为什么要用 Batch

Momentum

Adaptive Learning Rate

Optimization for deep learning