# Today I learned

Deep Learning:

- Whenever you define a neural network, either as a vanilla Python function or as a
`nn.Module`

, it should only take a mini-batch of training data as input. For e.g., for a net that works on images, the input will only be`X`

whose shape could be`[256, 1, 28, 28]`

. Here, 256 is the number of items in each mini-batch, 1 is the number of channels (for gray-scale images) and 28 * 28 is the size of the images.- First element of the input is always the batch-size.

- Loss function should only return a scalar (or a
`Tensor`

of size 1). That is because PyTorch’s`backward`

function only works on scalars.