I am one of those people who function better by writing things down. One day, I realized that most of my notes don’t have to be private, so here they are - my second brain. Be warned that, if you stumble upon something here that doesn’t make sense to you, it isn’t meant to!
Today I learned
Deep Learning: Whenever you define a neural network, either as a vanilla Python function or as a nn.Module, it should only take a mini-batch of training data as input. For e.g., for a net that works on images, the input will only be X whose shape could be [256, 1, 28, 28]. Here, 256 is the number of items in each mini-batch, 1 is the number of channels (for gray-scale images) and 28 * 28 is the size of the images. First element of the input is always the batch-size. Loss function should only return a scalar (or a Tensor of size 1). That is because PyTorch’s backward function only works on scalars.