I am one of those people who function better by writing things down. One day, I realized that most of my notes don’t have to be private, so here they are - my second brain. Be warned that, if you stumble upon something here that doesn’t make sense to you, it isn’t meant to!
Today I learned
Fastai: While updating weights, we multiply learning rate with derivative of the loss function with respect to the weights. Loss function is a function of independent variables, X, and weights. Cross-entropy loss function is useful for classification problems where you don’t care about how close your prediction was. Softmax is an activation function that allows your output to be between 0 and 1. Somewhat like the Sigmoid that conforms output to a range. Some discussion on when to use or the other of these: Softmax vs Sigmoid function in Logistic classifier? You generally want Cross-entropy loss and Softmax for single-label multi-class classification problems. They go well together. Regularization techniques allow you to avoid over-fitting. Weight decay. Dropout. Batch Norm. Data augmentation. One way to avoid over-fitting is to use less parameters. However, Jeremy proposes that we instead use a lot of parameters but penalize complexity. Weight decay is one way to do the latter.