Getting Started with Neural Network
Implementing and training a neural network could be complicated. Below is a checklist for modeling with pytorch
.
Preprocessing
- Input:
- Normalization / subtract mean, reason: optimization shape.
- Features should be on the same scale
- Outlier/very large number should be capped/rescaled: for large input, gradient of sigmoid and tanh is close to zero (saturation), this will slow down optimization
- For CV task: resize image to same dimension
- For NLP task: text cleaning and tokenization
- Output:
- For regression problem, standardize $y$
Network architecture
- Base network
- Activation
- Regularization
- Dropout (torch.nn.X in Dropout Layers)
- Batch normalization (torch.nn.X in Normalization Layers)
- Weight regularization (defined as an argument of torch.optimizer.X)
- Ordering of batch norm and dropout
- Loss function
- Eval mode: disable Dropout, BatchNorm, etc.
Optimizer
- Optimizer
- Learning rate
- Lr scheduler: scheduler.step()
- ReduceLROnPlateau
- CyclicLR
- Early stopping
- Sanity check
- Lr estimating: run a grid search/random search on $\log(\text{lr})$ for a few epochs
- Check the network can overfit a small batch of data
Dataloader
- Batch size
- Restricted by GPU vRAM
- Large batch size leads to poor generalization
- Increase batch size ~= decay lr
- Dataset
- Dataloader
- torchvision.transform
Model Serialization
- state_dict
- Save: torch.save(model.state_dict(), PATH)
- Load into existing model: model.load_state_dict(torch.load(PATH))
- Init model and load: model = torch.load(PATH)
Diagnostics
- Get parameters: torch.parameters
- cuda available: device = ‘cuda’ if torch.cuda.is_available() else ‘cpu’
- torch version: torch.__version__
Tensorboard
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter()
writer.add_scalar("Loss/train", loss, epoch)
Host tensorboard
tensorboard --logdir PATH
[ssh port forwarding](https://www.ssh.com/academy/ssh/tunneling/example): ssh -L 16006:127.0.0.1:6006 usr@server_ip