Topic: CNN (Convolutional Neural Networks) – Part 3: Batch Normalization, Dropout, and Regularization
---
1. Batch Normalization (BatchNorm)
• Normalizes layer inputs to improve training speed and stability.
• It reduces internal covariate shift by normalizing activations over the batch.
• Formula applied for each batch:
$$
\hat{x} = \frac{x - \mu}{\sqrt{\sigma^2 + \epsilon}} \quad;\quad y = \gamma \hat{x} + \beta
$$
where $\mu$, $\sigma^2$ are batch mean and variance, $\gamma$ and $\beta$ are learnable parameters.
---
2. Dropout
• A regularization technique that randomly "drops out" neurons during training to prevent overfitting.
• The dropout rate (e.g., 0.5) specifies the probability of dropping a neuron.
---
3. Adding BatchNorm and Dropout in PyTorch
---
4. Why Use BatchNorm and Dropout?
• BatchNorm helps the model converge faster and allows higher learning rates.
• Dropout helps reduce overfitting by making the network less sensitive to specific neuron weights.
---
5. Other Regularization Techniques
• Weight Decay: Adds an L2 penalty to weights during optimization.
• Early Stopping: Stops training when validation loss starts increasing.
---
Summary
• Batch normalization and dropout are essential tools for training deep CNNs effectively.
• Regularization improves generalization and reduces overfitting.
---
Exercise
• Modify the CNN above by adding dropout after the second fully connected layer and train it on a dataset to compare results with/without dropout.
---
#CNN #BatchNormalization #Dropout #Regularization #DeepLearning
https://t.iss.one/DataScienceM
---
1. Batch Normalization (BatchNorm)
• Normalizes layer inputs to improve training speed and stability.
• It reduces internal covariate shift by normalizing activations over the batch.
• Formula applied for each batch:
$$
\hat{x} = \frac{x - \mu}{\sqrt{\sigma^2 + \epsilon}} \quad;\quad y = \gamma \hat{x} + \beta
$$
where $\mu$, $\sigma^2$ are batch mean and variance, $\gamma$ and $\beta$ are learnable parameters.
---
2. Dropout
• A regularization technique that randomly "drops out" neurons during training to prevent overfitting.
• The dropout rate (e.g., 0.5) specifies the probability of dropping a neuron.
---
3. Adding BatchNorm and Dropout in PyTorch
import torch.nn as nn
import torch.nn.functional as F
class CNNWithBNDropout(nn.Module):
def __init__(self):
super(CNNWithBNDropout, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, padding=1)
self.bn1 = nn.BatchNorm2d(32)
self.dropout = nn.Dropout(0.5)
self.pool = nn.MaxPool2d(2, 2)
self.fc1 = nn.Linear(32 * 14 * 14, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = self.pool(F.relu(self.bn1(self.conv1(x))))
x = x.view(-1, 32 * 14 * 14)
x = F.relu(self.fc1(x))
x = self.dropout(x)
x = self.fc2(x)
return x
---
4. Why Use BatchNorm and Dropout?
• BatchNorm helps the model converge faster and allows higher learning rates.
• Dropout helps reduce overfitting by making the network less sensitive to specific neuron weights.
---
5. Other Regularization Techniques
• Weight Decay: Adds an L2 penalty to weights during optimization.
• Early Stopping: Stops training when validation loss starts increasing.
---
Summary
• Batch normalization and dropout are essential tools for training deep CNNs effectively.
• Regularization improves generalization and reduces overfitting.
---
Exercise
• Modify the CNN above by adding dropout after the second fully connected layer and train it on a dataset to compare results with/without dropout.
---
#CNN #BatchNormalization #Dropout #Regularization #DeepLearning
https://t.iss.one/DataScienceM
❤7👍1