Machine Learning

Topic: CNN (Convolutional Neural Networks) – Part 3: Batch Normalization, Dropout, and Regularization

---

1. Batch Normalization (BatchNorm)

• Normalizes layer inputs to improve training speed and stability.

• It reduces internal covariate shift by normalizing activations over the batch.

• Formula applied for each batch:

$$
\hat{x} = \frac{x - \mu}{\sqrt{\sigma^2 + \epsilon}} \quad;\quad y = \gamma \hat{x} + \beta
$$

where $\mu$, $\sigma^2$ are batch mean and variance, $\gamma$ and $\beta$ are learnable parameters.

---

2. Dropout

• A regularization technique that randomly "drops out" neurons during training to prevent overfitting.

• The dropout rate (e.g., 0.5) specifies the probability of dropping a neuron.

---

3. Adding BatchNorm and Dropout in PyTorch

import torch.nn as nn
import torch.nn.functional as F

class CNNWithBNDropout(nn.Module):
    def __init__(self):
        super(CNNWithBNDropout, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, padding=1)
        self.bn1 = nn.BatchNorm2d(32)
        self.dropout = nn.Dropout(0.5)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(32 * 14 * 14, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.bn1(self.conv1(x))))
        x = x.view(-1, 32 * 14 * 14)
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.fc2(x)
        return x

---

4. Why Use BatchNorm and Dropout?

• BatchNorm helps the model converge faster and allows higher learning rates.

• Dropout helps reduce overfitting by making the network less sensitive to specific neuron weights.

---

5. Other Regularization Techniques

• Weight Decay: Adds an L2 penalty to weights during optimization.

• Early Stopping: Stops training when validation loss starts increasing.

---

Summary

• Batch normalization and dropout are essential tools for training deep CNNs effectively.

• Regularization improves generalization and reduces overfitting.

---

Exercise

• Modify the CNN above by adding dropout after the second fully connected layer and train it on a dataset to compare results with/without dropout.

---

#CNN #BatchNormalization #Dropout #Regularization #DeepLearning

https://t.iss.one/DataScienceM

❤7👍1

1.97K views11:41

Machine Learning

⚡️ How does regularization prevent overfitting?

📈 #machinelearning algorithms have revolutionized the way we solve complex problems and make predictions. These algorithms, however, are prone to a common pitfall known as #overfitting. Overfitting occurs when a model becomes too complex and starts to memorize the training data instead of learning the underlying patterns. As a result, the model performs poorly on unseen data, leading to inaccurate predictions.

📈 To combat overfitting, #regularization techniques have been developed. Regularization is a method that adds a penalty term to the loss function during the training process. This penalty term discourages the model from fitting the training data too closely, promoting better generalization and preventing overfitting.

📈 There are different types of regularization techniques, but two of the most commonly used ones are L1 regularization (#Lasso) and L2 regularization (#Ridge). Both techniques aim to reduce the complexity of the model, but they achieve this in different ways.

📈 L1 regularization adds the sum of absolute values of the model's weights to the loss function. This additional term encourages the model to reduce the magnitude of less important features' weights to zero. In other words, L1 regularization performs feature selection by eliminating irrelevant features. By doing so, it helps prevent overfitting by reducing the complexity of the model and focusing only on the most important features.

📈 On the other hand, L2 regularization adds the sum of squared values of the model's weights to the loss function. Unlike L1 regularization, L2 regularization does not force any weights to become exactly zero. Instead, it shrinks all weights towards zero, making them smaller and less likely to overfit noisy or irrelevant features. L2 regularization helps prevent overfitting by reducing the impact of individual features while still considering their overall importance.

📈 Regularization techniques strike a balance between fitting the training data well and keeping the model's weights small. By adding a regularization term to the loss function, these techniques introduce a trade-off that prevents the model from being overly complex and overly sensitive to the training data. This trade-off helps the model generalize better and perform well on unseen data.

📈 Regularization techniques have become an essential tool in the machine learning toolbox. They provide a means to prevent overfitting and improve the generalization capabilities of models. By striking a balance between fitting the training data and reducing complexity, regularization techniques help create models that can make accurate predictions on unseen data.

📚 Reference: Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron

https://t.iss.one/DataScienceM

⛈

⚡️

Please open Telegram to view this post

VIEW IN TELEGRAM

❤4👍1

804 viewsedited 10:27

About

Blog

Apps

Platform