Topic: CNN (Convolutional Neural Networks) – Part 2: Layers, Padding, Stride, and Activation Functions
---
1. Convolutional Layer Parameters
• Kernel (Filter) Size: Size of the sliding window (e.g., 3x3, 5x5).
• Stride: Number of pixels the filter moves at each step. Larger stride means smaller output.
• Padding: Adding zeros around the input to control output size.
* Valid padding: No padding, output smaller than input.
* Same padding: Pads input so output size equals input size.
---
2. Calculating Output Size
For input size $N$, filter size $F$, padding $P$, stride $S$:
$$
\text{Output size} = \left\lfloor \frac{N - F + 2P}{S} \right\rfloor + 1
$$
---
3. Activation Functions
• ReLU (Rectified Linear Unit): Most common, outputs zero for negatives, linear for positives.
• Other activations: Sigmoid, Tanh, Leaky ReLU.
---
4. Pooling Layers
• Reduces spatial dimensions to lower computational cost.
• Max Pooling: Takes the maximum value in a window.
• Average Pooling: Takes the average value.
---
5. Example PyTorch CNN with Padding and Stride
---
6. Summary
• Padding and stride control output dimensions of convolution layers.
• ReLU is widely used for non-linearity.
• Pooling layers reduce dimensionality, improving performance.
---
Exercise
• Modify the example above to add a third convolutional layer with stride 2 and observe output sizes.
---
#CNN #DeepLearning #ActivationFunctions #Padding #Stride
https://t.iss.one/DataScience4
---
1. Convolutional Layer Parameters
• Kernel (Filter) Size: Size of the sliding window (e.g., 3x3, 5x5).
• Stride: Number of pixels the filter moves at each step. Larger stride means smaller output.
• Padding: Adding zeros around the input to control output size.
* Valid padding: No padding, output smaller than input.
* Same padding: Pads input so output size equals input size.
---
2. Calculating Output Size
For input size $N$, filter size $F$, padding $P$, stride $S$:
$$
\text{Output size} = \left\lfloor \frac{N - F + 2P}{S} \right\rfloor + 1
$$
---
3. Activation Functions
• ReLU (Rectified Linear Unit): Most common, outputs zero for negatives, linear for positives.
• Other activations: Sigmoid, Tanh, Leaky ReLU.
---
4. Pooling Layers
• Reduces spatial dimensions to lower computational cost.
• Max Pooling: Takes the maximum value in a window.
• Average Pooling: Takes the average value.
---
5. Example PyTorch CNN with Padding and Stride
import torch.nn as nn
import torch.nn.functional as F
class CNNWithPadding(nn.Module):
def __init__(self):
super(CNNWithPadding, self).__init__()
self.conv1 = nn.Conv2d(1, 16, kernel_size=3, stride=1, padding=1) # output same size as input
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=0) # valid padding
self.fc1 = nn.Linear(32 * 13 * 13, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x))) # 28x28 -> 28x28 -> 14x14 after pooling
x = F.relu(self.conv2(x)) # 14x14 -> 12x12
x = x.view(-1, 32 * 12 * 12)
x = self.fc1(x)
return x
---
6. Summary
• Padding and stride control output dimensions of convolution layers.
• ReLU is widely used for non-linearity.
• Pooling layers reduce dimensionality, improving performance.
---
Exercise
• Modify the example above to add a third convolutional layer with stride 2 and observe output sizes.
---
#CNN #DeepLearning #ActivationFunctions #Padding #Stride
https://t.iss.one/DataScience4
❤5