Topic: CNN (Convolutional Neural Networks) – Part 4: Training, Loss Functions, and Evaluation Metrics
---
1. Preparing for Training
To train a CNN, we need:
• Dataset – Typically image data with labels (e.g., MNIST, CIFAR-10).
• Loss Function – Measures the difference between predicted and actual values.
• Optimizer – Updates model weights based on gradients.
• Evaluation Metrics – Accuracy, precision, recall, F1 score, etc.
---
2. Common Loss Functions for CNNs
• CrossEntropyLoss – For multi-class classification (most common).
• BCELoss – For binary classification.
---
3. Optimizers
• SGD (Stochastic Gradient Descent)
• Adam – Adaptive learning rate; widely used for faster convergence.
---
4. Basic Training Loop in PyTorch
---
5. Evaluating the Model
---
6. Tips for Better CNN Training
• Normalize images.
• Shuffle training data for better generalization.
• Use validation sets to monitor overfitting.
• Save checkpoints (
---
Summary
• CNN training involves feeding batches of images, computing loss, backpropagation, and updating weights.
• Evaluation metrics like accuracy help track progress.
• Loss functions and optimizers are critical for learning quality.
---
Exercise
• Train a CNN on CIFAR-10 for 10 epochs using
---
#CNN #DeepLearning #Training #LossFunction #ModelEvaluation
https://t.iss.one/DataScienceM
---
1. Preparing for Training
To train a CNN, we need:
• Dataset – Typically image data with labels (e.g., MNIST, CIFAR-10).
• Loss Function – Measures the difference between predicted and actual values.
• Optimizer – Updates model weights based on gradients.
• Evaluation Metrics – Accuracy, precision, recall, F1 score, etc.
---
2. Common Loss Functions for CNNs
• CrossEntropyLoss – For multi-class classification (most common).
criterion = nn.CrossEntropyLoss()
• BCELoss – For binary classification.
---
3. Optimizers
• SGD (Stochastic Gradient Descent)
• Adam – Adaptive learning rate; widely used for faster convergence.
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
---
4. Basic Training Loop in PyTorch
for epoch in range(num_epochs):
model.train()
running_loss = 0.0
for images, labels in train_loader:
optimizer.zero_grad()
outputs = model(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print(f"Epoch {epoch+1}, Loss: {running_loss:.4f}")
---
5. Evaluating the Model
correct = 0
total = 0
model.eval()
with torch.no_grad():
for images, labels in test_loader:
outputs = model(images)
_, predicted = torch.max(outputs, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
accuracy = 100 * correct / total
print(f"Test Accuracy: {accuracy:.2f}%")
---
6. Tips for Better CNN Training
• Normalize images.
• Shuffle training data for better generalization.
• Use validation sets to monitor overfitting.
• Save checkpoints (
torch.save(model.state_dict())).---
Summary
• CNN training involves feeding batches of images, computing loss, backpropagation, and updating weights.
• Evaluation metrics like accuracy help track progress.
• Loss functions and optimizers are critical for learning quality.
---
Exercise
• Train a CNN on CIFAR-10 for 10 epochs using
CrossEntropyLoss and Adam, then print accuracy and plot loss over epochs.---
#CNN #DeepLearning #Training #LossFunction #ModelEvaluation
https://t.iss.one/DataScienceM
❤7
Machine Learning
Photo
# 📚 PyTorch Tutorial for Beginners - Part 2/6: Deep Neural Networks & Training Techniques
#PyTorch #DeepLearning #MachineLearning #NeuralNetworks #Training
Welcome to Part 2 of our comprehensive PyTorch series! This lesson dives deep into building and training neural networks, covering architectures, activation functions, optimization, and more.
---
## 🔹 Recap & Setup
---
## 🔹 Deep Neural Network (DNN) Architecture
### 1. Key Components
| Component | Purpose | PyTorch Implementation |
|--------------------|-------------------------------------------------------------------------|------------------------------|
| Input Layer | Receives raw features |
| Hidden Layers | Learn hierarchical representations | Multiple
| Output Layer | Produces final predictions |
| Activation | Introduces non-linearity |
| Loss Function | Measures prediction error |
| Optimizer | Updates weights to minimize loss |
### 2. Building a DNN
---
## 🔹 Activation Functions
### 1. Common Choices
| Activation | Formula | Range | Use Case | PyTorch |
|-----------------|----------------------|------------|------------------------------|------------------|
| ReLU | max(0, x) | [0, ∞) | Hidden layers |
| Leaky ReLU | max(0.01x, x) | (-∞, ∞) | Avoid dead neurons |
| Sigmoid | 1 / (1 + e^(-x)) | (0, 1) | Binary classification |
| Tanh | (e^x - e^(-x)) / ... | (-1, 1) | RNNs, some hidden layers |
| Softmax | e^x / sum(e^x) | (0, 1) | Multi-class classification |
### 2. Visual Comparison
---
#PyTorch #DeepLearning #MachineLearning #NeuralNetworks #Training
Welcome to Part 2 of our comprehensive PyTorch series! This lesson dives deep into building and training neural networks, covering architectures, activation functions, optimization, and more.
---
## 🔹 Recap & Setup
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader, TensorDataset
# Check GPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")
---
## 🔹 Deep Neural Network (DNN) Architecture
### 1. Key Components
| Component | Purpose | PyTorch Implementation |
|--------------------|-------------------------------------------------------------------------|------------------------------|
| Input Layer | Receives raw features |
nn.Linear(input_dim, hidden_dim) || Hidden Layers | Learn hierarchical representations | Multiple
nn.Linear + Activation || Output Layer | Produces final predictions |
nn.Linear(hidden_dim, output_dim) || Activation | Introduces non-linearity |
nn.ReLU(), nn.Sigmoid(), etc. || Loss Function | Measures prediction error |
nn.MSELoss(), nn.CrossEntropyLoss() || Optimizer | Updates weights to minimize loss |
optim.SGD(), optim.Adam() |### 2. Building a DNN
class DNN(nn.Module):
def __init__(self, input_size, hidden_sizes, output_size):
super().__init__()
layers = []
# Hidden layers
prev_size = input_size
for hidden_size in hidden_sizes:
layers.append(nn.Linear(prev_size, hidden_size))
layers.append(nn.ReLU())
prev_size = hidden_size
# Output layer (no activation for regression)
layers.append(nn.Linear(prev_size, output_size))
self.net = nn.Sequential(*layers)
def forward(self, x):
return self.net(x)
# Example: 3-layer network (input=10, hidden=[64,32], output=1)
model = DNN(10, [64, 32], 1).to(device)
print(model)
---
## 🔹 Activation Functions
### 1. Common Choices
| Activation | Formula | Range | Use Case | PyTorch |
|-----------------|----------------------|------------|------------------------------|------------------|
| ReLU | max(0, x) | [0, ∞) | Hidden layers |
nn.ReLU() || Leaky ReLU | max(0.01x, x) | (-∞, ∞) | Avoid dead neurons |
nn.LeakyReLU() || Sigmoid | 1 / (1 + e^(-x)) | (0, 1) | Binary classification |
nn.Sigmoid() || Tanh | (e^x - e^(-x)) / ... | (-1, 1) | RNNs, some hidden layers |
nn.Tanh() || Softmax | e^x / sum(e^x) | (0, 1) | Multi-class classification |
nn.Softmax() |### 2. Visual Comparison
x = torch.linspace(-5, 5, 100)
activations = {
"ReLU": nn.ReLU()(x),
"LeakyReLU": nn.LeakyReLU(0.1)(x),
"Sigmoid": nn.Sigmoid()(x),
"Tanh": nn.Tanh()(x)
}
plt.figure(figsize=(12, 4))
for i, (name, y) in enumerate(activations.items()):
plt.subplot(1, 4, i+1)
plt.plot(x.numpy(), y.numpy())
plt.title(name)
plt.tight_layout()
plt.show()
---
🔥2👍1
🔥 Trending Repository: comprehensive-rust
📝 Description: This is the Rust course used by the Android team at Google. It provides you the material to quickly teach Rust.
🔗 Repository URL: https://github.com/google/comprehensive-rust
🌐 Website: https://google.github.io/comprehensive-rust/
📖 Readme: https://github.com/google/comprehensive-rust#readme
📊 Statistics:
🌟 Stars: 30.8K stars
👀 Watchers: 149
🍴 Forks: 1.8K forks
💻 Programming Languages: Rust - JavaScript - TypeScript - Handlebars - Assembly - Shell
🏷️ Related Topics:
==================================
🧠 By: https://t.iss.one/DataScienceM
📝 Description: This is the Rust course used by the Android team at Google. It provides you the material to quickly teach Rust.
🔗 Repository URL: https://github.com/google/comprehensive-rust
🌐 Website: https://google.github.io/comprehensive-rust/
📖 Readme: https://github.com/google/comprehensive-rust#readme
📊 Statistics:
🌟 Stars: 30.8K stars
👀 Watchers: 149
🍴 Forks: 1.8K forks
💻 Programming Languages: Rust - JavaScript - TypeScript - Handlebars - Assembly - Shell
🏷️ Related Topics:
#android #training #rust #classroom #google #course #guide #training_materials
==================================
🧠 By: https://t.iss.one/DataScienceM
❤4
Part 5: Training the Model
We train the model using the
#Training #MachineLearning #ModelFit
---
Part 6: Evaluating and Discussing Results
After training, we evaluate the model's performance on the test set. We also plot the training history to visualize accuracy and loss curves. This helps us understand if the model is overfitting or underfitting.
Discussion:
The plots show how accuracy and loss change over epochs. Ideally, both training and validation accuracy should increase, while losses decrease. If the validation accuracy plateaus or decreases while training accuracy continues to rise, it's a sign of overfitting. Our simple model achieves a decent accuracy. To improve it, one could use techniques like Data Augmentation, Dropout layers, or a deeper architecture.
#Evaluation #Results #Accuracy #Overfitting
---
Part 7: Making Predictions on a Single Image
This is how you handle a single image file for prediction. The model expects a batch of images as input, so we must add an extra dimension to our single image before passing it to
#Prediction #ImageProcessing #Inference
━━━━━━━━━━━━━━━
By: @DataScienceM ✨
We train the model using the
fit() method, providing our training data, batch size, number of epochs, and validation data to monitor performance on unseen data.history = model.fit(x_train, y_train,
epochs=15,
batch_size=64,
validation_data=(x_test, y_test))
#Training #MachineLearning #ModelFit
---
Part 6: Evaluating and Discussing Results
After training, we evaluate the model's performance on the test set. We also plot the training history to visualize accuracy and loss curves. This helps us understand if the model is overfitting or underfitting.
# Evaluate the model on the test data
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f'\nTest accuracy: {test_acc:.4f}')
# Plot training & validation accuracy values
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
# Plot training & validation loss values
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()
Discussion:
The plots show how accuracy and loss change over epochs. Ideally, both training and validation accuracy should increase, while losses decrease. If the validation accuracy plateaus or decreases while training accuracy continues to rise, it's a sign of overfitting. Our simple model achieves a decent accuracy. To improve it, one could use techniques like Data Augmentation, Dropout layers, or a deeper architecture.
#Evaluation #Results #Accuracy #Overfitting
---
Part 7: Making Predictions on a Single Image
This is how you handle a single image file for prediction. The model expects a batch of images as input, so we must add an extra dimension to our single image before passing it to
model.predict().# Select a single image from the test set
img_index = 15
test_image = x_test[img_index]
true_label_index = np.argmax(y_test[img_index])
# Display the image
plt.imshow(test_image)
plt.title(f"Actual Label: {class_names[true_label_index]}")
plt.show()
# The model expects a batch of images, so we add a dimension
image_for_prediction = np.expand_dims(test_image, axis=0)
print("Image shape before prediction:", test_image.shape)
print("Image shape after adding batch dimension:", image_for_prediction.shape)
# Make a prediction
predictions = model.predict(image_for_prediction)
predicted_label_index = np.argmax(predictions[0])
# Print the result
print(f"\nPrediction Probabilities: {predictions[0]}")
print(f"Predicted Label: {class_names[predicted_label_index]}")
print(f"Actual Label: {class_names[true_label_index]}")
#Prediction #ImageProcessing #Inference
━━━━━━━━━━━━━━━
By: @DataScienceM ✨