Topic: Python – Reading Images from Datasets and Organizing Them (Part 2): Using PyTorch and TensorFlow Data Loaders
---
1. Using PyTorch’s `ImageFolder` and `DataLoader`
PyTorch provides an easy way to load image datasets organized in folders by classes.
---
2. Iterating Through DataLoader
---
3. Using TensorFlow `image_dataset_from_directory`
TensorFlow Keras also provides utilities for loading datasets organized in folders.
---
4. Dataset Splitting
You can split datasets into training and validation sets easily:
---
5. Summary
• PyTorch’s ImageFolder + DataLoader offers a quick way to load and batch datasets.
• TensorFlow’s image\_dataset\_from\_directory provides similar high-level dataset loading.
• Both allow easy transformations, batching, and shuffling.
---
Exercise
• Write code to normalize images in TensorFlow dataset using
---
#Python #DatasetHandling #PyTorch #TensorFlow #ImageProcessing
https://t.iss.one/DataScience4
---
1. Using PyTorch’s `ImageFolder` and `DataLoader`
PyTorch provides an easy way to load image datasets organized in folders by classes.
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
# Define transformations (resize, normalize, convert to tensor)
transform = transforms.Compose([
transforms.Resize((128, 128)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
dataset = datasets.ImageFolder(root='dataset/', transform=transform)
# Create DataLoader for batching and shuffling
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
# Access class names
class_names = dataset.classes
print(class_names)
---
2. Iterating Through DataLoader
for images, labels in dataloader:
print(images.shape) # (batch_size, 3, 128, 128)
print(labels)
# Use images and labels for training or validation
break
---
3. Using TensorFlow `image_dataset_from_directory`
TensorFlow Keras also provides utilities for loading datasets organized in folders.
import tensorflow as tf
dataset = tf.keras.preprocessing.image_dataset_from_directory(
'dataset/',
image_size=(128, 128),
batch_size=32,
label_mode='int' # can be 'categorical', 'binary', or None
)
class_names = dataset.class_names
print(class_names)
for images, labels in dataset.take(1):
print(images.shape)
print(labels)
---
4. Dataset Splitting
You can split datasets into training and validation sets easily:
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
'dataset/',
validation_split=0.2,
subset="training",
seed=123,
image_size=(128, 128),
batch_size=32
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
'dataset/',
validation_split=0.2,
subset="validation",
seed=123,
image_size=(128, 128),
batch_size=32
)
---
5. Summary
• PyTorch’s ImageFolder + DataLoader offers a quick way to load and batch datasets.
• TensorFlow’s image\_dataset\_from\_directory provides similar high-level dataset loading.
• Both allow easy transformations, batching, and shuffling.
---
Exercise
• Write code to normalize images in TensorFlow dataset using
map() with Rescaling.---
#Python #DatasetHandling #PyTorch #TensorFlow #ImageProcessing
https://t.iss.one/DataScience4
❤4
Topic: Python – Reading Images from Datasets and Organizing Them (Part 3): Custom Dataset Class and Data Augmentation
---
1. Creating a Custom Dataset Class (PyTorch)
Sometimes you need more control over how images and labels are loaded and processed. You can create a custom dataset class by extending
---
2. Using Data Augmentation with `transforms`
Data augmentation helps improve model generalization by artificially increasing dataset diversity.
Pass this transform to the custom dataset:
---
3. Loading Dataset with DataLoader
---
4. Summary
• Custom dataset classes offer flexibility in how data is loaded and labeled.
• Data augmentation techniques such as flipping and rotation can be applied using torchvision transforms.
• Use DataLoader for batching and shuffling during training.
---
Exercise
• Extend the custom dataset to handle grayscale images and apply a random brightness adjustment transform.
---
#Python #DatasetHandling #PyTorch #DataAugmentation #ImageProcessing
https://t.iss.one/DataScience4
---
1. Creating a Custom Dataset Class (PyTorch)
Sometimes you need more control over how images and labels are loaded and processed. You can create a custom dataset class by extending
torch.utils.data.Dataset.import os
from PIL import Image
from torch.utils.data import Dataset
class CustomImageDataset(Dataset):
def __init__(self, root_dir, transform=None):
self.root_dir = root_dir
self.transform = transform
self.image_paths = []
self.labels = []
self.class_to_idx = {}
classes = sorted(os.listdir(root_dir))
self.class_to_idx = {cls_name: idx for idx, cls_name in enumerate(classes)}
for cls_name in classes:
cls_dir = os.path.join(root_dir, cls_name)
for img_name in os.listdir(cls_dir):
img_path = os.path.join(cls_dir, img_name)
self.image_paths.append(img_path)
self.labels.append(self.class_to_idx[cls_name])
def __len__(self):
return len(self.image_paths)
def __getitem__(self, idx):
img_path = self.image_paths[idx]
image = Image.open(img_path).convert("RGB")
label = self.labels[idx]
if self.transform:
image = self.transform(image)
return image, label
---
2. Using Data Augmentation with `transforms`
Data augmentation helps improve model generalization by artificially increasing dataset diversity.
from torchvision import transforms
transform = transforms.Compose([
transforms.Resize((128, 128)),
transforms.RandomHorizontalFlip(),
transforms.RandomRotation(10),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
Pass this transform to the custom dataset:
dataset = CustomImageDataset(root_dir='dataset/', transform=transform)
---
3. Loading Dataset with DataLoader
from torch.utils.data import DataLoader
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
---
4. Summary
• Custom dataset classes offer flexibility in how data is loaded and labeled.
• Data augmentation techniques such as flipping and rotation can be applied using torchvision transforms.
• Use DataLoader for batching and shuffling during training.
---
Exercise
• Extend the custom dataset to handle grayscale images and apply a random brightness adjustment transform.
---
#Python #DatasetHandling #PyTorch #DataAugmentation #ImageProcessing
https://t.iss.one/DataScience4
❤2