Clean Code Tip:
For sequential CNN architectures, defining layers individually and calling them one-by-one in the
Example:
โโโโโโโโโโโโโโโ
By: @DataScienceM โจ
For sequential CNN architectures, defining layers individually and calling them one-by-one in the
forward method creates boilerplate. Encapsulate your network trunk in an nn.Sequential container. This makes your architecture declarative, compact, and much easier to read at a glance. ๐๏ธExample:
import torch
import torch.nn as nn
# --- The Verbose, Repetitive Way ---
class VerboseCNN(nn.Module):
def __init__(self, num_classes=10):
super().__init__()
# Layers are defined one by one
self.conv1 = nn.Conv2d(1, 16, kernel_size=3, padding=1)
self.relu1 = nn.ReLU()
self.pool1 = nn.MaxPool2d(2)
self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=1)
self.relu2 = nn.ReLU()
self.pool2 = nn.MaxPool2d(2)
self.flatten = nn.Flatten()
self.fc = nn.Linear(32 * 7 * 7, num_classes)
def forward(self, x):
# The forward pass is a long, manual chain of calls
x = self.conv1(x)
x = self.relu1(x)
x = self.pool1(x)
x = self.conv2(x)
x = self.relu2(x)
x = self.pool2(x)
x = self.flatten(x)
x = self.fc(x)
return x
print("--- Verbose Way ---")
verbose_model = VerboseCNN()
print(verbose_model)
# --- The Clean, Declarative Way with nn.Sequential ---
class CleanCNN(nn.Module):
def __init__(self, num_classes=10):
super().__init__()
# The feature extractor is a clean, sequential block
self.features = nn.Sequential(
nn.Conv2d(1, 16, kernel_size=3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Conv2d(16, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Flatten()
)
self.classifier = nn.Linear(32 * 7 * 7, num_classes)
def forward(self, x):
# The forward pass is simple and clear
features = self.features(x)
output = self.classifier(features)
return output
print("\n--- Clean Way ---")
clean_model = CleanCNN()
print(clean_model)
โโโโโโโโโโโโโโโ
By: @DataScienceM โจ
โค1
Clean Code Tip:
When building complex architectures like ResNets, defining skip connections directly in the main
Example:
โโโโโโโโโโโโโโโ
By: @DataScienceM โจ
When building complex architectures like ResNets, defining skip connections directly in the main
forward method leads to repetitive, hard-to-read code. Encapsulate repeating patterns, like a residual block, into their own reusable nn.Module. This promotes modularity, follows the DRY principle, and makes your overall network architecture dramatically cleaner. ๐งฑExample:
import torch
import torch.nn as nn
# --- The Cluttered, Repetitive Way ---
class ClutteredResNet(nn.Module):
def __init__(self, in_channels=64, num_classes=10):
super().__init__()
# Defining layers for two blocks inline... gets messy fast.
self.conv1a = nn.Conv2d(in_channels, 64, 3, padding=1)
self.bn1a = nn.BatchNorm2d(64)
self.conv1b = nn.Conv2d(64, 64, 3, padding=1)
self.bn1b = nn.BatchNorm2d(64)
self.conv2a = nn.Conv2d(64, 64, 3, padding=1)
self.bn2a = nn.BatchNorm2d(64)
self.conv2b = nn.Conv2d(64, 64, 3, padding=1)
self.bn2b = nn.BatchNorm2d(64)
self.relu = nn.ReLU(inplace=True)
# ...imagine more blocks...
def forward(self, x):
# Manually implementing the first block's logic
identity1 = x
out = self.relu(self.bn1a(self.conv1a(x)))
out = self.bn1b(self.conv1b(out))
out += identity1 # The skip connection
out = self.relu(out)
# Repeating the same logic for the second block
identity2 = out
out = self.relu(self.bn2a(self.conv2a(out)))
out = self.bn2b(self.conv2b(out))
out += identity2 # Another skip connection
out = self.relu(out)
return out
# --- The Clean, Modular Way ---
# 1. First, create a reusable module for the repeating block
class ResidualBlock(nn.Module):
def __init__(self, in_channels, out_channels):
super().__init__()
self.conv1 = nn.Conv2d(in_channels, out_channels, 3, padding=1)
self.bn1 = nn.BatchNorm2d(out_channels)
self.conv2 = nn.Conv2d(out_channels, out_channels, 3, padding=1)
self.bn2 = nn.BatchNorm2d(out_channels)
self.relu = nn.ReLU(inplace=True)
def forward(self, x):
identity = x
out = self.relu(self.bn1(self.conv1(x)))
out = self.bn2(self.conv2(out))
out += identity # Encapsulated skip connection logic
out = self.relu(out)
return out
# 2. Then, compose the main model from these clean blocks
class CleanResNet(nn.Module):
def __init__(self, in_channels=64, num_classes=10):
super().__init__()
# The architecture is now clear and declarative
self.layer1 = ResidualBlock(in_channels, 64)
self.layer2 = ResidualBlock(64, 64)
# ... add more blocks easily ...
def forward(self, x):
# The forward pass is simple and readable
x = self.layer1(x)
x = self.layer2(x)
return x
print("--- Clean Model Architecture ---")
model = CleanResNet()
print(model)
โโโโโโโโโโโโโโโ
By: @DataScienceM โจ
โค3
#CNN #DeepLearning #Python #Tutorial
Lesson: Building a Convolutional Neural Network (CNN) for Image Classification
This lesson will guide you through building a CNN from scratch using TensorFlow and Keras to classify images from the CIFAR-10 dataset.
---
Part 1: Setup and Data Loading
First, we import the necessary libraries and load the CIFAR-10 dataset. This dataset contains 60,000 32x32 color images in 10 classes.
#TensorFlow #Keras #DataLoading
---
Part 2: Data Exploration and Preprocessing
We need to prepare the data before feeding it to the network. This involves:
โข Normalization: Scaling pixel values from the 0-255 range to the 0-1 range.
โข One-Hot Encoding: Converting class vectors (integers) to a binary matrix.
Let's also visualize some images to understand our data.
#DataPreprocessing #Normalization #Visualization
---
Part 3: Building the CNN Model
Now, we'll construct our CNN model. A common architecture consists of a stack of
โข Conv2D: Extracts features (like edges, corners) from the input image.
โข MaxPooling2D: Reduces the spatial dimensions (downsampling), which helps in making the feature detection more robust.
โข Flatten: Converts the 2D feature maps into a 1D vector.
โข Dense: A standard fully-connected neural network layer.
#ModelBuilding #CNN #KerasLayers
---
Part 4: Compiling the Model
Before training, we need to configure the learning process. This is done via the
โข Optimizer: An algorithm to update the model's weights (e.g., 'adam').
โข Loss Function: A function to measure how inaccurate the model is during training (e.g., 'categorical_crossentropy' for multi-class classification).
โข Metrics: Used to monitor the training and testing steps (e.g., 'accuracy').
#ModelCompilation #Optimizer #LossFunction
---
Lesson: Building a Convolutional Neural Network (CNN) for Image Classification
This lesson will guide you through building a CNN from scratch using TensorFlow and Keras to classify images from the CIFAR-10 dataset.
---
Part 1: Setup and Data Loading
First, we import the necessary libraries and load the CIFAR-10 dataset. This dataset contains 60,000 32x32 color images in 10 classes.
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
import numpy as np
# Load the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()
# Check the shape of the data
print("Training data shape:", x_train.shape)
print("Test data shape:", x_test.shape)
#TensorFlow #Keras #DataLoading
---
Part 2: Data Exploration and Preprocessing
We need to prepare the data before feeding it to the network. This involves:
โข Normalization: Scaling pixel values from the 0-255 range to the 0-1 range.
โข One-Hot Encoding: Converting class vectors (integers) to a binary matrix.
Let's also visualize some images to understand our data.
# Define class names for CIFAR-10
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
# Visualize a few images
plt.figure(figsize=(10,10))
for i in range(25):
plt.subplot(5,5,i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(x_train[i])
plt.xlabel(class_names[y_train[i][0]])
plt.show()
# Normalize pixel values to be between 0 and 1
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
# One-hot encode the labels
y_train = tf.keras.utils.to_categorical(y_train, num_classes=10)
y_test = tf.keras.utils.to_categorical(y_test, num_classes=10)
#DataPreprocessing #Normalization #Visualization
---
Part 3: Building the CNN Model
Now, we'll construct our CNN model. A common architecture consists of a stack of
Conv2D and MaxPooling2D layers, followed by Dense layers for classification.โข Conv2D: Extracts features (like edges, corners) from the input image.
โข MaxPooling2D: Reduces the spatial dimensions (downsampling), which helps in making the feature detection more robust.
โข Flatten: Converts the 2D feature maps into a 1D vector.
โข Dense: A standard fully-connected neural network layer.
model = models.Sequential()
# Convolutional Base
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
# Flatten and Dense Layers
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax')) # 10 output classes
# Print the model summary
model.summary()
#ModelBuilding #CNN #KerasLayers
---
Part 4: Compiling the Model
Before training, we need to configure the learning process. This is done via the
compile() method, which requires:โข Optimizer: An algorithm to update the model's weights (e.g., 'adam').
โข Loss Function: A function to measure how inaccurate the model is during training (e.g., 'categorical_crossentropy' for multi-class classification).
โข Metrics: Used to monitor the training and testing steps (e.g., 'accuracy').
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
#ModelCompilation #Optimizer #LossFunction
---
Part 5: Training the Model
We train the model using the
#Training #MachineLearning #ModelFit
---
Part 6: Evaluating and Discussing Results
After training, we evaluate the model's performance on the test set. We also plot the training history to visualize accuracy and loss curves. This helps us understand if the model is overfitting or underfitting.
Discussion:
The plots show how accuracy and loss change over epochs. Ideally, both training and validation accuracy should increase, while losses decrease. If the validation accuracy plateaus or decreases while training accuracy continues to rise, it's a sign of overfitting. Our simple model achieves a decent accuracy. To improve it, one could use techniques like Data Augmentation, Dropout layers, or a deeper architecture.
#Evaluation #Results #Accuracy #Overfitting
---
Part 7: Making Predictions on a Single Image
This is how you handle a single image file for prediction. The model expects a batch of images as input, so we must add an extra dimension to our single image before passing it to
#Prediction #ImageProcessing #Inference
โโโโโโโโโโโโโโโ
By: @DataScienceM โจ
We train the model using the
fit() method, providing our training data, batch size, number of epochs, and validation data to monitor performance on unseen data.history = model.fit(x_train, y_train,
epochs=15,
batch_size=64,
validation_data=(x_test, y_test))
#Training #MachineLearning #ModelFit
---
Part 6: Evaluating and Discussing Results
After training, we evaluate the model's performance on the test set. We also plot the training history to visualize accuracy and loss curves. This helps us understand if the model is overfitting or underfitting.
# Evaluate the model on the test data
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f'\nTest accuracy: {test_acc:.4f}')
# Plot training & validation accuracy values
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
# Plot training & validation loss values
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()
Discussion:
The plots show how accuracy and loss change over epochs. Ideally, both training and validation accuracy should increase, while losses decrease. If the validation accuracy plateaus or decreases while training accuracy continues to rise, it's a sign of overfitting. Our simple model achieves a decent accuracy. To improve it, one could use techniques like Data Augmentation, Dropout layers, or a deeper architecture.
#Evaluation #Results #Accuracy #Overfitting
---
Part 7: Making Predictions on a Single Image
This is how you handle a single image file for prediction. The model expects a batch of images as input, so we must add an extra dimension to our single image before passing it to
model.predict().# Select a single image from the test set
img_index = 15
test_image = x_test[img_index]
true_label_index = np.argmax(y_test[img_index])
# Display the image
plt.imshow(test_image)
plt.title(f"Actual Label: {class_names[true_label_index]}")
plt.show()
# The model expects a batch of images, so we add a dimension
image_for_prediction = np.expand_dims(test_image, axis=0)
print("Image shape before prediction:", test_image.shape)
print("Image shape after adding batch dimension:", image_for_prediction.shape)
# Make a prediction
predictions = model.predict(image_for_prediction)
predicted_label_index = np.argmax(predictions[0])
# Print the result
print(f"\nPrediction Probabilities: {predictions[0]}")
print(f"Predicted Label: {class_names[predicted_label_index]}")
print(f"Actual Label: {class_names[true_label_index]}")
#Prediction #ImageProcessing #Inference
โโโโโโโโโโโโโโโ
By: @DataScienceM โจ
#YOLOv8 #ComputerVision #ObjectDetection #IndustrialAI #Python
Applying YOLOv8 for Industrial Automation: Counting Plastic Bottles
This lesson will guide you through a complete computer vision project using YOLOv8. The goal is to detect and count plastic bottles in an image from an industrial setting, such as a conveyor belt or a storage area.
---
Step 1: Setup and Installation
First, we need to install the necessary libraries. The
#Setup #Installation
---
Step 2: Loading the Model and the Target Image
We will load a pre-trained YOLOv8 model. These models are trained on the large COCO dataset, which already knows how to identify common objects like 'bottle'. Then, we'll load our industrial image. Ensure you have an image named
#ModelLoading #DataHandling
---
Step 3: Performing Detection on the Image
With the model and image loaded, we can now run the detection. The
#Inference #ObjectDetection
---
Step 4: Filtering and Counting the Bottles
The model detects many types of objects. Our task is to go through the results, filter for only the 'bottle' class, and count how many there are. We'll also store the locations (bounding boxes) of each detected bottle for visualization.
#DataProcessing #Filtering
---
Step 5: Visualizing the Results
A number is good, but seeing what the model detected is better. We will draw the bounding boxes and the final count directly onto the image to create a clear visual output.
#Visualization #OpenCV
Applying YOLOv8 for Industrial Automation: Counting Plastic Bottles
This lesson will guide you through a complete computer vision project using YOLOv8. The goal is to detect and count plastic bottles in an image from an industrial setting, such as a conveyor belt or a storage area.
---
Step 1: Setup and Installation
First, we need to install the necessary libraries. The
ultralytics library provides the YOLOv8 model, and opencv-python is essential for image processing tasks.#Setup #Installation
# Open your terminal or command prompt and run this command:
pip install ultralytics opencv-python
---
Step 2: Loading the Model and the Target Image
We will load a pre-trained YOLOv8 model. These models are trained on the large COCO dataset, which already knows how to identify common objects like 'bottle'. Then, we'll load our industrial image. Ensure you have an image named
factory_bottles.jpg in your project folder.#ModelLoading #DataHandling
import cv2
from ultralytics import YOLO
# Load a pre-trained YOLOv8 model (yolov8n.pt is the smallest and fastest)
model = YOLO('yolov8n.pt')
# Load the image from the industrial setting
image_path = 'factory_bottles.jpg' # Make sure this image is in your directory
img = cv2.imread(image_path)
# A quick check to ensure the image was loaded correctly
if img is None:
print(f"Error: Could not load image at {image_path}")
else:
print("YOLOv8 model and image loaded successfully.")
---
Step 3: Performing Detection on the Image
With the model and image loaded, we can now run the detection. The
ultralytics library makes this process incredibly simple. The model will analyze the image and identify all the objects it recognizes.#Inference #ObjectDetection
# Run the model on the image to get detection results
results = model(img)
print("Detection complete. Processing results...")
---
Step 4: Filtering and Counting the Bottles
The model detects many types of objects. Our task is to go through the results, filter for only the 'bottle' class, and count how many there are. We'll also store the locations (bounding boxes) of each detected bottle for visualization.
#DataProcessing #Filtering
# Initialize a counter for the bottles
bottle_count = 0
bottle_boxes = []
# The model's results is a list, so we loop through it
for result in results:
# Each result has a 'boxes' attribute with the detections
boxes = result.boxes
for box in boxes:
# Get the class ID of the detected object
class_id = int(box.cls)
# Check if the class name is 'bottle'
if model.names[class_id] == 'bottle':
bottle_count += 1
# Store the bounding box coordinates (x1, y1, x2, y2)
bottle_boxes.append(box.xyxy[0])
print(f"Total plastic bottles detected: {bottle_count}")
---
Step 5: Visualizing the Results
A number is good, but seeing what the model detected is better. We will draw the bounding boxes and the final count directly onto the image to create a clear visual output.
#Visualization #OpenCV
๐ฅ1
# Create a copy of the original image to draw on
output_img = img.copy()
# Draw a bounding box for each detected bottle
for box in bottle_boxes:
x1, y1, x2, y2 = map(int, box)
# Draw a green rectangle around each bottle
cv2.rectangle(output_img, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Add the final count as text on the image
summary_text = f"Bottle Count: {bottle_count}"
cv2.putText(output_img, summary_text, (20, 50),
cv2.FONT_HERSHEY_SIMPLEX, 1.5, (0, 0, 255), 4)
# Save the resulting image
cv2.imwrite('factory_bottles_result.jpg', output_img)
print("Result image with detections has been saved as 'factory_bottles_result.jpg'")
---
Step 6: Discussion of Results and Limitations
#Discussion #Limitations #FineTuning
Result: The code successfully uses a pre-trained YOLOv8 model to identify and count standard plastic bottles in an image. The final output provides both a numerical count and a visual confirmation of the detections.
Limitations of Pre-trained Model:
1. Occlusion: If bottles are heavily clustered or hiding behind each other, the model might miss some, leading to an undercount.
2. Unusual Shapes: The model is trained on common bottles (from the COCO dataset). If your factory produces bottles of a very unique shape or color, the model's accuracy might decrease.
3. Environmental Factors: Poor lighting, motion blur (if from a fast conveyor belt), or reflections can all negatively impact detection performance.
How to Improve (Next Steps): For a real-world, high-accuracy industrial application, you should not rely on a generic pre-trained model. The best approach is Fine-Tuning. This involves:
1. Collecting Data: Take hundreds or thousands of pictures of your specific bottles in your actual factory environment*.
2. Annotating Data: Draw bounding boxes around every bottle in those images.
3. Training: Use this custom dataset to train (or "fine-tune") the YOLOv8 model. This teaches the model exactly what to look for in your specific use case, leading to much higher accuracy and reliability.
โโโโโโโโโโโโโโโ
By: @DataScienceM โจ
โค1
Please open Telegram to view this post
VIEW IN TELEGRAM
#Pandas #DataAnalysis #Python #DataScience #Tutorial
Top 30 Pandas Functions & Methods
This lesson covers 30 essential Pandas functions for data manipulation and analysis, each with a standalone example and its output.
---
1.
Creates a new DataFrame (a 2D labeled data structure) from various inputs like dictionaries or lists.
---
2.
Creates a new Series (a 1D labeled array).
---
3.
Reads data from a CSV file into a DataFrame. (Assuming a file
---
4.
Writes a DataFrame to a CSV file.
#PandasIO #DataFrame #Series
---
5.
Returns the first
---
6.
Returns the last
---
7.
Provides a concise summary of the DataFrame, including data types and non-null values.
---
8.
Returns a tuple representing the dimensionality (rows, columns) of the DataFrame.
#DataInspection #PandasBasics
---
9.
Generates descriptive statistics for numerical columns (count, mean, std, min, max, etc.).
Top 30 Pandas Functions & Methods
This lesson covers 30 essential Pandas functions for data manipulation and analysis, each with a standalone example and its output.
---
1.
pd.DataFrame()Creates a new DataFrame (a 2D labeled data structure) from various inputs like dictionaries or lists.
import pandas as pd
data = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data)
print(df)
col1 col2
0 1 3
1 2 4
---
2.
pd.Series()Creates a new Series (a 1D labeled array).
import pandas as pd
s = pd.Series([10, 20, 30, 40], name='MyNumbers')
print(s)
0 10
1 20
2 30
3 40
Name: MyNumbers, dtype: int64
---
3.
pd.read_csv()Reads data from a CSV file into a DataFrame. (Assuming a file
data.csv exists).# Create a dummy csv file first
with open('data.csv', 'w') as f:
f.write('Name,Age\nAlice,25\nBob,30')
df = pd.read_csv('data.csv')
print(df)
Name Age
0 Alice 25
1 Bob 30
---
4.
df.to_csv()Writes a DataFrame to a CSV file.
import pandas as pd
df = pd.DataFrame({'Name': ['Charlie'], 'Age': [35]})
# index=False prevents writing the DataFrame index to the file
df.to_csv('output.csv', index=False)
# You can check that 'output.csv' has been created.
print("File 'output.csv' created.")
File 'output.csv' created.
#PandasIO #DataFrame #Series
---
5.
df.head()Returns the first
n rows of the DataFrame (default is 5).import pandas as pd
data = {'Name': ['A', 'B', 'C', 'D', 'E', 'F'], 'Value': [1, 2, 3, 4, 5, 6]}
df = pd.DataFrame(data)
print(df.head(3))
Name Value
0 A 1
1 B 2
2 C 3
---
6.
df.tail()Returns the last
n rows of the DataFrame (default is 5).import pandas as pd
data = {'Name': ['A', 'B', 'C', 'D', 'E', 'F'], 'Value': [1, 2, 3, 4, 5, 6]}
df = pd.DataFrame(data)
print(df.tail(2))
Name Value
4 E 5
5 F 6
---
7.
df.info()Provides a concise summary of the DataFrame, including data types and non-null values.
import pandas as pd
import numpy as np
data = {'col1': [1, 2, 3], 'col2': [4.0, 5.0, np.nan], 'col3': ['A', 'B', 'C']}
df = pd.DataFrame(data)
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 col1 3 non-null int64
1 col2 2 non-null float64
2 col3 3 non-null object
dtypes: float64(1), int64(1), object(1)
memory usage: 200.0+ bytes
---
8.
df.shapeReturns a tuple representing the dimensionality (rows, columns) of the DataFrame.
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4], 'C': [5, 6]})
print(df.shape)
(2, 3)
#DataInspection #PandasBasics
---
9.
df.describe()Generates descriptive statistics for numerical columns (count, mean, std, min, max, etc.).
import pandas as pd
df = pd.DataFrame({'Age': [22, 38, 26, 35, 29]})
print(df.describe())
โค3
Age
count 5.000000
mean 30.000000
std 6.363961
min 22.000000
25% 26.000000
50% 29.000000
75% 35.000000
max 38.000000
---
10.
df.columnsReturns the column labels of the DataFrame.
import pandas as pd
df = pd.DataFrame({'Name': [], 'Age': [], 'City': []})
print(df.columns)
Index(['Name', 'Age', 'City'], dtype='object')
---
11.
df.dtypesReturns the data type of each column.
import pandas as pd
df = pd.DataFrame({'Name': ['Alice'], 'Age': [25], 'Salary': [75000.50]})
print(df.dtypes)
Name object
Age int64
Salary float64
dtype: object
---
12. Selecting a Column
Select a single column, which returns a Pandas Series.
import pandas as pd
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
ages = df['Age']
print(ages)
0 25
1 30
Name: Age, dtype: int64
#DataSelection #Indexing #Statistics
---
13.
df.loc[]Access a group of rows and columns by label(s) or a boolean array.
import pandas as pd
data = {'Age': [25, 30, 35], 'City': ['NY', 'LA', 'CH']}
df = pd.DataFrame(data, index=['Alice', 'Bob', 'Charlie'])
print(df.loc['Bob'])
Age 30
City LA
Name: Bob, dtype: object
---
14.
df.iloc[]Access a group of rows and columns by integer position(s).
import pandas as pd
data = {'Age': [25, 30, 35], 'City': ['NY', 'LA', 'CH']}
df = pd.DataFrame(data, index=['Alice', 'Bob', 'Charlie'])
print(df.iloc[1]) # Get the second row (index 1)
Age 30
City LA
Name: Bob, dtype: object
---
15.
df.isnull()Returns a DataFrame of the same shape with boolean values indicating if a value is missing (NaN).
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan], 'B': [3, 4]})
print(df.isnull())
A B
0 False False
1 True False
---
16.
df.dropna()Removes missing values.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan, 3], 'B': [4, 5, 6]})
cleaned_df = df.dropna()
print(cleaned_df)
A B
0 1.0 4
2 3.0 6
#DataCleaning #MissingData
---
17.
df.fillna()Fills missing (NaN) values with a specified value or method.
import pandas as pd
import numpy as np
df = pd.DataFrame({'Score': [90, 85, np.nan, 92]})
filled_df = df.fillna(0)
print(filled_df)
Score
0 90.0
1 85.0
2 0.0
3 92.0
---
18.
df.drop_duplicates()Removes duplicate rows from the DataFrame.
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Alice'], 'Age': [25, 30, 25]}
df = pd.DataFrame(data)
unique_df = df.drop_duplicates()
print(unique_df)
Name Age
0 Alice 25
1 Bob 30
---
19.
df.rename()Alters axes labels (e.g., column names).
import pandas as pd
df = pd.DataFrame({'A': [1], 'B': [2]})
renamed_df = df.rename(columns={'A': 'Column_A', 'B': 'Column_B'})
print(renamed_df)
Column_A Column_B
0 1 2
---
20.
series.value_counts()Returns a Series containing counts of unique values.
import pandas as pd
s = pd.Series(['A', 'B', 'A', 'C', 'A', 'B'])
print(s.value_counts())
A 3
B 2
C 1
dtype: int64
#DataManipulation #Transformation
---
21.
series.unique()Returns an array of unique values in a Series.
import pandas as pd
s = pd.Series(['A', 'B', 'A', 'C', 'A', 'B'])
print(s.unique())
['A' 'B' 'C']
---
22.
df.sort_values()Sorts a DataFrame by the values of one or more columns.
import pandas as pd
data = {'Name': ['Charlie', 'Alice', 'Bob'], 'Age': [35, 25, 30]}
df = pd.DataFrame(data)
sorted_df = df.sort_values(by='Age')
print(sorted_df)
Name Age
1 Alice 25
2 Bob 30
0 Charlie 35
---
23.
df.groupby()Groups a DataFrame using a mapper or by a Series of columns for aggregation.
import pandas as pd
data = {'Dept': ['HR', 'IT', 'HR', 'IT'], 'Salary': [70, 85, 75, 90]}
df = pd.DataFrame(data)
grouped = df.groupby('Dept').mean()
print(grouped)
Salary
Dept
HR 72.5
IT 87.5
---
24.
df.agg()Applies one or more aggregations over the specified axis.
import pandas as pd
data = {'Dept': ['HR', 'IT', 'HR', 'IT'], 'Salary': [70, 85, 75, 90]}
df = pd.DataFrame(data)
agg_results = df.groupby('Dept')['Salary'].agg(['mean', 'sum'])
print(agg_results)
mean sum
Dept
HR 72.5 145
IT 87.5 175
#Aggregation #Grouping #Sorting
---
25.
df.apply()Applies a function along an axis of the DataFrame.
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [10, 20, 30]})
# Apply a function to double each value in column 'A'
df['A_doubled'] = df['A'].apply(lambda x: x * 2)
print(df)
A B A_doubled
0 1 10 2
1 2 20 4
2 3 30 6
---
26.
pd.merge()Merges two DataFrames based on a common column or index, similar to a SQL join.
import pandas as pd
df1 = pd.DataFrame({'ID': [1, 2], 'Name': ['Alice', 'Bob']})
df2 = pd.DataFrame({'ID': [1, 2], 'Role': ['Engineer', 'Analyst']})
merged_df = pd.merge(df1, df2, on='ID')
print(merged_df)
ID Name Role
0 1 Alice Engineer
1 2 Bob Analyst
---
27.
pd.concat()Concatenates (stacks) pandas objects along a particular axis.
import pandas as pd
df1 = pd.DataFrame({'A': ['A0'], 'B': ['B0']})
df2 = pd.DataFrame({'A': ['A1'], 'B': ['B1']})
concatenated_df = pd.concat([df1, df2])
print(concatenated_df)
A B
0 A0 B0
0 A1 B1
---
28.
df.pivot_table()Creates a spreadsheet-style pivot table as a DataFrame.
โค2
import pandas as pd
data = {'Date': ['2023-01-01', '2023-01-01', '2023-01-02'],
'City': ['NY', 'LA', 'NY'],
'Sales': [100, 150, 120]}
df = pd.DataFrame(data)
pivot = df.pivot_table(values='Sales', index='Date', columns='City')
print(pivot)
City LA NY
Date
2023-01-01 150.0 100.0
2023-01-02 NaN 120.0
#CombiningData #PivotTable
---
29.
df.set_index()Sets one or more existing columns as the DataFrame index.
import pandas as pd
data = {'ID': ['a1', 'a2'], 'Name': ['Alice', 'Bob']}
df = pd.DataFrame(data)
df_indexed = df.set_index('ID')
print(df_indexed)
Name
ID
a1 Alice
a2 Bob
---
30.
df.reset_index()Resets the index of the DataFrame, making the old index a new column.
import pandas as pd
data = {'Name': ['Alice', 'Bob']}
df = pd.DataFrame(data, index=['a1', 'a2'])
df_reset = df.reset_index()
print(df_reset)
index Name
0 a1 Alice
1 a2 Bob
#Indexing #PandasTips
โโโโโโโโโโโโโโโ
By: @DataScienceM โจ
โค8
Please open Telegram to view this post
VIEW IN TELEGRAM
Top 30 MATLAB Image Processing Functions
#MATLAB #ImageProcessing #Basics
#1.
Reads an image from a file into a matrix.
#2.
Displays an image in a figure window.
#3.
Writes an image matrix to a file.
#4.
Returns the dimensions of the image matrix (rows, columns, color channels).
#5.
Converts an RGB color image to a grayscale intensity image.
---
#MATLAB #ImageProcessing #Conversion #Transformation
#6.
Converts an image to double-precision format, scaling data to the range [0, 1].
#7.
Resizes an image to a specified size.
#8.
Rotates an image by a specified angle.
#9.
Crops an image to a specified rectangle.
#10.
Converts an RGB image to the Hue-Saturation-Value (HSV) color space.
---
#MATLAB #ImageProcessing #Enhancement
#11.
Displays the histogram of an image, showing the distribution of pixel intensity values.
#MATLAB #ImageProcessing #Basics
#1.
imread()Reads an image from a file into a matrix.
img = imread('peppers.png');
disp('Image "peppers.png" loaded into variable "img".');Image "peppers.png" loaded into variable "img".
#2.
imshow()Displays an image in a figure window.
img = imread('peppers.png');
imshow(img);
title('Peppers Image');Output: A new figure window opens, displaying the 'peppers.png' image with the title "Peppers Image".
#3.
imwrite()Writes an image matrix to a file.
img = imread('cameraman.tif');
imwrite(img, 'my_cameraman.jpg');
disp('Image saved as my_cameraman.jpg');Image saved as my_cameraman.jpg
#4.
size()Returns the dimensions of the image matrix (rows, columns, color channels).
rgb_img = imread('peppers.png');
gray_img = imread('cameraman.tif');
size_rgb = size(rgb_img);
size_gray = size(gray_img);
disp(['Size of RGB image: ', num2str(size_rgb)]);
disp(['Size of grayscale image: ', num2str(size_gray)]);Size of RGB image: 384 512 3
Size of grayscale image: 256 256
#5.
rgb2gray()Converts an RGB color image to a grayscale intensity image.
rgb_img = imread('peppers.png');
gray_img = rgb2gray(rgb_img);
imshow(gray_img);
title('Grayscale Peppers');Output: A figure window displays the grayscale version of the peppers image.
---
#MATLAB #ImageProcessing #Conversion #Transformation
#6.
im2double()Converts an image to double-precision format, scaling data to the range [0, 1].
img_uint8 = imread('cameraman.tif');
img_double = im2double(img_uint8);
disp(['Max value of original image: ', num2str(max(img_uint8(:)))]);
disp(['Max value of double image: ', num2str(max(img_double(:)))]);Max value of original image: 253
Max value of double image: 0.99216
#7.
imresize()Resizes an image to a specified size.
img = imread('cameraman.tif');
resized_img = imresize(img, 0.5); % Resize to 50% of original size
imshow(resized_img);
title('Resized Cameraman');Output: A figure window displays the cameraman image at half its original size.
#8.
imrotate()Rotates an image by a specified angle.
img = imread('cameraman.tif');
rotated_img = imrotate(img, 30, 'bilinear', 'crop');
imshow(rotated_img);
title('Rotated 30 Degrees');Output: A figure window displays the cameraman image rotated by 30 degrees, cropped to the original size.
#9.
imcrop()Crops an image to a specified rectangle.
img = imread('peppers.png');
% [xmin ymin width height]
cropped_img = imcrop(img, [100 80 250 200]);
imshow(cropped_img);
title('Cropped Image');Output: A figure window displays only the rectangular section specified from the peppers image.
#10.
rgb2hsv()Converts an RGB image to the Hue-Saturation-Value (HSV) color space.
rgb_img = imread('peppers.png');
hsv_img = rgb2hsv(rgb_img);
hue_channel = hsv_img(:,:,1); % Extract the Hue channel
imshow(hue_channel);
title('Hue Channel of Peppers Image');Output: A figure window displays the Hue channel of the peppers image as a grayscale image.
---
#MATLAB #ImageProcessing #Enhancement
#11.
imhist()Displays the histogram of an image, showing the distribution of pixel intensity values.
โค1
gray_img = imread('pout.tif');
imhist(gray_img);
title('Histogram of a Low-Contrast Image');Output: A figure window with a bar chart showing the intensity distribution of the 'pout.tif' image.
#12.
histeq()Enhances contrast using histogram equalization.
low_contrast_img = imread('pout.tif');
high_contrast_img = histeq(low_contrast_img);
imshow(high_contrast_img);
title('Histogram Equalized Image');Output: A figure window displays a higher contrast version of the 'pout.tif' image.
#13.
imadjust()Adjusts image intensity values or colormap by mapping intensity values to new values.
img = imread('cameraman.tif');
adjusted_img = imadjust(img, [0.3 0.7], []);
imshow(adjusted_img);
title('Intensity Adjusted Image');Output: A figure window showing a high-contrast version of the cameraman image, where intensities between 0.3 and 0.7 are stretched to the full [0, 1] range.
#14.
imtranslate()Translates (shifts) an image horizontally and vertically.
img = imread('cameraman.tif');
translated_img = imtranslate(img, [25, 15]); % Shift 25 pixels right, 15 pixels down
imshow(translated_img);
title('Translated Image');Output: A figure window shows the cameraman image shifted to the right and down.
#15.
imsharpen()Sharpens an image using the unsharp masking method.
img = imread('peppers.png');
sharpened_img = imsharpen(img);
imshow(sharpened_img);
title('Sharpened Image');Output: A figure window displays a crisper, more detailed version of the peppers image.
---
#MATLAB #ImageProcessing #Filtering #Noise
#16.
imnoise()Adds a specified type of noise to an image.
img = imread('cameraman.tif');
noisy_img = imnoise(img, 'salt & pepper', 0.02);
imshow(noisy_img);
title('Image with Salt & Pepper Noise');Output: A figure window displays the cameraman image with random white and black pixels (noise).
#17.
fspecial()Creates a predefined 2-D filter kernel (e.g., for averaging, Gaussian blur, Laplacian).
h = fspecial('motion', 20, 45); % Create a motion blur filter
disp('Generated a 2D motion filter kernel.');
disp(h);Generated a 2D motion filter kernel.
(Output is a matrix representing the filter kernel)
#18.
imfilter()Filters a multidimensional image with a specified filter kernel.
img = imread('cameraman.tif');
h = fspecial('motion', 20, 45);
motion_blur_img = imfilter(img, h, 'replicate');
imshow(motion_blur_img);
title('Motion Blurred Image');Output: A figure window shows the cameraman image with a motion blur effect applied at a 45-degree angle.
#19.
medfilt2()Performs 2-D median filtering, which is excellent for removing 'salt & pepper' noise.
noisy_img = imnoise(imread('cameraman.tif'), 'salt & pepper', 0.02);
denoised_img = medfilt2(noisy_img);
imshow(denoised_img);
title('Denoised with Median Filter');Output: A figure window shows the noisy image significantly cleaned up, with most salt & pepper noise removed.
#20.
edge()Finds edges in an intensity image using various algorithms (e.g., Sobel, Canny).