Machine Learning
39.2K subscribers
3.83K photos
32 videos
41 files
1.3K links
Machine learning insights, practical tutorials, and clear explanations for beginners and aspiring data scientists. Follow the channel for models, algorithms, coding guides, and real-world ML applications.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
#87. What is feature engineering?
A: Feature engineering is the process of using domain knowledge to create new features (predictor variables) from raw data. The goal is to improve the performance of machine learning models. Examples include combining features, creating dummy variables from categorical data, or extracting components from a date.

#88. How would you handle a very large dataset that doesn't fit into your computer's memory?
A:
Chunking: Read and process the data in smaller chunks using options in pd.read_csv (chunksize).
Data Types: Optimize data types to use less memory (e.g., using int32 instead of int64).
Cloud Computing: Use cloud-based platforms like AWS, GCP, or Azure that provide scalable computing resources (e.g., using Spark with Databricks).
Sampling: Work with a representative random sample of the data for initial exploration.

#89. Where do you go to stay up-to-date with the latest trends in data analysis?
A: I actively read blogs like Towards Data Science on Medium, follow key data scientists and analysts on LinkedIn and Twitter, and listen to podcasts. I also enjoy browsing Kaggle competitions to see how others approach complex problems and occasionally review documentation for new features in libraries like pandas and Scikit-learn.

#90. What is a key performance indicator (KPI)?
A: A KPI is a measurable value that demonstrates how effectively a company is achieving key business objectives. Organizations use KPIs to evaluate their success at reaching targets. For a data analyst, it's crucial to understand what the business's KPIs are in order to align analysis with business goals.
---
#91. What is the difference between structured and unstructured data?
A:
Structured Data: Highly organized and formatted in a way that is easily searchable in relational databases (e.g., spreadsheets, SQL databases).
Unstructured Data: Data that has no predefined format or organization, making it more difficult to collect, process, and analyze (e.g., text in emails, images, videos, social media posts).

#92. Why is data cleaning important?
A: Data cleaning (or data cleansing) is crucial because "garbage in, garbage out." Raw data is often messy, containing errors, inconsistencies, and missing values. If this data is not cleaned, it will lead to inaccurate analysis, flawed models, and unreliable conclusions, which can result in poor business decisions.

#93. Tell me about a time you had to work with ambiguous instructions or unclear data.
A: "I was once asked to analyze 'user engagement'. This term was very broad. I scheduled a meeting with the stakeholders (product manager, marketing lead) to clarify. I asked questions like: 'What business question are we trying to answer with this analysis?', 'Which users are we most interested in?', and 'What actions on the platform do we consider valuable engagement?'. This helped us collaboratively define engagement with specific metrics (e.g., likes, comments, session duration), which ensured my analysis was relevant and actionable."

#94. What is the difference between a dashboard and a report?
A:
Report: A static presentation of data for a specific time period (e.g., a quarterly sales report). It's meant to inform.
Dashboard: A dynamic, interactive BI tool that provides a real-time, at-a-glance view of key performance indicators. It's meant for monitoring and exploration.

#95. What is statistical power?
A: Statistical power is the probability that a hypothesis test will correctly reject the null hypothesis when the null hypothesis is false (i.e., the probability of avoiding a Type II error). In A/B testing, higher power means you are more likely to detect a real effect if one exists.
👍1
#96. How do you know if your sample is representative of the population?
A: The best way is through proper sampling techniques. Random sampling is the gold standard, where every member of the population has an equal chance of being selected. You can also use stratified sampling, where you divide the population into subgroups (strata) and then take a random sample from each subgroup to ensure all groups are represented proportionally.

#97. What is your favorite data visualization and why?
A: "I find the box plot to be incredibly powerful and efficient. In a single, compact chart, it visualizes the distribution of data, showing the median, quartiles (25th and 75th percentiles), and potential outliers. It's excellent for comparing distributions across multiple categories and is much more informative than a simple bar chart of means."

#98. What is survivorship bias?
A: Survivorship bias is a logical error where you concentrate on the people or things that "survived" some process and inadvertently overlook those that did not because of their lack of visibility. A classic example is analyzing the habits of successful startup founders without considering the thousands who failed, which can lead to flawed conclusions about what it takes to succeed.

#99. You are given two datasets. How would you figure out if they can be joined?
A: I would first inspect the columns in both datasets to look for a common key or field. This field should ideally be a unique identifier (like user_id, product_id). I would check that the data types of these key columns are the same. Then, I would check the overlap of values between the key columns to understand how many records would match in a join.

#100. Why do you want to be a data analyst?
A: "I am passionate about being a data analyst because I enjoy the process of transforming raw data into actionable insights that can drive real business decisions. I love the blend of technical skills like SQL and Python with the problem-solving and storytelling aspects of the role. I find it incredibly rewarding to uncover hidden patterns and help a company grow by making data-informed choices."

━━━━━━━━━━━━━━━
By: @DataScienceM
5
📌 How to Evaluate Search Relevance and Ranking

🗂 Category:

🕒 Date: 2024-05-30 | ⏱️ Read time: 6 min read

This article explores the key metrics used for evaluating Search Relevance and Ranking, empowering you…
1
📌 Implementing Generative and Analytical Models to Create and Enrich Knowledge Graphs for RAGs

🗂 Category:

🕒 Date: 2024-05-29 | ⏱️ Read time: 20 min read

Evaluate generative and analytical models to build Knowledge Graphs and facilitate them to power highly…
1
Forwarded from Kaggle Data Hub
Unlock premium learning without spending a dime! ⭐️ @DataScienceC is the first Telegram channel dishing out free Udemy coupons daily—grab courses on data science, coding, AI, and beyond. Join the revolution and boost your skills for free today! 📕

What topic are you itching to learn next? 😊
https://t.iss.one/DataScienceC 🌟
Please open Telegram to view this post
VIEW IN TELEGRAM
📌 From Classical Models to AI: Forecasting Humidity for Energy and Water Efficiency in Data Centers

🗂 Category: MACHINE LEARNING

🕒 Date: 2025-11-02 | ⏱️ Read time: 27 min read

From ARIMA to N-BEATS: Comparing forecasting approaches that balance accuracy, interpretability, and sustainability
1
📌 MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter

🗂 Category: ARTIFICIAL INTELLIGENCE

🕒 Date: 2025-11-02 | ⏱️ Read time: 29 min read

MobileNetV3 with PyTorch  —  now featuring SE blocks and hard activation functions
2
💡 Applying Image Filters with Pillow

Pillow's ImageFilter module provides a set of pre-defined filters you can apply to your images with a single line of code. This example demonstrates how to apply a Gaussian blur effect, which is useful for softening images or creating depth-of-field effects.

from PIL import Image, ImageFilter

try:
# Open an existing image
with Image.open("your_image.jpg") as img:
# Apply the Gaussian Blur filter
# The radius parameter controls the blur intensity
blurred_img = img.filter(ImageFilter.GaussianBlur(radius=5))

# Display the blurred image
blurred_img.show()

# Save the new image
blurred_img.save("blurred_image.png")

except FileNotFoundError:
print("Error: 'your_image.jpg' not found. Please provide an image.")


Code explanation: The script opens an image file, applies a GaussianBlur filter from the ImageFilter module using the .filter() method, and then displays and saves the resulting blurred image. The blur intensity is controlled by the radius argument.

#Python #Pillow #ImageProcessing #ImageFilter #PIL

━━━━━━━━━━━━━━━
By: @DataScienceM
💡 Top 50 Operations for Audio Processing in Python
👇👇👇👇👇
Please open Telegram to view this post
VIEW IN TELEGRAM
💡 Top 50 Operations for Audio Processing in Python

Note: Most examples use pydub. You need ffmpeg installed for opening/exporting non-WAV files. Install libraries with pip install pydub librosa sounddevice scipy numpy.

I. Basic Loading, Saving & Properties

• Load an audio file (any format).
from pydub import AudioSegment
audio = AudioSegment.from_file("sound.mp3")

• Export (save) an audio file.
audio.export("new_sound.wav", format="wav")

• Get duration in milliseconds.
duration_ms = len(audio)

• Get frame rate (sample rate).
rate = audio.frame_rate

• Get number of channels (1 for mono, 2 for stereo).
channels = audio.channels

• Get sample width in bytes (e.g., 2 for 16-bit).
width = audio.sample_width


II. Playback & Recording

• Play an audio segment.
from pydub.playback import play
play(audio)

• Record audio from a microphone for 5 seconds.
import sounddevice as sd
from scipy.io.wavfile import write

fs = 44100 # Sample rate
seconds = 5
recording = sd.rec(int(seconds * fs), samplerate=fs, channels=2)
sd.wait() # Wait until recording is finished
write('output.wav', fs, recording)


III. Slicing & Concatenating

• Get a slice (e.g., the first 5 seconds).
first_five_seconds = audio[:5000] # Time is in milliseconds

• Get a slice from the end (e.g., the last 3 seconds).
last_three_seconds = audio[-3000:]

• Concatenate (append) two audio files.
combined = audio1 + audio2

• Repeat an audio segment.
repeated = audio * 3

• Crossfade two audio segments.
# Fades out audio1 while fading in audio2
faded = audio1.append(audio2, crossfade=1000)


IV. Volume & Effects

• Increase volume by 6 dB.
louder_audio = audio + 6

• Decrease volume by 3 dB.
quieter_audio = audio - 3

• Fade in from silence.
faded_in = audio.fade_in(2000) # 2-second fade-in

• Fade out to silence.
faded_out = audio.fade_out(3000) # 3-second fade-out

• Reverse the audio.
reversed_audio = audio.reverse()

• Normalize audio to a maximum amplitude.
from pydub.effects import normalize
normalized_audio = normalize(audio)

• Overlay (mix) two tracks.
# Starts playing 'overlay_sound' 5 seconds into 'main_sound'
mixed = main_sound.overlay(overlay_sound, position=5000)


V. Channel Manipulation

• Split stereo into two mono channels.
left_channel, right_channel = audio.split_to_mono()

• Create a stereo segment from two mono segments.
stereo_sound = AudioSegment.from_mono_segments(left_channel, right_channel)

• Convert stereo to mono.
mono_audio = audio.set_channels(1)


VI. Silence & Splitting

• Generate a silent segment.
one_second_silence = AudioSegment.silent(duration=1000)

• Split audio based on silence.
from pydub.silence import split_on_silence
chunks = split_on_silence(
audio,
min_silence_len=500,
silence_thresh=-40
)


VII. Working with Raw Data (NumPy & SciPy)
• Get raw audio data as a NumPy array.
import numpy as np
samples = np.array(audio.get_array_of_samples())

• Create a Pydub segment from a NumPy array.
new_audio = AudioSegment(
samples.tobytes(),
frame_rate=audio.frame_rate,
sample_width=audio.sample_width,
channels=audio.channels
)

• Read a WAV file directly into a NumPy array.
from scipy.io.wavfile import read
rate, data = read("sound.wav")

• Write a NumPy array to a WAV file.
from scipy.io.wavfile import write
write("new_sound.wav", rate, data)

• Generate a sine wave.
import numpy as np
sample_rate = 44100
frequency = 440 # A4 note
duration = 5
t = np.linspace(0., duration, int(sample_rate * duration))
amplitude = np.iinfo(np.int16).max * 0.5
data = amplitude * np.sin(2. * np.pi * frequency * t)
# This array can now be written to a file


VIII. Audio Analysis with Librosa

• Load audio with Librosa.
import librosa
y, sr = librosa.load("sound.mp3")

• Estimate tempo (Beats Per Minute).
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)

• Get beat event times in seconds.
beat_times = librosa.frames_to_time(beat_frames, sr=sr)

• Decompose into harmonic and percussive components.
y_harmonic, y_percussive = librosa.effects.hpss(y)

• Compute a spectrogram.
import numpy as np
D = librosa.stft(y)
S_db = librosa.amplitude_to_db(np.abs(D), ref=np.max)

• Compute Mel-Frequency Cepstral Coefficients (MFCCs).
mfccs = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)

• Compute Chroma features (related to musical pitch).
chroma = librosa.feature.chroma_stft(y=y, sr=sr)

• Detect onset events (the start of notes).
onset_frames = librosa.onset.onset_detect(y=y, sr=sr)
onset_times = librosa.frames_to_time(onset_frames, sr=sr)

• Pitch shifting.
y_pitched = librosa.effects.pitch_shift(y, sr=sr, n_steps=4) # Shift up 4 semitones

• Time stretching (change speed without changing pitch).
y_fast = librosa.effects.time_stretch(y, rate=2.0) # Double speed


IX. More Utilities

• Detect leading silence.
from pydub.silence import detect_leading_silence
trim_ms = detect_leading_silence(audio)
trimmed_audio = audio[trim_ms:]

• Get the root mean square (RMS) energy.
rms = audio.rms

• Get the maximum possible RMS for the audio format.
max_possible_rms = audio.max_possible_amplitude

• Find the loudest section of an audio file.
from pydub.scipy_effects import normalize
loudest_part = normalize(audio.strip_silence(silence_len=1000, silence_thresh=-32))

• Change the frame rate (resample).
resampled = audio.set_frame_rate(16000)

• Create a simple band-pass filter.
from pydub.scipy_effects import band_pass_filter
filtered = band_pass_filter(audio, 400, 2000) # Pass between 400Hz and 2000Hz

• Convert file format in one line.
AudioSegment.from_file("music.ogg").export("music.mp3", format="mp3")

• Get the raw bytes of the audio data.
raw_data = audio.raw_data

• Get the maximum amplitude.
max_amp = audio.max

• Match the volume of two segments.
matched_audio2 = audio2.apply_gain(audio1.dBFS - audio2.dBFS)


#Python #AudioProcessing #Pydub #Librosa #SignalProcessing

━━━━━━━━━━━━━━━
By: @DataScienceM
3
📌 The Pearson Correlation Coefficient, Explained Simply

🗂 Category: STATISTICS

🕒 Date: 2025-11-01 | ⏱️ Read time: 7 min read

A simple explanation of the Pearson correlation coefficient with examples
2
📌 Graph RAG vs SQL RAG

🗂 Category: LARGE LANGUAGE MODELS

🕒 Date: 2025-11-01 | ⏱️ Read time: 7 min read

Evaluating RAGs on graph and SQL databases
📌 Understanding the Two Faces of Shiny for Python: Core and Express

🗂 Category: DATA SCIENCE

🕒 Date: 2024-05-29 | ⏱️ Read time: 7 min read

Exploring the Differences and Use Cases of Shiny Core and Shiny Express for Python
📌 Do You Need a Degree to Be a Data Scientist?

🗂 Category: DATA SCIENCE

🕒 Date: 2024-05-29 | ⏱️ Read time: 8 min read

No, but it certainly helps.
🤖🧠 HunyuanWorld-Mirror: Tencent’s Breakthrough in Universal 3D Reconstruction

🗓️ 03 Nov 2025
📚 AI News & Trends

The race toward achieving universal 3D understanding has reached a significant milestone with Tencent’s HunyuanWorld-Mirror, a cutting-edge open-source model designed to revolutionize 3D reconstruction. In an era dominated by visual intelligence and immersive digital experiences, this new model stands out by offering a feed-forward, geometry-aware framework that can predict multiple 3D outputs in a single ...

#HunyuanWorld #Tencent #3DReconstruction #UniversalAI #GeometryAware #OpenSourceAI
📌 Data Scientists Work in the Cloud. Here’s How to Practice This as a Student (Part 2: Python)

🗂 Category: DATA SCIENCE

🕒 Date: 2024-05-29 | ⏱️ Read time: 9 min read

Because data scientists don’t write production code in the Udemy code editor
💡 Top 50 Operations for Signal Processing in Python

Note: Most examples use numpy, scipy.signal, and matplotlib.pyplot. Assume they are imported as:
import numpy as np
from scipy import signal
import matplotlib.pyplot as plt

I. Signal Generation

• Create a time vector.
fs = 1000  # Sampling frequency
t = np.linspace(0, 1, fs, endpoint=False)

• Generate a sine wave.
freq = 50 # Hz
sine_wave = np.sin(2 * np.pi * freq * t)

• Generate a square wave.
square_wave = signal.square(2 * np.pi * freq * t)

• Generate a sawtooth wave.
sawtooth_wave = signal.sawtooth(2 * np.pi * freq * t)

• Generate Gaussian white noise.
noise = np.random.normal(0, 1, len(t))

• Generate a frequency-swept cosine (chirp).
chirp_signal = signal.chirp(t, f0=1, f1=100, t1=1, method='linear')

• Generate an impulse signal (unit impulse).
impulse = signal.unit_impulse(100, 'mid') # at index 50 of 100

• Generate a Gaussian pulse.
gaus_pulse = signal.gausspulse(t, fc=5, bw=0.5)


II. Signal Visualization & Properties

• Plot a signal.
plt.plot(t, sine_wave)
plt.xlabel("Time [s]")
plt.ylabel("Amplitude")
plt.show()

• Calculate the mean value.
mean_val = np.mean(sine_wave)

• Calculate the Root Mean Square (RMS).
rms_val = np.sqrt(np.mean(sine_wave**2))

• Calculate the standard deviation.
std_dev = np.std(sine_wave)

• Find the maximum value and its index.
max_val = np.max(sine_wave)
max_idx = np.argmax(sine_wave)


III. Frequency Domain Analysis (FFT)

• Compute the Fast Fourier Transform (FFT).
from scipy.fft import fft, fftfreq
yf = fft(sine_wave)

• Get the frequency bins for the FFT.
N = len(sine_wave)
xf = fftfreq(N, 1 / fs)[:N//2]

• Plot the magnitude spectrum.
plt.plot(xf, 2.0/N * np.abs(yf[0:N//2]))
plt.grid()
plt.show()

• Compute the Inverse FFT (IFFT).
from scipy.fft import ifft
original_signal = ifft(yf)

• Compute the Power Spectral Density (PSD) using Welch's method.
f, Pxx_den = signal.welch(sine_wave, fs, nperseg=1024)


IV. Digital Filtering

• Design a Butterworth low-pass filter.
b, a = signal.butter(4, 100, 'low', analog=False, fs=fs)

• Apply a filter to a signal (zero-phase filtering).
noisy_signal = sine_wave + noise
filtered_signal = signal.filtfilt(b, a, noisy_signal)

• Design a Chebyshev Type I high-pass filter.
b, a = signal.cheby1(4, 5, 100, 'high', fs=fs) # 5dB ripple

• Design a Bessel band-pass filter.
b, a = signal.bessel(4, [50, 150], 'band', fs=fs)

• Design an FIR filter using a window method.
numtaps = 101
fir_coeffs = signal.firwin(numtaps, cutoff=100, fs=fs)

• Plot the frequency response of a filter.
w, h = signal.freqz(b, a, fs=fs)
plt.plot(w, 20 * np.log10(abs(h)))

• Apply a median filter (good for salt-and-pepper noise).
median_filtered = signal.medfilt(noisy_signal, kernel_size=3)

• Apply a Wiener filter for noise reduction.
wiener_filtered = signal.wiener(noisy_signal)


V. Resampling & Windowing

• Resample a signal to a new length.
resampled = signal.resample(sine_wave, num=500) # Resample to 500 points

• Decimate a signal (downsample by a factor).
decimated = signal.decimate(sine_wave, q=4) # Downsample by 4

• Create a Hamming window.
window = signal.windows.hamming(51)

• Apply a window to a signal segment.
segment = sine_wave[0:51]
windowed_segment = segment * window


VI. Convolution & Correlation

• Perform linear convolution.
sig1 = np.repeat([0., 1., 0.], 100)
sig2 = np.repeat([0., 1., 1., 0.], 100)
convolved = signal.convolve(sig1, sig2, mode='same')

• Compute cross-correlation.
# Useful for finding delays between signals
correlation = signal.correlate(sig1, sig2, mode='full')

• Compute auto-correlation.
# Useful for finding periodicities in a signal
autocorr = signal.correlate(sine_wave, sine_wave, mode='full')


VII. Time-Frequency Analysis

• Compute and plot a spectrogram.
f, t_spec, Sxx = signal.spectrogram(chirp_signal, fs)
plt.pcolormesh(t_spec, f, Sxx, shading='gouraud')
plt.show()

• Perform Continuous Wavelet Transform (CWT).
widths = np.arange(1, 31)
cwt_matrix = signal.cwt(chirp_signal, signal.ricker, widths)

• Perform Hilbert transform to get the analytic signal.
analytic_signal = signal.hilbert(sine_wave)

• Calculate instantaneous frequency.
instant_phase = np.unwrap(np.angle(analytic_signal))
instant_freq = (np.diff(instant_phase) / (2.0*np.pi) * fs)


VIII. Feature Extraction

• Find peaks in a signal.
peaks, _ = signal.find_peaks(sine_wave, height=0.5)

• Find peaks with prominence criteria.
peaks_prom, _ = signal.find_peaks(noisy_signal, prominence=1)

• Differentiate a signal (e.g., to find velocity from position).
derivative = np.diff(sine_wave)

• Integrate a signal.
from scipy.integrate import cumulative_trapezoid
integral = cumulative_trapezoid(sine_wave, t, initial=0)

• Detrend a signal to remove a linear trend.
trend = np.linspace(0, 1, fs)
trended_signal = sine_wave + trend
detrended = signal.detrend(trended_signal)


IX. System Analysis

• Define a system via a transfer function (numerator, denominator).
# Example: 2nd order low-pass filter
system = signal.TransferFunction([1], [1, 1, 1])

• Compute the step response of a system.
t_step, y_step = signal.step(system)

• Compute the impulse response of a system.
t_impulse, y_impulse = signal.impulse(system)

• Compute the Bode plot of a system's frequency response.
w, mag, phase = signal.bode(system)


X. Signal Generation from Data

• Generate a signal from a function.
t = np.linspace(0, 1, 500)
custom_signal = np.sinc(2 * np.pi * 4 * t)

• Convert a list of values to a signal array.
my_data = [0, 1, 2, 3, 2, 1, 0, -1, -2, -1, 0]
data_signal = np.array(my_data)

• Read signal data from a WAV file.
from scipy.io import wavfile
samplerate, data = wavfile.read('audio.wav')

• Create a pulse train signal.
pulse_train = np.zeros(fs)
pulse_train[::100] = 1 # Impulse every 100 samples


#Python #SignalProcessing #SciPy #NumPy #DSP

━━━━━━━━━━━━━━━
By: @DataScienceM
💡 Top 50 Matplotlib Commands in Python

Note: Examples assume the following imports:
import matplotlib.pyplot as plt
import numpy as np

I. Figure & Basic Plots

• Create a figure.
fig = plt.figure(figsize=(8, 6))

• Create a basic line plot.
x = np.linspace(0, 10, 100)
plt.plot(x, np.sin(x))

• Show/display the plot.
plt.show()

• Save a figure to a file.
plt.savefig("my_plot.png", dpi=300)

• Create a scatter plot.
plt.scatter(x, np.cos(x))

• Create a bar chart.
categories = ['A', 'B', 'C']
values = [3, 7, 2]
plt.bar(categories, values)

• Create a horizontal bar chart.
plt.barh(categories, values)

• Create a histogram.
data = np.random.randn(1000)
plt.hist(data, bins=30)

• Create a pie chart.
plt.pie(values, labels=categories, autopct='%1.1f%%')

• Create a box plot.
plt.boxplot([data, data*2])

• Display a 2D array or image.
matrix = np.random.rand(10, 10)
plt.imshow(matrix, cmap='viridis')

• Clear the current figure.
plt.clf()


II. Labels, Titles & Legends

• Add a title to the plot.
plt.title("Sine Wave")

• Add a label to the x-axis.
plt.xlabel("Time (s)")

• Add a label to the y-axis.
plt.ylabel("Amplitude")

• Add a legend.
plt.plot(x, np.sin(x), label='Sine')
plt.plot(x, np.cos(x), label='Cosine')
plt.legend()

• Add a grid.
plt.grid(True)

• Add text to the plot at specific coordinates.
plt.text(2, 0.5, 'An important point')

• Add an annotation with an arrow.
plt.annotate('Peak', xy=(np.pi/2, 1), xytext=(3, 1.5),
arrowprops=dict(facecolor='black', shrink=0.05))


III. Axes & Ticks

• Set the x-axis limits.
plt.xlim(0, 5)

• Set the y-axis limits.
plt.ylim(-1.5, 1.5)

• Set the x-axis ticks and labels.
plt.xticks([0, np.pi, 2*np.pi], ['0', '$\pi$', '$2\pi$'])

• Set the y-axis ticks and labels.
plt.yticks([-1, 0, 1])

• Set a logarithmic scale on an axis.
plt.yscale('log')

• Set the aspect ratio of the plot.
plt.axis('equal') # Other options: 'tight', 'off'


IV. Plot Customization

• Set the color of a plot.
plt.plot(x, np.sin(x), color='red')

• Set the line style.
plt.plot(x, np.sin(x), linestyle='--')

• Set the line width.
plt.plot(x, np.sin(x), linewidth=3)

• Set the marker style for points.
plt.plot(x, np.sin(x), marker='o')

• Set the transparency (alpha).
plt.hist(data, alpha=0.5)

• Use a predefined style.
plt.style.use('ggplot')

• Fill the area between two curves.
plt.fill_between(x, np.sin(x), np.cos(x), alpha=0.2)

• Create an error bar plot.
y_err = 0.2 * np.ones_like(x)
plt.errorbar(x, np.sin(x), yerr=y_err)

• Add a horizontal line.
plt.axhline(y=0, color='k', linestyle='-')

• Add a vertical line.
plt.axvline(x=np.pi, color='k', linestyle='-')

• Add a colorbar for plots like imshow or scatter.
plt.colorbar(label='Magnitude')


V. Subplots (Object-Oriented Approach)

• Create a figure and a grid of subplots (preferred method).