Machine Learning
39.4K subscribers
4.35K photos
40 videos
50 files
1.42K links
Real Machine Learning โ€” simple, practical, and built on experience.
Learn step by step with clear explanations and working code.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
๐Ÿ›  ๐๐ž๐ฒ๐จ๐ง๐ ๐ญ๐ก๐ž ๐†๐ซ๐š๐๐ข๐ž๐ง๐ญ: ๐“๐ก๐ž ๐Œ๐š๐ญ๐ก๐ž๐ฆ๐š๐ญ๐ข๐œ๐ฌ ๐๐ž๐ก๐ข๐ง๐ ๐‹๐จ๐ฌ๐ฌ ๐…๐ฎ๐ง๐œ๐ญ๐ข๐จ๐ง๐ฌ

ML engineers often treat loss functions as โ€œset-and-forgetโ€ hyperparameters. But the loss is not just a training detail; it is the mathematical statement of what the model is supposed to care about.

โžก๏ธ In ๐ซ๐ž๐ ๐ซ๐ž๐ฌ๐ฌ๐ข๐จ๐ง, ๐Œ๐’๐„ pushes the model to reduce large errors aggressively, which makes it sensitive to outliers, while ๐Œ๐€๐„ treats all errors more evenly and is often more robust.
โ†ณ ๐‡๐ฎ๐›๐ž๐ซ ๐ฅ๐จ๐ฌ๐ฌ sits between the two, using squared error for small deviations and absolute error for larger ones.
โ†ณ ๐๐ฎ๐š๐ง๐ญ๐ข๐ฅ๐ž ๐ฅ๐จ๐ฌ๐ฌ becomes useful when the goal is not a single prediction, but an interval or asymmetric risk, and ๐๐จ๐ข๐ฌ๐ฌ๐จ๐ง ๐ฅ๐จ๐ฌ๐ฌ fits naturally when the target is a count or rate.
โžก๏ธ In ๐œ๐ฅ๐š๐ฌ๐ฌ๐ข๐Ÿ๐ข๐œ๐š๐ญ๐ข๐จ๐ง, ๐‚๐ซ๐จ๐ฌ๐ฌ-๐„๐ง๐ญ๐ซ๐จ๐ฉ๐ฒ remains the core objective because it trains the model to produce good probabilities, not just correct labels.
โ†ณ ๐๐ข๐ง๐š๐ซ๐ฒ ๐‚๐ซ๐จ๐ฌ๐ฌ-๐„๐ง๐ญ๐ซ๐จ๐ฉ๐ฒ is the natural choice for two-class or multi-label settings, while ๐‚๐š๐ญ๐ž๐ ๐จ๐ซ๐ข๐œ๐š๐ฅ ๐‚๐ซ๐จ๐ฌ๐ฌ-๐„๐ง๐ญ๐ซ๐จ๐ฉ๐ฒ extends that idea to multi-class softmax outputs.
โ†ณ ๐Š๐‹ ๐ƒ๐ข๐ฏ๐ž๐ซ๐ ๐ž๐ง๐œ๐ž is especially important when the task involves matching distributions, such as distillation, variational inference, or probabilistic modeling.
โ†ณ ๐‡๐ข๐ง๐ ๐ž ๐ฅ๐จ๐ฌ๐ฌ and squared hinge loss reflect the margin-based logic behind SVM-style learning, and focal loss is particularly valuable when easy examples dominate and the hard cases need more attention.
โžก๏ธ In ๐ฌ๐ฉ๐ž๐œ๐ข๐š๐ฅ๐ข๐ณ๐ž๐ ๐ญ๐š๐ฌ๐ค๐ฌ, the choice of loss becomes even more meaningful.
โ†ณ ๐ƒ๐ข๐œ๐ž ๐ฅ๐จ๐ฌ๐ฌ works well in segmentation because it focuses on overlap and helps with class imbalance.
โ†ณ ๐†๐€๐ ๐ฅ๐จ๐ฌ๐ฌ drives the generatorโ€“discriminator game in adversarial learning.
โ†ณ ๐“๐ซ๐ข๐ฉ๐ฅ๐ž๐ญ ๐ฅ๐จ๐ฌ๐ฌ and contrastive loss shape embedding spaces so that similarity is learned directly.
โ†ณ ๐‚๐“๐‚ ๐ฅ๐จ๐ฌ๐ฌ solves alignment problems in sequence tasks like speech recognition and OCR, where labels are unsegmented.
โ†ณ ๐‚๐จ๐ฌ๐ข๐ง๐ž ๐ฉ๐ซ๐จ๐ฑ๐ข๐ฆ๐ข๐ญ๐ฒ is useful when vector direction matters more than magnitude.

๐Ÿ’ก ๐‘ป๐’‰๐’† ๐’ƒ๐’Š๐’ˆ๐’ˆ๐’†๐’“ ๐’•๐’‚๐’Œ๐’†๐’‚๐’˜๐’‚๐’š: ๐‘‡โ„Ž๐‘’ ๐‘™๐‘œ๐‘ ๐‘  ๐‘“๐‘ข๐‘›๐‘๐‘ก๐‘–๐‘œ๐‘› ๐‘’๐‘›๐‘๐‘œ๐‘‘๐‘’๐‘  ๐‘ฆ๐‘œ๐‘ข๐‘Ÿ ๐‘Ž๐‘ ๐‘ ๐‘ข๐‘š๐‘๐‘ก๐‘–๐‘œ๐‘›๐‘  ๐‘Ž๐‘๐‘œ๐‘ข๐‘ก ๐‘กโ„Ž๐‘’ ๐‘๐‘Ÿ๐‘œ๐‘๐‘™๐‘’๐‘š. ๐ผ๐‘ก ๐‘Ž๐‘“๐‘“๐‘’๐‘๐‘ก๐‘  ๐‘๐‘œ๐‘›๐‘ฃ๐‘’๐‘Ÿ๐‘”๐‘’๐‘›๐‘๐‘’, ๐‘ ๐‘ก๐‘Ž๐‘๐‘–๐‘™๐‘–๐‘ก๐‘ฆ, ๐‘๐‘Ž๐‘™๐‘–๐‘๐‘Ÿ๐‘Ž๐‘ก๐‘–๐‘œ๐‘›, ๐‘Ÿ๐‘œ๐‘๐‘ข๐‘ ๐‘ก๐‘›๐‘’๐‘ ๐‘ , ๐‘Ž๐‘›๐‘‘ ๐‘”๐‘’๐‘›๐‘’๐‘Ÿ๐‘Ž๐‘™๐‘–๐‘ง๐‘Ž๐‘ก๐‘–๐‘œ๐‘›; ๐‘ ๐‘œ๐‘š๐‘’๐‘ก๐‘–๐‘š๐‘’๐‘  ๐‘—๐‘ข๐‘ ๐‘ก ๐‘Ž๐‘  ๐‘š๐‘ข๐‘โ„Ž ๐‘Ž๐‘  ๐‘กโ„Ž๐‘’ ๐‘Ž๐‘Ÿ๐‘โ„Ž๐‘–๐‘ก๐‘’๐‘๐‘ก๐‘ข๐‘Ÿ๐‘’ ๐‘–๐‘ก๐‘ ๐‘’๐‘™๐‘“.
โžœ ๐‘†๐‘œ ๐‘กโ„Ž๐‘’ ๐‘Ÿ๐‘’๐‘Ž๐‘™ ๐‘ž๐‘ข๐‘’๐‘ ๐‘ก๐‘–๐‘œ๐‘› ๐‘–๐‘  ๐‘›๐‘œ๐‘ก ๐‘œ๐‘›๐‘™๐‘ฆ โ€œ๐‘Šโ„Ž๐‘–๐‘โ„Ž ๐‘š๐‘œ๐‘‘๐‘’๐‘™ ๐‘ โ„Ž๐‘œ๐‘ข๐‘™๐‘‘ ๐ผ ๐‘ข๐‘ ๐‘’?โ€
โžœ ๐ผ๐‘ก ๐‘–๐‘  ๐‘Ž๐‘™๐‘ ๐‘œ: โ€œ๐‘Šโ„Ž๐‘Ž๐‘ก ๐‘๐‘’โ„Ž๐‘Ž๐‘ฃ๐‘–๐‘œ๐‘Ÿ ๐‘–๐‘  ๐‘กโ„Ž๐‘–๐‘  ๐‘™๐‘œ๐‘ ๐‘  ๐‘’๐‘›๐‘๐‘œ๐‘ข๐‘Ÿ๐‘Ž๐‘”๐‘–๐‘›๐‘”?โ€

https://t.iss.one/MachineLearning9
โค5๐Ÿ‘1๐Ÿ”ฅ1
๐Ÿ”– 10 Stanford courses on AI and ML โ€” with official pages and all materials

โ–ถ๏ธ CS221: Artificial Intelligence
โ–ถ๏ธ CS229: Machine Learning
โ–ถ๏ธ CS229M: Theory of Machine Learning
โ–ถ๏ธ CS230: Deep Learning
โ–ถ๏ธ CS234: Reinforcement Learning
โ–ถ๏ธ CS224N: Natural Language Processing
โ–ถ๏ธ CS231N: Deep Learning for Computer Vision
โ–ถ๏ธ CME295: Large Language Models
โ–ถ๏ธ CS236: Deep Generative Models
โ–ถ๏ธ CS336: Modeling Language from Scratch

They cover the entire spectrum: classic ML, LLM, and generative models โ€” with theory and practice.

tags: #python #ML #LLM #AI

โžก https://t.iss.one/MachineLearning9
Please open Telegram to view this post
VIEW IN TELEGRAM
โค9
Algorithms by Jeff Erickson - one of the best algorithm books out there ๐Ÿ“š.

The illustrations make complex concepts surprisingly easy to follow ๐ŸŽจ. Highly recommend this ๐Ÿ‘.

Link: https://jeffe.cs.illinois.edu/teaching/algorithms/ ๐Ÿ”—

https://t.iss.one/MachineLearning9
โค3๐Ÿ‘3๐Ÿ”ฅ1
Every data professional forgets which statistical test to use. Here's the fix. ๐Ÿ› 

(Bookmark it. Seriously. ๐Ÿ“Œ)

I've been there:
โ†ณ Staring at two datasets wondering which test to run ๐Ÿค”
โ†ณ Googling "t-test vs ANOVA" for the 10th time ๐Ÿ”
โ†ณ Second-guessing myself in an interview ๐Ÿ˜ฐ

Choosing the wrong statistical test can invalidate your findings and lead to flawed conclusions. โš ๏ธ

Here's your quick reference guide:

๐‚๐จ๐ฆ๐ฉ๐š๐ซ๐ข๐ง๐  ๐Œ๐ž๐š๐ง๐ฌ: ๐Ÿ“Š
โ†ณ 2 independent groups โ†’ Independent t-Test
โ†ณ Same group, before/after โ†’ Paired t-Test
โ†ณ 3+ groups โ†’ ANOVA

๐๐จ๐ง-๐๐จ๐ซ๐ฆ๐š๐ฅ ๐ƒ๐š๐ญ๐š: ๐Ÿ“‰
โ†ณ 2 groups โ†’ Mann-Whitney U Test
โ†ณ Paired samples โ†’ Wilcoxon Signed-Rank Test
โ†ณ 3+ groups โ†’ Kruskal-Wallis Test

๐‘๐ž๐ฅ๐š๐ญ๐ข๐จ๐ง๐ฌ๐ก๐ข๐ฉ๐ฌ: ๐Ÿ”—
โ†ณ Linear relationship โ†’ Pearson Correlation
โ†ณ Ranked/non-linear โ†’ Spearman Correlation
โ†ณ Two categorical variables โ†’ Chi-Square Test

๐๐ซ๐ž๐๐ข๐œ๐ญ๐ข๐จ๐ง: ๐Ÿ”ฎ
โ†ณ Continuous outcome โ†’ Linear Regression
โ†ณ Binary outcome (yes/no) โ†’ Logistic Regression

๐•๐š๐ซ๐ข๐š๐ง๐œ๐ž: โš–๏ธ
โ†ณ Compare spread between groups โ†’ Levene's Test / F-Test

Here are 5 resources to help you: ๐Ÿ“š

1. Khan Academy Statistics: https://lnkd.in/statistics-khan
2. StatQuest YouTube Channel: https://lnkd.in/statquest-yt
3. Seeing Theory (Visual Stats): https://lnkd.in/seeing-theory
4. Statistics by Jim Blog: https://lnkd.in/stats-jim
5. OpenIntro Statistics (Free Textbook): https://lnkd.in/openintro-stats
โค2