Machine Learning with Python
67.8K subscribers
1.42K photos
118 videos
192 files
1.13K links
Learn Machine Learning with hands-on Python tutorials, real-world code examples, and clear explanations for researchers and developers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Most AI engineers never fully understood the maths behind what they build! ๐Ÿคฏ๐Ÿงฎ

This is an open, unconventional textbook covering maths, CS, and AI from the ground up, written for curious practitioners who want to deeply understand the field, not just survive an interview. ๐Ÿ“˜โœจ

Over 7 years of AI/ML experience distilled into intuition-first, no hand-waving explanations that connect the concepts in a way that actually sticks. ๐Ÿง ๐Ÿ”—

What it covers:
- Vectors, linear algebra, calculus, and optimization ๐Ÿ“๐Ÿ“‰
- Classical machine learning and deep learning ๐Ÿค–
- Transformer architectures and LLMs ๐Ÿฆ„
- Efficient architectures, quantization, and distillation โšก๏ธ
- CUDA, GPU programming, and SIMD ๐Ÿš€
- AI inference and deployment ๐ŸŒ

Ships with an MCP server so Claude Code, Cursor, and any MCP-compatible agent can use the compendium as a live knowledge base during development. You only need elementary maths and basic Python to start. ๐Ÿ๐Ÿ—

Repo: https://github.com/HenryNdubuaku/maths-cs-ai-compendium ๐Ÿ”—

https://t.iss.one/CodeProgrammer
โค7
Overfitting and Generalisation in ML.pdf
380.5 KB
Overfitting and Generalization in Machine Learning

My ML model had 100% accuracy.
And was completely useless.

That's not a paradox; that's overfitting.

The model didn't learn. It memorized.

Here's the mathematical core most tutorials skip:

E[loss] = Biasยฒ + Variance + ฯƒยฒ

โ†’ Biasยฒ = too simple โ†’ Underfitting
โ†’ Variance = too complex โ†’ Overfitting
โ†’ ฯƒยฒ = irreducible โ†’ always there

What this actually means in practice:

โ†’ A degree-9 polynomial on 6 data points hits Rยฒ = 1.0 and oscillates wildly between them
โ†’ A linear model on sine-wave data has near-zero variance โ€” but massive bias
โ†’ The optimal model isn't the simplest. Not the most complex. It's the one minimizing Biasยฒ + Variance

And the generalization gap?

Formally defined as:
gen_gap(f) = R(f) โˆ’ R_emp(f)

When this value is โ‰ซ 0, your model is learning noise, not signal.

The fix isn't "collect more data and hope."
The fix is regularization, which I derive fully in my paper: L1, L2, Dropout, and Early Stopping, all from first principles.

Which regularization strategy do you use most and why?

https://t.iss.one/CodeProgrammer
โค6