This media is not supported in your browser
VIEW IN TELEGRAM
How do transformers work? Learn it by hand 👇
𝗪𝗮𝗹𝗸𝘁𝗵𝗿𝗼𝘂𝗴𝗵
[1] Given
↳ Input features from the previous block (5 positions)
[2] Attention
↳ Feed all 5 features to a query-key attention module (QK) to obtain an attention weight matrix (A). I will skip the details of this module. In a follow-up post I will unpack this module.
[3] Attention Weighting
↳ Multiply the input features with the attention weight matrix to obtain attention weighted features (Z). Note that there are still 5 positions.
↳ The effect is to combine features across positions (horizontally), in this case, X1 := X1 + X2, X2 := X2 + X3....etc.
[4] FFN: First Layer
↳ Feed all 5 attention weighted features into the first layer.
↳ Multiply these features with the weights and biases.
↳ The effect is to combine features across feature dimensions (vertically).
↳ The dimensionality of each feature is increased from 3 to 4.
↳ Note that each position is processed by the same weight matrix. This is what the term "position-wise" is referring to.
↳ Note that the FFN is essentially a multi layer perceptron.
[5] ReLU
↳ Negative values are set to zeros by ReLU.
[6] FFN: Second Layer
↳ Feed all 5 features (d=3) into the second layer.
↳ The dimensionality of each feature is decreased from 4 back to 3.
↳ The output is fed to the next block to repeat this process.
↳ Note that the next block would have a completely separate set of parameters.
#ai #tranformers #genai #learning
💯 BEST DATA SCIENCE CHANNELS ON TELEGRAM 🌟
𝗪𝗮𝗹𝗸𝘁𝗵𝗿𝗼𝘂𝗴𝗵
[1] Given
↳ Input features from the previous block (5 positions)
[2] Attention
↳ Feed all 5 features to a query-key attention module (QK) to obtain an attention weight matrix (A). I will skip the details of this module. In a follow-up post I will unpack this module.
[3] Attention Weighting
↳ Multiply the input features with the attention weight matrix to obtain attention weighted features (Z). Note that there are still 5 positions.
↳ The effect is to combine features across positions (horizontally), in this case, X1 := X1 + X2, X2 := X2 + X3....etc.
[4] FFN: First Layer
↳ Feed all 5 attention weighted features into the first layer.
↳ Multiply these features with the weights and biases.
↳ The effect is to combine features across feature dimensions (vertically).
↳ The dimensionality of each feature is increased from 3 to 4.
↳ Note that each position is processed by the same weight matrix. This is what the term "position-wise" is referring to.
↳ Note that the FFN is essentially a multi layer perceptron.
[5] ReLU
↳ Negative values are set to zeros by ReLU.
[6] FFN: Second Layer
↳ Feed all 5 features (d=3) into the second layer.
↳ The dimensionality of each feature is decreased from 4 back to 3.
↳ The output is fed to the next block to repeat this process.
↳ Note that the next block would have a completely separate set of parameters.
#ai #tranformers #genai #learning
Please open Telegram to view this post
VIEW IN TELEGRAM
👍7❤4
Forwarded from Machine Learning with Python
Found an easy way to learn math for ML: Mathematics for Machine Learning 🎓📚
This is a curated collection on GitHub, including books, research papers, video lectures, and basic materials on math for studying and reviewing the mathematical foundations of machine learning. 📖📊
It helps build a stronger knowledge base by bringing together trusted resources around topics that machine learning engineers constantly encounter: linear algebra, mathematical analysis, probability theory, statistics, information theory, matrix calculus, and deep learning mathematics. 🧮🤖
Free public repository on GitHub. 💻✨
https://github.com/dair-ai/Mathematics-for-ML
#MachineLearning #Mathematics #DataScience #Learning #GitHub #AI
This is a curated collection on GitHub, including books, research papers, video lectures, and basic materials on math for studying and reviewing the mathematical foundations of machine learning. 📖📊
It helps build a stronger knowledge base by bringing together trusted resources around topics that machine learning engineers constantly encounter: linear algebra, mathematical analysis, probability theory, statistics, information theory, matrix calculus, and deep learning mathematics. 🧮🤖
Free public repository on GitHub. 💻✨
https://github.com/dair-ai/Mathematics-for-ML
#MachineLearning #Mathematics #DataScience #Learning #GitHub #AI
GitHub
GitHub - dair-ai/Mathematics-for-ML: 🧮 A collection of resources to learn mathematics for machine learning
🧮 A collection of resources to learn mathematics for machine learning - dair-ai/Mathematics-for-ML
❤6
Don't learn ML by randomly jumping through tutorials. 🚫📚
DS-ML Bootcamp is a public repository for a Data Science and machine learning course for beginners who want a structured path from zero to practical projects. 🚀📊
It helps transition from installation and concepts to practical ML work, organizing lessons, assignments, code examples, datasets, and solutions around the main machine learning workflow. 🛠️🧠
Key features:
- End-to-end workflow - covers data collection, preprocessing, train/test split, model selection, training, evaluation, and deployment 🔄📈
- Lesson-based structure - starts with tools/setup, Data Science, ML, data fundamentals, and regression 📚🧮
- Practical materials - assignments give learners structured tasks, not just reading notes ✍️✅
- Code + datasets - Python examples and raw CSV datasets included for exercises 🐍📂
- Set up for repetition - the README says you can clone the repository and use Jupyter or VS Code while going through lessons 💻🔁
Free public repository on GitHub. 🆓
https://github.com/goobolabs/ds-ml-bootcamp
#MachineLearning #DataScience #Coding #Python #AI #Learning
✨ Join Best TG Channels https://t.iss.one/addlist/0f6vfFbEMdAwODBk
⭐️ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
DS-ML Bootcamp is a public repository for a Data Science and machine learning course for beginners who want a structured path from zero to practical projects. 🚀📊
It helps transition from installation and concepts to practical ML work, organizing lessons, assignments, code examples, datasets, and solutions around the main machine learning workflow. 🛠️🧠
Key features:
- End-to-end workflow - covers data collection, preprocessing, train/test split, model selection, training, evaluation, and deployment 🔄📈
- Lesson-based structure - starts with tools/setup, Data Science, ML, data fundamentals, and regression 📚🧮
- Practical materials - assignments give learners structured tasks, not just reading notes ✍️✅
- Code + datasets - Python examples and raw CSV datasets included for exercises 🐍📂
- Set up for repetition - the README says you can clone the repository and use Jupyter or VS Code while going through lessons 💻🔁
Free public repository on GitHub. 🆓
https://github.com/goobolabs/ds-ml-bootcamp
#MachineLearning #DataScience #Coding #Python #AI #Learning
✨ Join Best TG Channels https://t.iss.one/addlist/0f6vfFbEMdAwODBk
⭐️ Join Our WhatsApp Channel https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
GitHub
GitHub - goobolabs/ds-ml-bootcamp: Data Science and Machine Learning Bootcamp. (Jun - 2026)
Data Science and Machine Learning Bootcamp. (Jun - 2026) - goobolabs/ds-ml-bootcamp
❤1