This media is not supported in your browser
VIEW IN TELEGRAM
𝐕𝐢𝐬𝐮𝐚𝐥 𝐛𝐥𝐨𝐠 on Vision Transformers is live.
https://vizuaranewsletter.com/p/vision-transformers?r=5b5pyd&utm_campaign=post&utm_medium=web
Learn how ViT works from the ground up, and fine-tune one on a real classification dataset.
𝐒𝐨𝐦𝐞 𝐑𝐞𝐬𝐨𝐮𝐫𝐜𝐞𝐬
ViT paper dissection
https://youtube.com/watch?v=U_sdodhcBC4
Build ViT from Scratch
https://youtube.com/watch?v=ZRo74xnN2SI
Original Paper
https://arxiv.org/abs/2010.11929
https://t.iss.one/CodeProgrammer
https://vizuaranewsletter.com/p/vision-transformers?r=5b5pyd&utm_campaign=post&utm_medium=web
Learn how ViT works from the ground up, and fine-tune one on a real classification dataset.
CNNs process images through small sliding filters. Each filter only sees a tiny local region, and the model has to stack many layers before distant parts of an image can even talk to each other.
Vision Transformers threw that whole approach out.
ViT chops an image into patches, treats each patch like a token, and runs self-attention across the full sequence.
Every patch can attend to every other patch from the very first layer. No stacking required.
That global view from layer one is what made ViT surpass CNNs on large-scale benchmarks.
𝐖𝐡𝐚𝐭 𝐭𝐡𝐞 𝐛𝐥𝐨𝐠 𝐜𝐨𝐯𝐞𝐫𝐬:
- Introduction to Vision Transformers and comparison with CNNs
- Adapting transformers to images: patch embeddings and flattening
- Positional encodings in Vision Transformers
- Encoder-only structure for classification
- Benefits and drawbacks of ViT
- Real-world applications of Vision Transformers
- Hands-on: fine-tuning ViT for image classification
The Image below shows
Self-attention connects every pixel to every other pixel at once. Convolution only sees a small local window. That's why ViT captures things CNNs miss, like the optical illusion painting where distant patches form a hidden face.
The architecture is simple. Split image into patches, flatten them into embeddings (like words in a sentence), run them through a Transformer encoder, and the class token collects info from all patches for the final prediction. Patch in, class out.
Inside attention: each patch (query) compares itself to all other patches (keys), softmax gives attention weights, and the weighted sum of values produces a new representation aware of the full image, visualizes what the CLS token actually attends to through attention heatmaps.
The second half of the blog is hands-on code. I fine-tuned ViT-Base from google (86M params) on the Oxford-IIIT Pet dataset, 37 breeds, ~7,400 images.
𝐁𝐥𝐨𝐠 𝐋𝐢𝐧𝐤
https://vizuaranewsletter.com/p/vision-transformers?r=5b5pyd&utm_campaign=post&utm_medium=web
𝐒𝐨𝐦𝐞 𝐑𝐞𝐬𝐨𝐮𝐫𝐜𝐞𝐬
ViT paper dissection
https://youtube.com/watch?v=U_sdodhcBC4
Build ViT from Scratch
https://youtube.com/watch?v=ZRo74xnN2SI
Original Paper
https://arxiv.org/abs/2010.11929
https://t.iss.one/CodeProgrammer
❤5
Forwarded from Machine Learning with Python
Follow the Machine Learning with Python channel on WhatsApp: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
❤5
📱 TorchCode — a PyTorch training tool for preparing for ML interviews
40 tasks for implementing operators and architectures that are actually asked in interviews. Automatic checking, hints, and reference solutions — all in the browser without installation.
If you're preparing for an ML interview, it's useful to go through at least half of them.
Link: https://github.com/duoan/TorchCode
tags: #useful #pytorch
https://t.iss.one/CodeProgrammer✅
40 tasks for implementing operators and architectures that are actually asked in interviews. Automatic checking, hints, and reference solutions — all in the browser without installation.
If you're preparing for an ML interview, it's useful to go through at least half of them.
Link: https://github.com/duoan/TorchCode
tags: #useful #pytorch
https://t.iss.one/CodeProgrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
❤9
SVFR — a full-fledged framework for restoring faces in videos.
It can:
Essentially, the model takes old or damaged videos and makes them "as if they were shot yesterday". And it's free and open-source.
1. Create an environment
conda create -n svfr python=3.9 -y
conda activate svfr
2. Install PyTorch (for your CUDA)
pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2
3. Install dependencies
pip install -r requirements.txt
4. Download models
conda install git-lfs
git lfs install
git clone https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt models/stable-video-diffusion-img2vid-xt
5. Start processing videos
python infer.py \
--config config/infer.yaml \
--task_ids 0 \
--input_path input.mp4 \
--output_dir results/ \
--crop_face_region
Where task_ids:
*
0 — face enhancement*
1 — colorization*
2 — redrawing damageAn ideal tool if:
#python #soft #github
https://t.iss.one/CodeProgrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
❤5👍4
Please open Telegram to view this post
VIEW IN TELEGRAM
❤7👍5🆒2🎉1
A huge cheat sheet for Python, Django, Plotly, Matplotlib, P.pdf
741 KB
Many topics are covered inside:
https://t.iss.one/CodeProgrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
❤9👍1
Not just another "what is a neural network" course — this is about how to build combat-ready ML systems around models.
What's inside:
▶️ Building autograd, optimizers, attention, and mini-PyTorch from scratch;
▶️ Batches, computational accuracy, architectures, and training;
▶️ Performance optimization, hardware acceleration, and benchmarking.
You can read the book and the code for free right now.
https://github.com/harvard-edge/cs249r_book
Please open Telegram to view this post
VIEW IN TELEGRAM
❤11🎉6👍3🔥1
📱 Python enthusiasts, this is for you — 15 BEST REPOSITORIES on GitHub for learning Python
▶️ Awesome Python — https://github.com/vinta/awesome-python
— the largest and most authoritative collection of frameworks, libraries, and resources for Python — a must-save
▶️ TheAlgorithms/Python — https://github.com/TheAlgorithms/Python
— a huge collection of algorithms and data structures written in Python
▶️ Project-Based-Learning — https://github.com/practical-tutorials/project-based-learning
— learning Python (and not only) through real projects
▶️ Real Python Guide — https://github.com/realpython/python-guide
— a high-quality guide to the Python ecosystem, tools, and best practices
▶️ Materials from Real Python — https://github.com/realpython/materials
— a collection of code and projects for Real Python articles and courses
▶️ Learn Python — https://github.com/trekhleb/learn-python
— a reference with explanations, examples, and exercises
▶️ Learn Python 3 — https://github.com/jerry-git/learn-python3
— a convenient guide to modern Python 3 with tasks
▶️ Python Reference — https://github.com/rasbt/python_reference
— cheat sheets, scripts, and useful tips from one of the most respected Python authors
▶️ 30-Days-Of-Python — https://github.com/Asabeneh/30-Days-Of-Python
— a 30-day challenge: from syntax to more complex topics
▶️ Python Programming Exercises — https://github.com/zhiwehu/Python-programming-exercises
— 100+ Python tasks with answers
▶️ Coding Problems — https://github.com/MTrajK/coding-problems
— tasks on algorithms and data structures, including for preparation for interviews
▶️ Projects — https://github.com/karan/Projects
— a list of ideas for pet projects (not just Python). Great for practice
▶️ 100-Days-Of-ML-Code — https://github.com/Avik-Jain/100-Days-Of-ML-Code
— machine learning in Python in the format of a challenge
▶️ 30-Seconds-of-Python — https://github.com/30-seconds/30-seconds-of-python
— useful snippets and tricks for everyday tasks
▶️ Geekcomputers/Python — https://github.com/geekcomputers/Python
— various scripts: from working with the network to automation tasks
React ♥️ for more posts like this💛
▶️ Awesome Python — https://github.com/vinta/awesome-python
— the largest and most authoritative collection of frameworks, libraries, and resources for Python — a must-save
▶️ TheAlgorithms/Python — https://github.com/TheAlgorithms/Python
— a huge collection of algorithms and data structures written in Python
▶️ Project-Based-Learning — https://github.com/practical-tutorials/project-based-learning
— learning Python (and not only) through real projects
▶️ Real Python Guide — https://github.com/realpython/python-guide
— a high-quality guide to the Python ecosystem, tools, and best practices
▶️ Materials from Real Python — https://github.com/realpython/materials
— a collection of code and projects for Real Python articles and courses
▶️ Learn Python — https://github.com/trekhleb/learn-python
— a reference with explanations, examples, and exercises
▶️ Learn Python 3 — https://github.com/jerry-git/learn-python3
— a convenient guide to modern Python 3 with tasks
▶️ Python Reference — https://github.com/rasbt/python_reference
— cheat sheets, scripts, and useful tips from one of the most respected Python authors
▶️ 30-Days-Of-Python — https://github.com/Asabeneh/30-Days-Of-Python
— a 30-day challenge: from syntax to more complex topics
▶️ Python Programming Exercises — https://github.com/zhiwehu/Python-programming-exercises
— 100+ Python tasks with answers
▶️ Coding Problems — https://github.com/MTrajK/coding-problems
— tasks on algorithms and data structures, including for preparation for interviews
▶️ Projects — https://github.com/karan/Projects
— a list of ideas for pet projects (not just Python). Great for practice
▶️ 100-Days-Of-ML-Code — https://github.com/Avik-Jain/100-Days-Of-ML-Code
— machine learning in Python in the format of a challenge
▶️ 30-Seconds-of-Python — https://github.com/30-seconds/30-seconds-of-python
— useful snippets and tricks for everyday tasks
▶️ Geekcomputers/Python — https://github.com/geekcomputers/Python
— various scripts: from working with the network to automation tasks
React ♥️ for more posts like this
Please open Telegram to view this post
VIEW IN TELEGRAM
❤17👍3🔥1🎉1
Classical filters & convolution: The heart of computer vision
Before Deep Learning exploded onto the scene, traditional computer vision centered on filters. Filters were small, hand-engineered matrices that you convolved with an image to detect specific features like edges, corners, or textures. In this article, we will dive into the details of classical filters and convolution operation - how they work, why they matter, and how to implement them.
More: https://www.vizuaranewsletter.com/p/classical-filters-and-convolution
Before Deep Learning exploded onto the scene, traditional computer vision centered on filters. Filters were small, hand-engineered matrices that you convolved with an image to detect specific features like edges, corners, or textures. In this article, we will dive into the details of classical filters and convolution operation - how they work, why they matter, and how to implement them.
More: https://www.vizuaranewsletter.com/p/classical-filters-and-convolution
🔥5❤4🎉1