Hugging Face (Twitter)
RT @srush_nlp: The work Hugging Face does continues to be incredible. Putting in serious effort to make these topics accessible and detailed.
https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbook#introduction
  RT @srush_nlp: The work Hugging Face does continues to be incredible. Putting in serious effort to make these topics accessible and detailed.
https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbook#introduction
Hugging Face (Twitter)
RT @alexinexxx: thank god i’m unemployed so i can take a break from learning cuda & just read this banger hehe https://twitter.com/eliebakouch/status/1983930328751153159#m
  RT @alexinexxx: thank god i’m unemployed so i can take a break from learning cuda & just read this banger hehe https://twitter.com/eliebakouch/status/1983930328751153159#m
Hugging Face (Twitter)
RT @Hesamation: holy shit... Hugging Face cooked again! 🔥
they just dropped a free blog (BOOK) that covers the no-bs reality of building SOTA models. i haven't seen any lab/researcher go into the real decisions behind the LLM research and its nuances. this is literally a gem.
Syllabus:
→ Training compass: why → what → how
→ Every big model starts with a small ablation
→ Designing the model architecture
→ The art of data curation
→ The training marathon
→ Beyond base models — post-training in 2025
→ Infrastructure - the unsung hero
skimming through the blog, this is incredibly detailed just like their ultrascale playbook. i'm gonna read this and share more about it in the coming days.
Read here: https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbook
  RT @Hesamation: holy shit... Hugging Face cooked again! 🔥
they just dropped a free blog (BOOK) that covers the no-bs reality of building SOTA models. i haven't seen any lab/researcher go into the real decisions behind the LLM research and its nuances. this is literally a gem.
Syllabus:
→ Training compass: why → what → how
→ Every big model starts with a small ablation
→ Designing the model architecture
→ The art of data curation
→ The training marathon
→ Beyond base models — post-training in 2025
→ Infrastructure - the unsung hero
skimming through the blog, this is incredibly detailed just like their ultrascale playbook. i'm gonna read this and share more about it in the coming days.
Read here: https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbook
This media is not supported in your browser
    VIEW IN TELEGRAM
  Hugging Face (Twitter)
RT @RisingSayak: With simple changes, I was able to cut down @krea_ai's new real-time video gen's timing from 25.54s to 18.14s 🔥🚀
1. FA3 through `kernels`
2. Regional compilation
3. Selective (FP8) quantization
Notes are in 🧵 below
  RT @RisingSayak: With simple changes, I was able to cut down @krea_ai's new real-time video gen's timing from 25.54s to 18.14s 🔥🚀
1. FA3 through `kernels`
2. Regional compilation
3. Selective (FP8) quantization
Notes are in 🧵 below
Hugging Face (Twitter)
RT @Thom_Wolf: We’ve cooked another one of these 200+ pages practical books on model training that we love to write.
This time it’s on all pretraining and post-training recipes and how to do a training project hyper parameter exploration.
Closing the trilogy of:
1. Building a pretraining dataset with the « FineWeb blog post »
2. Scaling infra GPU cluster with the « Ultrascale Playbook »
3. And now all the training recipes and HP exploration for pre- and post-training with this « Smol Training Playbook »
The HF science team on fire https://twitter.com/eliebakouch/status/1983930328751153159#m
  RT @Thom_Wolf: We’ve cooked another one of these 200+ pages practical books on model training that we love to write.
This time it’s on all pretraining and post-training recipes and how to do a training project hyper parameter exploration.
Closing the trilogy of:
1. Building a pretraining dataset with the « FineWeb blog post »
2. Scaling infra GPU cluster with the « Ultrascale Playbook »
3. And now all the training recipes and HP exploration for pre- and post-training with this « Smol Training Playbook »
The HF science team on fire https://twitter.com/eliebakouch/status/1983930328751153159#m
Hugging Face (Twitter)
RT @Yampeleg: hf are doing god’s work fr https://twitter.com/_lewtun/status/1983929588909797414#m
  RT @Yampeleg: hf are doing god’s work fr https://twitter.com/_lewtun/status/1983929588909797414#m
Hugging Face (Twitter)
RT @novasarc01: many people have asked me about how to keep up with frontier research and new models. this is one of the best gold resource to start with. covers pre-training, post-training, infra, architecture nuances and recent advances. huge respect to the hf team for putting it together. https://twitter.com/eliebakouch/status/1983930328751153159#m
  RT @novasarc01: many people have asked me about how to keep up with frontier research and new models. this is one of the best gold resource to start with. covers pre-training, post-training, infra, architecture nuances and recent advances. huge respect to the hf team for putting it together. https://twitter.com/eliebakouch/status/1983930328751153159#m
This media is not supported in your browser
    VIEW IN TELEGRAM
  Hugging Face (Twitter)
RT @ahadj0: finally got around to implementing the software for teleoping the detachable gripper on top of @LeRobotHF
going to release open source files for it soon
  RT @ahadj0: finally got around to implementing the software for teleoping the detachable gripper on top of @LeRobotHF
going to release open source files for it soon
Hugging Face (Twitter)
RT @ludwigABAP: if @huggingface released a mega book with all 4 massive essays they wrote on training LLMs (ultra scale playbook, eval guidebook, smol training playbook), I’d buy it for an exorbitant amount just because it’s the closest to what most of us have been interested in and/or doing for the last few years
  RT @ludwigABAP: if @huggingface released a mega book with all 4 massive essays they wrote on training LLMs (ultra scale playbook, eval guidebook, smol training playbook), I’d buy it for an exorbitant amount just because it’s the closest to what most of us have been interested in and/or doing for the last few years
Hugging Face (Twitter)
RT @yacinelearning: I quit my job so I can have enough time to read this book btw https://twitter.com/eliebakouch/status/1983930328751153159#m
  RT @yacinelearning: I quit my job so I can have enough time to read this book btw https://twitter.com/eliebakouch/status/1983930328751153159#m
Hugging Face (Twitter)
RT @TheAhmadOsman: yesterday, Hugging Face dropped a 214-page MASTERCLASS on how to train LLMs
> it’s called The Smol Training Playbook
> and if want to learn how to train LLMs,
> this GIFT is for you
> this training bible walks you through the ENTIRE pipeline
> covers every concept that matters from why you train,
> to what you train, to how you actually pull it off
> from pre-training, to mid-training, to post-training
> it turns vague buzzwords into step-by-step decisions
> architecture, tokenization, data strategy, and infra
> highlights the real-world gotchas
> instabilities, scaling headaches, debugging nightmares
> distills lessons from building actual
> state-of-the-art LLMs, not just toy models
how modern transformer models are actually built
> tokenization: the secret foundation of every LLM
> tokenizer fundamentals
> vocabulary size
> byte pair encoding
> custom vs existing tokenizers
> all the modern attention mechanisms are here
>...
Перейти на оригинальный пост
  RT @TheAhmadOsman: yesterday, Hugging Face dropped a 214-page MASTERCLASS on how to train LLMs
> it’s called The Smol Training Playbook
> and if want to learn how to train LLMs,
> this GIFT is for you
> this training bible walks you through the ENTIRE pipeline
> covers every concept that matters from why you train,
> to what you train, to how you actually pull it off
> from pre-training, to mid-training, to post-training
> it turns vague buzzwords into step-by-step decisions
> architecture, tokenization, data strategy, and infra
> highlights the real-world gotchas
> instabilities, scaling headaches, debugging nightmares
> distills lessons from building actual
> state-of-the-art LLMs, not just toy models
how modern transformer models are actually built
> tokenization: the secret foundation of every LLM
> tokenizer fundamentals
> vocabulary size
> byte pair encoding
> custom vs existing tokenizers
> all the modern attention mechanisms are here
>...
Перейти на оригинальный пост
Hugging Face (Twitter)
RT @_xjdr: What an incredible resource. Anyone interested in pretrainig should read this carefully https://twitter.com/eliebakouch/status/1983930328751153159#m
  RT @_xjdr: What an incredible resource. Anyone interested in pretrainig should read this carefully https://twitter.com/eliebakouch/status/1983930328751153159#m
This media is not supported in your browser
    VIEW IN TELEGRAM
  Hugging Face (Twitter)
RT @angrypenguinPNG: Open-source might have just caught up to NanoBanana
dx8152 trained a Qwen Edit LoRA that nails multi-angle shots.
Here’s a product video I made using it with Kling Start/End frames. Link to LoRA below 👇
  RT @angrypenguinPNG: Open-source might have just caught up to NanoBanana
dx8152 trained a Qwen Edit LoRA that nails multi-angle shots.
Here’s a product video I made using it with Kling Start/End frames. Link to LoRA below 👇
Hugging Face (Twitter)
RT @Thom_Wolf: Monday morning read: a fascinating deep dive in recent Chinese chips developments as well as the coming co-evolution with LLM builders
fresh analysis from the HF team
=> https://huggingface.co/blog/huggingface/shifting-compute-landscape
  RT @Thom_Wolf: Monday morning read: a fascinating deep dive in recent Chinese chips developments as well as the coming co-evolution with LLM builders
fresh analysis from the HF team
=> https://huggingface.co/blog/huggingface/shifting-compute-landscape
Hugging Face (Twitter)
RT @andimarafioti: We grew up with gaming consoles. Our kids will grow up with robots. Fucking incredible.
  RT @andimarafioti: We grew up with gaming consoles. Our kids will grow up with robots. Fucking incredible.
Hugging Face (Twitter)
RT @ClementDelangue: No excuse anymore not to train your own models! This 200+ pages with full transparency. Let's go open-source AI! https://twitter.com/eliebakouch/status/1983930328751153159#m
  RT @ClementDelangue: No excuse anymore not to train your own models! This 200+ pages with full transparency. Let's go open-source AI! https://twitter.com/eliebakouch/status/1983930328751153159#m
Hugging Face (Twitter)
RT @Gradio: 🎂 MCP turns 1 on Nov 25
We're throwing it the biggest birthday party in AI history
@AnthropicAI + Gradio are bringing together thousands of developers for 17 days of building MCP servers and Agentic apps
$500K+ free credits for participants. $17.5K+ in cash prizes.
Nov 14-30
  RT @Gradio: 🎂 MCP turns 1 on Nov 25
We're throwing it the biggest birthday party in AI history
@AnthropicAI + Gradio are bringing together thousands of developers for 17 days of building MCP servers and Agentic apps
$500K+ free credits for participants. $17.5K+ in cash prizes.
Nov 14-30
Hugging Face (Twitter)
RT @abidlabs: Trackio will soon support Images in Tables. What other features do you need?
https://github.com/gradio-app/trackio/pull/328
  RT @abidlabs: Trackio will soon support Images in Tables. What other features do you need?
https://github.com/gradio-app/trackio/pull/328
This media is not supported in your browser
    VIEW IN TELEGRAM
  Hugging Face (Twitter)
RT @yukangchen_: We open-sourced QeRL — Quantization-enhanced Reinforcement Learning !
🧠 4-bit quantized RL training
💪 Train a 32B LLM on a single H100 GPU
⚙️ 1.7× faster overall training
🎯 Accuracy on par with bfloat16-level accuracy
🔥 Supports NVFP4 quantization format
Moreover, we show that quantization helps exploration in RL training.
Paper: https://huggingface.co/papers/2510.11696
Code: github.com/NVlabs/QeRL
#NVIDIA #AIResearch #ReinforcementLearning #Quantization #LLM #EfficientAI
  RT @yukangchen_: We open-sourced QeRL — Quantization-enhanced Reinforcement Learning !
🧠 4-bit quantized RL training
💪 Train a 32B LLM on a single H100 GPU
⚙️ 1.7× faster overall training
🎯 Accuracy on par with bfloat16-level accuracy
🔥 Supports NVFP4 quantization format
Moreover, we show that quantization helps exploration in RL training.
Paper: https://huggingface.co/papers/2510.11696
Code: github.com/NVlabs/QeRL
#NVIDIA #AIResearch #ReinforcementLearning #Quantization #LLM #EfficientAI