Hugging Face (Twitter)
RT @abacaj: Pretty bullish on LoRA fine tuning again. Idk if it’s because the models are so much better today that they adapt much more easily or what... someone should study this
RT @abacaj: Pretty bullish on LoRA fine tuning again. Idk if it’s because the models are so much better today that they adapt much more easily or what... someone should study this
Hugging Face (Twitter)
RT @mervenoyann: meet-up next month at @huggingface Paris office with our friends at @bfl_ml and @fal 🇫🇷🥖🤗
talks, networking, food, swag 🕺🏻are you in? 🤝
RT @mervenoyann: meet-up next month at @huggingface Paris office with our friends at @bfl_ml and @fal 🇫🇷🥖🤗
talks, networking, food, swag 🕺🏻are you in? 🤝
Hugging Face (Twitter)
RT @pollenrobotics: The first Reachy Mini units are on their way! 🚀
Our Community Beta Program is starting soon — selected testers will receive their robots to help us improve docs, software & explore new features.
Lite & Wireless versions ship around Dec 15!
RT @pollenrobotics: The first Reachy Mini units are on their way! 🚀
Our Community Beta Program is starting soon — selected testers will receive their robots to help us improve docs, software & explore new features.
Lite & Wireless versions ship around Dec 15!
Hugging Face (Twitter)
RT @ClementDelangue: It's easier than ever to train, optimize and run your own models thanks to open-source (versus delegating all learning, control, capabilities to black-box APIs).
Cool to see @karpathy proving it once more by leveraging @huggingface fineweb (https://huggingface.co/datasets/karpathy/fineweb-edu-100b-shuffle)! https://twitter.com/karpathy/status/1977755427569111362#m
RT @ClementDelangue: It's easier than ever to train, optimize and run your own models thanks to open-source (versus delegating all learning, control, capabilities to black-box APIs).
Cool to see @karpathy proving it once more by leveraging @huggingface fineweb (https://huggingface.co/datasets/karpathy/fineweb-edu-100b-shuffle)! https://twitter.com/karpathy/status/1977755427569111362#m
Hugging Face (Twitter)
RT @BdsLoick: New blog post analyzing the top 50 entities with the most downloaded models on @huggingface 🤗!
The purpose here is to get an idea of the profile of the models with the greatest impact in open source (we are not interested in closed models here!).
Some key findings:
RT @BdsLoick: New blog post analyzing the top 50 entities with the most downloaded models on @huggingface 🤗!
The purpose here is to get an idea of the profile of the models with the greatest impact in open source (we are not interested in closed models here!).
Some key findings:
Hugging Face (Twitter)
RT @karpathy: Excited to release new repo: nanochat!
(it's among the most unhinged I've written).
Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI.
It weighs ~8,000 lines of imo quite clean code to:
- Train the tokenizer using a new Rust implementation
- Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics
- Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use.
- SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval)
- RL the model optionally on GSM8K with "GRPO"
- Efficient inference the model in an Engine with KV cache,...
Перейти на оригинальный пост
RT @karpathy: Excited to release new repo: nanochat!
(it's among the most unhinged I've written).
Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI.
It weighs ~8,000 lines of imo quite clean code to:
- Train the tokenizer using a new Rust implementation
- Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics
- Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use.
- SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval)
- RL the model optionally on GSM8K with "GRPO"
- Efficient inference the model in an Engine with KV cache,...
Перейти на оригинальный пост
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)
RT @maximelabonne: New LFM2 release 🥳
It's a Japanese PII extractor with only 350M parameters.
It's extremely fast and on par with GPT-5 (!) in terms of quality.
Check it out, it's available today on @huggingface!
RT @maximelabonne: New LFM2 release 🥳
It's a Japanese PII extractor with only 350M parameters.
It's extremely fast and on par with GPT-5 (!) in terms of quality.
Check it out, it's available today on @huggingface!
Hugging Face (Twitter)
RT @karpathy: @ClementDelangue @huggingface: Ty! huggingface work/infra/datasets are critical to projects like nanochat - to be accurate the source code of nanochat (e.g. at the $100 tier) is ~8KB of Python and ~30GB of fineweb/smoltalk.
RT @karpathy: @ClementDelangue @huggingface: Ty! huggingface work/infra/datasets are critical to projects like nanochat - to be accurate the source code of nanochat (e.g. at the $100 tier) is ~8KB of Python and ~30GB of fineweb/smoltalk.
Hugging Face (Twitter)
RT @vanstriendaniel: @nanonets just shipped Nanonets-OCR2: new 3B VLM for OCR!
LaTeX equations, tables, handwriting, charts, multilingual - it does it all!
You can try it against your data with one command via @huggingface Jobs - no local GPU needed!
The HF Jobs command/output from the model 👇
RT @vanstriendaniel: @nanonets just shipped Nanonets-OCR2: new 3B VLM for OCR!
LaTeX equations, tables, handwriting, charts, multilingual - it does it all!
You can try it against your data with one command via @huggingface Jobs - no local GPU needed!
The HF Jobs command/output from the model 👇
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)
RT @NielsRogge: Very cool new document AI release by @barrowjoseph on @huggingface
A free tool to automatically convert PDFs into fillable forms :)
Outperforms @Adobe Acrobat by training open-source models for <$500!
RT @NielsRogge: Very cool new document AI release by @barrowjoseph on @huggingface
A free tool to automatically convert PDFs into fillable forms :)
Outperforms @Adobe Acrobat by training open-source models for <$500!
Hugging Face (Twitter)
RT @ClementDelangue: Am I wrong in sensing a paradigm shift in AI?
Feels like we’re moving from a world obsessed with generalist LLM APIs to one where more and more companies are training, optimizing, and running their own models built on open source (especially smaller, specialized ones)
Some validating signs just in the past few weeks:
- @karpathy released nanochat to train models in just a few lines of code
- @thinkymachines launched a fine-tuning product
- rising popularity of @vllm_project, @sgl_project, @PrimeIntellect, Loras, trl,...
- 1M new repos on HF in the past 90 days (including the first open-source LLMs from @OpenAI)
And now, @nvidia just announced DGX Spark, powerful enough for everyone to fine-tune their own models at home.
Would you agree, or am I just seeing the future I want to exist? Also, why is this happening (just the advent of RL/post-training?)
RT @ClementDelangue: Am I wrong in sensing a paradigm shift in AI?
Feels like we’re moving from a world obsessed with generalist LLM APIs to one where more and more companies are training, optimizing, and running their own models built on open source (especially smaller, specialized ones)
Some validating signs just in the past few weeks:
- @karpathy released nanochat to train models in just a few lines of code
- @thinkymachines launched a fine-tuning product
- rising popularity of @vllm_project, @sgl_project, @PrimeIntellect, Loras, trl,...
- 1M new repos on HF in the past 90 days (including the first open-source LLMs from @OpenAI)
And now, @nvidia just announced DGX Spark, powerful enough for everyone to fine-tune their own models at home.
Would you agree, or am I just seeing the future I want to exist? Also, why is this happening (just the advent of RL/post-training?)
Hugging Face (Twitter)
RT @Alibaba_Qwen: Introducing the compact, dense versions of Qwen3-VL — now available in 4B and 8B pairs, each with both Instruct and Thinking variants.
✅ Lower VRAM usage
✅ Full Qwen3-VL capabilities retained
✅ Strong performance across the board
Despite their size, they outperform models like Gemini 2.5 Flash Lite and GPT-5 Nano, and often beat them on benchmarks spanning STEM, VQA, OCR, video understanding, agent tasks, and more. In many cases, they even rival our flagship Qwen2.5-VL-72B from just six months ago!
Plus, FP8 versions are also available for efficient deployment.
Hugging Face: https://huggingface.co/collections/Qwen/qwen3-vl-68d2a7c1b8a8afce4ebd2dbe
ModelScope: https://modelscope.cn/collections/Qwen3-VL-5c7a94c8cb144b
Qwen3-VL-8B-Instruct API: https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?type=model&url=2840914_2&modelId=qwen3-vl-8b-instruct
Qwen3-VL-8B-Thinking API:...
Перейти на оригинальный пост
RT @Alibaba_Qwen: Introducing the compact, dense versions of Qwen3-VL — now available in 4B and 8B pairs, each with both Instruct and Thinking variants.
✅ Lower VRAM usage
✅ Full Qwen3-VL capabilities retained
✅ Strong performance across the board
Despite their size, they outperform models like Gemini 2.5 Flash Lite and GPT-5 Nano, and often beat them on benchmarks spanning STEM, VQA, OCR, video understanding, agent tasks, and more. In many cases, they even rival our flagship Qwen2.5-VL-72B from just six months ago!
Plus, FP8 versions are also available for efficient deployment.
Hugging Face: https://huggingface.co/collections/Qwen/qwen3-vl-68d2a7c1b8a8afce4ebd2dbe
ModelScope: https://modelscope.cn/collections/Qwen3-VL-5c7a94c8cb144b
Qwen3-VL-8B-Instruct API: https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?type=model&url=2840914_2&modelId=qwen3-vl-8b-instruct
Qwen3-VL-8B-Thinking API:...
Перейти на оригинальный пост
Hugging Face (Twitter)
RT @_fracapuano: should we release a comprehensive 70+ pages tutorial on robot learning, with hands-on code examples using @LeRobotHF and @huggingface? 🤗
RT @_fracapuano: should we release a comprehensive 70+ pages tutorial on robot learning, with hands-on code examples using @LeRobotHF and @huggingface? 🤗
Hugging Face (Twitter)
RT @HuggingPapers: ByteDance just released FaceCLIP on Hugging Face!
A new vision-language model specializing in understanding and generating diverse human faces.
Dive into the future of facial AI.
https://huggingface.co/ByteDance/FaceCLIP
RT @HuggingPapers: ByteDance just released FaceCLIP on Hugging Face!
A new vision-language model specializing in understanding and generating diverse human faces.
Dive into the future of facial AI.
https://huggingface.co/ByteDance/FaceCLIP
huggingface.co
ByteDance/FaceCLIP · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Hugging Face (Twitter)
RT @osanseviero: Introducing... Cell2Sentence Scale 27B🤏
Based on Gemma, it's an open model that generated hypotheses about cancer cellular behavior. In collaboration with Yale, we confirmed the predictions with experimental validation in living cells
Super excited about this one 🤯
RT @osanseviero: Introducing... Cell2Sentence Scale 27B🤏
Based on Gemma, it's an open model that generated hypotheses about cancer cellular behavior. In collaboration with Yale, we confirmed the predictions with experimental validation in living cells
Super excited about this one 🤯
Hugging Face (Twitter)
RT @sundarpichai: The model + resources are now on HuggingFace and GitHub so researchers can keep building and experimenting. More details here: https://blog.google/technology/ai/google-gemma-ai-cancer-therapy-discovery/
RT @sundarpichai: The model + resources are now on HuggingFace and GitHub so researchers can keep building and experimenting. More details here: https://blog.google/technology/ai/google-gemma-ai-cancer-therapy-discovery/
Google
How a Gemma model helped discover a new potential cancer therapy pathway
We’re launching a new 27 billion parameter foundation model for single-cell analysis built on the Gemma family of open models.
Hugging Face (Twitter)
RT @AziziShekoofeh: Explore the model and resources on Hugging Face:
https://huggingface.co/vandijklab/C2S-Scale-Gemma-2-27B
RT @AziziShekoofeh: Explore the model and resources on Hugging Face:
https://huggingface.co/vandijklab/C2S-Scale-Gemma-2-27B
huggingface.co
vandijklab/C2S-Scale-Gemma-2-27B · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.