Hugging Face (Twitter)
RT @wjb_mattingly: TIL you can add a duration to spaces.GPU() when you expect a model to take longer than 60 seconds to run inference. I needed to do this because NuMarkedown-8B-Thinking took about 300 seconds to process a WW2 passport
RT @wjb_mattingly: TIL you can add a duration to spaces.GPU() when you expect a model to take longer than 60 seconds to run inference. I needed to do this because NuMarkedown-8B-Thinking took about 300 seconds to process a WW2 passport
Hugging Face (Twitter)
RT @ClementDelangue: So excited to announce that @huggingface @LeRobotHF can now be installed with a simple pip install and just crossed 15,000 @github stars!
It's already integrated into hundreds of robots, ranging from simple hand grippers like the SO-100/101 all the way to some of the most complex humanoid robots like @pollenrobotics Reachy 2.
Thanks to all contributors of policies, models & datasets (like @nvidia, @physical_int, @microsoft, SmolVLA,...) who already shared over 1,500 models & 15,000 datasets, it's becoming the standard to bridge hardware and software in AI robotics.
Let's go open AI robotics π¦Ύπ¦Ύπ¦Ύ
RT @ClementDelangue: So excited to announce that @huggingface @LeRobotHF can now be installed with a simple pip install and just crossed 15,000 @github stars!
It's already integrated into hundreds of robots, ranging from simple hand grippers like the SO-100/101 all the way to some of the most complex humanoid robots like @pollenrobotics Reachy 2.
Thanks to all contributors of policies, models & datasets (like @nvidia, @physical_int, @microsoft, SmolVLA,...) who already shared over 1,500 models & 15,000 datasets, it's becoming the standard to bridge hardware and software in AI robotics.
Let's go open AI robotics π¦Ύπ¦Ύπ¦Ύ
Hugging Face (Twitter)
RT @QuixiAI: I really don't know why people are whining about gpt-oss.
I'm using 120b for real work and other than being overly structured and prudish, I've no problem. And I appreciate the innovation in the chat template. (Which was a happy bonus of this release)
Finally - it's open source and apache 2.0 - it there's anything you don't like about it you can fine-tune it to act differently. (And you can sell your fine-tune and keep all the profit!)
120b is way way faster than other models in its class (Mistral large, llama 3.3 70b, qwen 2.5 72b) and thus it's perfect for home and small office use running on consumer hardware like 4x3090. (Can be built for $5,000)
We don't need data centers to run capable AI.
I chatted with it much of the day yesterday and as a coding model the code works the first time. Even complex code.
It's not as good as gpt5 and Claude of course but that's a stupid comparison.
Compare it to llama 3.3 70b. It's better at everything I tried, except for creative writing.
It's a good model. It's not perfect but it's really nice and I appreciate that it's free and it's American. We need to praise good behavior and appreciate good things.
RT @QuixiAI: I really don't know why people are whining about gpt-oss.
I'm using 120b for real work and other than being overly structured and prudish, I've no problem. And I appreciate the innovation in the chat template. (Which was a happy bonus of this release)
Finally - it's open source and apache 2.0 - it there's anything you don't like about it you can fine-tune it to act differently. (And you can sell your fine-tune and keep all the profit!)
120b is way way faster than other models in its class (Mistral large, llama 3.3 70b, qwen 2.5 72b) and thus it's perfect for home and small office use running on consumer hardware like 4x3090. (Can be built for $5,000)
We don't need data centers to run capable AI.
I chatted with it much of the day yesterday and as a coding model the code works the first time. Even complex code.
It's not as good as gpt5 and Claude of course but that's a stupid comparison.
Compare it to llama 3.3 70b. It's better at everything I tried, except for creative writing.
It's a good model. It's not perfect but it's really nice and I appreciate that it's free and it's American. We need to praise good behavior and appreciate good things.
βHugging Face (Twitter)
RT @PGelsinger: Open always wins.
https://venturebeat.com/ai/why-open-source-ai-became-an-american-national-priority/
RT @PGelsinger: Open always wins.
https://venturebeat.com/ai/why-open-source-ai-became-an-american-national-priority/
VentureBeat
Why open-source AI became an American national priority
To reflect democratic principles, AI must be built in the open. If the U.S. wants to lead the AI race, it must lead the open-source AI race.
Hugging Face (Twitter)
RT @HuggingPapers: Tencent AI Lab introduces R-Zero!
A groundbreaking framework enabling LLMs to self-evolve their reasoning capabilities
from zero human-curated data, through an autonomous Challenger-Solver loop.
RT @HuggingPapers: Tencent AI Lab introduces R-Zero!
A groundbreaking framework enabling LLMs to self-evolve their reasoning capabilities
from zero human-curated data, through an autonomous Challenger-Solver loop.
Hugging Face (Twitter)
RT @localghost: putting the oss model weights on a usb is such a good merch idea
RT @localghost: putting the oss model weights on a usb is such a good merch idea
Hugging Face (Twitter)
RT @jxmnop: if you want to try the data, here you go, it's on huggingface:
https://huggingface.co/datasets/jxm/gpt-oss20b-samples
let me know what you find!
RT @jxmnop: if you want to try the data, here you go, it's on huggingface:
https://huggingface.co/datasets/jxm/gpt-oss20b-samples
let me know what you find!
Hugging Face (Twitter)
RT @XiaomiMiMo: π MiMoβVL 2508 is live! Same size, much smarter π
Weβve upgraded performance, thinking control, and overall user experience.
π Benchmark gains across image + video: MMMU 70.6, VideoMME 70.8. Consistent improvements across the board.
π€ Thinking Control: toggle reasoning with no_think.
On (default): full reasoning visible;
Off: direct answers, no reasoning β‘β‘;
β€οΈ Realβworld user experience: our VLM Arena rating improved from 1093.9 β 1131.2 (+37.3).
More capable, flexible, and reliable in everyday tasks.
Feedback welcome.
π€ RL Version: https://huggingface.co/XiaomiMiMo/MiMo-VL-7B-RL-2508
π€ SFT Version: https://huggingface.co/XiaomiMiMo/MiMo-VL-7B-SFT-2508
#XiaomiMiMo #MiMoVL
RT @XiaomiMiMo: π MiMoβVL 2508 is live! Same size, much smarter π
Weβve upgraded performance, thinking control, and overall user experience.
π Benchmark gains across image + video: MMMU 70.6, VideoMME 70.8. Consistent improvements across the board.
π€ Thinking Control: toggle reasoning with no_think.
On (default): full reasoning visible;
Off: direct answers, no reasoning β‘β‘;
β€οΈ Realβworld user experience: our VLM Arena rating improved from 1093.9 β 1131.2 (+37.3).
More capable, flexible, and reliable in everyday tasks.
Feedback welcome.
π€ RL Version: https://huggingface.co/XiaomiMiMo/MiMo-VL-7B-RL-2508
π€ SFT Version: https://huggingface.co/XiaomiMiMo/MiMo-VL-7B-SFT-2508
#XiaomiMiMo #MiMoVL
Hugging Face (Twitter)
RT @MaziyarPanahi: Academic visibility unlocked!! π€
This is super cool! Now your papers show up directly on your @huggingface profile!!! π₯
RT @MaziyarPanahi: Academic visibility unlocked!! π€
This is super cool! Now your papers show up directly on your @huggingface profile!!! π₯
βHugging Face (Twitter)
RT @Nouamanetazi: π’ The Ultra-Scale Playbook is now available in print!
πA deep dive into training Large Language Models efficiently on GPU clusters β from fundamentals to advanced parallelism.
Order here
π
RT @Nouamanetazi: π’ The Ultra-Scale Playbook is now available in print!
πA deep dive into training Large Language Models efficiently on GPU clusters β from fundamentals to advanced parallelism.
Order here
π
Lulu
The Ultra-Scale Playbook
Embark on a journey to orchestrate thousands of GPUs to scale LLM training to the largest compute clusters today. Starting with the memory and compute anatomy of model training we then explore 5 dimensions of parallelism to distribute training efficiently.β¦
Hugging Face (Twitter)
RT @Zai_org: Introducing GLM-4.5V: a breakthrough in open-source visual reasoning
GLM-4.5V delivers state-of-the-art performance among open-source models in its size class, dominating across 41 benchmarks.
Built on the GLM-4.5-Air base model, GLM-4.5V inherits proven techniques from GLM-4.1V-Thinking while achieving effective scaling through a powerful 106B-parameter MoE architecture.
Hugging Face: https://huggingface.co/zai-org/GLM-4.5V
GitHub: github.com/zai-org/GLM-V
Z.ai API: https://docs.z.ai/guides/vlm/glm-4.5v
Try it now: chat.z.ai
RT @Zai_org: Introducing GLM-4.5V: a breakthrough in open-source visual reasoning
GLM-4.5V delivers state-of-the-art performance among open-source models in its size class, dominating across 41 benchmarks.
Built on the GLM-4.5-Air base model, GLM-4.5V inherits proven techniques from GLM-4.1V-Thinking while achieving effective scaling through a powerful 106B-parameter MoE architecture.
Hugging Face: https://huggingface.co/zai-org/GLM-4.5V
GitHub: github.com/zai-org/GLM-V
Z.ai API: https://docs.z.ai/guides/vlm/glm-4.5v
Try it now: chat.z.ai
Hugging Face (Twitter)
RT @reach_vb: OpenAI gpt-oss has over 5M downloads, 400+ fine-tunes and *the* most liked release this year so far! π₯
Great job @OpenAI π€
RT @reach_vb: OpenAI gpt-oss has over 5M downloads, 400+ fine-tunes and *the* most liked release this year so far! π₯
Great job @OpenAI π€
Hugging Face (Twitter)
RT @fdaudens: GPT-OSS:
- 5M downloads in <1 week on @huggingface π
- 400 new models
- already outpacing DeepSeek R1βs launch numbers, and thatβs without counting inference calls
- also the most-liked release of any major LLM this summer
RT @fdaudens: GPT-OSS:
- 5M downloads in <1 week on @huggingface π
- 400 new models
- already outpacing DeepSeek R1βs launch numbers, and thatβs without counting inference calls
- also the most-liked release of any major LLM this summer
Hugging Face (Twitter)
RT @romainhuet: Over 5M downloads in under a week for our @OpenAI open models, and 400+ fine-tunes on @huggingface! π€ https://twitter.com/reach_vb/status/1954909541805801799#m
RT @romainhuet: Over 5M downloads in under a week for our @OpenAI open models, and 400+ fine-tunes on @huggingface! π€ https://twitter.com/reach_vb/status/1954909541805801799#m
Hugging Face (Twitter)
RT @BrigitteTousi: This Wednesday Aug. 13 at 11 am EDT, join @huggingface on Discord for an AMA with our CEO @ClementDelangue.
No bullshit, just real talk.
Sign up link in thread. π€
RT @BrigitteTousi: This Wednesday Aug. 13 at 11 am EDT, join @huggingface on Discord for an AMA with our CEO @ClementDelangue.
No bullshit, just real talk.
Sign up link in thread. π€
Hugging Face (Twitter)
RT @Xianbao_QIAN: The new talking head model, EchoMimicV3, from Ant Group seems to be pretty cool.
Based on Wan 2.1 1.3B
RT @Xianbao_QIAN: The new talking head model, EchoMimicV3, from Ant Group seems to be pretty cool.
Based on Wan 2.1 1.3B