Hugging Face (Twitter)
RT @Xianbao_QIAN: Step 3 has just been released. It proposed a new infra level optimization of Attention, FFN disaggregation.
Model & Infra co-design is the way forward!
Model: https://huggingface.co/stepfun-ai/step3
Technical paper: arxiv.org/abs/2507.19427
RT @Xianbao_QIAN: Step 3 has just been released. It proposed a new infra level optimization of Attention, FFN disaggregation.
Model & Infra co-design is the way forward!
Model: https://huggingface.co/stepfun-ai/step3
Technical paper: arxiv.org/abs/2507.19427
Hugging Face (Twitter)
RT @nickfrosst: cohere vision model :)
weights on huggingface
https://huggingface.co/blog/CohereLabs/introducing-command-a-vision-07-2025
RT @nickfrosst: cohere vision model :)
weights on huggingface
https://huggingface.co/blog/CohereLabs/introducing-command-a-vision-07-2025
Hugging Face (Twitter)
RT @victormustar: Black Forest Labs did a great job here, really like the vibe of the outputs here.
👇free demo is available on Hugging Face https://twitter.com/bfl_ml/status/1950920537741336801#m
RT @victormustar: Black Forest Labs did a great job here, really like the vibe of the outputs here.
👇free demo is available on Hugging Face https://twitter.com/bfl_ml/status/1950920537741336801#m
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)
RT @bfl_ml: Today we are releasing FLUX.1 Krea [dev] - a new state-of-the-art open-weights FLUX model, built for photorealism.
Developed in collaboration with @krea_ai, this model is focused on images with unique aesthetics. No “AI look”, no blown-out highlights, just natural detail.
RT @bfl_ml: Today we are releasing FLUX.1 Krea [dev] - a new state-of-the-art open-weights FLUX model, built for photorealism.
Developed in collaboration with @krea_ai, this model is focused on images with unique aesthetics. No “AI look”, no blown-out highlights, just natural detail.
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)
RT @reach_vb: New favourite model: Flux.1 Krea Dev by @bfl_ml 🔥
Focused on aesthetics - nails prompt guidance too! - You can run for free via ZeroGPU! 🤗
RT @reach_vb: New favourite model: Flux.1 Krea Dev by @bfl_ml 🔥
Focused on aesthetics - nails prompt guidance too! - You can run for free via ZeroGPU! 🤗
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)
RT @multimodalart: I've built a demo to allow you to navigate some of the immersive worlds generated by HunyuanWorld 🌎 https://twitter.com/TencentHunyuan/status/1949288986192834718#m
RT @multimodalart: I've built a demo to allow you to navigate some of the immersive worlds generated by HunyuanWorld 🌎 https://twitter.com/TencentHunyuan/status/1949288986192834718#m
Hugging Face (Twitter)
RT @Akashi203: I released the first Arabic reasoning dataset, designed to help train and fine-tune AI models for reasoning tasks in Arabic.
It’s open-sourced on Hugging Face:
https://huggingface.co/datasets/Jr23xd23/Arabic-Optimized-Reasoning-Dataset
RT @Akashi203: I released the first Arabic reasoning dataset, designed to help train and fine-tune AI models for reasoning tasks in Arabic.
It’s open-sourced on Hugging Face:
https://huggingface.co/datasets/Jr23xd23/Arabic-Optimized-Reasoning-Dataset
huggingface.co
Jr23xd23/Arabic-Optimized-Reasoning-Dataset · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Hugging Face (Twitter)
RT @eliebakouch: If you’re a researcher working on RL, you should definitely try SmolLM3-3B and get another data point besides Qwen3-3B.
1) We didn’t have time to try RL during post training, so I think there’s still some room to build an even better version of smollm!
2) We released the intermediate checkpoints from post training, so you can use our model at different stages (base, mid training, SFT, APO, merging,) and see if it changes RL perf.
3) The model is also pretty good at long context, you can probably push it past 128k thanks to NoPE and yarn.
RT @eliebakouch: If you’re a researcher working on RL, you should definitely try SmolLM3-3B and get another data point besides Qwen3-3B.
1) We didn’t have time to try RL during post training, so I think there’s still some room to build an even better version of smollm!
2) We released the intermediate checkpoints from post training, so you can use our model at different stages (base, mid training, SFT, APO, merging,) and see if it changes RL perf.
3) The model is also pretty good at long context, you can probably push it past 128k thanks to NoPE and yarn.
Hugging Face (Twitter)
RT @julien_c: 50 (!) LLMs released these past 2-3 weeks.
But the real kicker is when you think of this:
It is the most releases we’ve seen so far, but the least releases we’ll see in the future 🤯
RT @julien_c: 50 (!) LLMs released these past 2-3 weeks.
But the real kicker is when you think of this:
It is the most releases we’ve seen so far, but the least releases we’ll see in the future 🤯
Hugging Face (Twitter)
RT @ClementDelangue: Every tech company can and should train their own deepseek R1, Llama or GPT5, just like every tech company writes their own code (and AI is no more than software 2.0).
This is why we're releasing the Ultra-Scale Playbook. 200 pages to master:
- 5D parallelism (DP, TP, PP, EP, FSDP)
- ZeRO
- Flash Attention
- Compute/communication overlap and bottlenecks
All with accessible theory intros and 4,000+ scaling experiments.
RT @ClementDelangue: Every tech company can and should train their own deepseek R1, Llama or GPT5, just like every tech company writes their own code (and AI is no more than software 2.0).
This is why we're releasing the Ultra-Scale Playbook. 200 pages to master:
- 5D parallelism (DP, TP, PP, EP, FSDP)
- ZeRO
- Flash Attention
- Compute/communication overlap and bottlenecks
All with accessible theory intros and 4,000+ scaling experiments.
Hugging Face (Twitter)
RT @rohanpaul_ai: Wow. This is a HUGE 24-trillion-token web dataset with document-level metadata available on @huggingface
apache-2.0 license
- collected from Common Crawl
- each document is labeled with a 12-field taxonomy covering topic, page type, complexity, and quality .
- Labels are generated by EAI-Distill-0.5b, a 0.5B-parameter model fine-tuned on Qwen2.5-32B-Instruct outputs. It matches teacher quality within 3% agreement and preserves domain recall within 1pp .
- Simple SQL-style filters produce datasets competitive with specialist pipelines. Math is within 8% of SOTA, web code up 14.3%, STEM up 24.5%, and medical up 8.6% .
- Inference on 23.6 B documents required about 90k AMD MI300x GPU-hours.
RT @rohanpaul_ai: Wow. This is a HUGE 24-trillion-token web dataset with document-level metadata available on @huggingface
apache-2.0 license
- collected from Common Crawl
- each document is labeled with a 12-field taxonomy covering topic, page type, complexity, and quality .
- Labels are generated by EAI-Distill-0.5b, a 0.5B-parameter model fine-tuned on Qwen2.5-32B-Instruct outputs. It matches teacher quality within 3% agreement and preserves domain recall within 1pp .
- Simple SQL-style filters produce datasets competitive with specialist pipelines. Math is within 8% of SOTA, web code up 14.3%, STEM up 24.5%, and medical up 8.6% .
- Inference on 23.6 B documents required about 90k AMD MI300x GPU-hours.
Hugging Face (Twitter)
RT @ChujieZheng: Uploaded on July 25, GSPO is already the #1 most popular paper on @huggingface for July 🍻 https://twitter.com/ClementDelangue/status/1949934196148895799#m
RT @ChujieZheng: Uploaded on July 25, GSPO is already the #1 most popular paper on @huggingface for July 🍻 https://twitter.com/ClementDelangue/status/1949934196148895799#m
Hugging Face (Twitter)
RT @theMetaStoneAI: 🚀 Introducing XBai o4:a milestone in our 4th-generation open-source technology based on parallel test time scaling!
In its medium mode, XBai o4 now fully outperforms OpenAI−o3−mini.📈
🔗Open-source weights: https://huggingface.co/MetaStoneTec/XBai-o4✅
Github link: https://github.com/MetaStone-AI/XBai-o4
RT @theMetaStoneAI: 🚀 Introducing XBai o4:a milestone in our 4th-generation open-source technology based on parallel test time scaling!
In its medium mode, XBai o4 now fully outperforms OpenAI−o3−mini.📈
🔗Open-source weights: https://huggingface.co/MetaStoneTec/XBai-o4✅
Github link: https://github.com/MetaStone-AI/XBai-o4
Hugging Face (Twitter)
RT @HuggingPapers: StepFun just released Step-3 on Hugging Face!
It's a new 321B-parameter VLM that's "Large yet Affordable," co-designed for cost-effective decoding.
Achieves unprecedented efficiency, setting a new Pareto frontier for LLM inference.
RT @HuggingPapers: StepFun just released Step-3 on Hugging Face!
It's a new 321B-parameter VLM that's "Large yet Affordable," co-designed for cost-effective decoding.
Achieves unprecedented efficiency, setting a new Pareto frontier for LLM inference.
Hugging Face (Twitter)
RT @NielsRogge: StepFun quitely dropped a 321B parameter VLM on @huggingface.. trained on Hopper GPUs similar to DeepSeek except more efficient
@StepFun_ai is yet-another-Chinese AI player besides @deepseek_ai, @Alibaba_Qwen, @Kimi_Moonshot, @MiniMax__AI, @TencentHunyuan and @Zai_org https://twitter.com/HuggingPapers/status/1952038716488208409#m
RT @NielsRogge: StepFun quitely dropped a 321B parameter VLM on @huggingface.. trained on Hopper GPUs similar to DeepSeek except more efficient
@StepFun_ai is yet-another-Chinese AI player besides @deepseek_ai, @Alibaba_Qwen, @Kimi_Moonshot, @MiniMax__AI, @TencentHunyuan and @Zai_org https://twitter.com/HuggingPapers/status/1952038716488208409#m
Hugging Face (Twitter)
RT @HaihaoShen: 🥳Qwen3-Coder-30B-A3B INT4 & INT2 GGUF models are available now -
https://huggingface.co/Intel/Qwen3-Coder-30B-A3B-Instruct-int4-AutoRound
https://huggingface.co/Intel/Qwen3-Coder-30B-A3B-Instruct-gguf-q2ks-mixed-AutoRound
#intel #int4 #autoround #huggingface
RT @HaihaoShen: 🥳Qwen3-Coder-30B-A3B INT4 & INT2 GGUF models are available now -
https://huggingface.co/Intel/Qwen3-Coder-30B-A3B-Instruct-int4-AutoRound
https://huggingface.co/Intel/Qwen3-Coder-30B-A3B-Instruct-gguf-q2ks-mixed-AutoRound
#intel #int4 #autoround #huggingface
huggingface.co
Intel/Qwen3-Coder-30B-A3B-Instruct-int4-AutoRound · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Hugging Face (Twitter)
RT @Alibaba_Qwen: 🚀 Meet Qwen-Image — a 20B MMDiT model for next-gen text-to-image generation. Especially strong at creating stunning graphic posters with native text. Now open-source.
🔍 Key Highlights:
🔹 SOTA text rendering — rivals GPT-4o in English, best-in-class for Chinese
🔹 In-pixel text generation — no overlays, fully integrated
🔹 Bilingual support, diverse fonts, complex layouts
🎨 Also excels at general image generation — from photorealistic to anime, impressionist to minimalist. A true creative powerhouse.
Blog:https://qwenlm.github.io/blog/qwen-image/
Hugging Face:https://huggingface.co/Qwen/Qwen-Image
ModelScope:https://modelscope.cn/models/Qwen/Qwen-Image
Github:github.com/QwenLM/Qwen-Image
Technical report:https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf
Demo: https://modelscope.cn/aigc/imageGeneration?tab=advanced
RT @Alibaba_Qwen: 🚀 Meet Qwen-Image — a 20B MMDiT model for next-gen text-to-image generation. Especially strong at creating stunning graphic posters with native text. Now open-source.
🔍 Key Highlights:
🔹 SOTA text rendering — rivals GPT-4o in English, best-in-class for Chinese
🔹 In-pixel text generation — no overlays, fully integrated
🔹 Bilingual support, diverse fonts, complex layouts
🎨 Also excels at general image generation — from photorealistic to anime, impressionist to minimalist. A true creative powerhouse.
Blog:https://qwenlm.github.io/blog/qwen-image/
Hugging Face:https://huggingface.co/Qwen/Qwen-Image
ModelScope:https://modelscope.cn/models/Qwen/Qwen-Image
Github:github.com/QwenLM/Qwen-Image
Technical report:https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf
Demo: https://modelscope.cn/aigc/imageGeneration?tab=advanced
Hugging Face (Twitter)
RT @_fracapuano: We shipped @LeRobotHF to its first major release, on Pypi and GitHub.
Alongside the team at @huggingface we’re making robotics more accessible, collaborative, and we hope this release makes contributing easier and better.
Links in 🧵
RT @_fracapuano: We shipped @LeRobotHF to its first major release, on Pypi and GitHub.
Alongside the team at @huggingface we’re making robotics more accessible, collaborative, and we hope this release makes contributing easier and better.
Links in 🧵