Hugging Face (Twitter)
RT @NVIDIAAIDev: ๐ We just opened over 26M lines of synthetic data that was used to train the Llama Nemotron Super v1.5 model.
๐ This transparency into our model training also helps you build your own models -- without expending the effort and time required to produce your own datasets.
๐ข Find them on @HuggingFace ๐ค https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v1
RT @NVIDIAAIDev: ๐ We just opened over 26M lines of synthetic data that was used to train the Llama Nemotron Super v1.5 model.
๐ This transparency into our model training also helps you build your own models -- without expending the effort and time required to produce your own datasets.
๐ข Find them on @HuggingFace ๐ค https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v1
Hugging Face (Twitter)
RT @ClementDelangue: If you're a researcher or engineer releasing open science papers & open models and datasets, I bow to you ๐๐๐
From what I'm hearing, doing so, especially in US big tech, often means fighting your manager and colleagues, going through countless legal meetings, threatening to quit or taking a lower paycheck, and sometimes the result is only that you'll get scolded when what you shared is used by competitors.
But, please remember: research papers and open models and datasets is how progress happens! Your efforts are pushing AI toward a more open and collaborative future. Thanks to openness, your research or models get a chance to be noticed, seen & built upon by people you respect to accelerate progress, grow your network & accelerate your impact.
It might be tough right now but open science will ultimately prevail as it always did! The researchers & engineers that we'll remember in ten years are the ones who share what they build, not the ones that keep it behind closed-doors for company profit maximization.
Please keep fighting for openness. We see you and we thank you! ๐๐ ๐๐
RT @ClementDelangue: If you're a researcher or engineer releasing open science papers & open models and datasets, I bow to you ๐๐๐
From what I'm hearing, doing so, especially in US big tech, often means fighting your manager and colleagues, going through countless legal meetings, threatening to quit or taking a lower paycheck, and sometimes the result is only that you'll get scolded when what you shared is used by competitors.
But, please remember: research papers and open models and datasets is how progress happens! Your efforts are pushing AI toward a more open and collaborative future. Thanks to openness, your research or models get a chance to be noticed, seen & built upon by people you respect to accelerate progress, grow your network & accelerate your impact.
It might be tough right now but open science will ultimately prevail as it always did! The researchers & engineers that we'll remember in ten years are the ones who share what they build, not the ones that keep it behind closed-doors for company profit maximization.
Please keep fighting for openness. We see you and we thank you! ๐๐ ๐๐
Hugging Face (Twitter)
RT @Xianbao_QIAN: Step 3 has just been released. It proposed a new infra level optimization of Attention, FFN disaggregation.
Model & Infra co-design is the way forward!
Model: https://huggingface.co/stepfun-ai/step3
Technical paper: arxiv.org/abs/2507.19427
RT @Xianbao_QIAN: Step 3 has just been released. It proposed a new infra level optimization of Attention, FFN disaggregation.
Model & Infra co-design is the way forward!
Model: https://huggingface.co/stepfun-ai/step3
Technical paper: arxiv.org/abs/2507.19427
Hugging Face (Twitter)
RT @nickfrosst: cohere vision model :)
weights on huggingface
https://huggingface.co/blog/CohereLabs/introducing-command-a-vision-07-2025
RT @nickfrosst: cohere vision model :)
weights on huggingface
https://huggingface.co/blog/CohereLabs/introducing-command-a-vision-07-2025
Hugging Face (Twitter)
RT @victormustar: Black Forest Labs did a great job here, really like the vibe of the outputs here.
๐free demo is available on Hugging Face https://twitter.com/bfl_ml/status/1950920537741336801#m
RT @victormustar: Black Forest Labs did a great job here, really like the vibe of the outputs here.
๐free demo is available on Hugging Face https://twitter.com/bfl_ml/status/1950920537741336801#m
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)
RT @bfl_ml: Today we are releasing FLUX.1 Krea [dev] - a new state-of-the-art open-weights FLUX model, built for photorealism.
Developed in collaboration with @krea_ai, this model is focused on images with unique aesthetics. No โAI lookโ, no blown-out highlights, just natural detail.
RT @bfl_ml: Today we are releasing FLUX.1 Krea [dev] - a new state-of-the-art open-weights FLUX model, built for photorealism.
Developed in collaboration with @krea_ai, this model is focused on images with unique aesthetics. No โAI lookโ, no blown-out highlights, just natural detail.
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)
RT @reach_vb: New favourite model: Flux.1 Krea Dev by @bfl_ml ๐ฅ
Focused on aesthetics - nails prompt guidance too! - You can run for free via ZeroGPU! ๐ค
RT @reach_vb: New favourite model: Flux.1 Krea Dev by @bfl_ml ๐ฅ
Focused on aesthetics - nails prompt guidance too! - You can run for free via ZeroGPU! ๐ค
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)
RT @multimodalart: I've built a demo to allow you to navigate some of the immersive worlds generated by HunyuanWorld ๐ https://twitter.com/TencentHunyuan/status/1949288986192834718#m
RT @multimodalart: I've built a demo to allow you to navigate some of the immersive worlds generated by HunyuanWorld ๐ https://twitter.com/TencentHunyuan/status/1949288986192834718#m
โHugging Face (Twitter)
RT @Akashi203: I released the first Arabic reasoning dataset, designed to help train and fine-tune AI models for reasoning tasks in Arabic.
Itโs open-sourced on Hugging Face:
https://huggingface.co/datasets/Jr23xd23/Arabic-Optimized-Reasoning-Dataset
RT @Akashi203: I released the first Arabic reasoning dataset, designed to help train and fine-tune AI models for reasoning tasks in Arabic.
Itโs open-sourced on Hugging Face:
https://huggingface.co/datasets/Jr23xd23/Arabic-Optimized-Reasoning-Dataset
huggingface.co
Jr23xd23/Arabic-Optimized-Reasoning-Dataset ยท Datasets at Hugging Face
Weโre on a journey to advance and democratize artificial intelligence through open source and open science.
Hugging Face (Twitter)
RT @eliebakouch: If youโre a researcher working on RL, you should definitely try SmolLM3-3B and get another data point besides Qwen3-3B.
1) We didnโt have time to try RL during post training, so I think thereโs still some room to build an even better version of smollm!
2) We released the intermediate checkpoints from post training, so you can use our model at different stages (base, mid training, SFT, APO, merging,) and see if it changes RL perf.
3) The model is also pretty good at long context, you can probably push it past 128k thanks to NoPE and yarn.
RT @eliebakouch: If youโre a researcher working on RL, you should definitely try SmolLM3-3B and get another data point besides Qwen3-3B.
1) We didnโt have time to try RL during post training, so I think thereโs still some room to build an even better version of smollm!
2) We released the intermediate checkpoints from post training, so you can use our model at different stages (base, mid training, SFT, APO, merging,) and see if it changes RL perf.
3) The model is also pretty good at long context, you can probably push it past 128k thanks to NoPE and yarn.
Hugging Face (Twitter)
RT @julien_c: 50 (!) LLMs released these past 2-3 weeks.
But the real kicker is when you think of this:
It is the most releases weโve seen so far, but the least releases weโll see in the future ๐คฏ
RT @julien_c: 50 (!) LLMs released these past 2-3 weeks.
But the real kicker is when you think of this:
It is the most releases weโve seen so far, but the least releases weโll see in the future ๐คฏ
Hugging Face (Twitter)
RT @ClementDelangue: Every tech company can and should train their own deepseek R1, Llama or GPT5, just like every tech company writes their own code (and AI is no more than software 2.0).
This is why we're releasing the Ultra-Scale Playbook. 200 pages to master:
- 5D parallelism (DP, TP, PP, EP, FSDP)
- ZeRO
- Flash Attention
- Compute/communication overlap and bottlenecks
All with accessible theory intros and 4,000+ scaling experiments.
RT @ClementDelangue: Every tech company can and should train their own deepseek R1, Llama or GPT5, just like every tech company writes their own code (and AI is no more than software 2.0).
This is why we're releasing the Ultra-Scale Playbook. 200 pages to master:
- 5D parallelism (DP, TP, PP, EP, FSDP)
- ZeRO
- Flash Attention
- Compute/communication overlap and bottlenecks
All with accessible theory intros and 4,000+ scaling experiments.
Hugging Face (Twitter)
RT @rohanpaul_ai: Wow. This is a HUGE 24-trillion-token web dataset with document-level metadata available on @huggingface
apache-2.0 license
- collected from Common Crawl
- each document is labeled with a 12-field taxonomy covering topic, page type, complexity, and quality .
- Labels are generated by EAI-Distill-0.5b, a 0.5B-parameter model fine-tuned on Qwen2.5-32B-Instruct outputs. It matches teacher quality within 3% agreement and preserves domain recall within 1pp .
- Simple SQL-style filters produce datasets competitive with specialist pipelines. Math is within 8% of SOTA, web code up 14.3%, STEM up 24.5%, and medical up 8.6% .
- Inference on 23.6 B documents required about 90k AMD MI300x GPU-hours.
RT @rohanpaul_ai: Wow. This is a HUGE 24-trillion-token web dataset with document-level metadata available on @huggingface
apache-2.0 license
- collected from Common Crawl
- each document is labeled with a 12-field taxonomy covering topic, page type, complexity, and quality .
- Labels are generated by EAI-Distill-0.5b, a 0.5B-parameter model fine-tuned on Qwen2.5-32B-Instruct outputs. It matches teacher quality within 3% agreement and preserves domain recall within 1pp .
- Simple SQL-style filters produce datasets competitive with specialist pipelines. Math is within 8% of SOTA, web code up 14.3%, STEM up 24.5%, and medical up 8.6% .
- Inference on 23.6 B documents required about 90k AMD MI300x GPU-hours.
Hugging Face (Twitter)
RT @ChujieZheng: Uploaded on July 25, GSPO is already the #1 most popular paper on @huggingface for July ๐ป https://twitter.com/ClementDelangue/status/1949934196148895799#m
RT @ChujieZheng: Uploaded on July 25, GSPO is already the #1 most popular paper on @huggingface for July ๐ป https://twitter.com/ClementDelangue/status/1949934196148895799#m
Hugging Face (Twitter)
RT @theMetaStoneAI: ๐ Introducing XBai o4๏ผa milestone in our 4th-generation open-source technology based on parallel test time scaling๏ผ
In its medium mode, XBai o4 now fully outperforms OpenAIโo3โmini.๐
๐Open-source weights: https://huggingface.co/MetaStoneTec/XBai-o4โ
Github link: https://github.com/MetaStone-AI/XBai-o4
RT @theMetaStoneAI: ๐ Introducing XBai o4๏ผa milestone in our 4th-generation open-source technology based on parallel test time scaling๏ผ
In its medium mode, XBai o4 now fully outperforms OpenAIโo3โmini.๐
๐Open-source weights: https://huggingface.co/MetaStoneTec/XBai-o4โ
Github link: https://github.com/MetaStone-AI/XBai-o4
Hugging Face (Twitter)
RT @HuggingPapers: StepFun just released Step-3 on Hugging Face!
It's a new 321B-parameter VLM that's "Large yet Affordable," co-designed for cost-effective decoding.
Achieves unprecedented efficiency, setting a new Pareto frontier for LLM inference.
RT @HuggingPapers: StepFun just released Step-3 on Hugging Face!
It's a new 321B-parameter VLM that's "Large yet Affordable," co-designed for cost-effective decoding.
Achieves unprecedented efficiency, setting a new Pareto frontier for LLM inference.
Hugging Face (Twitter)
RT @NielsRogge: StepFun quitely dropped a 321B parameter VLM on @huggingface.. trained on Hopper GPUs similar to DeepSeek except more efficient
@StepFun_ai is yet-another-Chinese AI player besides @deepseek_ai, @Alibaba_Qwen, @Kimi_Moonshot, @MiniMax__AI, @TencentHunyuan and @Zai_org https://twitter.com/HuggingPapers/status/1952038716488208409#m
RT @NielsRogge: StepFun quitely dropped a 321B parameter VLM on @huggingface.. trained on Hopper GPUs similar to DeepSeek except more efficient
@StepFun_ai is yet-another-Chinese AI player besides @deepseek_ai, @Alibaba_Qwen, @Kimi_Moonshot, @MiniMax__AI, @TencentHunyuan and @Zai_org https://twitter.com/HuggingPapers/status/1952038716488208409#m