Hugging Face
56 subscribers
565 photos
214 videos
985 links
Download Telegram
Hugging Face (Twitter)

RT @NVIDIAAIDev: ๐Ÿ‘€ We just opened over 26M lines of synthetic data that was used to train the Llama Nemotron Super v1.5 model.

๐Ÿ”Ž This transparency into our model training also helps you build your own models -- without expending the effort and time required to produce your own datasets.

๐Ÿ”ข Find them on @HuggingFace ๐Ÿค— https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v1
Hugging Face (Twitter)

RT @ClementDelangue: If you're a researcher or engineer releasing open science papers & open models and datasets, I bow to you ๐Ÿ™‡๐Ÿ™‡๐Ÿ™‡

From what I'm hearing, doing so, especially in US big tech, often means fighting your manager and colleagues, going through countless legal meetings, threatening to quit or taking a lower paycheck, and sometimes the result is only that you'll get scolded when what you shared is used by competitors.

But, please remember: research papers and open models and datasets is how progress happens! Your efforts are pushing AI toward a more open and collaborative future. Thanks to openness, your research or models get a chance to be noticed, seen & built upon by people you respect to accelerate progress, grow your network & accelerate your impact.

It might be tough right now but open science will ultimately prevail as it always did! The researchers & engineers that we'll remember in ten years are the ones who share what they build, not the ones that keep it behind closed-doors for company profit maximization.

Please keep fighting for openness. We see you and we thank you! ๐Ÿ’š๐Ÿ’› ๐Ÿ’™๐Ÿ’œ
Hugging Face (Twitter)

RT @Xianbao_QIAN: Step 3 has just been released. It proposed a new infra level optimization of Attention, FFN disaggregation.

Model & Infra co-design is the way forward!

Model: https://huggingface.co/stepfun-ai/step3
Technical paper: arxiv.org/abs/2507.19427
Hugging Face (Twitter)

RT @victormustar: Black Forest Labs did a great job here, really like the vibe of the outputs here.

๐Ÿ‘‡free demo is available on Hugging Face https://twitter.com/bfl_ml/status/1950920537741336801#m
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @bfl_ml: Today we are releasing FLUX.1 Krea [dev] - a new state-of-the-art open-weights FLUX model, built for photorealism.

Developed in collaboration with @krea_ai, this model is focused on images with unique aesthetics. No โ€œAI lookโ€, no blown-out highlights, just natural detail.
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @reach_vb: New favourite model: Flux.1 Krea Dev by @bfl_ml ๐Ÿ”ฅ

Focused on aesthetics - nails prompt guidance too! - You can run for free via ZeroGPU! ๐Ÿค—
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @multimodalart: I've built a demo to allow you to navigate some of the immersive worlds generated by HunyuanWorld ๐ŸŒŽ https://twitter.com/TencentHunyuan/status/1949288986192834718#m
Hugging Face (Twitter)

RT @eliebakouch: If youโ€™re a researcher working on RL, you should definitely try SmolLM3-3B and get another data point besides Qwen3-3B.

1) We didnโ€™t have time to try RL during post training, so I think thereโ€™s still some room to build an even better version of smollm!

2) We released the intermediate checkpoints from post training, so you can use our model at different stages (base, mid training, SFT, APO, merging,) and see if it changes RL perf.

3) The model is also pretty good at long context, you can probably push it past 128k thanks to NoPE and yarn.
Hugging Face (Twitter)

RT @julien_c: 50 (!) LLMs released these past 2-3 weeks.

But the real kicker is when you think of this:

It is the most releases weโ€™ve seen so far, but the least releases weโ€™ll see in the future ๐Ÿคฏ
Hugging Face (Twitter)

RT @ClementDelangue: Every tech company can and should train their own deepseek R1, Llama or GPT5, just like every tech company writes their own code (and AI is no more than software 2.0).

This is why we're releasing the Ultra-Scale Playbook. 200 pages to master:
- 5D parallelism (DP, TP, PP, EP, FSDP)
- ZeRO
- Flash Attention
- Compute/communication overlap and bottlenecks

All with accessible theory intros and 4,000+ scaling experiments.
Hugging Face (Twitter)

RT @rohanpaul_ai: Wow. This is a HUGE 24-trillion-token web dataset with document-level metadata available on @huggingface

apache-2.0 license

- collected from Common Crawl
- each document is labeled with a 12-field taxonomy covering topic, page type, complexity, and quality .
- Labels are generated by EAI-Distill-0.5b, a 0.5B-parameter model fine-tuned on Qwen2.5-32B-Instruct outputs. It matches teacher quality within 3% agreement and preserves domain recall within 1pp .
- Simple SQL-style filters produce datasets competitive with specialist pipelines. Math is within 8% of SOTA, web code up 14.3%, STEM up 24.5%, and medical up 8.6% .
- Inference on 23.6 B documents required about 90k AMD MI300x GPU-hours.
Hugging Face (Twitter)

RT @theMetaStoneAI: ๐Ÿš€ Introducing XBai o4๏ผša milestone in our 4th-generation open-source technology based on parallel test time scaling๏ผ
In its medium mode, XBai o4 now fully outperforms OpenAIโˆ’o3โˆ’mini.๐Ÿ“ˆ

๐Ÿ”—Open-source weights: https://huggingface.co/MetaStoneTec/XBai-o4โœ…
Github link: https://github.com/MetaStone-AI/XBai-o4
Hugging Face (Twitter)

RT @HuggingPapers: StepFun just released Step-3 on Hugging Face!

It's a new 321B-parameter VLM that's "Large yet Affordable," co-designed for cost-effective decoding.

Achieves unprecedented efficiency, setting a new Pareto frontier for LLM inference.
Hugging Face (Twitter)

RT @NielsRogge: StepFun quitely dropped a 321B parameter VLM on @huggingface.. trained on Hopper GPUs similar to DeepSeek except more efficient

@StepFun_ai is yet-another-Chinese AI player besides @deepseek_ai, @Alibaba_Qwen, @Kimi_Moonshot, @MiniMax__AI, @TencentHunyuan and @Zai_org https://twitter.com/HuggingPapers/status/1952038716488208409#m