Hugging Face
66 subscribers
698 photos
243 videos
1.2K links
Download Telegram
Hugging Face (Twitter)

RT @ClementDelangue: Every tech company can and should train their own deepseek R1, Llama or GPT5, just like every tech company writes their own code (and AI is no more than software 2.0).

This is why we're releasing the Ultra-Scale Playbook. 200 pages to master:
- 5D parallelism (DP, TP, PP, EP, FSDP)
- ZeRO
- Flash Attention
- Compute/communication overlap and bottlenecks

All with accessible theory intros and 4,000+ scaling experiments.
Hugging Face (Twitter)

RT @rohanpaul_ai: Wow. This is a HUGE 24-trillion-token web dataset with document-level metadata available on @huggingface

apache-2.0 license

- collected from Common Crawl
- each document is labeled with a 12-field taxonomy covering topic, page type, complexity, and quality .
- Labels are generated by EAI-Distill-0.5b, a 0.5B-parameter model fine-tuned on Qwen2.5-32B-Instruct outputs. It matches teacher quality within 3% agreement and preserves domain recall within 1pp .
- Simple SQL-style filters produce datasets competitive with specialist pipelines. Math is within 8% of SOTA, web code up 14.3%, STEM up 24.5%, and medical up 8.6% .
- Inference on 23.6 B documents required about 90k AMD MI300x GPU-hours.
Hugging Face (Twitter)

RT @theMetaStoneAI: 🚀 Introducing XBai o4:a milestone in our 4th-generation open-source technology based on parallel test time scaling!
In its medium mode, XBai o4 now fully outperforms OpenAI−o3−mini.📈

🔗Open-source weights: https://huggingface.co/MetaStoneTec/XBai-o4
Github link: https://github.com/MetaStone-AI/XBai-o4
Hugging Face (Twitter)

RT @HuggingPapers: StepFun just released Step-3 on Hugging Face!

It's a new 321B-parameter VLM that's "Large yet Affordable," co-designed for cost-effective decoding.

Achieves unprecedented efficiency, setting a new Pareto frontier for LLM inference.
Hugging Face (Twitter)

RT @NielsRogge: StepFun quitely dropped a 321B parameter VLM on @huggingface.. trained on Hopper GPUs similar to DeepSeek except more efficient

@StepFun_ai is yet-another-Chinese AI player besides @deepseek_ai, @Alibaba_Qwen, @Kimi_Moonshot, @MiniMax__AI, @TencentHunyuan and @Zai_org https://twitter.com/HuggingPapers/status/1952038716488208409#m
Hugging Face (Twitter)

RT @Alibaba_Qwen: 🚀 Meet Qwen-Image — a 20B MMDiT model for next-gen text-to-image generation. Especially strong at creating stunning graphic posters with native text. Now open-source.

🔍 Key Highlights:
🔹 SOTA text rendering — rivals GPT-4o in English, best-in-class for Chinese
🔹 In-pixel text generation — no overlays, fully integrated
🔹 Bilingual support, diverse fonts, complex layouts

🎨 Also excels at general image generation — from photorealistic to anime, impressionist to minimalist. A true creative powerhouse.

Blog:https://qwenlm.github.io/blog/qwen-image/
Hugging Face:https://huggingface.co/Qwen/Qwen-Image
ModelScope:https://modelscope.cn/models/Qwen/Qwen-Image
Github:github.com/QwenLM/Qwen-Image
Technical report:https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf
Demo: https://modelscope.cn/aigc/imageGeneration?tab=advanced
Hugging Face (Twitter)

RT @_fracapuano: We shipped @LeRobotHF to its first major release, on Pypi and GitHub.

Alongside the team at @huggingface we’re making robotics more accessible, collaborative, and we hope this release makes contributing easier and better.

Links in 🧵
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @jandotai: Hugging Face 🤝 Jan

You can now use Hugging Face as a remote model provider in Jan.

Go to Settings -> Model Providers -> add your Hugging Face API key. Then open a new chat and pick a model from @huggingface.

Works with any model in Hugging Face in Jan.
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @abidlabs: New Gradio component: 🥳 gr.Dialogue:

• As an output, it can be used to show diarized speech transcription
• As input, it's perfect for multispeaker TTS models, as it also supports auto-complete tags 🪄

Try it out in Gradio 5.40!
Hugging Face (Twitter)

RT @jackvial89: I've created a @LeRobotHF @huggingface dataset for the screwdriver robot. This dataset contains 391 human demonstrations of attaching a part with a screw in 3 positions: left, right, center. Currently training a few different models on this dataset!
Hugging Face (Twitter)

RT @RisingSayak: Wait is over 🤯

An Apache 2.0 DiT-based image generation model from @Alibaba_Qwen -- Qwen-Image 🔥

Supported in Diffusers. Training script PR is up and should be merged soon.

Go, fire!
Hugging Face (Twitter)

RT @romainhuet: A great day to be a developer! Stay tuned! 🤗
Hugging Face (Twitter)

RT @_lewtun: One line of code is all it takes to fine-tune the gpt-oss models from @OpenAI 🔥

> Support to target the MoE expert layers with PEFT
> Kernels for FlashAttention3 & MegaBlocks
> Fast inference with MXFP4 quantization format

In our testing, these models are extremely efficient to tune and can be adapted to new domains with just a few 100 samples 🤯

Download the models: huggingface.co/openai
Training & inference recipes: https://github.com/huggingface/gpt-oss-recipes/tree/main
Hugging Face (Twitter)

RT @mervenoyann: gpt-oss @OpenAI is here! 🔥

> two MoEs with 21B/3.6B and 117B/5.1B total/active params, efficient reasoning models 🤯
> use & fine-tune with transformers & TRL 🛠️
> inference powered by @huggingface Inference Providers 🫡
> apache 2.0 license 💗