Hugging Face
63 subscribers
658 photos
232 videos
1.13K links
Download Telegram
Hugging Face (Twitter)

RT @gm8xx8: NVIDIA Nemotron-Nano v2

Models: 12B Base, 9B Reasoning, 9B Base
- Arch: Hybrid Mamba2–Transformer (128K ctx, 4 attn layers)
- Training: 10.6T tokens (3.5T synthetic from DeepSeek, Qwen, Nemotron-4, phi-4, etc.)
- 15 natural languages + 43 programming languages
- Datasets: Nemotron-CC v2 + Nemotron-CC-Math (133B tokens, 5.5× FineMath)

Benchmarks
- Math: 91.4 GSM8K CoT, 63.6 MATH L5, +30→56.7 AIME
- Code: 58.5 HumanEval+, 58.9 MBPP+
- Commonsense: 90.7 ARC, 79.9 HellaSwag
- Long-context: 82.2 RULER-128K

Highlights
- Nemotron-CC-Math: First scalable pipeline using Lynx + LLM cleanup to preserve LaTeX + code in web data. Delivers SOTA boosts (+12.6 MATH, +14.3 MBPP+) vs prior open math sets
- Efficiency: Distilled 12B→9B (480B tokens), ~1.5e24 FLOPs, ~724 MWh disclosed
- Deployment: Hugging Face, NGC, NeMo, TRT-LLM, vLLM | GPU-optimized
- Open: Models, datasets, and full extraction pipelines released
Hugging Face (Twitter)

RT @ctnzr: Today we're releasing NVIDIA Nemotron Nano v2 - a 9B hybrid SSM that is 6X faster than similarly sized models, while also being more accurate.

Along with this model, we are also releasing most of the data we used to create it, including the pretraining corpus.

Links to the models, datasets, and tech report are here:

https://research.nvidia.com/labs/adlr/NVIDIA-Nemotron-Nano-2/
Hugging Face (Twitter)

RT @NielsRogge: Ok ngl this is cool! The end of LoRa's??

Powered by @FAL as inference provider. Try it out below! https://twitter.com/Alibaba_Qwen/status/1957500569029079083#m
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @maximelabonne: LFM2-VL support with GGUF and llama.cpp 🥳

You can now run these tiny, hyper-efficient VLMs on your watch!

We released quantized checkpoints for LFM2-VL-450M and LFM2-VL-1.6B on @huggingface
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @multimodalart: IT'S OUT! 🚀 MoDA: Multi-modal Diffusion Architecture for Talking Head Generation

finally a talking head:
open source 🏋️
fast
portrait + audio-driven 🧑‍🎨🎧
with emotion control

(and yes, i built an inference system + Gradio, generate in < 15s on @huggingface spaces 🤗)
Hugging Face (Twitter)

RT @Xianbao_QIAN: nano-banana, qwen-image-edit, what else?

Try @StepFun_ai NextStep-1-Large-Edit

- 14B AR model
- Apache 2 license
- Demo available on @huggingface
- Pretrain model also made available

Link below
Hugging Face (Twitter)

RT @allen_ai: We’re releasing early pre-training checkpoints for OLMo-2-1B to help study how LLM capabilities emerge. They’re fine-grained snapshots intended for analysis, reproduction, and comparison. 🧵
Hugging Face (Twitter)

RT @RisingSayak: It's out friends!

Really great to see the state of things in image edits, video fidelity being pushed further and further, thanks to the community!

This release also features new fine-tuning scripts for Qwen-Image and Flux Kontext (with support for image inputs). So, get busy making these models your own 🤗

We also improved the loading speed of Diffusers pipelines & models. This will become particularly evident when operating with large models like Wan, Qwen, etc.

Release notes: https://github.com/huggingface/diffusers/releases/tag/v0.35.0
Hugging Face (Twitter)

RT @ClementDelangue: Just crossed 20M monthly requests with @huggingface inference providers, our router for open models.

@CerebrasSystems @novita_labs & @FireworksAI_HQ are growing the fastest!

It's now powering the official open playground from @OpenAI & integrate with apps like @cline & @roo_code.

Let's go!
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @dylan_ebert_: OmniPart: Part-Aware 3D Generation

🪛 Semantic decoupling
🏙️ Structural cohesion
🤗 Free and open source

demo available on Hugging Face
Hugging Face (Twitter)

RT @BrigitteTousi: Meet Surya: the newest open-source model from @IBM and @NASA that predicts solar storms before they fry satellites or knock out the grid. And it’s free for researchers, builders, and guardians of critical infrastructure on @huggingface. 🧵
Hugging Face (Twitter)

RT @IBM: In collaboration with @NASA, we've open-sourced Surya on @huggingface — a new foundation model designed to help researchers protect infrastructure through accessible, accurate modeling of space weather.

It's going to totally change how we forecast solar storms. See how.🧵
Hugging Face (Twitter)

Sun's out, models out. 😎
@IBM & @NASA dropped Surya, an open-source heliophysics model trained on 14 years of observations from NASA’s Solar Dynamics Observatory, and it's 🔥🔥🔥.