Hugging Face
96 subscribers
927 photos
298 videos
1.57K links
Download Telegram
Hugging Face (Twitter)

RT @ArtificialAnlys: IBM has launched Granite 4.0 - a new family of open weights language models ranging in size from 3B to 32B. Artificial Analysis was provided pre-release access, and our benchmarking shows Granite 4.0 H Small (32B/9B total/active parameters) scoring an Intelligence Index of 23, with a particular strength in token efficiency

Today IBM released four new models: Granite 4.0 H Small (32B/9B total/active parameters), Granite 4.0 H Tiny (7B/1B), Granite 4.0 H Micro (3B/3B) and Granite 4.0 Micro (3B/3B). We evaluated Granite 4.0 Small (in non-reasoning mode) and Granite 4.0 Micro using the Artificial Analysis Intelligence Index. Granite 4.0 models combine a small amount of standard transformer-style attention layers with a majority of Mamba layers which claims to reduce memory requirements without impacting performance

Key benchmarking takeaways:
🧠 Granite 4.0 H Small Intelligence: In non-reasoning, Granite 4.0 H Small scores 23 on the...

Перейти на оригинальный пост
Hugging Face (Twitter)

RT @ClementDelangue: IBM is back! They just joined Hugging Face Enterprise & released Granite 4.0 in open-source with a new hybrid Mamba/transformer architecture that reduces memory requirements without reducing accuracy much.

This set of models is great for agentic workflows like tool calling, document analysis, RAG, especially in an enterprise setup 🚀

The "Micro" (3.4B) model can even run 100% locally in your browser on WebGPU, powered by 🤗 TransformersJS!

3B dense hybrid: https://huggingface.co/ibm-granite/granite-4.0-micro
3B MoE with 1B active: https://huggingface.co/ibm-granite/granite-4.0-h-small-base
32B MoE with 9B active: https://huggingface.co/ibm-granite/granite-4.0-h-small

🗂️ Full Model collection: https://huggingface.co/collections/ibm-granite/granite-40-language-models-6811a18b820ef362d9e5a82c

🔗 In-browser demo: https://huggingface.co/spaces/ibm-granite/Granite-4.0-WebGPU
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @victormustar: another open source win:
opencode + GLM 4.6 is basically Claude Code (used it all day) but insanely cheap + better TUI. And you can use it with your Hugging Face token now 🔥 https://twitter.com/victormustar/status/1935285458394583356#m
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @VoyageAI: To evaluate embeddings and retrieval, we need more benchmarks beyond MTEB that are less vulnerable to overfitting. That’s why RTEB was just beta-launched!

⚖️ Both open and held-out datasets to prevent overfitting to evaluation sets.
🌍 Realistic datasets from critical enterprise domains like law, healthcare, code, and finance.
🔎 Only focus on retrieval applications with relevant large-scale datasets.

Check out the blog and leaderboard on @huggingface and join the community in building a stronger, more reliable benchmark.

Blog: mongodb.social/6013Ai5sz
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

What other features would you like to see in 𝚝𝚛𝚊𝚌𝚔𝚒𝚘, our experiment tracking library? https://twitter.com/TrackioApp/status/1973834043210018828#m
Hugging Face (Twitter)

RT @reach_vb: Pretty cool to see a MIT licensed 15B model competing w/ DeepSeek R1 - how are the vibes? 👀
Hugging Face (Twitter)

RT @ClementDelangue: 🦾Great📷 milestone for open-source robotics: pi0 & pi0.5 by @physical_int are now on @huggingface, fully ported to PyTorch in @LeRobotHF and validated side-by-side with OpenPI for everyone to experiment with, fine-tune & deploy in their robots!

As described by Physical Intelligence, π₀.₅ is a Vision-Language-Action model which represents a significant evolution from π₀ to address a big challenge in robotics: open-world generalization.

While robots can perform impressive tasks in controlled environments, π₀.₅ is designed to generalize to entirely new environments and situations that were never seen during training.

Generalization must occur at multiple levels:
- Physical Level: Understanding how to pick up a spoon (by the handle) or plate (by the edge), even with unseen objects in cluttered environments
- Semantic Level: Understanding task semantics, where to put clothes and shoes (laundry hamper, not on the bed), and what tools...

Перейти на оригинальный пост
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @calebfahlgren: .@pmarca: "My guess is we are going to live in a world in which most aggregate AI is going to be executed probably on smaller form factors and probably most of that is going to be open source" https://twitter.com/collision/status/1973473479061278737#m
Hugging Face (Twitter)

RT @MaziyarPanahi: just hit 4k followers on @huggingface! 🤗

couldn’t have done it without the incredible open-source AI community 💜

Grateful for your trust, support, and collaboration.
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @xenovacom: IBM just released Granite 4.0, their latest series of small language models! These models excel at agentic workflows (tool calling), document analysis, RAG, and more. 🚀

The "Micro" (3.4B) model can even run 100% locally in your browser on WebGPU, powered by 🤗 Transformers.js!
Hugging Face (Twitter)

RT @Alibaba_Qwen: 🚀 Qwen3-VL-30B-A3B-Instruct & Thinking are here!
Smaller size, same powerhouse performance 💪—packed with all the capabilities of Qwen3-VL!

🔧 With just 3B active params, it’s rivaling GPT-5-Mini & Claude4-Sonnet — and often beating them across STEM, VQA, OCR, Video, Agent tasks, and more.

And that’s not all: we’re also releasing an FP8 version, plus the FP8 of the massive Qwen3-VL-235B-A22B!

Try it out and make your multimodal AI applications run faster!🧠🖼️

Qwen Chat:   https://chat.qwen.ai/?models=qwen3-vl-30b-a3b
Github&Cookbooks:   https://github.com/QwenLM/Qwen3-VL/blob/main/cookbooks
API:   https://www.alibabacloud.com/help/en/model-studio/models#5540e6e52e1xx
Blog:  https://qwen.ai/blog?id=99f0335c4ad9ff6153e517418d48535ab6d8afef&from=research.latest-advancements-list
ModelScope:   https://modelscope.cn/collections/Qwen3-VL-5c7a94c8cb144b
HuggingFace:   https://huggingface.co/collections/Qwen/qwen3-vl-68d2a7c1b8a8afce4ebd2dbe
Hugging Face (Twitter)

RT @johnschulman2: Really happy to see people reproducing the result that LoRA rank=1 closely matches full fine-tuning on many RL fine-tuning problems. Here are a couple nice ones:
https://twitter.com/ben_burtenshaw/status/1974191312229577085 https://twitter.com/zzlccc/status/1973612326747336767#m
Hugging Face (Twitter)

RT @ClementDelangue: Cool reproduction of “Lora without regret” from @thinkymachines by @ben_burtenshaw in TRL
Hugging Face (Twitter)

RT @jietang: Finally, our open-source GLM-4.6 are trending no. 1 on HF. Thanks to all for the support. We are working on the next version and stay tuned!
Hugging Face (Twitter)

RT @arena: 🚨 New Top Open Model Update!

A relative newcomer to the Arena, @zai_org's GLM-4.6 takes the clear, undisputed #1 spot for Top Open Model. 🏆

It also ranks #4 overall, which is not an easy feat! The next top open model, DeepSeek R1 0528, has been the standing champion for months, now trailing nine points behind.

Congrats to the @zai_org team on this achievement! 🙌 https://twitter.com/Zai_org/status/1973034639708344767#m
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @Tu7uruu: Just dropped on HF: kani-tts-370m
A lightweight open-source text-to-speech model that sounds great and runs fast!

> 370M parameters — efficient and deployable on consumer GPUs
> NanoCodec + LFM2-350M
> Natural & expressive voice trained with modern neural TTS techniques
> Fast inference: real-time on a single RTX 3060