Hugging Face
67 subscribers
699 photos
243 videos
1.2K links
Download Telegram
Hugging Face (Twitter)

RT @dylan_ebert_: These are the current best Generative 3D

Render:
#1 - CSM
#2 - TRELLIS (open-source)
#3 - Zaohaowu3D

Topology:
#1 - Hunyuan3D-2
#2 - TRELLIS (open-source)
#3 - Hunyuan3D-2.1

as voted/submitted openly on 3D Arena
Hugging Face (Twitter)

RT @Xianbao_QIAN: 500+ hours of real world manipulation data, covering residential, kitchen, retail and office settings. A important step towards generalized manipulation models!

Great work Galaxea team!

https://huggingface.co/datasets/OpenGalaxea/Galaxea-Open-World-Dataset
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @DataChaz: This is wild.

A real-time webcam demo using SmolVLM from @huggingface and llama.cpp! 🤯

Running fully local on a MacBook M3.
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @Tu7uruu: Just dropped on HF! HunyuanVideo-Foley from Tencent AI Lab an end-to-end Text-Video-to-Audio (TV2A) model that turns silent videos into lifelike soundscapes

> 100k-hour curated TV2A dataset via automated pipeline
> Modality-balanced MMDiT: dual-stream audio-video fusion + text cross-attention
> REPA loss: aligns internal states with self-supervised audio features → higher fidelity & stability
> DAC-VAE audio codec: 48kHz, continuous latents, strong reconstruction across speech/music/sfx
> SOTA on Kling-Audio-Eval, VGGSound, and MovieGen-Audio-Bench (audio quality, semantic + temporal alignment)
Media is too big
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @TencentHunyuan: Today we're announcing the open-source release of HunyuanVideo-Foley, our new end-to-end Text-Video-to-Audio (TV2A) framework for generating high-fidelity audio.🚀

This tool empowers creators in video production, filmmaking, and game development to generate professional-grade audio that precisely aligns with visual dynamics and semantic context, addressing key challenges in V2A generation.🔊

Key Innovations:

🔹Exceptional Generalization: Trained on a massive 100k-hour multimodal dataset, the model generates contextually-aware soundscapes for a wide range of scenes, from natural landscapes to animated shorts.

🔹Balanced Multimodal Response: Our innovative multimodal diffusion transformer (MMDiT) architecture ensures the model balances video and text cues, generating rich, layered sound effects that capture every detail—from the main subject to subtle background elements.

🔹High-Fidelity Audio: Using a Representation Alignment...

Перейти на оригинальный пост
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @pollenrobotics: Two Reachy 2 setting and clearing the table, all in real time teleoperation!

Shot in a single take with all the successes... and a small fail👀
One example of what Reachy 2 can do: efficient, versatile object manipulation, with the precision needed for delicate or fragile tasks
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @reach_vb: 🚨 Apple just released FastVLM on Hugging Face - 0.5, 1.5 and 7B real-time VLMs with WebGPU support 🤯

> 85x faster and 3.4x smaller than comparable sized VLMs
> 7.9x faster TTFT for larger models
> designed to output fewer output tokens and reduce encoding time for high resolution images

Bonus: works in REALTIME directly in your browser powered by transformers.js and WebGPU 🔥

Try it out on the demo below 👇
Media is too big
VIEW IN TELEGRAM
Hugging Face (Twitter)

RT @xenovacom: NEW: Apple releases FastVLM and MobileCLIP2 on Hugging Face! 🤗

The models are up to 85x faster and 3.4x smaller than previous work, enabling real-time VLM applications! 🤯

It can even do live video captioning 100% locally in your browser (zero install). Huge for accessibility!
Hugging Face (Twitter)

RT @RisingSayak: Lovely time presenting at #AIDev Amsterdam today ❤️

We explored some 📹 models (Wan, LTX, etc.), their existing capabilities, and limitations.

I am glad that the attendees found my presentation to be an enjoyable experience 🫡

Find the slides here ⬇️
bit.ly/open-vid-gen
Hugging Face (Twitter)

RT @Xianbao_QIAN: Meituan just open sourced their new MoE LLM LongCat on @huggingface

It's exciting to see new players! The model looks very interesting too with technical report.

https://huggingface.co/meituan-longcat/LongCat-Flash-Chat