ML Research Hub – Telegram

ML Research Hub

32.3K subscribers

6.73K photos

472 videos

24 files

7.34K links

Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho

Download Telegram

About

Blog

Apps

Platform

ML Research Hub

32.3K subscribers

ML Research Hub

✨SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding

📝 Summary:
SPEED-Bench is introduced as a new benchmark for Speculative Decoding SD evaluation. It provides diverse semantic domains and realistic serving regimes to address limitations of existing benchmarks. This enables accurate measurement of SD performance in production environments, setting a unified ...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.09557
• PDF: https://arxiv.org/pdf/2604.09557
• Project Page: https://huggingface.co/blog/nvidia/speed-bench
• Github: https://github.com/NVIDIA/Model-Optimizer

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#SpeculativeDecoding #AIBenchmarks #LLMs #DeepLearning #ModelOptimization

128 views08:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨TRACE: Capability-Targeted Agentic Training

📝 Summary:
TRACE improves LLM agents by identifying capability gaps from trajectory comparisons. It then creates targeted training environments for specific skills, using LoRA adapters for efficient, environment-specific self-improvement. This boosts performance on customer service and tool use tasks, outpe...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.05336
• PDF: https://arxiv.org/pdf/2604.05336
• Project Page: https://scalingintelligence.stanford.edu/blogs/trace/
• Github: https://github.com/ScalingIntelligence/TRACE

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLMAgents #AI #MachineLearning #LoRA #DeepLearning

193 views19:09

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨Accelerating Speculative Decoding with Block Diffusion Draft Trees

📝 Summary:
DDTree enhances speculative decoding by constructing draft trees from block diffusion drafter distributions. It efficiently verifies multiple trajectories in parallel in a single target model pass, improving performance.

🔹 Publication Date: Published on Apr 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12989
• PDF: https://arxiv.org/pdf/2604.12989
• Project Page: https://liranringel.github.io/ddtree
• Github: https://github.com/liranringel/ddtree

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#SpeculativeDecoding #BlockDiffusion #LLMAcceleration #DeepLearning #AIResearch

❤1

138 views08:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Domain-Specific Latent Representations Improve the Fidelity of Diffusion-Based Medical Image Super-Resolution

📝 Summary:
Domain-specific autoencoders significantly enhance medical image super-resolution. Replacing generic VAEs improves fidelity, showing autoencoder choice is key, not the diffusion architecture. Autoencoder performance predicts overall SR quality.

🔹 Publication Date: Published on Apr 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.12152
• PDF: https://arxiv.org/pdf/2604.12152
• Github: https://github.com/sebasmos/latent-sr

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MedicalImaging #SuperResolution #DiffusionModels #DeepLearning #Autoencoders

178 views11:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨3DTV: A Feedforward Interpolation Network for Real-Time View Synthesis

📝 Summary:
3DTV is a feedforward network combining lightweight geometry and learning for real-time, robust sparse-view interpolation. It generates novel views efficiently without scene-specific optimization, making it practical for interactive applications.

🔹 Publication Date: Published on Apr 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.11211
• PDF: https://arxiv.org/pdf/2604.11211
• Project Page: https://stefanmschulz.github.io/3DTV_webpage/
• Github: https://github.com/StefanMSchulz/3DTV

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ViewSynthesis #DeepLearning #ComputerVision #NeuralNetworks #RealTimeAI

206 views11:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video

📝 Summary:
ReconPhys is the first feedforward framework to jointly learn physical attribute estimation and 3D Gaussian Splatting reconstruction from a single video. It offers significantly faster inference and superior reconstruction quality for non-rigid objects compared to prior optimization-based methods...

🔹 Publication Date: Published on Apr 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07882
• PDF: https://arxiv.org/pdf/2604.07882
• Project Page: https://chuanshuogushi.github.io/ReconPhys/
• Github: https://chuanshuogushi.github.io/ReconPhys/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ComputerVision #3DReconstruction #GaussianSplatting #DeepLearning #AIResearch

188 views07:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Qwen3.5-Omni Technical Report

📝 Summary:
Qwen3.5-Omni is a large multimodal model excelling in audio-visual understanding and generation, achieving SOTA results across many benchmarks. It features a Hybrid Attention MoE architecture, introduces ARIA for improved speech synthesis, and exhibits a new Audio-Visual Vibe Coding capability.

🔹 Publication Date: Published on Apr 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.15804
• PDF: https://arxiv.org/pdf/2604.15804

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MultimodalAI #AIResearch #DeepLearning #GenerativeAI #SpeechSynthesis

147 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Learning Adaptive Reasoning Paths for Efficient Visual Reasoning

📝 Summary:
Existing visual reasoning models often overthink, using redundant steps. AVR is an adaptive framework that dynamically chooses efficient reasoning formats. It reduces token usage by 50-90 percent while maintaining accuracy.

🔹 Publication Date: Published on Apr 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.14568
• PDF: https://arxiv.org/pdf/2604.14568
• Github: https://github.com/RunRiotComeOn/AVR

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VisualReasoning #AI #MachineLearning #Efficiency #DeepLearning

151 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨Repurposing 3D Generative Model for Autoregressive Layout Generation

📝 Summary:
LaviGen is a 3D layout generation framework that repurposes 3D generative models. It uses an adapted 3D diffusion model for autoregressive generation, explicitly modeling geometric relations and physical constraints. This achieves superior, more plausible 3D layouts 65% faster than previous methods.

🔹 Publication Date: Published on Apr 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.16299
• PDF: https://arxiv.org/pdf/2604.16299
• Project Page: https://fenghora.github.io/LaviGen-Page/
• Github: https://github.com/fenghora/LaviGen

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#3DGeneration #DiffusionModels #GenerativeAI #ComputerGraphics #DeepLearning

151 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Media is too big

VIEW IN TELEGRAM

✨Hierarchical Codec Diffusion for Video-to-Speech Generation

📝 Summary:
HiCoDiT generates speech from videos by leveraging the hierarchical structure of discrete speech tokens, achieving better audio-visual alignment through coarse-to-fine conditioning with dual-scale nor...

🔹 Publication Date: Published on Apr 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.15923
• PDF: https://arxiv.org/pdf/2604.15923

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoToSpeech #DiffusionModels #GenerativeAI #SpeechSynthesis #DeepLearning

200 views12:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Concrete Jungle: Towards Concreteness Paved Contrastive Negative Mining for Compositional Understanding

📝 Summary:
This paper improves vision-language models for compositional reasoning by using concreteness-based negative sample selection and a novel margin-based loss. Their framework, Slipform, achieves state-of-the-art accuracy on compositional benchmarks and cross-modal retrieval.

🔹 Publication Date: Published on Apr 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.13313
• PDF: https://arxiv.org/pdf/2604.13313

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VisionLanguage #DeepLearning #AIResearch #ComputerVision #NLP

217 views10:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models

📝 Summary:
UDM-GRPO integrates Uniform Discrete Diffusion Models with reinforcement learning, solving training instability issues. It optimizes using final samples as actions and reconstructed trajectories. This achieves state-of-the-art performance in text-to-image generation and OCR tasks.

🔹 Publication Date: Published on Apr 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18518
• PDF: https://arxiv.org/pdf/2604.18518
• Project Page: https://yovecent.github.io/UDM-GRPO.github.io/
• Github: https://github.com/Yovecent/UDM-GRPO

🔹 Models citing this paper:
• https://huggingface.co/Yovecents/URSA-1.7B-IBQ512-UDMGRPO-GenEval
• https://huggingface.co/Yovecents/URSA-1.7B-IBQ512-UDMGRPO-PickScore

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#DiffusionModels #ReinforcementLearning #GenerativeAI #TextToImage #DeepLearning

❤1

88 views07:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Scaling Test-Time Compute for Agentic Coding

📝 Summary:
This framework improves long-horizon agentic coding by using compact trajectory representations for test-time scaling. It employs Recursive Tournament Voting and adapted Parallel-Distill-Refine to significantly boost coding agent performance on benchmarks.

🔹 Publication Date: Published on Apr 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.16529
• PDF: https://arxiv.org/pdf/2604.16529

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AgenticAI #CodingAgents #MachineLearning #AIResearch #DeepLearning

❤1

148 views07:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨DeVI: Physics-based Dexterous Human-Object Interaction via Synthetic Video Imitation

📝 Summary:
DeVI enables physically plausible dexterous robot control by leveraging text-conditioned synthetic videos through a hybrid tracking reward that combines 3D and 2D tracking for improved hand-object int...

🔹 Publication Date: Published on Apr 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.20841
• PDF: https://arxiv.org/pdf/2604.20841
• Project Page: https://snuvclab.github.io/devi/
• Github: https://github.com/snuvclab/devi

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#Robotics #AI #ComputerVision #HumanRobotInteraction #DeepLearning

188 views09:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Encoder-Free Human Motion Understanding via Structured Motion Descriptions

📝 Summary:
Structured Motion Description SMD converts human motion into natural language, enabling large language models LLMs to reason about it directly. This encoder-free method achieves state-of-the-art performance on motion question answering and captioning.

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.21668
• PDF: https://arxiv.org/pdf/2604.21668
• Project Page: https://yaozhang182.github.io/motion-smd/
• Github: https://yaozhang182.github.io/motion-smd/

🔹 Models citing this paper:
• https://huggingface.co/zyyy12138/motion-smd-lora

✨ Datasets citing this paper:
• https://huggingface.co/datasets/zyyy12138/motion-smd-data

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#HumanMotionUnderstanding #LLMs #NLP #AI #DeepLearning

Encoder-Free Human Motion Understanding via Structured Motion Descriptions

The world knowledge and reasoning capabilities of text-based large language models (LLMs) are advancing rapidly, yet current approaches to human motion understanding, including motion question...

❤1

161 views11:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications

📝 Summary:
Mixture of Experts MoE models enhance large AI model efficiency and performance by dynamically selecting sub-models for diverse data. This survey details MoE design, algorithms, theory, and applications in various machine learning fields.

🔹 Publication Date: Published on Mar 10, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2503.07137
• PDF: https://arxiv.org/pdf/2503.07137
• Github: https://github.com/deepseek-ai/DeepEP

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MixtureOfExperts #MoE #AI #MachineLearning #DeepLearning

60 views23:10

✨ Explore Data Science 📝 Write your paper