ML Research Hub – Telegram

ML Research Hub

32.7K subscribers

5.67K photos

359 videos

24 files

6.13K links

Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho

Download Telegram

About

Blog

Apps

Platform

ML Research Hub

32.7K subscribers

ML Research Hub

✨Cautious Weight Decay

📝 Summary:
Cautious Weight Decay CWD is an optimizer modification that selectively applies weight decay to parameters whose signs align with the optimizer update. It improves accuracy and loss in large-scale models without additional tuning, acting as a drop-in change for common optimizers.

🔹 Publication Date: Published on Oct 14, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.12402
• PDF: https://arxiv.org/pdf/2510.12402
• Project Page: https://elm.baulab.info
• Github: https://github.com/google-deepmind/simply

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#WeightDecay #Optimization #DeepLearning #MachineLearning #AI

❤1

356 views11:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OASIS: Open Agent Social Interaction Simulations with One Million Agents

📝 Summary:
OASIS is a scalable and generalizable social media simulator that models up to one million LLM agents. It replicates complex social phenomena like information spreading and group polarization across platforms, demonstrating enhanced group dynamics and diverse opinions with larger agent groups.

🔹 Publication Date: Published on Nov 18, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2411.11581
• PDF: https://arxiv.org/pdf/2411.11581
• Github: https://github.com/camel-ai/oasis

✨ Spaces citing this paper:
• https://huggingface.co/spaces/nguyenthanhasia/oasis-demo

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLMAgents #SocialSimulation #AgentBasedModeling #ComputationalSocialScience #GroupDynamics

214 views00:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling

📝 Summary:
FlashPrefill drastically speeds up LLM prefilling using instantaneous pattern discovery and dynamic thresholding for sparse attention. It achieves an unprecedented 27.78x speedup on 256K sequences and maintains 1.71x on 4K contexts.

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06199
• PDF: https://arxiv.org/pdf/2603.06199
• Github: https://github.com/qhfan/FlashPrefill

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #FlashPrefill #SparseAttention #LongContext #DeepLearning

142 views02:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

122 views02:01

ML Research Hub

✨PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

📝 Summary:
PixARMesh reconstructs complete 3D indoor scene meshes from a single image. It uses a unified model with cross-attention and autoregressive generation to directly predict layout and geometry, producing high-quality, lightweight meshes.

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05888
• PDF: https://arxiv.org/pdf/2603.05888
• Project Page: https://mlpc-ucsd.github.io/PixARMesh/
• Github: https://github.com/mlpc-ucsd/PixARMesh

🔹 Models citing this paper:
• https://huggingface.co/zx1239856/PixARMesh-EdgeRunner
• https://huggingface.co/zx1239856/PixARMesh-BPT

✨ Datasets citing this paper:
• https://huggingface.co/datasets/zx1239856/3d-front-ar-packed
• https://huggingface.co/datasets/zx1239856/PixARMesh-eval-data

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#3DReconstruction #ComputerVision #DeepLearning #SingleView3D #MeshGeneration

PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

We introduce PixARMesh, a method to autoregressively reconstruct complete 3D indoor scene meshes directly from a single RGB image. Unlike prior methods that rely on implicit signed distance fields...

129 views02:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

📝 Summary:
Penguin-VL introduces a vision encoder initialized from a text-only LLM, outperforming traditional contrastive pretraining. This method achieves superior visual fidelity and performance in multimodal tasks with a lightweight architecture, enabling efficient deployment on resource-constrained devi...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06569
• PDF: https://arxiv.org/pdf/2603.06569
• Github: https://github.com/tencent-ailab/Penguin-VL

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VLM #LLM #MultimodalAI #EfficientAI #AIResearch

104 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

📝 Summary:
RoboMME introduces a large-scale standardized benchmark for evaluating memory in vision-language-action models for long-horizon robotic manipulation. It comprises 16 tasks assessing temporal, spatial, object, and procedural memory. Experiments show memory effectiveness is highly task-dependent, w...

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04639
• PDF: https://arxiv.org/pdf/2603.04639
• Project Page: https://robomme.github.io/
• Github: https://github.com/RoboMME/robomme_benchmark

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#Robotics #AI #Benchmark #RoboticManipulation #Memory

95 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Reasoning Models Struggle to Control their Chains of Thought

📝 Summary:
Reasoning models exhibit very low control over their Chain-of-Thought steps compared to their final outputs. This low controllability, though poorly understood, currently suggests CoT monitoring remains a reliable tool for understanding models.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05706
• PDF: https://arxiv.org/pdf/2603.05706

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #MachineLearning #ChainOfThought #LLMs #AIResearch

95 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Physical Simulator In-the-Loop Video Generation

📝 Summary:
PSIVG integrates a physical simulator into video diffusion processes to generate physically consistent videos while maintaining visual quality and diversity. AI-generated summary Recent advances in di...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06408
• PDF: https://arxiv.org/pdf/2603.06408
• Project Page: https://vcai.mpi-inf.mpg.de/projects/PSIVG/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

123 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Dynamic Chunking Diffusion Transformer

📝 Summary:
Dynamic Chunking Diffusion Transformer adapts token sequence length based on image content and diffusion timestep, improving efficiency and performance over fixed-token approaches. AI-generated summar...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06351
• PDF: https://arxiv.org/pdf/2603.06351

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

149 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs

📝 Summary:
Flow-based VLA models face challenges in online RL. We propose π-StepNFT, a critic-free framework that uses step-wise guidance for wider exploration. It improves generalization and robustness in complex environments, offering a scalable solution.

🔹 Publication Date: Published on Mar 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02083
• PDF: https://arxiv.org/pdf/2603.02083
• Project Page: https://wangst0181.github.io/pi-StepNFT/
• Github: https://github.com/wangst0181/pi-StepNFT

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ReinforcementLearning #OnlineRL #MachineLearning #DeepLearning #πStepNFT

155 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning

📝 Summary:
BandPO addresses entropy collapse in LLM RL by replacing fixed PPO clipping. It uses Band, a dynamic probability-aware projection operator that prevents suppression of high-advantage actions. This method improves stability and exploration.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04918
• PDF: https://arxiv.org/pdf/2603.04918
• Github: https://github.com/OpenMOSS/BandPO

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #ReinforcementLearning #PPO #MachineLearning #AIResearch

❤1

135 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Progressive Residual Warmup for Language Model Pretraining

📝 Summary:
Progressive Residual Warmup ProRes stabilizes transformer pretraining by gradually activating residual connections layer by layer. This 'early layer learns first' strategy improves convergence speed, generalization, and downstream performance.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05369
• PDF: https://arxiv.org/pdf/2603.05369
• Github: https://github.com/dandingsky/ProRes

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #Transformer #DeepLearning #NLP #Pretraining

167 views06:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SLER-IR: Spherical Layer-wise Expert Routing for All-in-One Image Restoration

📝 Summary:
SLER-IR is an all-in-one image restoration framework using spherical layer-wise expert routing. It introduces a spherical degradation embedding with contrastive learning for reliable routing and a granularity fusion module for non-uniform degradations. It consistently outperforms state-of-the-art...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05940
• PDF: https://arxiv.org/pdf/2603.05940

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

174 views07:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨HiMAP-Travel: Hierarchical Multi-Agent Planning for Long-Horizon Constrained Travel

📝 Summary:
HiMAP-Travel is a hierarchical multi-agent framework that solves long-horizon constrained travel planning. It decomposes tasks into strategic coordination and parallel execution, achieving superior performance over baselines and reducing latency.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04750
• PDF: https://arxiv.org/pdf/2603.04750

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

133 views08:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model

📝 Summary:
CompACT, a discrete tokenizer that reduces observation encoding from hundreds to 8 tokens, enables faster and more efficient world model planning for real-time control applications. AI-generated summa...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05438
• PDF: https://arxiv.org/pdf/2603.05438
• Github: https://github.com/kdwonn/CompACT

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

88 views09:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨WildActor: Unconstrained Identity-Preserving Video Generation

📝 Summary:
WildActor generates consistent human videos with full-body identity preservation across varying viewpoints and motions using a large-scale dataset and novel attention mechanisms. AI-generated summary ...

🔹 Publication Date: Published on Feb 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00586
• PDF: https://arxiv.org/pdf/2603.00586
• Project Page: https://wildactor.github.io/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

82 views09:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Beyond the Grid: Layout-Informed Multi-Vector Retrieval with Parsed Visual Document Representations

📝 Summary:
ColParse introduces a document parsing approach that generates layout-informed sub-image embeddings to create compact, structurally-aware representations for visual document retrieval, achieving over ...

🔹 Publication Date: Published on Mar 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.01666
• PDF: https://arxiv.org/pdf/2603.01666

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

87 views09:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Making Reconstruction FID Predictive of Diffusion Generation FID

📝 Summary:
A new metric called interpolated FID is proposed that shows strong correlation with generation FID in diffusion models, addressing the poor correlation issue between reconstruction FID and generation ...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05630
• PDF: https://arxiv.org/pdf/2603.05630
• Github: https://github.com/tongdaxu/Making-rFID-Predictive-of-Diffusion-gFID

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

133 views09:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Demystifying Action Space Design for Robotic Manipulation Policies

📝 Summary:
Large-scale empirical study demonstrates that action space design significantly impacts robotic policy learning, with delta action prediction improving performance and joint-space/task-space represent...

🔹 Publication Date: Published on Feb 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.23408
• PDF: https://arxiv.org/pdf/2602.23408

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

148 views09:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Mario: Multimodal Graph Reasoning with Large Language Models

📝 Summary:
Mario is a unified framework that enables large language model-based reasoning on multimodal graphs by addressing cross-modal consistency and heterogeneous modality preferences through graph-condition...

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05181
• PDF: https://arxiv.org/pdf/2603.05181
• Github: https://github.com/sunyuanfu/Mario

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

131 views10:04

✨ Explore Data Science 📝 Write your paper