ML Research Hub – Telegram

ML Research Hub

32.4K subscribers

6.27K photos

421 videos

24 files

6.8K links

Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho

Download Telegram

About

Blog

Apps

Platform

ML Research Hub

32.4K subscribers

ML Research Hub

✨KAT-Coder-V2 Technical Report

📝 Summary:
KAT-Coder-V2 is an agentic coding model that uses a 'Specialize-then-Unify' approach across five expert domains. It employs novel training methods and infrastructure, achieving strong performance on SWE-bench, PinchBench, and other coding benchmarks.

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27703
• PDF: https://arxiv.org/pdf/2603.27703

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #Coding #LLM #MachineLearning #Research

200 views09:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Unified Number-Free Text-to-Motion Generation Via Flow Matching

📝 Summary:
Existing text-to-motion models struggle with variable agents, leading to inefficiency and errors. This paper proposes Unified Motion Flow UMF, a two-stage approach prior and reaction that uses P-Flow and S-Flow in a unified latent space. UMF effectively generates multi-person motion from text, mi...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27040
• PDF: https://arxiv.org/pdf/2603.27040
• Project Page: https://githubhgh.github.io/umf/
• Github: https://github.com/Githubhgh/UMF_CVPR

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#TextToMotion #FlowMatching #GenerativeAI #MotionSynthesis #DeepLearning

201 views10:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Text Data Integration

📝 Summary:
This paper argues for integrating textual data into data integration systems, as current approaches largely focus on structured data. It will explore the challenges, state-of-the-art, and open problems in utilizing unstructured text.

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27055
• PDF: https://arxiv.org/pdf/2603.27055
• Project Page: https://dtim.upc.edu/en
• Github: https://github.com/dtim-upc/THOR

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#DataIntegration #UnstructuredData #TextData #NLP #DataScience

154 views13:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨A Comparative Study in Surgical AI: Datasets, Foundation Models, and Barriers to Med-AGI

📝 Summary:
This paper finds that even state-of-the-art multi-billion parameter AI models struggle with surgical tool detection, a seemingly simple task. Scaling models further offers diminishing returns, suggesting fundamental limitations for current Vision Language Models in surgical use cases beyond just ...

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27341
• PDF: https://arxiv.org/pdf/2603.27341

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#SurgicalAI #MedicalAI #FoundationModels #VisionLanguageModels #AIHealthcare

162 views13:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨HandX: Scaling Bimanual Motion and Interaction Generation

📝 Summary:
HandX presents a new foundation for bimanual hand motion synthesis, offering a high-fidelity dataset, an LLM-driven annotation method, and new evaluation metrics. It enables high-quality dexterous motion generation, with scaling trends observed.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28766
• PDF: https://arxiv.org/pdf/2603.28766
• Project Page: https://github.com/handx-project/HandX
• Github: https://github.com/handx-project/HandX

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MotionSynthesis #BimanualInteraction #DexterousManipulation #AIResearch #LLM

143 views14:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding

📝 Summary:
AdaptToken enables efficient long video understanding for MLLMs by using model uncertainty to dynamically select relevant tokens. It allocates a global token budget and supports early stopping, significantly improving accuracy and reducing inference time across benchmarks.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28696
• PDF: https://arxiv.org/pdf/2603.28696
• Project Page: https://haozheqi.github.io/adapt-token
• Github: https://github.com/HaozheQi/AdaptToken

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MLLM #VideoUnderstanding #MachineLearning #AIResearch #TokenSelection

182 views14:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset

📝 Summary:
This paper introduces KITScenes LongTail, a new dataset for long-tail driving events. It offers multi-view video, trajectories, and multilingual expert reasoning traces. This resource improves few-shot generalization and evaluates multimodal models instruction following capabilities.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23607
• PDF: https://arxiv.org/pdf/2603.23607
• Project Page: https://huggingface.co/datasets/KIT-MRT/KITScenes-LongTail

✨ Datasets citing this paper:
• https://huggingface.co/datasets/KIT-MRT/KITScenes-LongTail

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AutonomousDriving #ComputerVision #Datasets #LongTailLearning #MultimodalAI

192 views15:49

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Forwarded from Machine Learning with Python

Follow the Machine Learning with Python channel on WhatsApp: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A

❤1

70 views16:10

ML Research Hub

✨ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

📝 Summary:
ChartNet is a 1.5 million-sample multimodal dataset. It improves AI models chart understanding by providing diverse charts with aligned visual, textual, and numerical data. Fine-tuning on this open-source dataset significantly enhances performance in data visualization.

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27064
• PDF: https://arxiv.org/pdf/2603.27064
• Project Page: https://huggingface.co/datasets/ibm-granite/ChartNet

🔹 Models citing this paper:
• https://huggingface.co/ibm-granite/granite-4.0-3b-vision
• https://huggingface.co/beaupi/granite-4.0-3b-vision-oQ8

✨ Datasets citing this paper:
• https://huggingface.co/datasets/ibm-granite/ChartNet

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

176 views19:25

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨STRIDE: When to Speak Meets Sequence Denoising for Streaming Video Understanding

📝 Summary:
STRIDE enables proactive video understanding by modeling temporal activation patterns through iterative denoising within sliding windows, improving timing decisions in streaming scenarios. AI-generate...

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27593
• PDF: https://arxiv.org/pdf/2603.27593
• Project Page: https://interlive-team.github.io/STRIDE/
• Github: https://interlive-team.github.io/STRIDE/

🔹 Models citing this paper:
• https://huggingface.co/interlive/STRIDE-2B

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

157 views19:25

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SpatialLM: Training Large Language Models for Structured Indoor Modeling

📝 Summary:
SpatialLM, a multimodal large language model, processes 3D point cloud data to generate structured scene understanding outputs, achieving state-of-the-art performance in layout estimation and competit...

🔹 Publication Date: Published on Jun 9, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2506.07491
• PDF: https://arxiv.org/pdf/2506.07491
• Project Page: https://manycore-research.github.io/SpatialLM
• Github: https://github.com/manycore-research/SpatialLM

🔹 Models citing this paper:
• https://huggingface.co/manycore-research/SpatialLM1.1-Qwen-0.5B
• https://huggingface.co/manycore-research/SpatialLM1.1-Llama-1B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/Voxel51/spatial_lm_dataset
• https://huggingface.co/datasets/manycore-research/SpatialLM-Dataset
• https://huggingface.co/datasets/manycore-research/SpatialLM-Testset

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

SpatialLM: Training Large Language Models for Structured Indoor Modeling

SpatialLM is a large language model designed to process 3D point cloud data and generate structured 3D scene understanding outputs. These outputs include architectural elements like walls, doors,...

150 views20:25

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Media is too big

VIEW IN TELEGRAM

153 views20:25

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨INSID3: Training-Free In-Context Segmentation with DINOv3

📝 Summary:
INSID3 demonstrates training-free in-context segmentation using only frozen DINOv3 features. This minimalist approach achieves state-of-the-art results across various segmentation tasks with fewer parameters and no supervision.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28480
• PDF: https://arxiv.org/pdf/2603.28480
• Project Page: https://visinf.github.io/INSID3/
• Github: https://github.com/visinf/INSID3

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

142 views21:26

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨LongCat-Next: Lexicalizing Modalities as Discrete Tokens

📝 Summary:
LongCat-Next introduces DiNA, a unified framework for native multimodal processing. It represents text, vision, and audio as discrete tokens in a shared space, enabling consistent autoregressive modeling. This reconciles understanding-generation conflicts and excels across many multimodal tasks.

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27538
• PDF: https://arxiv.org/pdf/2603.27538
• Project Page: https://longcat.chat/longcat-next/intro
• Github: https://github.com/meituan-longcat/LongCat-Next

🔹 Models citing this paper:
• https://huggingface.co/meituan-longcat/LongCat-Next

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

124 views23:26

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨A Neural Score-Based Particle Method for the Vlasov-Maxwell-Landau System

📝 Summary:
This paper introduces neural score-based transport modeling SBTM for the Vlasov-Maxwell-Landau system. SBTM replaces the less efficient blob method, providing linear cost, greater accuracy, faster runtime, and lower memory, correctly simulating long-term plasma relaxation.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25832
• PDF: https://arxiv.org/pdf/2603.25832
• Github: https://github.com/Vilin97/Vlasov-Landau-SBTM

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

127 views01:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

📝 Summary:
CARLA-Air unifies high-fidelity urban driving and multirotor flight simulation within a single Unreal Engine framework. This addresses the demand for joint air-ground agent modeling in photorealistic environments, supporting embodied intelligence research.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28032
• PDF: https://arxiv.org/pdf/2603.28032
• Project Page: https://github.com/louiszengCN/CarlaAir
• Github: https://github.com/louiszengCN/CarlaAir

🔹 Models citing this paper:
• https://huggingface.co/tianlezeng/CarlaAIr-v0.1.7

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

CARLA-Air: Fly Drones Inside a CARLA World -- A Unified...

The convergence of low-altitude economies, embodied intelligence, and air-ground cooperative systems creates growing demand for simulation infrastructure capable of jointly modeling aerial and...

116 views02:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SeGPruner: Semantic-Geometric Visual Token Pruner for 3D Question Answering

📝 Summary:
A semantic-aware and geometry-guided token pruning framework is presented for efficient 3D question answering with multi-view images, achieving significant reductions in token budget and inference lat...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29437
• PDF: https://arxiv.org/pdf/2603.29437
• Project Page: https://github.com/intcomp/SegPruner
• Github: https://github.com/intcomp/SegPruner

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

SeGPruner: Semantic-Geometric Visual Token Pruner for 3D Question Answering

Vision-language models (VLMs) have been widely adopted for 3D question answering (3D QA). In typical pipelines, visual tokens extracted from multiple viewpoints are concatenated with language...

117 views02:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells

📝 Summary:
Lingshu-Cell is a masked discrete diffusion model that learns transcriptomic state distributions and enables conditional simulation of cellular perturbations across diverse tissues and species. AI-gen...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25240
• PDF: https://arxiv.org/pdf/2603.25240

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

82 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Media is too big

VIEW IN TELEGRAM

✨VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward

📝 Summary:
VGGRPO is a latent geometry-guided framework that enhances video diffusion models' geometric consistency through a Latent Geometry Model and latent-space reinforcement learning with camera motion and ...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26599
• PDF: https://arxiv.org/pdf/2603.26599
• Project Page: https://zhaochongan.github.io/projects/VGGRPO/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

76 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training

📝 Summary:
OptiMer enables flexible continual pre-training by decoupling data mixture ratio selection from training through post-hoc Bayesian optimization of distribution vectors extracted from individual datase...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28858
• PDF: https://arxiv.org/pdf/2603.28858
• Github: https://github.com/shyyhs/optimer

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

108 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AutoWeather4D: Autonomous Driving Video Weather Conversion via G-Buffer Dual-Pass Editing

📝 Summary:
AutoWeather4D is a 3D-aware weather editing framework that decouples geometry and illumination through a dual-pass mechanism, enabling efficient and physically accurate weather modification for autono...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26546
• PDF: https://arxiv.org/pdf/2603.26546
• Project Page: https://lty2226262.github.io/autoweather4d/
• Github: https://github.com/lty2226262/AutoWeather4D

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AutonomousDriving #ComputerVision #WeatherEditing #3DGraphics #AIResearch

84 views03:03

✨ Explore Data Science 📝 Write your paper