ML Research Hub

108 views09:40

✨TAPS: Task Aware Proposal Distributions for Speculative Sampling

📝 Summary:
Speculative decoding quality depends on matching draft model training data to the downstream task. Task-specific training yields specialized drafters that are best combined at inference time using confidence-based routing, outperforming averaging. Confidence is a more effective routing signal tha...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27027
• PDF: https://arxiv.org/pdf/2603.27027
• Github: https://github.com/Moe-Zbeeb/TAPS

🔹 Models citing this paper:
• https://huggingface.co/zbeeb/Hass-MathInstruct_20epochs
• https://huggingface.co/zbeeb/Hass-ShareGPT_20epochs
• https://huggingface.co/zbeeb/Hass-Sharegpt-Mathinstruct-20epochs

✨ Datasets citing this paper:
• https://huggingface.co/datasets/zbeeb/TAPS-Datasets

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#SpeculativeDecoding #LLM #MachineLearning #AIResearch #NLP

arXiv.org

TAPS: Task Aware Proposal Distributions for Speculative Sampling

Speculative decoding accelerates autoregressive generation by letting a lightweight draft model propose future tokens that a larger target model then verifies in parallel. In practice, however,...

132 views09:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨KAT-Coder-V2 Technical Report

📝 Summary:
KAT-Coder-V2 is an agentic coding model that uses a 'Specialize-then-Unify' approach across five expert domains. It employs novel training methods and infrastructure, achieving strong performance on SWE-bench, PinchBench, and other coding benchmarks.

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27703
• PDF: https://arxiv.org/pdf/2603.27703

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #Coding #LLM #MachineLearning #Research

202 views09:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Unified Number-Free Text-to-Motion Generation Via Flow Matching

📝 Summary:
Existing text-to-motion models struggle with variable agents, leading to inefficiency and errors. This paper proposes Unified Motion Flow UMF, a two-stage approach prior and reaction that uses P-Flow and S-Flow in a unified latent space. UMF effectively generates multi-person motion from text, mi...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27040
• PDF: https://arxiv.org/pdf/2603.27040
• Project Page: https://githubhgh.github.io/umf/
• Github: https://github.com/Githubhgh/UMF_CVPR

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#TextToMotion #FlowMatching #GenerativeAI #MotionSynthesis #DeepLearning

204 views10:40

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Text Data Integration

📝 Summary:
This paper argues for integrating textual data into data integration systems, as current approaches largely focus on structured data. It will explore the challenges, state-of-the-art, and open problems in utilizing unstructured text.

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27055
• PDF: https://arxiv.org/pdf/2603.27055
• Project Page: https://dtim.upc.edu/en
• Github: https://github.com/dtim-upc/THOR

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#DataIntegration #UnstructuredData #TextData #NLP #DataScience

157 views13:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨A Comparative Study in Surgical AI: Datasets, Foundation Models, and Barriers to Med-AGI

📝 Summary:
This paper finds that even state-of-the-art multi-billion parameter AI models struggle with surgical tool detection, a seemingly simple task. Scaling models further offers diminishing returns, suggesting fundamental limitations for current Vision Language Models in surgical use cases beyond just ...

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27341
• PDF: https://arxiv.org/pdf/2603.27341

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#SurgicalAI #MedicalAI #FoundationModels #VisionLanguageModels #AIHealthcare

164 views13:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:24

This media is not supported in your browser

VIEW IN TELEGRAM

✨HandX: Scaling Bimanual Motion and Interaction Generation

📝 Summary:
HandX presents a new foundation for bimanual hand motion synthesis, offering a high-fidelity dataset, an LLM-driven annotation method, and new evaluation metrics. It enables high-quality dexterous motion generation, with scaling trends observed.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28766
• PDF: https://arxiv.org/pdf/2603.28766
• Project Page: https://github.com/handx-project/HandX
• Github: https://github.com/handx-project/HandX

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MotionSynthesis #BimanualInteraction #DexterousManipulation #AIResearch #LLM

146 views14:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding

📝 Summary:
AdaptToken enables efficient long video understanding for MLLMs by using model uncertainty to dynamically select relevant tokens. It allocates a global token budget and supports early stopping, significantly improving accuracy and reducing inference time across benchmarks.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28696
• PDF: https://arxiv.org/pdf/2603.28696
• Project Page: https://haozheqi.github.io/adapt-token
• Github: https://github.com/HaozheQi/AdaptToken

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MLLM #VideoUnderstanding #MachineLearning #AIResearch #TokenSelection

186 views14:41

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

✨LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset

📝 Summary:
This paper introduces KITScenes LongTail, a new dataset for long-tail driving events. It offers multi-view video, trajectories, and multilingual expert reasoning traces. This resource improves few-shot generalization and evaluates multimodal models instruction following capabilities.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23607
• PDF: https://arxiv.org/pdf/2603.23607
• Project Page: https://huggingface.co/datasets/KIT-MRT/KITScenes-LongTail

✨ Datasets citing this paper:
• https://huggingface.co/datasets/KIT-MRT/KITScenes-LongTail

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AutonomousDriving #ComputerVision #Datasets #LongTailLearning #MultimodalAI

193 views15:49

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Forwarded from Machine Learning with Python

Follow the Machine Learning with Python channel on WhatsApp: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A

❤1

71 views16:10

ML Research Hub

✨ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

📝 Summary:
ChartNet is a 1.5 million-sample multimodal dataset. It improves AI models chart understanding by providing diverse charts with aligned visual, textual, and numerical data. Fine-tuning on this open-source dataset significantly enhances performance in data visualization.

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27064
• PDF: https://arxiv.org/pdf/2603.27064
• Project Page: https://huggingface.co/datasets/ibm-granite/ChartNet

🔹 Models citing this paper:
• https://huggingface.co/ibm-granite/granite-4.0-3b-vision
• https://huggingface.co/beaupi/granite-4.0-3b-vision-oQ8

✨ Datasets citing this paper:
• https://huggingface.co/datasets/ibm-granite/ChartNet

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

178 views19:25

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨STRIDE: When to Speak Meets Sequence Denoising for Streaming Video Understanding

📝 Summary:
STRIDE enables proactive video understanding by modeling temporal activation patterns through iterative denoising within sliding windows, improving timing decisions in streaming scenarios. AI-generate...

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27593
• PDF: https://arxiv.org/pdf/2603.27593
• Project Page: https://interlive-team.github.io/STRIDE/
• Github: https://interlive-team.github.io/STRIDE/

🔹 Models citing this paper:
• https://huggingface.co/interlive/STRIDE-2B

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

162 views19:25

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SpatialLM: Training Large Language Models for Structured Indoor Modeling

📝 Summary:
SpatialLM, a multimodal large language model, processes 3D point cloud data to generate structured scene understanding outputs, achieving state-of-the-art performance in layout estimation and competit...

🔹 Publication Date: Published on Jun 9, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2506.07491
• PDF: https://arxiv.org/pdf/2506.07491
• Project Page: https://manycore-research.github.io/SpatialLM
• Github: https://github.com/manycore-research/SpatialLM

🔹 Models citing this paper:
• https://huggingface.co/manycore-research/SpatialLM1.1-Qwen-0.5B
• https://huggingface.co/manycore-research/SpatialLM1.1-Llama-1B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/Voxel51/spatial_lm_dataset
• https://huggingface.co/datasets/manycore-research/SpatialLM-Dataset
• https://huggingface.co/datasets/manycore-research/SpatialLM-Testset

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

SpatialLM: Training Large Language Models for Structured Indoor Modeling

SpatialLM is a large language model designed to process 3D point cloud data and generate structured 3D scene understanding outputs. These outputs include architectural elements like walls, doors,...

158 views20:25

✨ Explore Data Science 📝 Write your paper

157 views20:25

This media is not supported in your browser

VIEW IN TELEGRAM

✨INSID3: Training-Free In-Context Segmentation with DINOv3

📝 Summary:
INSID3 demonstrates training-free in-context segmentation using only frozen DINOv3 features. This minimalist approach achieves state-of-the-art results across various segmentation tasks with fewer parameters and no supervision.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28480
• PDF: https://arxiv.org/pdf/2603.28480
• Project Page: https://visinf.github.io/INSID3/
• Github: https://github.com/visinf/INSID3

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

145 views21:26

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨LongCat-Next: Lexicalizing Modalities as Discrete Tokens

📝 Summary:
LongCat-Next introduces DiNA, a unified framework for native multimodal processing. It represents text, vision, and audio as discrete tokens in a shared space, enabling consistent autoregressive modeling. This reconciles understanding-generation conflicts and excels across many multimodal tasks.

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27538
• PDF: https://arxiv.org/pdf/2603.27538
• Project Page: https://longcat.chat/longcat-next/intro
• Github: https://github.com/meituan-longcat/LongCat-Next

🔹 Models citing this paper:
• https://huggingface.co/meituan-longcat/LongCat-Next

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

129 views23:26

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨A Neural Score-Based Particle Method for the Vlasov-Maxwell-Landau System

📝 Summary:
This paper introduces neural score-based transport modeling SBTM for the Vlasov-Maxwell-Landau system. SBTM replaces the less efficient blob method, providing linear cost, greater accuracy, faster runtime, and lower memory, correctly simulating long-term plasma relaxation.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25832
• PDF: https://arxiv.org/pdf/2603.25832
• Github: https://github.com/Vilin97/Vlasov-Landau-SBTM

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

133 views01:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

📝 Summary:
CARLA-Air unifies high-fidelity urban driving and multirotor flight simulation within a single Unreal Engine framework. This addresses the demand for joint air-ground agent modeling in photorealistic environments, supporting embodied intelligence research.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28032
• PDF: https://arxiv.org/pdf/2603.28032
• Project Page: https://github.com/louiszengCN/CarlaAir
• Github: https://github.com/louiszengCN/CarlaAir

🔹 Models citing this paper:
• https://huggingface.co/tianlezeng/CarlaAIr-v0.1.7

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

CARLA-Air: Fly Drones Inside a CARLA World -- A Unified...

The convergence of low-altitude economies, embodied intelligence, and air-ground cooperative systems creates growing demand for simulation infrastructure capable of jointly modeling aerial and...

120 views02:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SeGPruner: Semantic-Geometric Visual Token Pruner for 3D Question Answering

📝 Summary:
A semantic-aware and geometry-guided token pruning framework is presented for efficient 3D question answering with multi-view images, achieving significant reductions in token budget and inference lat...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29437
• PDF: https://arxiv.org/pdf/2603.29437
• Project Page: https://github.com/intcomp/SegPruner
• Github: https://github.com/intcomp/SegPruner

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

arXiv.org

SeGPruner: Semantic-Geometric Visual Token Pruner for 3D Question Answering

Vision-language models (VLMs) have been widely adopted for 3D question answering (3D QA). In typical pipelines, visual tokens extracted from multiple viewpoints are concatenated with language...

122 views02:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells

📝 Summary:
Lingshu-Cell is a masked discrete diffusion model that learns transcriptomic state distributions and enables conditional simulation of cellular perturbations across diverse tissues and species. AI-gen...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25240
• PDF: https://arxiv.org/pdf/2603.25240

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

86 views03:01

✨ Explore Data Science 📝 Write your paper

✨VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward

📝 Summary:
VGGRPO is a latent geometry-guided framework that enhances video diffusion models' geometric consistency through a Latent Geometry Model and latent-space reinforcement learning with camera motion and ...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26599
• PDF: https://arxiv.org/pdf/2603.26599
• Project Page: https://zhaochongan.github.io/projects/VGGRPO/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

78 views03:02

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform