ML Research Hub
32.4K subscribers
6.3K photos
424 videos
24 files
6.84K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
STRIDE: When to Speak Meets Sequence Denoising for Streaming Video Understanding

📝 Summary:
STRIDE enables proactive video understanding by modeling temporal activation patterns through iterative denoising within sliding windows, improving timing decisions in streaming scenarios. AI-generate...

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27593
• PDF: https://arxiv.org/pdf/2603.27593
• Project Page: https://interlive-team.github.io/STRIDE/
• Github: https://interlive-team.github.io/STRIDE/

🔹 Models citing this paper:
https://huggingface.co/interlive/STRIDE-2B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
INSID3: Training-Free In-Context Segmentation with DINOv3

📝 Summary:
INSID3 demonstrates training-free in-context segmentation using only frozen DINOv3 features. This minimalist approach achieves state-of-the-art results across various segmentation tasks with fewer parameters and no supervision.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28480
• PDF: https://arxiv.org/pdf/2603.28480
• Project Page: https://visinf.github.io/INSID3/
• Github: https://github.com/visinf/INSID3

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
LongCat-Next: Lexicalizing Modalities as Discrete Tokens

📝 Summary:
LongCat-Next introduces DiNA, a unified framework for native multimodal processing. It represents text, vision, and audio as discrete tokens in a shared space, enabling consistent autoregressive modeling. This reconciles understanding-generation conflicts and excels across many multimodal tasks.

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27538
• PDF: https://arxiv.org/pdf/2603.27538
• Project Page: https://longcat.chat/longcat-next/intro
• Github: https://github.com/meituan-longcat/LongCat-Next

🔹 Models citing this paper:
https://huggingface.co/meituan-longcat/LongCat-Next

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
A Neural Score-Based Particle Method for the Vlasov-Maxwell-Landau System

📝 Summary:
This paper introduces neural score-based transport modeling SBTM for the Vlasov-Maxwell-Landau system. SBTM replaces the less efficient blob method, providing linear cost, greater accuracy, faster runtime, and lower memory, correctly simulating long-term plasma relaxation.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25832
• PDF: https://arxiv.org/pdf/2603.25832
• Github: https://github.com/Vilin97/Vlasov-Landau-SBTM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

📝 Summary:
CARLA-Air unifies high-fidelity urban driving and multirotor flight simulation within a single Unreal Engine framework. This addresses the demand for joint air-ground agent modeling in photorealistic environments, supporting embodied intelligence research.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28032
• PDF: https://arxiv.org/pdf/2603.28032
• Project Page: https://github.com/louiszengCN/CarlaAir
• Github: https://github.com/louiszengCN/CarlaAir

🔹 Models citing this paper:
https://huggingface.co/tianlezeng/CarlaAIr-v0.1.7

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SeGPruner: Semantic-Geometric Visual Token Pruner for 3D Question Answering

📝 Summary:
A semantic-aware and geometry-guided token pruning framework is presented for efficient 3D question answering with multi-view images, achieving significant reductions in token budget and inference lat...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29437
• PDF: https://arxiv.org/pdf/2603.29437
• Project Page: https://github.com/intcomp/SegPruner
• Github: https://github.com/intcomp/SegPruner

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells

📝 Summary:
Lingshu-Cell is a masked discrete diffusion model that learns transcriptomic state distributions and enables conditional simulation of cellular perturbations across diverse tissues and species. AI-gen...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25240
• PDF: https://arxiv.org/pdf/2603.25240

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward

📝 Summary:
VGGRPO is a latent geometry-guided framework that enhances video diffusion models' geometric consistency through a Latent Geometry Model and latent-space reinforcement learning with camera motion and ...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26599
• PDF: https://arxiv.org/pdf/2603.26599
• Project Page: https://zhaochongan.github.io/projects/VGGRPO/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training

📝 Summary:
OptiMer enables flexible continual pre-training by decoupling data mixture ratio selection from training through post-hoc Bayesian optimization of distribution vectors extracted from individual datase...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28858
• PDF: https://arxiv.org/pdf/2603.28858
• Github: https://github.com/shyyhs/optimer

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AutoWeather4D: Autonomous Driving Video Weather Conversion via G-Buffer Dual-Pass Editing

📝 Summary:
AutoWeather4D is a 3D-aware weather editing framework that decouples geometry and illumination through a dual-pass mechanism, enabling efficient and physically accurate weather modification for autono...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26546
• PDF: https://arxiv.org/pdf/2603.26546
• Project Page: https://lty2226262.github.io/autoweather4d/
• Github: https://github.com/lty2226262/AutoWeather4D

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AutonomousDriving #ComputerVision #WeatherEditing #3DGraphics #AIResearch
Think Anywhere in Code Generation

📝 Summary:
Think-Anywhere is a novel reasoning mechanism that enables large language models to invoke thinking on-demand during code generation, improving performance across multiple benchmarks through adaptive ...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29957
• PDF: https://arxiv.org/pdf/2603.29957
• Github: https://github.com/jiangxxxue/Think-Anywhere

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
CutClaw: Agentic Hours-Long Video Editing via Music Synchronization

📝 Summary:
CutClaw is an autonomous multi-agent framework that uses multimodal language models to automatically edit long video footage into rhythmic, narratively consistent short videos with synchronized audio ...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29664
• PDF: https://arxiv.org/pdf/2603.29664
• Project Page: https://github.com/GVCLab/CutClaw
• Github: https://github.com/GVCLab/CutClaw

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing

📝 Summary:
VectorGym presents a comprehensive benchmark suite for scalable vector graphics encompassing text-to-svg generation, sketch-to-svg conversion, complex svg editing, and visual understanding tasks with ...

🔹 Publication Date: Published on Feb 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29852
• PDF: https://arxiv.org/pdf/2603.29852

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Learn2Fold: Structured Origami Generation with World Model Planning

📝 Summary:
A neuro-symbolic framework called Learn2Fold generates physically valid origami folding sequences from text by combining language model semantic proposals with graph-structured world model verificatio...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29585
• PDF: https://arxiv.org/pdf/2603.29585

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

📝 Summary:
FIPO enhances reinforcement learning for language models by using discounted future-KL divergence to improve credit assignment and extend reasoning chains, achieving better mathematical problem-solvin...

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19835
• PDF: https://arxiv.org/pdf/2603.19835
• Project Page: https://qwen-pilot.notion.site/fipo
• Github: https://github.com/qwenpilot/FIPO

🔹 Models citing this paper:
https://huggingface.co/QwenPilot/FIPO_32B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis

📝 Summary:
Unify-Agent integrates agent-based modeling with multimodal understanding to enhance image synthesis through reasoning, searching, and generation processes grounded in external knowledge. AI-generated...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2603.29620
• PDF: https://arxiv.org/pdf/2603.29620
• Github: https://github.com/shawn0728/Unify-Agent

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
GEMS: Agent-Native Multimodal Generation with Memory and Skills

📝 Summary:
GEMS is an agent-native multimodal generation framework that enhances model capabilities through structured multi-agent optimization, persistent memory, and domain-specific skills across general and d...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28088
• PDF: https://arxiv.org/pdf/2603.28088
• Project Page: https://gems-gen.github.io/
• Github: https://github.com/lcqysl/GEMS

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration

📝 Summary:
FlowPIE is a novel retrieval-generation framework for scientific idea generation. It uses flow-guided Monte Carlo Tree Search for literature exploration and an evolutionary process to produce diverse, high-quality, and novel ideas by integrating cross-domain knowledge.

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29557
• PDF: https://arxiv.org/pdf/2603.29557
• Project Page: https://flowpie.wangqiyao.me/
• Github: https://github.com/AIforIP/FlowPIE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Extend3D: Town-Scale 3D Generation

📝 Summary:
An object-centric 3D generative model is extended with adaptive latent space and iterative refinement to generate complete 3D scenes from single images, incorporating noise-aware completion and 3D-awa...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29387
• PDF: https://arxiv.org/pdf/2603.29387
• Project Page: https://seungwoo-yoon.github.io/extend3d-page/
• Github: https://github.com/SNU-VGILab/Extend3D

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research