ML Research Hub
32.4K subscribers
6.31K photos
424 videos
24 files
6.85K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
INSID3: Training-Free In-Context Segmentation with DINOv3

📝 Summary:
INSID3 demonstrates training-free in-context segmentation using only frozen DINOv3 features. This minimalist approach achieves state-of-the-art results across various segmentation tasks with fewer parameters and no supervision.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28480
• PDF: https://arxiv.org/pdf/2603.28480
• Project Page: https://visinf.github.io/INSID3/
• Github: https://github.com/visinf/INSID3

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
LongCat-Next: Lexicalizing Modalities as Discrete Tokens

📝 Summary:
LongCat-Next introduces DiNA, a unified framework for native multimodal processing. It represents text, vision, and audio as discrete tokens in a shared space, enabling consistent autoregressive modeling. This reconciles understanding-generation conflicts and excels across many multimodal tasks.

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27538
• PDF: https://arxiv.org/pdf/2603.27538
• Project Page: https://longcat.chat/longcat-next/intro
• Github: https://github.com/meituan-longcat/LongCat-Next

🔹 Models citing this paper:
https://huggingface.co/meituan-longcat/LongCat-Next

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
A Neural Score-Based Particle Method for the Vlasov-Maxwell-Landau System

📝 Summary:
This paper introduces neural score-based transport modeling SBTM for the Vlasov-Maxwell-Landau system. SBTM replaces the less efficient blob method, providing linear cost, greater accuracy, faster runtime, and lower memory, correctly simulating long-term plasma relaxation.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25832
• PDF: https://arxiv.org/pdf/2603.25832
• Github: https://github.com/Vilin97/Vlasov-Landau-SBTM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

📝 Summary:
CARLA-Air unifies high-fidelity urban driving and multirotor flight simulation within a single Unreal Engine framework. This addresses the demand for joint air-ground agent modeling in photorealistic environments, supporting embodied intelligence research.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28032
• PDF: https://arxiv.org/pdf/2603.28032
• Project Page: https://github.com/louiszengCN/CarlaAir
• Github: https://github.com/louiszengCN/CarlaAir

🔹 Models citing this paper:
https://huggingface.co/tianlezeng/CarlaAIr-v0.1.7

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SeGPruner: Semantic-Geometric Visual Token Pruner for 3D Question Answering

📝 Summary:
A semantic-aware and geometry-guided token pruning framework is presented for efficient 3D question answering with multi-view images, achieving significant reductions in token budget and inference lat...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29437
• PDF: https://arxiv.org/pdf/2603.29437
• Project Page: https://github.com/intcomp/SegPruner
• Github: https://github.com/intcomp/SegPruner

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells

📝 Summary:
Lingshu-Cell is a masked discrete diffusion model that learns transcriptomic state distributions and enables conditional simulation of cellular perturbations across diverse tissues and species. AI-gen...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25240
• PDF: https://arxiv.org/pdf/2603.25240

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward

📝 Summary:
VGGRPO is a latent geometry-guided framework that enhances video diffusion models' geometric consistency through a Latent Geometry Model and latent-space reinforcement learning with camera motion and ...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26599
• PDF: https://arxiv.org/pdf/2603.26599
• Project Page: https://zhaochongan.github.io/projects/VGGRPO/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training

📝 Summary:
OptiMer enables flexible continual pre-training by decoupling data mixture ratio selection from training through post-hoc Bayesian optimization of distribution vectors extracted from individual datase...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28858
• PDF: https://arxiv.org/pdf/2603.28858
• Github: https://github.com/shyyhs/optimer

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AutoWeather4D: Autonomous Driving Video Weather Conversion via G-Buffer Dual-Pass Editing

📝 Summary:
AutoWeather4D is a 3D-aware weather editing framework that decouples geometry and illumination through a dual-pass mechanism, enabling efficient and physically accurate weather modification for autono...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26546
• PDF: https://arxiv.org/pdf/2603.26546
• Project Page: https://lty2226262.github.io/autoweather4d/
• Github: https://github.com/lty2226262/AutoWeather4D

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AutonomousDriving #ComputerVision #WeatherEditing #3DGraphics #AIResearch
Think Anywhere in Code Generation

📝 Summary:
Think-Anywhere is a novel reasoning mechanism that enables large language models to invoke thinking on-demand during code generation, improving performance across multiple benchmarks through adaptive ...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29957
• PDF: https://arxiv.org/pdf/2603.29957
• Github: https://github.com/jiangxxxue/Think-Anywhere

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
CutClaw: Agentic Hours-Long Video Editing via Music Synchronization

📝 Summary:
CutClaw is an autonomous multi-agent framework that uses multimodal language models to automatically edit long video footage into rhythmic, narratively consistent short videos with synchronized audio ...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29664
• PDF: https://arxiv.org/pdf/2603.29664
• Project Page: https://github.com/GVCLab/CutClaw
• Github: https://github.com/GVCLab/CutClaw

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing

📝 Summary:
VectorGym presents a comprehensive benchmark suite for scalable vector graphics encompassing text-to-svg generation, sketch-to-svg conversion, complex svg editing, and visual understanding tasks with ...

🔹 Publication Date: Published on Feb 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29852
• PDF: https://arxiv.org/pdf/2603.29852

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Learn2Fold: Structured Origami Generation with World Model Planning

📝 Summary:
A neuro-symbolic framework called Learn2Fold generates physically valid origami folding sequences from text by combining language model semantic proposals with graph-structured world model verificatio...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29585
• PDF: https://arxiv.org/pdf/2603.29585

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

📝 Summary:
FIPO enhances reinforcement learning for language models by using discounted future-KL divergence to improve credit assignment and extend reasoning chains, achieving better mathematical problem-solvin...

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19835
• PDF: https://arxiv.org/pdf/2603.19835
• Project Page: https://qwen-pilot.notion.site/fipo
• Github: https://github.com/qwenpilot/FIPO

🔹 Models citing this paper:
https://huggingface.co/QwenPilot/FIPO_32B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis

📝 Summary:
Unify-Agent integrates agent-based modeling with multimodal understanding to enhance image synthesis through reasoning, searching, and generation processes grounded in external knowledge. AI-generated...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2603.29620
• PDF: https://arxiv.org/pdf/2603.29620
• Github: https://github.com/shawn0728/Unify-Agent

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
GEMS: Agent-Native Multimodal Generation with Memory and Skills

📝 Summary:
GEMS is an agent-native multimodal generation framework that enhances model capabilities through structured multi-agent optimization, persistent memory, and domain-specific skills across general and d...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28088
• PDF: https://arxiv.org/pdf/2603.28088
• Project Page: https://gems-gen.github.io/
• Github: https://github.com/lcqysl/GEMS

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration

📝 Summary:
FlowPIE is a novel retrieval-generation framework for scientific idea generation. It uses flow-guided Monte Carlo Tree Search for literature exploration and an evolutionary process to produce diverse, high-quality, and novel ideas by integrating cross-domain knowledge.

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29557
• PDF: https://arxiv.org/pdf/2603.29557
• Project Page: https://flowpie.wangqiyao.me/
• Github: https://github.com/AIforIP/FlowPIE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Extend3D: Town-Scale 3D Generation

📝 Summary:
An object-centric 3D generative model is extended with adaptive latent space and iterative refinement to generate complete 3D scenes from single images, incorporating noise-aware completion and 3D-awa...

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29387
• PDF: https://arxiv.org/pdf/2603.29387
• Project Page: https://seungwoo-yoon.github.io/extend3d-page/
• Github: https://github.com/SNU-VGILab/Extend3D

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face Generation

📝 Summary:
A unified dual-stream diffusion transformer model enables synergistic multimodal face synthesis by jointly processing spatial and semantic tokens through shared attention mechanisms while maintaining ...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29029
• PDF: https://arxiv.org/pdf/2603.29029
• Project Page: https://vcbsl.github.io/MMFace-DiT/
• Github: https://github.com/Bharath-K3/MMFace-DiT

🔹 Models citing this paper:
https://huggingface.co/BharathK333/MMFace-DiT-Models

Datasets citing this paper:
https://huggingface.co/datasets/BharathK333/MMFace-DiT-Datasets

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
How Auditory Knowledge in LLM Backbones Shapes Audio Language Models: A Holistic Evaluation

📝 Summary:
This paper explores how much auditory knowledge LLMs acquire from text-only pre-training and its effect on audio language models. They found that auditory knowledge varies substantially and text-only results strongly correlate with audio performance.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19195
• PDF: https://arxiv.org/pdf/2603.19195
• Project Page: https://kehanlu.github.io/AKB
• Github: https://github.com/kehanlu/AKB

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #AudioAI #NLP #DeepLearning #AIResearch
daVinci-LLM:Towards the Science of Pretraining

📝 Summary:
daVinci-LLM explores pretraining with industrial resources and an open science approach. It demonstrates that data processing depth and adaptive curriculum strategies significantly impact model capabilities, releasing full processes for community advancement.

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27164
• PDF: https://arxiv.org/pdf/2603.27164
• Github: https://github.com/GAIR-NLP/daVinci-LLM

🔹 Models citing this paper:
https://huggingface.co/SII-GAIR-NLP/davinci-llm-model

Datasets citing this paper:
https://huggingface.co/datasets/SII-GAIR-NLP/davinci-llm-data

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #Pretraining #OpenScience #AI #MachineLearning