ML Research Hub
32.4K subscribers
6.21K photos
416 videos
24 files
6.74K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design

📝 Summary:
A verification-centric framework for deep research agents improves performance on complex benchmarks by incorporating error checking at multiple stages of development and inference. AI-generated summa...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28376
• PDF: https://arxiv.org/pdf/2603.28376
• Github: https://github.com/AIDC-AI/Marco-DeepResearch

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
MuSEAgent: A Multimodal Reasoning Agent with Stateful Experiences

📝 Summary:
MuSEAgent enhances multimodal reasoning through stateful experience learning that abstracts interactions into decision experiences for improved policy-driven retrieval and adaptive search strategies. ...

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27813
• PDF: https://arxiv.org/pdf/2603.27813
• Github: https://github.com/DeepExperience/MuSEAgent

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Density-aware Soft Context Compression with Semi-Dynamic Compression Ratio

📝 Summary:
A density-aware dynamic compression framework for large language models that uses a discrete ratio selector to adaptively compress contexts based on information density, outperforming static methods i...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25926
• PDF: https://arxiv.org/pdf/2603.25926
• Github: https://github.com/yuyijiong/semi-dynamic-context-compress

🔹 Models citing this paper:
https://huggingface.co/yuyijiong/qwen3-semi-dynamic-soft-context-compress

Datasets citing this paper:
https://huggingface.co/datasets/yuyijiong/context_qa_sum_qwen3_synthetic

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing

📝 Summary:
DreamLite is a compact unified on-device diffusion model that supports both text-to-image generation and text-guided image editing with efficient training and inference. AI-generated summary Diffusion...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28713
• PDF: https://arxiv.org/pdf/2603.28713
• Project Page: https://carlofkl.github.io/dreamlite/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Story2Proposal: A Scaffold for Structured Scientific Paper Writing

📝 Summary:
Story2Proposal is a contract-governed multi-agent framework that generates structured scientific manuscripts with improved consistency and visual alignment through coordinated agents operating under a...

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27065
• PDF: https://arxiv.org/pdf/2603.27065

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Think over Trajectories: Leveraging Video Generation to Reconstruct GPS Trajectories from Cellular Signaling

📝 Summary:
Cellular signaling records are transformed into GPS trajectories through map-visual video generation, achieving superior performance over traditional methods while maintaining scalability and cross-ci...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26610
• PDF: https://arxiv.org/pdf/2603.26610

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Superintelligence and Law

📝 Summary:
T h e p r o s p e c t o f a r t i f i c i a l s u p e r i n t e l l i g e n c e - - A I a g e n t s t h a t c a n g e n e r a l l y o u t p e r f o r m h u m a n s i n c o g n i t i v e t a s k s a n ...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28669
• PDF: https://arxiv.org/pdf/2603.28669

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

📝 Summary:
HISA improves sparse attention efficiency by replacing the traditional indexer with a hierarchical approach that reduces computational complexity from O(L²) to sub-quadratic scaling while maintaining ...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28458
• PDF: https://arxiv.org/pdf/2603.28458

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MolmoPoint: Better Pointing for VLMs with Grounding Tokens

📝 Summary:
A vision-language model approach for grounding that directly selects visual tokens containing target concepts through specialized pointing tokens, achieving superior performance in image, GUI, video p...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28069
• PDF: https://arxiv.org/pdf/2603.28069

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MOOZY: A Patient-First Foundation Model for Computational Pathology

📝 Summary:
A patient-first pathology foundation model named MOOZY uses a case transformer to model dependencies across multiple slides from the same patient, achieving superior performance on diverse clinical ta...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27048
• PDF: https://arxiv.org/pdf/2603.27048
• Project Page: https://atlasanalyticslab.github.io/MOOZY/
• Github: https://github.com/AtlasAnalyticsLab/MOOZY

🔹 Models citing this paper:
https://huggingface.co/AtlasAnalyticsLab/MOOZY

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
EpochX: Building the Infrastructure for an Emergent Agent Civilization

📝 Summary:
EpochX is a credits-native marketplace infrastructure designed for human-agent production networks. It enables scalable task delegation and verification, generating reusable skills and workflows. This system fosters cumulative improvement and durable human-agent collaboration through economic inc...

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27304
• PDF: https://arxiv.org/pdf/2603.27304
• Project Page: https://epochx.cc
• Github: https://github.com/QuantaAlpha/EpochX

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AIAgents #HumanAICooperation #AIInfrastructure #AIEconomics #EmergentAI
On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers

📝 Summary:
Diffusion transformers often lack visual diversity. This paper introduces on-the-fly repulsion in the contextual space to enhance diversity. It intervenes in multimodal attention during the forward pass, yielding rich outcomes without losing quality or efficiency.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28762
• PDF: https://arxiv.org/pdf/2603.28762
• Project Page: https://contextual-repulsion.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DiffusionModels #DeepLearning #GenerativeAI #ComputerVision #AIResearch
SEAR: Schema-Based Evaluation and Routing for LLM Gateways

📝 Summary:
SEAR is a schema-based system for evaluating and routing LLM responses in gateways. It uses structured signals from LLM reasoning to make accurate, interpretable decisions, unifying evaluation and routing. It achieved significant cost reductions with comparable quality in production.

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26728
• PDF: https://arxiv.org/pdf/2603.26728
• Project Page: https://www.strukto.ai/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AIGateways #AIEvaluation #AIRouting #MachineLearning
TAPS: Task Aware Proposal Distributions for Speculative Sampling

📝 Summary:
Speculative decoding quality depends on matching draft model training data to the downstream task. Task-specific training yields specialized drafters that are best combined at inference time using confidence-based routing, outperforming averaging. Confidence is a more effective routing signal tha...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27027
• PDF: https://arxiv.org/pdf/2603.27027
• Github: https://github.com/Moe-Zbeeb/TAPS

🔹 Models citing this paper:
https://huggingface.co/zbeeb/Hass-MathInstruct_20epochs
https://huggingface.co/zbeeb/Hass-ShareGPT_20epochs
https://huggingface.co/zbeeb/Hass-Sharegpt-Mathinstruct-20epochs

Datasets citing this paper:
https://huggingface.co/datasets/zbeeb/TAPS-Datasets

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SpeculativeDecoding #LLM #MachineLearning #AIResearch #NLP
KAT-Coder-V2 Technical Report

📝 Summary:
KAT-Coder-V2 is an agentic coding model that uses a 'Specialize-then-Unify' approach across five expert domains. It employs novel training methods and infrastructure, achieving strong performance on SWE-bench, PinchBench, and other coding benchmarks.

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27703
• PDF: https://arxiv.org/pdf/2603.27703

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #Coding #LLM #MachineLearning #Research
Unified Number-Free Text-to-Motion Generation Via Flow Matching

📝 Summary:
Existing text-to-motion models struggle with variable agents, leading to inefficiency and errors. This paper proposes Unified Motion Flow UMF, a two-stage approach prior and reaction that uses P-Flow and S-Flow in a unified latent space. UMF effectively generates multi-person motion from text, mi...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27040
• PDF: https://arxiv.org/pdf/2603.27040
• Project Page: https://githubhgh.github.io/umf/
• Github: https://github.com/Githubhgh/UMF_CVPR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#TextToMotion #FlowMatching #GenerativeAI #MotionSynthesis #DeepLearning
Text Data Integration

📝 Summary:
This paper argues for integrating textual data into data integration systems, as current approaches largely focus on structured data. It will explore the challenges, state-of-the-art, and open problems in utilizing unstructured text.

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27055
• PDF: https://arxiv.org/pdf/2603.27055
• Project Page: https://dtim.upc.edu/en
• Github: https://github.com/dtim-upc/THOR

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DataIntegration #UnstructuredData #TextData #NLP #DataScience
A Comparative Study in Surgical AI: Datasets, Foundation Models, and Barriers to Med-AGI

📝 Summary:
This paper finds that even state-of-the-art multi-billion parameter AI models struggle with surgical tool detection, a seemingly simple task. Scaling models further offers diminishing returns, suggesting fundamental limitations for current Vision Language Models in surgical use cases beyond just ...

🔹 Publication Date: Published on Mar 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27341
• PDF: https://arxiv.org/pdf/2603.27341

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SurgicalAI #MedicalAI #FoundationModels #VisionLanguageModels #AIHealthcare
This media is not supported in your browser
VIEW IN TELEGRAM
HandX: Scaling Bimanual Motion and Interaction Generation

📝 Summary:
HandX presents a new foundation for bimanual hand motion synthesis, offering a high-fidelity dataset, an LLM-driven annotation method, and new evaluation metrics. It enables high-quality dexterous motion generation, with scaling trends observed.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28766
• PDF: https://arxiv.org/pdf/2603.28766
• Project Page: https://github.com/handx-project/HandX
• Github: https://github.com/handx-project/HandX

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MotionSynthesis #BimanualInteraction #DexterousManipulation #AIResearch #LLM
AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding

📝 Summary:
AdaptToken enables efficient long video understanding for MLLMs by using model uncertainty to dynamically select relevant tokens. It allocates a global token budget and supports early stopping, significantly improving accuracy and reducing inference time across benchmarks.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28696
• PDF: https://arxiv.org/pdf/2603.28696
• Project Page: https://haozheqi.github.io/adapt-token
• Github: https://github.com/HaozheQi/AdaptToken

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MLLM #VideoUnderstanding #MachineLearning #AIResearch #TokenSelection