ML Research Hub
32.5K subscribers
6.06K photos
392 videos
24 files
6.56K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Language on Demand, Knowledge at Core: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality

📝 Summary:
XBridge combines LLMs with translation models to boost multilingual performance, especially for low-resource languages. It keeps the LLM as an English knowledge core, bridging model misalignment with lightweight mapping layers for semantic consistency without retraining the LLM.

🔹 Publication Date: Published on Mar 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17512
• PDF: https://arxiv.org/pdf/2603.17512
• Github: https://github.com/ictnlp/XBridge

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #MultilingualAI #NLP #LowResourceLanguages #AIResearch
Teaching an Agent to Sketch One Part at a Time

📝 Summary:
Researchers developed an agent that generates vector sketches incrementally, one part at a time. It uses a multi-modal language model and process-reward reinforcement learning with a new part-annotated dataset. This enables controllable and editable text-to-vector sketch generation.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19500
• PDF: https://arxiv.org/pdf/2603.19500

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #GenerativeAI #MachineLearning #ComputerVision #ReinforcementLearning
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

📝 Summary:
HopChain is a framework that synthesizes multi-hop vision-language reasoning data to improve VLMs. This data features logically dependent reasoning chains, addressing VLMs' struggle with complex reasoning. Training with HopChain data significantly enhances generalizable VLM performance across div...

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17024
• PDF: https://arxiv.org/pdf/2603.17024

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VLMs #DataSynthesis #MultiHopReasoning #AIResearch #ComputerVision
Media is too big
VIEW IN TELEGRAM
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models

📝 Summary:
Astrolabe is an efficient online reinforcement learning framework for distilled autoregressive video models. It improves generation quality using a forward-process RL formulation and streaming training with a multi-reward objective, avoiding expensive re-distillation or reverse-process optimization.

🔹 Publication Date: Published on Mar 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17051
• PDF: https://arxiv.org/pdf/2603.17051
• Project Page: https://franklinz233.github.io/projects/astrolabe/
• Github: https://github.com/franklinz233/Astrolabe

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ReinforcementLearning #VideoGeneration #DeepLearning #AI #ModelOptimization
BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection

📝 Summary:
BEAVER is a training-free framework that improves long-context LLM inference using structure-aware hierarchical selection and dense tensor mapping. It maintains semantic integrity, achieves comparable performance to SOTA methods, and significantly reduces latency by 26.4x on large contexts.

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19635
• PDF: https://arxiv.org/pdf/2603.19635
• Project Page: https://cslikai.cn/BEAVER/
• Github: https://github.com/JusperLee/BEAVER

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #AI #PromptEngineering #DeepLearning #ModelOptimization
AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science

📝 Summary:
AgentDS benchmark evaluates AI agents and human-AI collaboration in domain-specific data science tasks, revealing continued necessity of human expertise despite advances in large language models and A...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19005
• PDF: https://arxiv.org/pdf/2603.19005
• Project Page: https://agentds.org/

Datasets citing this paper:
https://huggingface.co/datasets/lainmn/AgentDS-Insurance
https://huggingface.co/datasets/lainmn/AgentDS-RetailBanking
https://huggingface.co/datasets/lainmn/AgentDS-Manufacturing

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
EgoForge: Goal-Directed Egocentric World Simulator

📝 Summary:
EgoForge is an egocentric goal-directed world simulator that generates coherent first-person video rollouts from minimal static inputs using trajectory-level reward-guided refinement during diffusion ...

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.20169
• PDF: https://arxiv.org/pdf/2603.20169
• Project Page: https://plan-lab.github.io/projects/egoforge

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FlowScene: Style-Consistent Indoor Scene Generation with Multimodal Graph Rectified Flow

📝 Summary:
FlowScene is a generative model that uses multimodal graph conditioning and rectified flow to create realistic, style-consistent indoor scenes. It offers fine-grained control over object shapes, textures, and relations, surpassing prior methods.

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19598
• PDF: https://arxiv.org/pdf/2603.19598

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#GenerativeAI #3DSceneGeneration #MultimodalAI #DeepLearning #ComputerGraphics
🎁 23 Years of SPOTO – Claim Your Free IT Certs Prep Kit!

🔥Whether you're preparing for #Python, #AI, #Cisco, #PMI, #Fortinet, #AWS, #Azure, #Excel, #comptia, #ITIL, #cloud or any other in-demand certification – SPOTO has got you covered!

Free Resources :
・Free Python, Excel, Cyber Security, Cisco, SQL, ITIL, PMP, AWS courses: https://bit.ly/4lk4m3c
・IT Certs E-book: https://bit.ly/4bdZOqt
・IT Exams Skill Test: https://bit.ly/4sDvi0b
・Free AI material and support tools: https://bit.ly/46TpsQ8
・Free Cloud Study Guide: https://bit.ly/4lk3dIS


👉 Become Part of Our IT Learning Circle! resources and support:
https://chat.whatsapp.com/Cnc5M5353oSBo3savBl397

💬 Want exam help? Chat with an admin now!
wa.link/rozuuw
TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation

📝 Summary:
TerraScope is a new VLM for Earth Observation enabling pixel-grounded geospatial reasoning. It offers modality-flexible and multi-temporal capabilities, outperforming existing models on a new benchmark for accurate and interpretable results.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19039
• PDF: https://arxiv.org/pdf/2603.19039
• Project Page: https://shuyansy.github.io/terrascope/
• Github: https://github.com/shuyansy/Earth-Observation-VLMs

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#EarthObservation #VLM #Geospatial #RemoteSensing #ComputerVision
HiMu: Hierarchical Multimodal Frame Selection for Long Video Question Answering

📝 Summary:
HiMu is a training-free framework for long video QA. It efficiently selects relevant frames using hierarchical query decomposition with lightweight multimodal experts, preserving temporal and cross-modal structure. HiMu advances the efficiency-accuracy Pareto front.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18558
• PDF: https://arxiv.org/pdf/2603.18558
• Project Page: https://danbenami.github.io/HiMu.io/
• Github: https://github.com/DanBenAmi/HiMu

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoQA #MultimodalAI #ComputerVision #MachineLearning #AI
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

📝 Summary:
V-JEPA 2 uses self-supervised learning on web videos and minimal robot data. It excels at video understanding, anticipation, Q&A, and zero-shot robotic planning. This approach yields a powerful world model for physical world planning.

🔹 Publication Date: Published on Jun 11, 2025

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/v-jepa-2-self-supervised-video-models-enable-understanding-prediction-and-planning
• PDF: https://arxiv.org/pdf/2506.09985
• Github: https://github.com/facebookresearch/vjepa2

Datasets citing this paper:
https://huggingface.co/datasets/ckadirt/vjxla

Spaces citing this paper:
https://huggingface.co/spaces/vselvarajijay/vjepa2-latent-prediction
https://huggingface.co/spaces/aavi21458/vjepa2-latent-prediction

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #SelfSupervisedLearning #VideoAI #Robotics #WorldModels
🔥1
LoopRPT: Reinforcement Pre-Training for Looped Language Models

📝 Summary:
LoopRPT is a reinforcement pre-training framework for looped language models. It directly shapes intermediate representations by assigning reinforcement signals to latent steps, improving latent reasoning. This leads to better accuracy-computation trade-offs and enhanced early-stage reasoning.

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19714
• PDF: https://arxiv.org/pdf/2603.19714

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ReinforcementLearning #LanguageModels #AI #NLP #DeepLearning
Cooperation and Exploitation in LLM Policy Synthesis for Sequential Social Dilemmas

📝 Summary:
This paper uses LLMs to synthesize agent policies for multi-agent environments. Dense feedback including social metrics consistently outperforms sparse reward-only feedback, guiding LLMs toward effective cooperative strategies in social dilemmas.

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19453
• PDF: https://arxiv.org/pdf/2603.19453
• Github: https://github.com/vicgalle/llm-policies-social-dilemmas

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #MultiAgentSystems #SocialDilemmas #ReinforcementLearning #AIResearch
ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models

📝 Summary:
This paper introduces ProactiveBench to measure if MLLMs can proactively ask for user help on challenging tasks. It finds MLLMs generally lack this proactiveness, and conversational history can even hinder it. However, reinforcement learning shows promise for teaching models this crucial collabor...

🔹 Publication Date: Published on Mar 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19466
• PDF: https://arxiv.org/pdf/2603.19466
• Project Page: https://huggingface.co/datasets/tdemin16/ProactiveBench
• Github: https://github.com/tdemin16/proactivebench

Datasets citing this paper:
https://huggingface.co/datasets/tdemin16/ProactiveBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MLLMs #AIProactiveness #BenchmarkingAI #ReinforcementLearning #LargeLanguageModels
1
The Y-Combinator for LLMs: Solving Long-Context Rot with λ-Calculus

📝 Summary:
λ-RLM replaces open-ended recursive code generation in LLMs with a typed functional runtime based on λ-calculus. This provides formal guarantees and improves long-context reasoning by outperforming standard RLMs in accuracy and latency.

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.20105
• PDF: https://arxiv.org/pdf/2603.20105
• Github: https://github.com/lambda-calculus-LLM/lambda-RLM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #LambdaCalculus #AI #NaturalLanguageProcessing #DeepLearning
1
Versatile Editing of Video Content, Actions, and Dynamics without Training

📝 Summary:
DynaEdit is a training-free method for versatile video editing using pretrained text-to-video models. It addresses limitations in handling complex edits, actions, and object interactions by solving technical issues like misalignment and jitter, achieving state-of-the-art results.

🔹 Publication Date: Published on Mar 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.17989
• PDF: https://arxiv.org/pdf/2603.17989
• Project Page: https://dynaedit.github.io

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoEditing #TextToVideo #GenerativeAI #ComputerVision #AIResearch
Deep Tabular Research via Continual Experience-Driven Execution

📝 Summary:
This paper introduces Deep Tabular Research DTR, an agentic framework for complex tabular reasoning. It constructs a hierarchical meta-graph, uses expectation-aware path selection, and refines iteratively via siamese structured memory, highlighting the importance of separating planning from execu...

🔹 Publication Date: Published on Mar 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09151
• PDF: https://arxiv.org/pdf/2603.09151

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DeepLearning #TabularData #AI #MachineLearning #AIagents
CurveStream: Boosting Streaming Video Understanding in MLLMs via Curvature-Aware Hierarchical Visual Memory Management

📝 Summary:
CurveStream enhances streaming video understanding in MLLMs via a curvature-aware hierarchical memory framework. It dynamically routes frames based on semantic intensity to prevent Out-of-Memory errors and achieve over 10 percent performance gains.

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19571
• PDF: https://arxiv.org/pdf/2603.19571
• Github: https://github.com/streamingvideos/CurveStream

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MLLMs #StreamingVideo #VideoUnderstanding #MemoryManagement #AI
s2n-bignum-bench: A practical benchmark for evaluating low-level code reasoning of LLMs

📝 Summary:
s2n-bignum-bench is a new benchmark evaluating LLMs on formal proof synthesis for industrial cryptographic assembly routines. It bridges the gap between competition math and real-world verification by requiring LLMs to generate HOL Light proofs for AWS s2n-bignum library code.

🔹 Publication Date: Published on Mar 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14628
• PDF: https://arxiv.org/pdf/2603.14628
• Project Page: https://kings-crown.github.io/s2n-bignum-leaderboard/
• Github: https://github.com/kings-crown/s2n-bignum-bench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research