ML Research Hub
32.7K subscribers
5.64K photos
358 videos
24 files
6.09K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

📝 Summary:
AgentVista presents a comprehensive benchmark for multimodal agents requiring long-horizon tool interactions across multiple modalities and complex real-world scenarios. AI-generated summary Real-worl...

🔹 Publication Date: Published on Feb 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.23166
• PDF: https://arxiv.org/pdf/2602.23166

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
RoboPocket: Improve Robot Policies Instantly with Your Phone

📝 Summary:
RoboPocket enables efficient, robot-free policy iteration via smartphones. It uses augmented reality to visualize policy weaknesses, guiding data collection, and asynchronous online finetuning to update policies quickly. This doubles data and sample efficiency.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05504
• PDF: https://arxiv.org/pdf/2603.05504
• Project Page: https://robo-pocket.github.io

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Large Multimodal Models as General In-Context Classifiers

📝 Summary:
Large Multimodal Models demonstrate superior performance in closed-world classification with in-context learning and excel in open-world scenarios when equipped with the proposed CIRCLE method for pse...

🔹 Publication Date: Published on Feb 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.23229
• PDF: https://arxiv.org/pdf/2602.23229
• Project Page: https://circle-lmm.github.io/
• Github: https://github.com/marco-garosi/CIRCLE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface

📝 Summary:
GLiNER2 is a unified transformer-based framework that supports multiple NLP tasks with improved efficiency and accessibility compared to large language models. AI-generated summary Information extract...

🔹 Publication Date: Published on Jul 24, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.18546
• PDF: https://arxiv.org/pdf/2507.18546
• Github: https://github.com/fastino-ai/GLiNER2

🔹 Models citing this paper:
https://huggingface.co/fastino/gliner2-base-v1
https://huggingface.co/fastino/gliner2-large-v1
https://huggingface.co/fastino/gliner2-multi-v1

Spaces citing this paper:
https://huggingface.co/spaces/sitammeur/GLiNER2-Suite
https://huggingface.co/spaces/fastino/gliner2-official-demo
https://huggingface.co/spaces/sohom004/testdup

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Mozi: Governed Autonomy for Drug Discovery LLM Agents

📝 Summary:
Mozi is a dual-layer framework for reliable drug discovery LLM agents, solving issues of tool-use governance and long-horizon reliability. It uses a control plane for isolated tool-use and replanning, plus a workflow plane for structured stages with human oversight, ensuring robust, auditable res...

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03655
• PDF: https://arxiv.org/pdf/2603.03655
• Project Page: https://ai4s.idea.edu.cn/ai4s/mozi

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMAgents #DrugDiscovery #AI #MachineLearning #AutonomousAgents
MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models

📝 Summary:
MASQuant improves multimodal LLM quantization by resolving smoothing misalignment and cross-modal invariance. It uses modality-aware smoothing and SVD whitening for cross-modal compensation, achieving stable, competitive performance.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04800
• PDF: https://arxiv.org/pdf/2603.04800

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MultimodalAI #LLM #Quantization #DeepLearning #AIResearch
SkillNet: Create, Evaluate, and Connect AI Skills

📝 Summary:
SkillNet is an open infrastructure that systematically creates, evaluates, and organizes AI skills using a unified ontology. This overcomes the lack of skill accumulation in current agents, significantly boosting performance by 40 percent in rewards and reducing execution steps by 30 percent.

🔹 Publication Date: Published on Feb 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04448
• PDF: https://arxiv.org/pdf/2603.04448
• Project Page: https://skillnet.openkg.cn/
• Github: https://github.com/zjunlp/SkillNet

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #AISkills #AIAgents #Ontology #MachineLearning
STMI: Segmentation-Guided Token Modulation with Cross-Modal Hypergraph Interaction for Multi-Modal Object Re-Identification

📝 Summary:
STMI is a novel multi-modal ReID framework that improves object re-identification. It uses segmentation-guided modulation for foreground enhancement, token reallocation for compact features, and cross-modal hypergraph interaction to capture high-order semantic relationships.

🔹 Publication Date: Published on Feb 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00695
• PDF: https://arxiv.org/pdf/2603.00695

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ObjectReID #ComputerVision #DeepLearning #MultiModalAI #STMI
Please open Telegram to view this post
VIEW IN TELEGRAM
Interactive Benchmarks

📝 Summary:
Interactive Benchmarks propose a new framework to assess AI intelligence by evaluating active information acquisition and reasoning under constraints. This approach offers a robust assessment, revealing significant room for model improvement in interactive scenarios.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04737
• PDF: https://arxiv.org/pdf/2603.04737
• Project Page: https://github.com/interactivebench/interactivebench
• Github: https://github.com/interactivebench/interactivebench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #AIBenchmarks #InteractiveAI #AIIntelligence #MachineLearning
Lightweight Visual Reasoning for Socially-Aware Robots

📝 Summary:
A lightweight language-to-vision feedback module enhances VLMs for robotics. It reinterprets visual scenes under text context via a gated MLP, improving navigation, scene description, and human intention recognition with minimal parameters.

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03942
• PDF: https://arxiv.org/pdf/2603.03942
• Github: https://github.com/alessioGalatolo/VLM-Reasoning-for-Robotics

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Robotics #VLMs #VisualReasoning #AI #HumanRobotInteraction
Please open Telegram to view this post
VIEW IN TELEGRAM
Media is too big
VIEW IN TELEGRAM
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling

📝 Summary:
LPWM is a self-supervised object-centric world model that autonomously discovers object representations from video data. It models stochastic particle dynamics for decision-making, achieving state-of-the-art results.

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04553
• PDF: https://arxiv.org/pdf/2603.04553
• Project Page: https://taldatech.github.io/lpwm-web/
• Github: https://github.com/taldatech/lpwm

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#WorldModels #SelfSupervisedLearning #ObjectCentricAI #MachineLearning #AI
Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning

📝 Summary:
SLATE uses truncated step-level sampling and dense LLM-as-judge rewards to train language models with search engines. This reduces gradient variance and improves performance on complex reasoning tasks by better assigning credit.

🔹 Publication Date: Published on Feb 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.23440
• PDF: https://arxiv.org/pdf/2602.23440
• Github: https://github.com/algoprog/SLATE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #RetrievalAugmentedReasoning #ReinforcementLearning #AI #MachineLearning