ML Research Hub
32.7K subscribers
5.65K photos
359 videos
24 files
6.11K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface

📝 Summary:
GLiNER2 is a unified transformer-based framework that supports multiple NLP tasks with improved efficiency and accessibility compared to large language models. AI-generated summary Information extract...

🔹 Publication Date: Published on Jul 24, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.18546
• PDF: https://arxiv.org/pdf/2507.18546
• Github: https://github.com/fastino-ai/GLiNER2

🔹 Models citing this paper:
https://huggingface.co/fastino/gliner2-base-v1
https://huggingface.co/fastino/gliner2-large-v1
https://huggingface.co/fastino/gliner2-multi-v1

Spaces citing this paper:
https://huggingface.co/spaces/sitammeur/GLiNER2-Suite
https://huggingface.co/spaces/fastino/gliner2-official-demo
https://huggingface.co/spaces/sohom004/testdup

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Mozi: Governed Autonomy for Drug Discovery LLM Agents

📝 Summary:
Mozi is a dual-layer framework for reliable drug discovery LLM agents, solving issues of tool-use governance and long-horizon reliability. It uses a control plane for isolated tool-use and replanning, plus a workflow plane for structured stages with human oversight, ensuring robust, auditable res...

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03655
• PDF: https://arxiv.org/pdf/2603.03655
• Project Page: https://ai4s.idea.edu.cn/ai4s/mozi

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMAgents #DrugDiscovery #AI #MachineLearning #AutonomousAgents
MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models

📝 Summary:
MASQuant improves multimodal LLM quantization by resolving smoothing misalignment and cross-modal invariance. It uses modality-aware smoothing and SVD whitening for cross-modal compensation, achieving stable, competitive performance.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04800
• PDF: https://arxiv.org/pdf/2603.04800

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MultimodalAI #LLM #Quantization #DeepLearning #AIResearch
SkillNet: Create, Evaluate, and Connect AI Skills

📝 Summary:
SkillNet is an open infrastructure that systematically creates, evaluates, and organizes AI skills using a unified ontology. This overcomes the lack of skill accumulation in current agents, significantly boosting performance by 40 percent in rewards and reducing execution steps by 30 percent.

🔹 Publication Date: Published on Feb 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04448
• PDF: https://arxiv.org/pdf/2603.04448
• Project Page: https://skillnet.openkg.cn/
• Github: https://github.com/zjunlp/SkillNet

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #AISkills #AIAgents #Ontology #MachineLearning
STMI: Segmentation-Guided Token Modulation with Cross-Modal Hypergraph Interaction for Multi-Modal Object Re-Identification

📝 Summary:
STMI is a novel multi-modal ReID framework that improves object re-identification. It uses segmentation-guided modulation for foreground enhancement, token reallocation for compact features, and cross-modal hypergraph interaction to capture high-order semantic relationships.

🔹 Publication Date: Published on Feb 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00695
• PDF: https://arxiv.org/pdf/2603.00695

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ObjectReID #ComputerVision #DeepLearning #MultiModalAI #STMI
Interactive Benchmarks

📝 Summary:
Interactive Benchmarks propose a new framework to assess AI intelligence by evaluating active information acquisition and reasoning under constraints. This approach offers a robust assessment, revealing significant room for model improvement in interactive scenarios.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04737
• PDF: https://arxiv.org/pdf/2603.04737
• Project Page: https://github.com/interactivebench/interactivebench
• Github: https://github.com/interactivebench/interactivebench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #AIBenchmarks #InteractiveAI #AIIntelligence #MachineLearning
Lightweight Visual Reasoning for Socially-Aware Robots

📝 Summary:
A lightweight language-to-vision feedback module enhances VLMs for robotics. It reinterprets visual scenes under text context via a gated MLP, improving navigation, scene description, and human intention recognition with minimal parameters.

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03942
• PDF: https://arxiv.org/pdf/2603.03942
• Github: https://github.com/alessioGalatolo/VLM-Reasoning-for-Robotics

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Robotics #VLMs #VisualReasoning #AI #HumanRobotInteraction
1
Media is too big
VIEW IN TELEGRAM
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling

📝 Summary:
LPWM is a self-supervised object-centric world model that autonomously discovers object representations from video data. It models stochastic particle dynamics for decision-making, achieving state-of-the-art results.

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04553
• PDF: https://arxiv.org/pdf/2603.04553
• Project Page: https://taldatech.github.io/lpwm-web/
• Github: https://github.com/taldatech/lpwm

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#WorldModels #SelfSupervisedLearning #ObjectCentricAI #MachineLearning #AI
1
Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning

📝 Summary:
SLATE uses truncated step-level sampling and dense LLM-as-judge rewards to train language models with search engines. This reduces gradient variance and improves performance on complex reasoning tasks by better assigning credit.

🔹 Publication Date: Published on Feb 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.23440
• PDF: https://arxiv.org/pdf/2602.23440
• Github: https://github.com/algoprog/SLATE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #RetrievalAugmentedReasoning #ReinforcementLearning #AI #MachineLearning
1
Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching

📝 Summary:
Fast-FoundationStereo achieves real-time zero-shot stereo matching, bridging the gap between slow robust models and fast specialized ones. It employs distillation, architecture search, and pruning, running over 10x faster with similar accuracy to prior foundation models. This sets a new state-of-...

🔹 Publication Date: Published on Dec 11, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.11130
• PDF: https://arxiv.org/pdf/2512.11130
• Project Page: https://nvlabs.github.io/Fast-FoundationStereo/
• Github: https://github.com/NVlabs/Fast-FoundationStereo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#StereoMatching #ComputerVision #RealTimeAI #ZeroShotLearning #DeepLearning
1
Cautious Weight Decay

📝 Summary:
Cautious Weight Decay CWD is an optimizer modification that selectively applies weight decay to parameters whose signs align with the optimizer update. It improves accuracy and loss in large-scale models without additional tuning, acting as a drop-in change for common optimizers.

🔹 Publication Date: Published on Oct 14, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.12402
• PDF: https://arxiv.org/pdf/2510.12402
• Project Page: https://elm.baulab.info
• Github: https://github.com/google-deepmind/simply

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#WeightDecay #Optimization #DeepLearning #MachineLearning #AI
1
Please open Telegram to view this post
VIEW IN TELEGRAM
🎉1
OASIS: Open Agent Social Interaction Simulations with One Million Agents

📝 Summary:
OASIS is a scalable and generalizable social media simulator that models up to one million LLM agents. It replicates complex social phenomena like information spreading and group polarization across platforms, demonstrating enhanced group dynamics and diverse opinions with larger agent groups.

🔹 Publication Date: Published on Nov 18, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2411.11581
• PDF: https://arxiv.org/pdf/2411.11581
• Github: https://github.com/camel-ai/oasis

Spaces citing this paper:
https://huggingface.co/spaces/nguyenthanhasia/oasis-demo

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMAgents #SocialSimulation #AgentBasedModeling #ComputationalSocialScience #GroupDynamics
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling

📝 Summary:
FlashPrefill drastically speeds up LLM prefilling using instantaneous pattern discovery and dynamic thresholding for sparse attention. It achieves an unprecedented 27.78x speedup on 256K sequences and maintains 1.71x on 4K contexts.

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06199
• PDF: https://arxiv.org/pdf/2603.06199
• Github: https://github.com/qhfan/FlashPrefill

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #FlashPrefill #SparseAttention #LongContext #DeepLearning
PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

📝 Summary:
PixARMesh reconstructs complete 3D indoor scene meshes from a single image. It uses a unified model with cross-attention and autoregressive generation to directly predict layout and geometry, producing high-quality, lightweight meshes.

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05888
• PDF: https://arxiv.org/pdf/2603.05888
• Project Page: https://mlpc-ucsd.github.io/PixARMesh/
• Github: https://github.com/mlpc-ucsd/PixARMesh

🔹 Models citing this paper:
https://huggingface.co/zx1239856/PixARMesh-EdgeRunner
https://huggingface.co/zx1239856/PixARMesh-BPT

Datasets citing this paper:
https://huggingface.co/datasets/zx1239856/3d-front-ar-packed
https://huggingface.co/datasets/zx1239856/PixARMesh-eval-data

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#3DReconstruction #ComputerVision #DeepLearning #SingleView3D #MeshGeneration
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

📝 Summary:
Penguin-VL introduces a vision encoder initialized from a text-only LLM, outperforming traditional contrastive pretraining. This method achieves superior visual fidelity and performance in multimodal tasks with a lightweight architecture, enabling efficient deployment on resource-constrained devi...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06569
• PDF: https://arxiv.org/pdf/2603.06569
• Github: https://github.com/tencent-ailab/Penguin-VL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VLM #LLM #MultimodalAI #EfficientAI #AIResearch
This media is not supported in your browser
VIEW IN TELEGRAM
RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

📝 Summary:
RoboMME introduces a large-scale standardized benchmark for evaluating memory in vision-language-action models for long-horizon robotic manipulation. It comprises 16 tasks assessing temporal, spatial, object, and procedural memory. Experiments show memory effectiveness is highly task-dependent, w...

🔹 Publication Date: Published on Mar 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04639
• PDF: https://arxiv.org/pdf/2603.04639
• Project Page: https://robomme.github.io/
• Github: https://github.com/RoboMME/robomme_benchmark

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Robotics #AI #Benchmark #RoboticManipulation #Memory
Reasoning Models Struggle to Control their Chains of Thought

📝 Summary:
Reasoning models exhibit very low control over their Chain-of-Thought steps compared to their final outputs. This low controllability, though poorly understood, currently suggests CoT monitoring remains a reliable tool for understanding models.

🔹 Publication Date: Published on Mar 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05706
• PDF: https://arxiv.org/pdf/2603.05706

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #MachineLearning #ChainOfThought #LLMs #AIResearch
Physical Simulator In-the-Loop Video Generation

📝 Summary:
PSIVG integrates a physical simulator into video diffusion processes to generate physically consistent videos while maintaining visual quality and diversity. AI-generated summary Recent advances in di...

🔹 Publication Date: Published on Mar 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.06408
• PDF: https://arxiv.org/pdf/2603.06408
• Project Page: https://vcai.mpi-inf.mpg.de/projects/PSIVG/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research