ML Research Hub
32.9K subscribers
5.35K photos
332 videos
24 files
5.78K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Visual Persuasion: What Influences Decisions of Vision-Language Models?

📝 Summary:
Visual-language models' decision-making preferences are studied through controlled image choice tasks with systematic input perturbations, revealing visual vulnerabilities and safety concerns. AI-gene...

🔹 Publication Date: Published on Feb 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15278
• PDF: https://arxiv.org/pdf/2602.15278
• Project Page: https://visual-persuasion-website.vercel.app/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Geometry-Aware Rotary Position Embedding for Consistent Video World Model

📝 Summary:
ViewRope, a geometry-aware encoding method, enhances long-term consistency in predictive world models by injecting camera-ray directions into video transformer attention layers, addressing spatial per...

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07854
• PDF: https://arxiv.org/pdf/2602.07854
• Project Page: https://huggingface.co/papers?q=projective%20geometry

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
STAPO: Stabilizing Reinforcement Learning for LLMs by Silencing Rare Spurious Tokens

📝 Summary:
Research identifies spurious tokens as the cause of training instability in reinforcement learning fine-tuning of large language models and proposes a solution that selectively masks problematic gradi...

🔹 Publication Date: Published on Feb 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15620
• PDF: https://arxiv.org/pdf/2602.15620

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ResearchGym: Evaluating Language Model Agents on Real-World AI Research

📝 Summary:
ResearchGym presents a benchmark environment for evaluating AI agents on end-to-end research tasks, revealing significant capability-reliability gaps in current autonomous agents despite occasional st...

🔹 Publication Date: Published on Feb 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15112
• PDF: https://arxiv.org/pdf/2602.15112
• Github: https://github.com/Anikethh/ResearchGym

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models

📝 Summary:
The Reason-Reflect-Refine framework addresses the trade-off between generation and understanding in multimodal models by reformulating single-step generation into a multi-step process that explicitly ...

🔹 Publication Date: Published on Feb 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15772
• PDF: https://arxiv.org/pdf/2602.15772
• Github: https://github.com/sen-ye/R3

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ClinAlign: Scaling Healthcare Alignment from Clinician Preference

📝 Summary:
A two-stage framework addresses alignment of large language models with clinician preferences through physician-verified examples and distilled clinical principles for improved medical reasoning. AI-g...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09653
• PDF: https://arxiv.org/pdf/2602.09653
• Project Page: https://github.com/AQ-MedAI/ClinAlign
• Github: https://github.com/AQ-MedAI/ClinAlign

🔹 Models citing this paper:
https://huggingface.co/AQ-MedAI/ClinAligh-4B
https://huggingface.co/AQ-MedAI/ClinAligh-30B-A3B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
This media is not supported in your browser
VIEW IN TELEGRAM
Revisiting the Platonic Representation Hypothesis: An Aristotelian View

📝 Summary:
A new calibration framework corrects inflated neural network similarity scores. It reveals global convergence vanishes, while local neighborhood similarity persists, supporting the Aristotelian Representation Hypothesis of shared local relationships.

🔹 Publication Date: Published on Feb 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.14486
• PDF: https://arxiv.org/pdf/2602.14486
• Project Page: https://brbiclab.epfl.ch/projects/aristotelian/
• Github: https://github.com/mlbio-epfl/aristotelian

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Learning Native Continuation for Action Chunking Flow Policies

📝 Summary:
Legato improves action-chunked Vision Language Action models by using training-time continuation methods that ensure smooth trajectories and reduce multimodal switching during real-time execution. AI-...

🔹 Publication Date: Published on Feb 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12978
• PDF: https://arxiv.org/pdf/2602.12978
• Project Page: https://lyfeng001.github.io/Legato/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Prescriptive Scaling Reveals the Evolution of Language Model Capabilities

📝 Summary:
Large-scale observational analysis estimates capability boundaries and performance predictions for foundation models using quantile regression and evaluates temporal stability across tasks. AI-generat...

🔹 Publication Date: Published on Feb 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15327
• PDF: https://arxiv.org/pdf/2602.15327
• Project Page: https://jkjin.com/prescriptive-scaling

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
COMPOT: Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers Compression

📝 Summary:
COMPOT is a training-free Transformer compression framework. It uses sparse dictionary learning with orthogonal dictionaries and closed-form updates, outperforming traditional low-rank methods. This results in a superior quality-compression trade-off by also adaptively allocating layer-wise compr...

🔹 Publication Date: Published on Feb 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15200
• PDF: https://arxiv.org/pdf/2602.15200

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Transformers #ModelCompression #DeepLearning #AIResearch #Optimization
2
UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

📝 Summary:
UniT introduces a framework for unified multimodal models to perform iterative chain-of-thought reasoning and refinement. This test-time scaling substantially improves generation and understanding, generalizing to longer inference chains.

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12279
• PDF: https://arxiv.org/pdf/2602.12279

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MultimodalAI #ChainOfThought #AIResearch #MachineLearning #GenerativeAI
Causal-JEPA: Learning World Models through Object-Level Latent Interventions

📝 Summary:
C-JEPA is an object-centric world model extending masked joint embedding prediction. It uses object-level masking to induce latent interventions, forcing interaction reasoning and preventing shortcuts. This improves visual question answering, counterfactual reasoning, and efficient agent control ...

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11389
• PDF: https://arxiv.org/pdf/2602.11389
• Github: https://github.com/galilai-group/cjepa

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #MachineLearning #WorldModels #Causality #DeepLearning
Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

📝 Summary:
Sparse Autoencoders SAEs do not reliably decompose neural network internals despite strong reconstruction. On synthetic data, SAEs recovered only 9% of true features. Random baselines matched fully-trained SAEs in interpretability and downstream tasks, suggesting current SAEs fail their core purp...

🔹 Publication Date: Published on Feb 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.14111
• PDF: https://arxiv.org/pdf/2602.14111

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#SparseAutoencoders #AIResearch #MachineLearning #NeuralNetworks #Interpretability
1
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

📝 Summary:
SkillsBench shows curated agent skills significantly boost performance, though inconsistently. Models struggle to create useful skills themselves, as self-generated skills provide no benefit. Focused skills are more effective.

🔹 Publication Date: Published on Feb 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12670
• PDF: https://arxiv.org/pdf/2602.12670
• Project Page: https://skillsbench.ai/
• Github: https://github.com/benchflow-ai/skillsbench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AgentSkills #AI #Benchmarking #MachineLearning #LLMAgents
jina-embeddings-v5-text: Task-Targeted Embedding Distillation

📝 Summary:
This paper introduces a novel training regimen for compact text embedding models. It combines distillation with task-specific contrastive loss to achieve state-of-the-art performance for small models. The resulting jina-embeddings-v5-text models support long contexts and robust quantization.

🔹 Publication Date: Published on Feb 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15547
• PDF: https://arxiv.org/pdf/2602.15547

🔹 Models citing this paper:
https://huggingface.co/jinaai/jina-embeddings-v5-text-small
https://huggingface.co/jinaai/jina-embeddings-v5-text-nano

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#TextEmbeddings #MachineLearning #NLP #ModelDistillation #DeepLearning
TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models

📝 Summary:
TAROT proposes a reinforcement fine-tuning method for code generation that uses a four-tier test suite and capability-adaptive curriculum. This approach tailors curriculum progression based on a models skill, improving functional correctness and robustness.

🔹 Publication Date: Published on Feb 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15449
• PDF: https://arxiv.org/pdf/2602.15449
• Github: https://github.com/deep-diver/TAROT

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLM #CodeGeneration #ReinforcementLearning #AI #MachineLearning
Detecting Overflow in Compressed Token Representations for Retrieval-Augmented Generation

📝 Summary:
Soft compression for LLMs can lead to token overflow, losing vital information. This paper proposes query-aware probing classifiers that detect overflow with 0.72 AUC-ROC, improving upon query-agnostic methods. This enables low-cost pre-LLM gating to mitigate compression-induced errors.

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12235
• PDF: https://arxiv.org/pdf/2602.12235

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMs #RAG #NLP #AIResearch #TokenCompression
A Trajectory-Based Safety Audit of Clawdbot (OpenClaw)

📝 Summary:
Clawdbot, a self-hosted AI agent, exhibits a non-uniform safety profile. It reliably performs specified tasks but struggles with ambiguous inputs, open-ended goals, or jailbreaks, escalating minor misinterpretations into risky tool actions.

🔹 Publication Date: Published on Feb 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.14364
• PDF: https://arxiv.org/pdf/2602.14364
• Github: https://github.com/tychenn/clawdbot_report

Datasets citing this paper:
https://huggingface.co/datasets/tianyyuu/clawdbot_safety_testing

Spaces citing this paper:
https://huggingface.co/spaces/tianyyuu/clawdbot-safety-audit-explorer

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AISafety #AIagent #Robotics #AIaudit #MachineLearning
This media is not supported in your browser
VIEW IN TELEGRAM
MIND: Benchmarking Memory Consistency and Action Control in World Models

📝 Summary:
MIND is the first open-domain, closed-loop benchmark for evaluating world model abilities like memory consistency and action control. It uses high-quality videos and various action spaces, uncovering current models struggles with long-term memory and action generalization.

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08025
• PDF: https://arxiv.org/pdf/2602.08025
• Project Page: https://csu-jpg.github.io/MIND.github.io/
• Github: https://github.com/CSU-JPG/MIND

Datasets citing this paper:
https://huggingface.co/datasets/CSU-JPG/MIND

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Revisiting the Platonic Representation Hypothesis: An Aristotelian View

📝 Summary:
A new calibration framework corrects inflated neural network similarity scores. It reveals global convergence vanishes, while local neighborhood similarity persists, supporting the Aristotelian Representation Hypothesis of shared local relationships.

🔹 Publication Date: Published on Feb 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.14486
• PDF: https://arxiv.org/pdf/2602.14486
• Project Page: https://brbiclab.epfl.ch/projects/aristotelian/
• Github: https://github.com/mlbio-epfl/aristotelian

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam

📝 Summary:
HLE-Verified systematically validates and revises the HLE benchmark, resolving noisy items through expert review and model-based checks. This improves language model evaluation accuracy by 7-10 percentage points, especially on erroneous items, enabling more reliable measurement of model capabilit...

🔹 Publication Date: Published on Feb 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.13964
• PDF: https://arxiv.org/pdf/2602.13964

Datasets citing this paper:
https://huggingface.co/datasets/skylenage/HLE-Verified

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMEvaluation #Benchmarking #LanguageModels #AIResearch #NLP