ML Research Hub
32.9K subscribers
5.36K photos
332 videos
24 files
5.79K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Surprisal-Guided Selection: Compute-Optimal Test-Time Strategies for Execution-Grounded Code Generation

📝 Summary:
Test-time training fails in verification-grounded tasks due to over-sharpening, while surprisal-guided selection improves performance by favoring diverse, low-confidence samples. AI-generated summary ...

🔹 Publication Date: Published on Feb 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07670
• PDF: https://arxiv.org/pdf/2602.07670
• Project Page: https://jbarnes850.github.io/2026/02/02/surprisal-guided-selection/
• Github: https://jbarnes850.github.io/2026/02/02/surprisal-guided-selection/

🔹 Models citing this paper:
https://huggingface.co/Jarrodbarnes/KernelBench-RLVR-120b

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Effective Reasoning Chains Reduce Intrinsic Dimensionality

📝 Summary:
Effective chain-of-thought reasoning strategies reduce intrinsic dimensionality, leading to better generalization by requiring fewer model parameters to achieve given accuracy thresholds. AI-generated...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09276
• PDF: https://arxiv.org/pdf/2602.09276

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ContextBench: A Benchmark for Context Retrieval in Coding Agents

📝 Summary:
ContextBench evaluates context retrieval in coding agents through detailed process analysis, revealing that advanced agent designs provide limited improvements in context usage while highlighting gaps...

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.05892
• PDF: https://arxiv.org/pdf/2602.05892
• Project Page: https://contextbench.github.io/
• Github: https://github.com/EuniAI/ContextBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model

📝 Summary:
SecCoderX uses online reinforcement learning to align large language models for secure code generation while preserving functionality, addressing the functionality-security trade-off through vulnerabi...

🔹 Publication Date: Published on Feb 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07422
• PDF: https://arxiv.org/pdf/2602.07422
• Github: https://github.com/AndrewWTY/SecCoderX

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
This media is not supported in your browser
VIEW IN TELEGRAM
Contact-Anchored Policies: Contact Conditioning Creates Strong Robot Utility Models

📝 Summary:
Contact-Anchored Policies CAP replace language conditioning with physical contact points, using modular utility models for robust robot manipulation. CAP achieves superior zero-shot performance with minimal demonstration data, outperforming large VLAs by 56%.

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09017
• PDF: https://arxiv.org/pdf/2602.09017
• Project Page: https://cap-policy.github.io/
• Github: https://github.com/jeffacce/cap-policy

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
iGRPO: Self-Feedback-Driven LLM Reasoning

📝 Summary:
iGRPO enhances LLM mathematical reasoning using a two-stage, self-feedback process. It first drafts solutions, selects the best, and then refines based on that best draft. This iterative approach significantly improves performance and achieves state-of-the-art results on math benchmarks.

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09000
• PDF: https://arxiv.org/pdf/2602.09000

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
👍1
CausalArmor: Efficient Indirect Prompt Injection Guardrails via Causal Attribution

📝 Summary:
CausalArmor is a selective defense against Indirect Prompt Injection in AI agents. It uses causal ablation to detect when untrusted content dominates an agents privileged actions, triggering targeted sanitization only then. This improves security utility and latency over always-on defenses.

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07918
• PDF: https://arxiv.org/pdf/2602.07918

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
AgentSys: Secure and Dynamic LLM Agents Through Explicit Hierarchical Memory Management

📝 Summary:
AgentSys defends against indirect prompt injection in LLM agents through hierarchical memory isolation and controlled data flow, significantly reducing attack success rates while maintaining performan...

🔹 Publication Date: Published on Feb 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07398
• PDF: https://arxiv.org/pdf/2602.07398
• Github: https://github.com/ruoyaow/agentsys-memory

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Locas: Your Models are Principled Initializers of Locally-Supported Parametric Memories

📝 Summary:
Locas, a locally-supported parametric memory mechanism, enables flexible integration with transformer models for continual learning while minimizing catastrophic forgetting through principled initiali...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.05085
• PDF: https://arxiv.org/pdf/2602.05085

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SceneSmith: Agentic Generation of Simulation-Ready Indoor Scenes

📝 Summary:
SceneSmith is a hierarchical agentic framework that generates simulation-ready indoor environments from natural language prompts through multiple stages involving VLM agents and integrated asset gener...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09153
• PDF: https://arxiv.org/pdf/2602.09153
• Project Page: https://scenesmith.github.io/
• Github: https://github.com/nepfaff/scenesmith

Datasets citing this paper:
https://huggingface.co/datasets/nepfaff/scenesmith-example-scenes

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
MIND: Benchmarking Memory Consistency and Action Control in World Models

📝 Summary:
MIND is the first open-domain, closed-loop benchmark for evaluating world model abilities like memory consistency and action control. It uses high-quality videos and various action spaces, uncovering current models struggles with long-term memory and action generalization.

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08025
• PDF: https://arxiv.org/pdf/2602.08025
• Project Page: https://csu-jpg.github.io/MIND.github.io/
• Github: https://github.com/CSU-JPG/MIND

Datasets citing this paper:
https://huggingface.co/datasets/CSU-JPG/MIND

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Stroke3D: Lifting 2D strokes into rigged 3D model via latent diffusion models

📝 Summary:
Stroke3D generates rigged 3D meshes from 2D strokes and text prompts through a two-stage pipeline combining controllable skeleton generation with enhanced mesh synthesis. AI-generated summary Rigged 3...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09713
• PDF: https://arxiv.org/pdf/2602.09713
• Project Page: https://whalesong-zrs.github.io/Stroke3D_project_page/
• Github: https://github.com/Whalesong-zrs/Stroke3D

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
G-LNS: Generative Large Neighborhood Search for LLM-Based Automatic Heuristic Design

📝 Summary:
A generative evolutionary framework extends large language models for automated design of large neighborhood search operators in combinatorial optimization problems. AI-generated summary While Large L...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08253
• PDF: https://arxiv.org/pdf/2602.08253
• Project Page: https://zboyn.github.io/G-LNS/
• Github: https://github.com/ZBoyn/G-LNS

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models

📝 Summary:
Meta-Experience Learning enhances LLM reasoning by incorporating self-distilled error representations into parametric memory through contrastive trajectory analysis and language-modeled reward signals...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10224
• PDF: https://arxiv.org/pdf/2602.10224

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

📝 Summary:
Step 3.5 Flash is a sparse Mixture-of-Experts model that achieves frontier-level agentic intelligence through efficient parameter utilization and optimized attention mechanisms, demonstrating strong p...

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10604
• PDF: https://arxiv.org/pdf/2602.10604
• Github: https://github.com/stepfun-ai/Step-3.5-Flash

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
When the Prompt Becomes Visual: Vision-Centric Jailbreak Attacks for Large Image Editing Models

📝 Summary:
Visual-to-visual jailbreak attacks compromise image editing models through malicious visual inputs, necessitating new safety benchmarks and defense mechanisms. AI-generated summary Recent advances in ...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10179
• PDF: https://arxiv.org/pdf/2602.10179
• Github: https://csu-jpg.github.io/vja.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation

📝 Summary:
ArcFlow is a few-step distillation framework that uses non-linear flow trajectories to approximate teacher diffusion models, achieving fast inference with minimal quality loss through lightweight adap...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09014
• PDF: https://arxiv.org/pdf/2602.09014
• Github: https://github.com/pnotp/ArcFlow

🔹 Models citing this paper:
https://huggingface.co/ymyy307/ArcFlow

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use Agents

📝 Summary:
Computer-use agents face safety risks from misaligned actions caused by external attacks or internal limitations, prompting the development of DeAction, a guardrail that detects and corrects such acti...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08995
• PDF: https://arxiv.org/pdf/2602.08995
• Project Page: https://osu-nlp-group.github.io/Misaligned-Action-Detection/
• Github: https://github.com/OSU-NLP-Group/Misaligned-Action-Detection

Datasets citing this paper:
https://huggingface.co/datasets/osunlp/MisActBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions

📝 Summary:
AgenticPay presents a benchmark and simulation framework for evaluating multi-agent language-mediated economic interactions, focusing on negotiation performance and strategic reasoning challenges in c...

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06008
• PDF: https://arxiv.org/pdf/2602.06008
• Project Page: https://agenticpay-tutorial.readthedocs.io/en/latest/
• Github: https://github.com/SafeRL-Lab/AgenticPay

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

📝 Summary:
GRU-Mem addresses long-context reasoning challenges in LLMs by incorporating text-controlled gates and reinforcement learning rewards to stabilize memory updates and improve computational efficiency. ...

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10560
• PDF: https://arxiv.org/pdf/2602.10560
• Project Page: https://alphalab-ustc.github.io/grumem-alphalab/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research