ML Research Hub

✨On the Reliability of Computer Use Agents

📝 Summary:
Computer-use agents exhibit unreliable performance due to execution stochasticity, task specification ambiguity, and behavioral variability, necessitating repeated evaluation and stable strategies for...

🔹 Publication Date: Published on Apr 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.17849
• PDF: https://arxiv.org/pdf/2604.17849
• Github: https://github.com/simular-ai/cua_reliability

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

72 views16:08

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨The Illusion of Certainty: Decoupling Capability and Calibration in On-Policy Distillation

📝 Summary:
On-policy distillation suffers from miscalibration due to information mismatch between training and deployment contexts, which is addressed through a calibration-aware framework that improves both per...

🔹 Publication Date: Published on Apr 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.16830
• PDF: https://arxiv.org/pdf/2604.16830
• Project Page: https://github.com/SalesforceAIResearch/CaOPD
• Github: https://github.com/SalesforceAIResearch/CaOPD

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

77 views16:09

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Forge-UGC: FX optimization and register-graph engine for universal graph compiler

📝 Summary:
Forge-UGC is a four-phase compiler for efficient transformer deployment on heterogeneous hardware, offering faster compilation, reduced inference latency, and lower energy consumption compared to exis...

🔹 Publication Date: Published on Apr 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.16498
• PDF: https://arxiv.org/pdf/2604.16498

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

129 views16:09

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Protecting Language Models Against Unauthorized Distillation through Trace Rewriting

📝 Summary:
Techniques for modifying teacher-generated reasoning traces to prevent unauthorized knowledge distillation while maintaining answer correctness and enabling detectable watermarks are presented. AI-gen...

🔹 Publication Date: Published on Apr 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.15143
• PDF: https://arxiv.org/pdf/2602.15143
• Github: https://github.com/xhOwenMa/trace-rewriting

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

164 views16:09

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility

📝 Summary:
Symbolic guardrails provide strong safety and security guarantees for AI agents in high-stakes environments. A study found these guardrails can enforce 74% of specified policy requirements, improving safety without sacrificing utility. This makes them a practical solution for domain-specific agents.

🔹 Publication Date: Published on Apr 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.15579
• PDF: https://arxiv.org/pdf/2604.15579
• Github: https://github.com/hyn0027/agent-symbolic-guardrails

✨ Datasets citing this paper:
• https://huggingface.co/datasets/hyn0027D/agent-symbolic-guardrails

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

89 views19:09

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨When Background Matters: Breaking Medical Vision Language Models by Transferable Attack

📝 Summary:
MedFocusLeak enables transferable black-box attacks on vision-language models for medical imaging by injecting imperceptible perturbations that redirect model attention, demonstrating significant vuln...

🔹 Publication Date: Published on Apr 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.17318
• PDF: https://arxiv.org/pdf/2604.17318
• Project Page: https://akashghosh.github.io/MedFocusLeakACL/
• Github: https://github.com/AkashGhosh/When-Background-Matters-Breaking-Medical-Vision-Language-Models-by-Transferable-Attack

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

66 views20:10

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:36

This media is not supported in your browser

VIEW IN TELEGRAM

✨MARCO: Navigating the Unseen Space of Semantic Correspondence

📝 Summary:
MARCO is a compact, fast model for semantic correspondence that excels at generalizing to unseen keypoints. Its coarse-to-fine objective and self-distillation framework improve fine-grained localization and overall accuracy.

🔹 Publication Date: Published on Apr 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18267
• PDF: https://arxiv.org/pdf/2604.18267
• Project Page: https://visinf.github.io/MARCO
• Github: https://github.com/visinf/MARCO

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

76 views20:10

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨River-LLM: Large Language Model Seamless Exit Based on KV Share

📝 Summary:
River-LLM enables efficient token-level early exit in LLMs by introducing a KV-Shared Exit River. This mechanism naturally generates and preserves missing historical states, overcoming the KV Cache Absence problem. It achieves 1.71 to 2.16 times practical speedup while maintaining high generation...

🔹 Publication Date: Published on Apr 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18396
• PDF: https://arxiv.org/pdf/2604.18396

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

53 views21:10

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Terminal Wrench: A Dataset of 331 Reward-Hackable Environments and 3,632 Exploit Trajectories

📝 Summary:
A dataset of 331 terminal-agent environments with 3,632 reward-hacking trajectories and 2,352 legitimate baselines across four AI models is released to study adversarial exploits in system administrat...

🔹 Publication Date: Published on Apr 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.17596
• PDF: https://arxiv.org/pdf/2604.17596
• Project Page: https://github.com/few-sh/terminal-wrench
• Github: https://github.com/few-sh/terminal-wrench

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

78 views21:10

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨KWBench: Measuring Unprompted Problem Recognition in Knowledge Work

📝 Summary:
KWBench is a new benchmark for evaluating LLMs ability to recognize underlying game-theoretic structures in professional scenarios without prompts. It tests if models can identify the problem type from raw inputs alone. Current LLMs perform poorly, failing to recognize problems even if they can a...

🔹 Publication Date: Published on Apr 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.15760
• PDF: https://arxiv.org/pdf/2604.15760
• Project Page: https://kwbench.github.io/
• Github: https://github.com/ankitmaloo/fasteval

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

51 views00:00

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform