ML Research Hub
32.3K subscribers
6.71K photos
466 videos
24 files
7.29K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Terminal Wrench: A Dataset of 331 Reward-Hackable Environments and 3,632 Exploit Trajectories

📝 Summary:
A dataset of 331 terminal-agent environments with 3,632 reward-hacking trajectories and 2,352 legitimate baselines across four AI models is released to study adversarial exploits in system administrat...

🔹 Publication Date: Published on Apr 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.17596
• PDF: https://arxiv.org/pdf/2604.17596
• Project Page: https://github.com/few-sh/terminal-wrench
• Github: https://github.com/few-sh/terminal-wrench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
KWBench: Measuring Unprompted Problem Recognition in Knowledge Work

📝 Summary:
KWBench is a new benchmark for evaluating LLMs ability to recognize underlying game-theoretic structures in professional scenarios without prompts. It tests if models can identify the problem type from raw inputs alone. Current LLMs perform poorly, failing to recognize problems even if they can a...

🔹 Publication Date: Published on Apr 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.15760
• PDF: https://arxiv.org/pdf/2604.15760
• Project Page: https://kwbench.github.io/
• Github: https://github.com/ankitmaloo/fasteval

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AgentSPEX: An Agent SPecification and EXecution Language

📝 Summary:
AgentSPEX is a domain-specific language and framework for creating structured, modular, and interpretable large language model agent workflows with explicit control flow and state management. AI-gener...

🔹 Publication Date: Published on Apr 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.13346
• PDF: https://arxiv.org/pdf/2604.13346
• Project Page: https://agentspex.ai/
• Github: https://github.com/ScaleML/AgentSPEX

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation

📝 Summary:
CoInteract presents an end-to-end framework for human-object interaction video synthesis using a Diffusion Transformer backbone with specialized modules for structural stability and physical plausibil...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19636
• PDF: https://arxiv.org/pdf/2604.19636
• Project Page: https://xinxiaozhe12345.github.io/CoInteract_Project/
• Github: https://github.com/luoxyhappy/CoInteract

🔹 Models citing this paper:
https://huggingface.co/georgexin/cointeract

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
PlayCoder: Making LLM-Generated GUI Code Playable

📝 Summary:
Large language models struggle to generate logically correct GUI applications, prompting the development of PlayEval benchmark and PlayCoder framework that uses multi-agent approaches to improve funct...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19742
• PDF: https://arxiv.org/pdf/2604.19742
• Project Page: https://arxiv.org/abs/2604.19742
• Github: https://github.com/Tencent/PlayCoder

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Dual-View Training for Instruction-Following Information Retrieval

📝 Summary:
A dual-view data synthesis approach using polarity reversal enhances retrieval systems' ability to follow instructions by training models to distinguish between topic-relevant and instruction-complian...

🔹 Publication Date: Published on Apr 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18845
• PDF: https://arxiv.org/pdf/2604.18845

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Code-Switching Information Retrieval: Benchmarks, Analysis, and the Limits of Current Retrievers

📝 Summary:
Code-switching poses significant challenges for information retrieval systems, revealing performance bottlenecks and embedding space divergences that current multilingual approaches cannot fully addre...

🔹 Publication Date: Published on Apr 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.17632
• PDF: https://arxiv.org/pdf/2604.17632
• Github: https://github.com/paddler2022/Code-Switching-Information-Retrieval

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LoopCTR: Unlocking the Loop Scaling Power for Click-Through Rate Prediction

📝 Summary:
LoopCTR introduces a loop scaling paradigm for CTR models that increases training computation through recursive layer reuse while maintaining efficient inference, achieving state-of-the-art performanc...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19550
• PDF: https://arxiv.org/pdf/2604.19550

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Target-Oriented Pretraining Data Selection via Neuron-Activated Graph

📝 Summary:
A novel target-oriented language model pretraining framework uses neuron activation graphs to select informative data without additional training, demonstrating superior performance across multiple be...

🔹 Publication Date: Published on Apr 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.15706
• PDF: https://arxiv.org/pdf/2604.15706
• Project Page: https://asillycat.github.io/NAG-website/
• Github: https://github.com/asillycat/NAG

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items

📝 Summary:
A commercial-scale virtual try-on system achieves high success rates, photorealistic results, and real-time performance through integrated system design and multi-stage training. AI-generated summary ...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19748
• PDF: https://arxiv.org/pdf/2604.19748
• Project Page: https://mpage.taobao.com/hd/download.html

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge

📝 Summary:
Research identifies systematic biases in multimodal large language models used as automatic evaluators, revealing reliability issues and proposing a benchmark for measuring compositional bias through ...

🔹 Publication Date: Published on Apr 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18164
• PDF: https://arxiv.org/pdf/2604.18164
• Project Page: https://mm-judgebias.github.io/
• Github: https://github.com/naver-ai/MM-JudgeBias

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ClawNet: Human-Symbiotic Agent Network for Cross-User Autonomous Cooperation

📝 Summary:
AI agents must evolve beyond individual task automation to enable secure, governed collaboration among multiple users through a human-symbiotic paradigm with identity-based governance mechanisms. AI-g...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19211
• PDF: https://arxiv.org/pdf/2604.19211
• Project Page: https://www.clawnet.hk/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
UniMesh: Unifying 3D Mesh Understanding and Generation

📝 Summary:
UniMesh presents a unified framework that combines 3D generation and understanding tasks through novel components including a Mesh Head, Chain of Mesh for iterative editing, and a self-reflection mech...

🔹 Publication Date: Published on Apr 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.17472
• PDF: https://arxiv.org/pdf/2604.17472
• Project Page: https://aigeeksgroup.github.io/UniMesh/
• Github: https://github.com/AIGeeksGroup/UniMesh

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Evaluation-driven Scaling for Scientific Discovery

📝 Summary:
SimpleTES framework scales evaluation-driven discovery loops for scientific problems, achieving state-of-the-art results across multiple domains through parallel exploration and feedback-driven refine...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19341
• PDF: https://arxiv.org/pdf/2604.19341
• Project Page: https://www.wizardquant.com/will/simpletes

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SPRITE: From Static Mockups to Engine-Ready Game UI

📝 Summary:
SPRITE enables automated conversion of game UI screenshots into editable engine assets by combining vision-language models with structured YAML representation to handle complex layouts and nesting. AI...

🔹 Publication Date: Published on Mar 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.18591
• PDF: https://arxiv.org/pdf/2604.18591
• Project Page: https://baiyunshu.github.io/sprite.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language

📝 Summary:
Chat2Workflow presents a benchmark and agentic framework for automating executable visual workflow generation from natural language, revealing significant challenges in achieving industrial-grade auto...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19667
• PDF: https://arxiv.org/pdf/2604.19667
• Github: https://github.com/zjunlp/Chat2Workflow

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Speculative Decoding for Autoregressive Video Generation

📝 Summary:
Speculative decoding is adapted to autoregressive video diffusion through a quality-based routing mechanism that maintains high visual quality while achieving significant speedup. AI-generated summary...

🔹 Publication Date: Published on Apr 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.17397
• PDF: https://arxiv.org/pdf/2604.17397

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Contrastive Attribution in the Wild: An Interpretability Analysis of LLM Failures on Realistic Benchmarks

📝 Summary:
Contrastive attribution methods for analyzing large language model failures show mixed effectiveness across different benchmarks and model sizes. AI-generated summary Interpretability tools are increa...

🔹 Publication Date: Published on Apr 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.17761
• PDF: https://arxiv.org/pdf/2604.17761
• Project Page: https://jzxycsjzy.github.io/Debug-XAI/
• Github: https://github.com/microsoft/Debug-XAI

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
TEMPO: Scaling Test-time Training for Large Reasoning Models

📝 Summary:
TEMPO is a test-time training framework that alternates policy refinement with critic recalibration to sustain performance improvements in language models without diversity collapse. AI-generated summ...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19295
• PDF: https://arxiv.org/pdf/2604.19295
• Project Page: https://qingyangzhang.github.io/tempo-homepage
• Github: https://github.com/QingyangZhang/TEMPO

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing

📝 Summary:
SmartPhotoCrafter automates photographic image editing by combining image quality comprehension with targeted enhancement, using a reasoning-to-generation approach that eliminates the need for explici...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19587
• PDF: https://arxiv.org/pdf/2604.19587

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model

📝 Summary:
AnyRecon enables scalable 3D reconstruction from arbitrary sparse inputs using diffusion models with persistent scene memory and geometry-aware conditioning for improved geometric consistency. AI-gene...

🔹 Publication Date: Published on Apr 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.19747
• PDF: https://arxiv.org/pdf/2604.19747

🔹 Models citing this paper:
https://huggingface.co/Yutian10/AnyRecon

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research