ML Research Hub

✨ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?

📝 Summary:
ViGoR benchmark addresses limitations in current AIGC evaluation by introducing a comprehensive framework for assessing visual generative reasoning across multiple modalities and cognitive dimensions....

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25823
• PDF: https://arxiv.org/pdf/2603.25823
• Project Page: https://vincenthancoder.github.io/ViGoR-Bench/
• Github: https://github.com/VincentHancoder/ViGoR-Bench-Eval

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

90 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Embarrassingly Simple Self-Distillation Improves Code Generation

📝 Summary:
Simple self-distillation improves code generation in large language models by fine-tuning on model-generated samples, effectively addressing precision-exploration trade-offs in decoding. AI-generated ...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01193
• PDF: https://arxiv.org/pdf/2604.01193
• Github: https://github.com/apple/ml-ssd

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

102 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Revision or Re-Solving? Decomposing Second-Pass Gains in Multi-LLM Pipelines

📝 Summary:
Multi-LLM revision pipelines' effectiveness varies by task structure and draft quality, with gains decomposing into re-solving, scaffold, and content components rather than representing uniform error ...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01029
• PDF: https://arxiv.org/pdf/2604.01029

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

97 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MMaDA-VLA: Large Diffusion Vision-Language-Action Model with Unified Multi-Modal Instruction and Generation

📝 Summary:
A native discrete diffusion framework unifies multi-modal understanding and generation for robotic manipulation, enabling parallel action and visual outcome prediction with improved long-horizon consi...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25406
• PDF: https://arxiv.org/pdf/2603.25406
• Project Page: https://yliu-cs.github.io/MMaDA-VLA
• Github: https://github.com/yliu-cs/MMaDA-VLA

🔹 Models citing this paper:
• https://huggingface.co/yliu-cs/MMaDA-VLA

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

122 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨HippoCamp: Benchmarking Contextual Agents on Personal Computers

📝 Summary:
HippoCamp is a new multimodal benchmark evaluating agents on massive personal file management. It exposes significant performance gaps in current models for long-horizon retrieval and cross-modal reasoning in user-centric environments, revealing bottlenecks in multimodal perception.

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01221
• PDF: https://arxiv.org/pdf/2604.01221
• Project Page: https://hippocamp-ai.github.io/
• Github: https://github.com/Savannah-yz/HippoCamp

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

141 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome

📝 Summary:
MiroEval is a new benchmark for deep research systems, addressing limitations of existing evaluations. It assesses adaptive synthesis, factuality, and process quality across real-user text and multimodal tasks, showing process quality predicts outcomes and multimodal tasks are very challenging.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28407
• PDF: https://arxiv.org/pdf/2603.28407
• Project Page: https://miroeval-ai.github.io/website/
• Github: https://github.com/MiroMindAI/MiroEval

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

113 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨QuitoBench: A High-Quality Open Time Series Forecasting Benchmark

📝 Summary:
QuitoBench addresses the lack of large-scale time series benchmarks by introducing a regime-balanced dataset with eight TSF regimes, revealing that foundation models outperform deep learning at long c...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26017
• PDF: https://arxiv.org/pdf/2603.26017

✨ Datasets citing this paper:
• https://huggingface.co/datasets/hq-bench/quitobench
• https://huggingface.co/datasets/hq-bench/quito-corpus

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#TimeSeriesForecasting #DataScience #MachineLearning #AI #QuitoBench

113 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

📝 Summary:
Vision2Web presents a comprehensive benchmark for visual website development tasks and evaluates coding agents across static UI generation, interactive frontend reproduction, and full-stack developmen...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26648
• PDF: https://arxiv.org/pdf/2603.26648
• Project Page: https://vision2web-bench.github.io/
• Github: https://github.com/zai-org/Vision2Web

✨ Datasets citing this paper:
• https://huggingface.co/datasets/zai-org/Vision2Web

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

129 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Paper Reconstruction Evaluation: Evaluating Presentation and Hallucination in AI-written Papers

📝 Summary:
A systematic evaluation framework called PaperRecon is proposed to assess AI-generated papers by separating quality assessment into presentation and hallucination dimensions using a benchmark of 51 re...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01128
• PDF: https://arxiv.org/pdf/2604.01128
• Project Page: https://agent4science-utokyo.github.io/PaperRecon_HP/
• Github: https://github.com/Agent4Science-UTokyo/PaperRecon

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

131 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants

📝 Summary:
A framework for proactive agent research is introduced that models applications as finite state machines to enable realistic user simulation and task execution across multiple digital environments. AI...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00842
• PDF: https://arxiv.org/pdf/2604.00842
• Github: https://github.com/deepakn97/pare

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

175 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Benchmarking and Mechanistic Analysis of Vision-Language Models for Cross-Depiction Assembly Instruction Alignment

📝 Summary:
Vision Language Models struggle with aligning assembly diagrams and video feeds due to a depiction gap, with findings indicating visual encoding as the primary target for improving cross-depiction rob...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00913
• PDF: https://arxiv.org/pdf/2604.00913
• Project Page: https://ryenhails.github.io/IKEA-Bench/
• Github: https://ryenhails.github.io/IKEA-Bench/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

160 views05:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Understand and Accelerate Memory Processing Pipeline for Disaggregated LLM Inference

📝 Summary:
LLM inference faces significant memory processing overhead. This paper proposes using heterogeneous GPU-FPGA systems to accelerate these operations by offloading memory-bounded tasks to FPGAs. This achieves 1.04-2.2x speedup and 1.11-4.7x energy savings over GPU baselines, proving heterogeneous s...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29002
• PDF: https://arxiv.org/pdf/2603.29002

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLMInference #FPGA #HeterogeneousComputing #HardwareAcceleration #SystemArchitecture

137 views06:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨UniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems

📝 Summary:
UniMixer is a unified architecture for recommendation systems that improves scaling efficiency. It uses a generalized parameterized token mixing module to optimize mixing patterns and connect attention, TokenMixer, and factorization-machine methods. A lightweight version boosts performance further.

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00590
• PDF: https://arxiv.org/pdf/2604.00590

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

159 views06:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

📝 Summary:
OpenClaw agents face critical security vulnerabilities due to extensive operational privileges. ClawKeeper provides comprehensive real-time protection using skill-based, plugin-based, and novel watcher-based mechanisms for state verification and intervention.

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24414
• PDF: https://arxiv.org/pdf/2603.24414
• Project Page: https://huggingface.co/datasets/xunyoyo/clawkeeper
• Github: https://github.com/SafeAI-Lab-X/ClawKeeper

✨ Datasets citing this paper:
• https://huggingface.co/datasets/xunyoyo/clawkeeper

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AISafety #AgentSecurity #AIagents #Cybersecurity #AIResearch

157 views07:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MemRerank: Preference Memory for Personalized Product Reranking

📝 Summary:
MemRerank improves personalized product reranking by distilling user purchase history into concise preference signals using reinforcement learning. This framework consistently outperforms raw history and other baselines, proving explicit preference memory is effective for e-commerce personalization.

🔹 Publication Date: Published on Mar 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.29247
• PDF: https://arxiv.org/pdf/2603.29247

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#Personalization #ECommerce #ReinforcementLearning #RecommendationSystems #MachineLearning

189 views07:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Reasoning Shift: How Context Silently Shortens LLM Reasoning

📝 Summary:
LLMs significantly shorten their reasoning traces when problems are presented in various contexts compared to isolation. This compression reduces self-verification, potentially affecting performance on complex tasks. It highlights issues with LLM reasoning robustness and context management.

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01161
• PDF: https://arxiv.org/pdf/2604.01161

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AIReasoning #ContextualAI #AIRobustness #MachineLearning

197 views08:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨A Survey of On-Policy Distillation for Large Language Models

📝 Summary:
On-Policy Distillation OPD lets LLMs learn from self-generated outputs and teacher feedback, addressing off-policy exposure bias. This survey unifies OPD with an f-divergence framework, organizing methods by feedback, teacher access, and loss.

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00626
• PDF: https://arxiv.org/pdf/2604.00626

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLMs #OnPolicyDistillation #ModelDistillation #DeepLearning #MachineLearning

198 views09:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AI Generalisation Gap In Comorbid Sleep Disorder Staging

📝 Summary:
AI sleep staging models trained on healthy subjects perform poorly on stroke patients due to fundamental differences in sleep architecture. This necessitates disease-specific approaches. The paper introduces iSLEEPS, a new stroke dataset, to confirm this generalization gap and highlights the need...

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23582
• PDF: https://arxiv.org/pdf/2603.23582
• Project Page: https://himalayansaswatabose.github.io/iSLEEPS_Explainability.github.io/
• Github: https://github.com/HimalayanSaswataBose/iSLEEPS_GeneralisationGapAndExplainability

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AIGeneralization #SleepStaging #StrokeResearch #MedicalAI #MachineLearning

272 views09:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Brevity Constraints Reverse Performance Hierarchies in Language Models

📝 Summary:
Large language models can underperform smaller ones due to verbose responses that introduce errors. Constraining output length reveals their superior latent capabilities, reversing performance hierarchies. This demands scale-aware prompt engineering for optimal performance.

🔹 Publication Date: Published on Mar 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00025
• PDF: https://arxiv.org/pdf/2604.00025
• Github: https://github.com/logicsame/Brevity-Constraints-Reverse-Performance-Hierarchies-in-Language-Models

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #PromptEngineering #AI #MachineLearning #NLP

❤1

247 views14:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Do Phone-Use Agents Respect Your Privacy?

📝 Summary:
This paper introduces MyPhoneBench, a framework to evaluate phone agents' privacy behavior. It found agents often over-share optional data, indicating current success metrics overestimate their deployment readiness due to privacy failures.

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00986
• PDF: https://arxiv.org/pdf/2604.00986
• Github: https://github.com/FreedomIntelligence/MyPhoneBench

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#PhoneAgents #DataPrivacy #AI #PrivacyResearch #Cybersecurity

❤1

219 views15:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨S0 Tuning: Zero-Overhead Adaptation of Hybrid Recurrent-Attention Models

📝 Summary:
S0 tuning optimizes recurrent state matrices in hybrid models, outperforming LoRA with zero inference overhead. It significantly improves performance on benchmarks like HumanEval and enables efficient task switching.

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01168
• PDF: https://arxiv.org/pdf/2604.01168
• Project Page: https://www.jackyoung.io/research/s0-tuning
• Github: https://github.com/JackYoung27/s0-tuning

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#S0Tuning #DeepLearning #LLMs #ModelOptimization #MachineLearning

❤1

236 views16:06

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform