ML Research Hub

✨Diffutron: A Masked Diffusion Language Model for Turkish Language

📝 Summary:
Diffutron introduces a compact masked diffusion language model for Turkish. It uses resource-efficient LoRA-based pre-training and progressive instruction tuning. The model achieves competitive performance for non-autoregressive Turkish text generation despite its small size.

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.20466
• PDF: https://arxiv.org/pdf/2603.20466

🔹 Models citing this paper:
• https://huggingface.co/diffutron/DiffutronLM-0.3B-Instruct
• https://huggingface.co/diffutron/DiffutronLM-0.3B-Base
• https://huggingface.co/diffutron/DiffutronLM-0.3B-1st-Stage

✨ Datasets citing this paper:
• https://huggingface.co/datasets/diffutron/DiffutronLM-Pretraining-Corpus

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LanguageModels #TurkishNLP #DiffusionModels #NLP #AI

243 views07:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MedOpenClaw: Auditable Medical Imaging Agents Reasoning over Uncurated Full Studies

📝 Summary:
MEDOPENCLAW and MEDFLOWBENCH enable evaluating medical VLMs in interactive 3D environments, unlike static 2D images. Surprisingly, top VLMs struggle with professional tools due to poor spatial grounding. This work highlights a critical gap for auditable, full-study medical agents.

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24649
• PDF: https://arxiv.org/pdf/2603.24649
• Project Page: https://jakobshen.github.io/MedOpenClaw

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MedicalAI #VLMs #MedicalImaging #AuditableAI #3DImaging

194 views09:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

✨LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset

📝 Summary:
This paper introduces KITScenes LongTail, a new dataset for long-tail driving events. It offers multi-view video, trajectories, and multilingual expert reasoning traces. This resource improves few-shot generalization and evaluates multimodal models instruction following capabilities.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23607
• PDF: https://arxiv.org/pdf/2603.23607
• Project Page: https://huggingface.co/datasets/KIT-MRT/KITScenes-LongTail

✨ Datasets citing this paper:
• https://huggingface.co/datasets/KIT-MRT/KITScenes-LongTail

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AutonomousDriving #ComputerVision #Datasets #LongTailLearning #MultimodalAI

202 views10:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Natural-Language Agent Harnesses

📝 Summary:
Natural-Language Agent Harnesses NLAHs and Intelligent Harness Runtime IHR enable portable, executable agent harness design through natural language. This externalizes control logic from code, making harnesses easier to transfer, compare, and study.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25723
• PDF: https://arxiv.org/pdf/2603.25723

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#NaturalLanguageProcessing #AI #AIAgents #SoftwareEngineering #CodePortability

255 views10:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation

📝 Summary:
RealChart2Code is a new benchmark assessing VLM ability to generate complex, multi-panel charts from real data. It reveals significant performance gaps between proprietary and open-weight models, highlighting VLM struggles with intricate plots.

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25804
• PDF: https://arxiv.org/pdf/2603.25804
• Project Page: https://huggingface.co/datasets/zjj1233/RealChart2Code
• Github: https://github.com/Speakn0w/RealChart2Code

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VLM #ChartToCode #Benchmark #AI #DataScience

279 views11:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨GenMask: Adapting DiT for Segmentation via Direct Mask

📝 Summary:
GenMask directly trains a DiT for joint image generation and segmentation using a novel timestep sampling strategy. This strategy emphasizes extreme noise for masks, enabling harmonious training. It outperforms indirect adaptation, simplifying the workflow.

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23906
• PDF: https://arxiv.org/pdf/2603.23906

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#Segmentation #ImageGeneration #DiT #DeepLearning #ComputerVision

❤1

238 views17:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Learning to Commit: Generating Organic Pull Requests via Online Repository Memory

📝 Summary:
Learning to Commit improves LLM coding agent organicity using Online Repository Memory. It distills project-specific coding skills from historical commits, guiding agents to generate code that adheres to project conventions and architectural patterns, leading to more acceptable pull requests.

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26664
• PDF: https://arxiv.org/pdf/2603.26664

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLMAgents #SoftwareEngineering #CodeGeneration #AIResearch #MachineLearning

❤1

204 views18:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Composer 2 Technical Report

📝 Summary:
Composer 2 is a specialized coding model trained via phased learning for real-world software engineering tasks. It demonstrates superior performance on new and public benchmarks, showcasing strong long-term planning and coding intelligence.

🔹 Publication Date: Published on Mar 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.24477
• PDF: https://arxiv.org/pdf/2603.24477

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #Coding #SoftwareEngineering #MachineLearning #CodeGeneration

❤1

200 views18:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?

📝 Summary:
CoT faithfulness varies widely 39.7-89.9% across open-weight models, driven by architecture and training. Models often internally recognize hint influence but suppress its acknowledgment in their verbalized CoT, impacting its transparency.

🔹 Publication Date: Published on Mar 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22582
• PDF: https://arxiv.org/pdf/2603.22582
• Github: https://github.com/ricyoung/cot-faithfulness-open-models

✨ Datasets citing this paper:
• https://huggingface.co/datasets/richardyoung/cot-faithfulness-open-models

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ChainOfThought #LLMs #AI #ModelFaithfulness #AITransparency

❤1

197 views20:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨A Matter of Time: Revealing the Structure of Time in Vision-Language Models

📝 Summary:
This paper reveals that vision-language models embed temporal information in a structured way. It introduces a new dataset and methods to derive explicit timeline representations from these models, enabling efficient temporal reasoning.

🔹 Publication Date: Published on Oct 22, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.19559
• PDF: https://arxiv.org/pdf/2510.19559
• Project Page: https://tekayanidham.github.io/timeline-page/
• Github: https://github.com/TekayaNidham/timeline-vlm

✨ Spaces citing this paper:
• https://huggingface.co/spaces/Nidhamtek/timeline-vlm

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VLM #TemporalReasoning #AIResearch #MachineLearning #DeepLearning

❤1👍1

193 views21:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Towards a Medical AI Scientist

📝 Summary:
Medical AI Scientist is the first autonomous AI framework for clinical research, generating evidence-based hypotheses and drafting manuscripts. It outperforms commercial LLMs in idea quality and experiment success, producing MICCAI-level quality manuscripts.

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28589
• PDF: https://arxiv.org/pdf/2603.28589
• Project Page: https://cuhk-aim-group.github.io/Med-AI-Scientist-Homepage/
• Github: https://cuhk-aim-group.github.io/Med-AI-Scientist-Homepage/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

129 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨On Token's Dilemma: Dynamic MoE with Drift-Aware Token Assignment for Continual Learning of Large Vision Language Models

📝 Summary:
LLaVA-DyMoE addresses routing-drift-induced forgetting in multimodal continual instruction tuning by dynamically expanding mixture of experts with token-level assignment guidance and routing score reg...

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27481
• PDF: https://arxiv.org/pdf/2603.27481
• Project Page: https://zhaoc5.github.io/DyMoE

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

105 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

📝 Summary:
ImagenWorld is a comprehensive benchmark for image generation and editing, featuring human annotations and explainable evaluation. It reveals models struggle with editing and text-heavy content, offering a rigorous diagnostic tool.

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27862
• PDF: https://arxiv.org/pdf/2603.27862
• Project Page: https://tiger-ai-lab.github.io/ImagenWorld/
• Github: https://github.com/TIGER-AI-Lab/ImagenWorld

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

146 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization

📝 Summary:
Kernel-Smith is a GPU kernel generation framework that combines evolutionary algorithms with post-training reinforcement learning to optimize performance across different hardware backends. AI-generat...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28342
• PDF: https://arxiv.org/pdf/2603.28342
• Project Page: https://chat.intern-ai.org.cn/kernel-smith/try
• Github: https://github.com/InternLM/Kernel-Smith

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

171 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Emergent Social Intelligence Risks in Generative Multi-Agent Systems

📝 Summary:
Generative multi-agent systems exhibit emergent collective risks mirroring human societal pathologies like collusion and conformity, despite no explicit instruction. These frequent group behaviors cannot be prevented by individual agent safeguards, posing a significant social intelligence risk.

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27771
• PDF: https://arxiv.org/pdf/2603.27771
• Project Page: https://howiehwong.github.io/blogs/MAS_risk.html
• Github: https://github.com/HowieHwong/RiskLab

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

99 views06:17

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Gen-Searcher: Reinforcing Agentic Search for Image Generation

📝 Summary:
A search-augmented image generation agent is presented that performs multi-hop reasoning and search to collect textual knowledge and reference images for grounded generation, trained with supervised f...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28767
• PDF: https://arxiv.org/pdf/2603.28767
• Project Page: https://gen-searcher.vercel.app/
• Github: https://github.com/tulerfeng/Gen-Searcher

🔹 Models citing this paper:
• https://huggingface.co/GenSearcher/Gen-Searcher-8B
• https://huggingface.co/GenSearcher/Gen-Searcher-SFT-8B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/GenSearcher/KnowGen-Bench
• https://huggingface.co/datasets/GenSearcher/Train-Data

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

👏1

65 views06:17

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨GEditBench v2: A Human-Aligned Benchmark for General Image Editing

📝 Summary:
A new benchmark and evaluation model for image editing are introduced to better assess visual consistency and human alignment in complex editing tasks. AI-generated summary Recent advances in image ed...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28547
• PDF: https://arxiv.org/pdf/2603.28547

🔹 Models citing this paper:
• https://huggingface.co/GEditBench-v2/PVC-Judge

✨ Datasets citing this paper:
• https://huggingface.co/datasets/GEditBench-v2/VCReward-Bench
• https://huggingface.co/datasets/GEditBench-v2/GEditBench-v2
• https://huggingface.co/datasets/GEditBench-v2/GEditBench-v2-CandidatesGallery

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

60 views06:17

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Make Geometry Matter for Spatial Reasoning

📝 Summary:
GeoSR enhances vision-language models' spatial reasoning capabilities by strategically incorporating geometry tokens through masking and guided fusion mechanisms. AI-generated summary Empowered by lar...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26639
• PDF: https://arxiv.org/pdf/2603.26639
• Project Page: https://suhzhang.github.io/GeoSR/
• Github: https://suhzhang.github.io/GeoSR/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

62 views06:17

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PRBench: End-to-end Paper Reproduction in Physics Research

📝 Summary:
PRBench evaluates AI agents' ability to reproduce scientific research by requiring them to implement algorithms from published papers and match original results, revealing significant challenges in fo...

🔹 Publication Date: Published on Mar 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27646
• PDF: https://arxiv.org/pdf/2603.27646
• Project Page: https://prbench.phybench.cn/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

68 views06:17

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoning

📝 Summary:
ResAdapt is an input-side adaptation framework that dynamically allocates visual resources to improve multimodal large language models' efficiency in video tasks while maintaining high performance. AI...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28610
• PDF: https://arxiv.org/pdf/2603.28610
• Project Page: https://xnhyacinth.github.io/projects/ResAdapt/
• Github: https://github.com/Xnhyacinth/ResAdapt

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

57 views06:17

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design

📝 Summary:
A verification-centric framework for deep research agents improves performance on complex benchmarks by incorporating error checking at multiple stages of development and inference. AI-generated summa...

🔹 Publication Date: Published on Mar 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28376
• PDF: https://arxiv.org/pdf/2603.28376
• Github: https://github.com/AIDC-AI/Marco-DeepResearch

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

66 views06:18

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform