Data Science | Machine Learning with Python for Researchers

✨Solving a Million-Step LLM Task with Zero Errors

📝 Summary:
MAKER solves million-step LLM tasks with zero errors. It uses extreme task decomposition for microagents and applies error correction at each step with multi-agent voting. This offers a new scalable approach for complex LLM processes.

🔹 Publication Date: Published on Nov 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.09030
• PDF: https://arxiv.org/pdf/2511.09030

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #ErrorCorrection #MultiAgent #TaskDecomposition

435 views18:41

✨ Explore Data Science 📝 Write your paper

✨CC30k: A Citation Contexts Dataset for Reproducibility-Oriented Sentiment Analysis

📝 Summary:
CC30k is a new dataset of 30,000 machine learning paper citation contexts, labeled with reproducibility-oriented sentiments. It enables large language models to better predict paper reproducibility, filling a crucial gap in computational reproducibility studies.

🔹 Publication Date: Published on Nov 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07790
• PDF: https://arxiv.org/pdf/2511.07790

✨ Datasets citing this paper:
• https://huggingface.co/datasets/rochanaro/CC30k

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MachineLearning #Reproducibility #LLM #SentimentAnalysis #DataScience

❤1

499 views19:41

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains

📝 Summary:
A new benchmark, DiscoX, and evaluation system, Metric-S, are introduced for discourse-level, expert Chinese-English translation. Findings show advanced LLMs still fall short of human performance, underscoring challenges in professional machine translation.

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.10984
• PDF: https://arxiv.org/pdf/2511.10984

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MachineTranslation #NLP #LLM #Benchmarking #AI

198 views04:01

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism

📝 Summary:
MarsRL enhances multi-agent reasoning systems by jointly optimizing all agents through reinforcement learning and agentic pipeline parallelism. This novel approach significantly boosts open-source LLM accuracy on complex tasks, even outperforming larger models on benchmarks like AIME2025.

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11373
• PDF: https://arxiv.org/pdf/2511.11373
• Github: https://github.com/liushulinle/MarsRL

🔹 Models citing this paper:
• https://huggingface.co/forestliutc/MarsRL

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ReinforcementLearning #MultiAgentSystems #LLM #AIResearch #MachineLearning

237 views05:02

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨Qwen3 Technical Report

📝 Summary:
Qwen3 is a new series of large language models integrating thinking and non-thinking modes for unified performance and efficiency. It achieves state-of-the-art results across diverse tasks and expands multilingual support to 119 languages.

🔹 Publication Date: Published on May 14

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/qwen3-technical-report
• PDF: https://arxiv.org/pdf/2505.09388
• Project Page: https://qwenlm.github.io/blog/qwen3/
• Github: https://github.com/QwenLM/Qwen3

🔹 Models citing this paper:
• https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct
• https://huggingface.co/Qwen/Qwen3-235B-A22B
• https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct

✨ Spaces citing this paper:
• https://huggingface.co/spaces/modelscope/DocResearch
• https://huggingface.co/spaces/enzostvs/deepsite
• https://huggingface.co/spaces/multimodalart/Eigen-Banana

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #MultilingualAI #NLP #Qwen3

Arxivexplained

Qwen3 Technical Report - Explained Simply

By An Yang, Anfeng Li, Baosong Yang et al.. # Qwen3: The AI Model That Thinks When It Needs To

**The Problem:** Current AI systems force you to...

242 views08:04

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

✨MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

📝 Summary:
MeshCoder reconstructs complex 3D objects from point clouds into editable Blender Python scripts using a multimodal LLM. This enables superior shape-to-code reconstruction, intuitive editing via code, and enhances 3D shape understanding.

🔹 Publication Date: Published on Aug 20

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/meshcoder-llm-powered-structured-mesh-code-generation-from-point-clouds
• PDF: https://arxiv.org/pdf/2508.14879
• Project Page: https://daibingquan.github.io/MeshCoder
• Github: https://daibingquan.github.io/MeshCoder

🔹 Models citing this paper:
• https://huggingface.co/InternRobotics/MeshCoder

✨ Datasets citing this paper:
• https://huggingface.co/datasets/InternRobotics/MeshCoderDataset

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MeshCoder #LLM #3DReconstruction #PointClouds #ComputerGraphics

243 views17:08

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨Experience-Guided Adaptation of Inference-Time Reasoning Strategies

📝 Summary:
Experience-Guided Reasoner EGuR dynamically generates and optimizes complete computational strategies at inference time using accumulated experience. It adapts LLM calls tools and control logic improving accuracy up to 14 percent and reducing costs by up to 111x.

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11519
• PDF: https://arxiv.org/pdf/2511.11519

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #Reasoning #Optimization #MachineLearning

267 views18:09

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨From Proof to Program: Characterizing Tool-Induced Reasoning Hallucinations in Large Language Models

📝 Summary:
Tool-augmented LLMs exhibit Tool-Induced Myopia TIM, treating tool outputs as substitutes for true reasoning. This improves final answer accuracy but significantly degrades reasoning quality. A proposed framework realigns these models to use tools as assistive evidence, enhancing both accuracy an...

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.10899
• PDF: https://arxiv.org/pdf/2511.10899

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AIResearch #Reasoning #ToolAugmentation #AIHallucinations

206 views01:00

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

0:16

This media is not supported in your browser

VIEW IN TELEGRAM

✨MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

📝 Summary:
A parallel multimodal diffusion framework, MMaDA-Parallel, enhances cross-modal alignment and semantic consistency in thinking-aware image synthesis by addressing error propagation issues in sequentia...

🔹 Publication Date: Published on Nov 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.09611
• PDF: https://arxiv.org/pdf/2511.09611
• Project Page: https://tyfeld.github.io/mmadaparellel.github.io/
• Github: https://github.com/tyfeld/MMaDA-Parallel

🔹 Models citing this paper:
• https://huggingface.co/tyfeld/MMaDA-Parallel-A
• https://huggingface.co/tyfeld/MMaDA-Parallel-M

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MultimodalAI #DiffusionModels #ImageSynthesis #LLM #AIResearch

131 views03:02

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance

📝 Summary:
WebCoach introduces a self-evolving framework for web agents with persistent cross-session memory. It uses a WebCondenser, External Memory Store, and a Coach to learn from past experiences without retraining. This significantly improves task success and enables smaller models to match larger LLM ...

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12997
• PDF: https://arxiv.org/pdf/2511.12997

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#WebAgents #AI #MachineLearning #LLM #MemoryAI

❤1

176 views05:06

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

📝 Summary:
MiroThinker v1.0 is an open-source research agent introducing 'interactive scaling.' It trains models with reinforcement learning for deeper agent-environment interactions, performing up to 600 tool calls per task. This achieves state-of-the-art performance and establishes interaction depth as a ...

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11793
• PDF: https://arxiv.org/pdf/2511.11793
• Project Page: https://dr.miromind.ai/
• Github: https://github.com/MiroMindAI/MiroThinker

🔹 Models citing this paper:
• https://huggingface.co/miromind-ai/MiroThinker-v1.0-72B
• https://huggingface.co/miromind-ai/MiroThinker-v1.0-8B
• https://huggingface.co/miromind-ai/MiroThinker-v1.0-30B

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MiroThinker #ResearchAgents #ReinforcementLearning #OpenSourceAI #LLM

arXiv.org

MiroThinker: Pushing the Performance Boundaries of Open-Source...

We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and information-seeking capabilities. Unlike previous agents that only scale up model size...

❤1

212 views05:06

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨Assessing LLMs for Serendipity Discovery in Knowledge Graphs: A Case for Drug Repurposing

📝 Summary:
SerenQA evaluates LLMs for discovering surprising, valuable serendipitous answers in scientific knowledge graphs, focusing on drug repurposing. It uses a new serendipity metric. Experiments show LLMs struggle with genuine surprising insights.

🔹 Publication Date: Published on Nov 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12472
• PDF: https://arxiv.org/pdf/2511.12472
• Project Page: https://cwru-db-group.github.io/serenQA
• Github: https://github.com/CWRU-DB-Group/DrugKG

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #KnowledgeGraphs #DrugRepurposing #AI #Serendipity

190 views23:10

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning

📝 Summary:
ATLAS is a new, high-difficulty, multidisciplinary benchmark for LLMs, featuring 800 original problems across seven scientific fields. It addresses current benchmark limitations with complex, open-ended answers and aims to differentiate advanced scientific reasoning, serving as a ruler for AGI pr...

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14366
• PDF: https://arxiv.org/pdf/2511.14366

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AGI #AIResearch #ScientificReasoning #Benchmark

152 views03:00

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

1:29

This media is not supported in your browser

VIEW IN TELEGRAM

✨Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

📝 Summary:
Think-at-Hard TaH improves LLM reasoning by dynamically refining only hard tokens. It uses a neural decider to identify them and LoRA for focused refinement, boosting performance with minimal overhead.

🔹 Publication Date: Published on Nov 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.08577
• PDF: https://arxiv.org/pdf/2511.08577
• Github: https://github.com/thu-nics/TaH

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #MachineLearning #NaturalLanguageProcessing #Reasoning

190 views06:03

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨Mitigating Label Length Bias in Large Language Models

📝 Summary:
Large Language Models exhibit a label length bias with multi-token class labels. This paper introduces Normalized Contextual Calibration NCC to mitigate this issue by normalizing and calibrating predictions at the full-label level. NCC significantly improves performance and reliability across div...

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14385
• PDF: https://arxiv.org/pdf/2511.14385

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #NLP #BiasInAI #MachineLearning

167 views08:20

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨Large Language Models Meet Extreme Multi-label Classification: Scaling and Multi-modal Framework

📝 Summary:
This paper improves Extreme Multi-label Classification XMC by using larger decoder-only models and introduces ViXML, a vision-enhanced framework. ViXML efficiently integrates visual information, significantly outperforming text-only models and achieving new state-of-the-art.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13189
• PDF: https://arxiv.org/pdf/2511.13189

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #XMC #MultiModalAI #MachineLearning #AIResearch

203 views09:21

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨LLM-Powered Fully Automated Chaos Engineering: Towards Enabling Anyone to Build Resilient Software Systems at Low Cost

📝 Summary:
Manual planning and improvement hinder Chaos Engineering adoption. ChaosEater automates the entire Chaos Engineering cycle for Kubernetes using LLMs, handling tasks from requirements to debugging. This enables anyone to build resilient systems quickly and affordably.

🔹 Publication Date: Published on Nov 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07865
• PDF: https://arxiv.org/pdf/2511.07865
• Project Page: https://ntt-dkiku.github.io/chaos-eater/
• Github: https://github.com/ntt-dkiku/chaos-eater

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#ChaosEngineering #LLM #CloudNative #SoftwareResilience #DevOps

❤1

327 views15:24

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs

📝 Summary:
Nemotron Elastic embeds multiple submodels within a single large language model, significantly reducing training costs by 360x compared to training separate models. This framework allows zero-shot extraction of optimized submodels for various deployment budgets without additional training or fine...

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16664
• PDF: https://arxiv.org/pdf/2511.16664
• Project Page: https://huggingface.co/nvidia/Nemotron-Elastic-12B

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #MachineLearning #DeepLearning #EfficientAI

83 views04:06

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

📝 Summary:
EntroPIC stabilizes entropy during long-term LLM training by adaptively tuning loss coefficients with Proportional-Integral Control. This novel method ensures efficient exploration and prevents sub-optimal behaviors, leading to stable and optimal reinforcement learning for LLMs.

🔹 Publication Date: Published on Nov 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15248
• PDF: https://arxiv.org/pdf/2511.15248
• Project Page: https://huggingface.co/spaces/yangkaiSIGS/entropic
• Github: https://github.com/yk7333/EntroPIC

🔹 Models citing this paper:
• https://huggingface.co/hunterbown/shannon-control-unit

✨ Spaces citing this paper:
• https://huggingface.co/spaces/yangkaiSIGS/entropic

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #MachineLearning #ReinforcementLearning #ControlTheory #DeepLearning

104 views16:09

✨ Explore Data Science 📝 Write your paper

Data Science | Machine Learning with Python for Researchers

✨Generalist Foundation Models Are Not Clinical Enough for Hospital Operations

📝 Summary:
Lang1, a specialized clinical language model, significantly outperforms generalist models in predicting hospital operational metrics after supervised finetuning. This suggests that effective healthcare AI requires in-domain pretraining and finetuning for specialized tasks.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13703
• PDF: https://arxiv.org/pdf/2511.13703

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#HealthcareAI #ClinicalNLP #LLM #HospitalOperations #AIResearch

107 views17:09

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform