🔹 Title: Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning
🔹 Publication Date: Published on Aug 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09726
• PDF: https://arxiv.org/pdf/2508.09726
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09726
• PDF: https://arxiv.org/pdf/2508.09726
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤1
🔹 Title: μ-Parametrization for Mixture of Experts
🔹 Publication Date: Published on Aug 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09752
• PDF: https://arxiv.org/pdf/2508.09752
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09752
• PDF: https://arxiv.org/pdf/2508.09752
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering
🔹 Publication Date: Published on Aug 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.07321
• PDF: https://arxiv.org/pdf/2508.07321
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.07321
• PDF: https://arxiv.org/pdf/2508.07321
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage
🔹 Publication Date: Published on Aug 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09603
• PDF: https://arxiv.org/pdf/2508.09603
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09603
• PDF: https://arxiv.org/pdf/2508.09603
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: PRELUDE: A Benchmark Designed to Require Global Comprehension and Reasoning over Long Contexts
🔹 Publication Date: Published on Aug 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09848
• PDF: https://arxiv.org/pdf/2508.09848
• Project Page: https://gorov.github.io/prelude
• Github: https://gorov.github.io/prelude/leaderboard.html
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/ttchungc/PRELUDE
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.09848
• PDF: https://arxiv.org/pdf/2508.09848
• Project Page: https://gorov.github.io/prelude
• Github: https://gorov.github.io/prelude/leaderboard.html
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/ttchungc/PRELUDE
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10860
• PDF: https://arxiv.org/pdf/2508.10860
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10860
• PDF: https://arxiv.org/pdf/2508.10860
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: HumanSense: From Multimodal Perception to Empathetic Context-Aware Responses through Reasoning MLLMs
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10576
• PDF: https://arxiv.org/pdf/2508.10576
• Project Page: https://digital-avatar.github.io/ai/HumanSense/
• Github: https://digital-avatar.github.io/ai/HumanSense/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10576
• PDF: https://arxiv.org/pdf/2508.10576
• Project Page: https://digital-avatar.github.io/ai/HumanSense/
• Github: https://digital-avatar.github.io/ai/HumanSense/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤1
🔹 Title: UI-Venus Technical Report: Building High-performance UI Agents with RFT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10833
• PDF: https://arxiv.org/pdf/2508.10833
• Github: https://github.com/antgroup/UI-Venus
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10833
• PDF: https://arxiv.org/pdf/2508.10833
• Github: https://github.com/antgroup/UI-Venus
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10711
• PDF: https://arxiv.org/pdf/2508.10711
• Github: https://github.com/stepfun-ai/NextStep-1
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10711
• PDF: https://arxiv.org/pdf/2508.10711
• Github: https://github.com/stepfun-ai/NextStep-1
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10751
• PDF: https://arxiv.org/pdf/2508.10751
• Project Page: https://github.com/RUCAIBox/Passk_Training
• Github: https://github.com/RUCAIBox/Passk_Training
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10751
• PDF: https://arxiv.org/pdf/2508.10751
• Project Page: https://github.com/RUCAIBox/Passk_Training
• Github: https://github.com/RUCAIBox/Passk_Training
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: AgroBench: Vision-Language Model Benchmark in Agriculture
🔹 Publication Date: Published on Jul 28
🔹 Abstract: AgroBench evaluates vision-language models across agricultural tasks, revealing areas for improvement in fine-grained identification, particularly weed identification, with expert-annotated categories. AI-generated summary Precise automated understanding of agricultural tasks such as disease identification is essential for sustainable crop production. Recent advances in vision-language models ( VLMs ) are expected to further expand the range of agricultural tasks by facilitating human-model interaction through easy, text-based communication. Here, we introduce AgroBench (Agronomist AI Benchmark), a benchmark for evaluating VLM models across seven agricultural topics, covering key areas in agricultural engineering and relevant to real-world farming. Unlike recent agricultural VLM benchmarks, AgroBench is annotated by expert agronomists. Our AgroBench covers a state-of-the-art range of categories, including 203 crop categories and 682 disease categories , to thoroughly evaluate VLM capabilities. In our evaluation on AgroBench , we reveal that VLMs have room for improvement in fine-grained identification tasks. Notably, in weed identification , most open-source VLMs perform close to random. With our wide range of topics and expert-annotated categories, we analyze the types of errors made by VLMs and suggest potential pathways for future VLM development. Our dataset and code are available at https://dahlian00.github.io/ AgroBench Page/ .
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.20519
• PDF: https://arxiv.org/pdf/2507.20519
• Project Page: https://dahlian00.github.io/AgroBenchPage/
• Github: https://dahlian00.github.io/AgroBenchPage/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Jul 28
🔹 Abstract: AgroBench evaluates vision-language models across agricultural tasks, revealing areas for improvement in fine-grained identification, particularly weed identification, with expert-annotated categories. AI-generated summary Precise automated understanding of agricultural tasks such as disease identification is essential for sustainable crop production. Recent advances in vision-language models ( VLMs ) are expected to further expand the range of agricultural tasks by facilitating human-model interaction through easy, text-based communication. Here, we introduce AgroBench (Agronomist AI Benchmark), a benchmark for evaluating VLM models across seven agricultural topics, covering key areas in agricultural engineering and relevant to real-world farming. Unlike recent agricultural VLM benchmarks, AgroBench is annotated by expert agronomists. Our AgroBench covers a state-of-the-art range of categories, including 203 crop categories and 682 disease categories , to thoroughly evaluate VLM capabilities. In our evaluation on AgroBench , we reveal that VLMs have room for improvement in fine-grained identification tasks. Notably, in weed identification , most open-source VLMs perform close to random. With our wide range of topics and expert-annotated categories, we analyze the types of errors made by VLMs and suggest potential pathways for future VLM development. Our dataset and code are available at https://dahlian00.github.io/ AgroBench Page/ .
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.20519
• PDF: https://arxiv.org/pdf/2507.20519
• Project Page: https://dahlian00.github.io/AgroBenchPage/
• Github: https://dahlian00.github.io/AgroBenchPage/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤1
🔹 Title: ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10881
• PDF: https://arxiv.org/pdf/2508.10881
• Project Page: https://lg-li.github.io/project/tooncomposer
• Github: https://github.com/TencentARC/ToonComposer
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10881
• PDF: https://arxiv.org/pdf/2508.10881
• Project Page: https://lg-li.github.io/project/tooncomposer
• Github: https://github.com/TencentARC/ToonComposer
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤1
🔹 Title: A Survey on Diffusion Language Models
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10875
• PDF: https://arxiv.org/pdf/2508.10875
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10875
• PDF: https://arxiv.org/pdf/2508.10875
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: Processing and acquisition traces in visual encoders: What does CLIP know about your camera?
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10637
• PDF: https://arxiv.org/pdf/2508.10637
• Github: https://github.com/ryan-caesar-ramos/visual-encoder-traces
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10637
• PDF: https://arxiv.org/pdf/2508.10637
• Github: https://github.com/ryan-caesar-ramos/visual-encoder-traces
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10893
• PDF: https://arxiv.org/pdf/2508.10893
• Project Page: https://nirvanalan.github.io/projects/stream3r
• Github: https://github.com/NIRVANALAN/STream3R
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10893
• PDF: https://arxiv.org/pdf/2508.10893
• Project Page: https://nirvanalan.github.io/projects/stream3r
• Github: https://github.com/NIRVANALAN/STream3R
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: When Explainability Meets Privacy: An Investigation at the Intersection of Post-hoc Explainability and Differential Privacy in the Context of Natural Language Processing
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10482
• PDF: https://arxiv.org/pdf/2508.10482
• Github: https://github.com/dmah10/xpnlp
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10482
• PDF: https://arxiv.org/pdf/2508.10482
• Github: https://github.com/dmah10/xpnlp
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤3
🔹 Title: Artificial Intelligence and Misinformation in Art: Can Vision Language Models Judge the Hand or the Machine Behind the Canvas?
🔹 Publication Date: Published on Aug 2
🔹 Abstract: State-of-the-art vision language models struggle with accurately attributing artists and distinguishing AI-generated images, highlighting the need for improvement to prevent misinformation. AI-generated summary The attribution of artworks in general and of paintings in particular has always been an issue in art. The advent of powerful artificial intelligence models that can generate and analyze images creates new challenges for painting attribution. On the one hand, AI models can create images that mimic the style of a painter, which can be incorrectly attributed, for example, by other AI models. On the other hand, AI models may not be able to correctly identify the artist for real paintings, inducing users to incorrectly attribute paintings. In this paper, both problems are experimentally studied using state-of-the-art AI models for image generation and analysis on a large dataset with close to 40,000 paintings from 128 artists. The results show that vision language models have limited capabilities to: 1) perform canvas attribution and 2) to identify AI generated images . As users increasingly rely on queries to AI models to get information, these results show the need to improve the capabilities of VLMs to reliably perform artist attribution and detection of AI generated images to prevent the spread of incorrect information.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.01408
• PDF: https://arxiv.org/pdf/2508.01408
• Github: https://ama2210.github.io/WikiArt_VLM_Web/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 2
🔹 Abstract: State-of-the-art vision language models struggle with accurately attributing artists and distinguishing AI-generated images, highlighting the need for improvement to prevent misinformation. AI-generated summary The attribution of artworks in general and of paintings in particular has always been an issue in art. The advent of powerful artificial intelligence models that can generate and analyze images creates new challenges for painting attribution. On the one hand, AI models can create images that mimic the style of a painter, which can be incorrectly attributed, for example, by other AI models. On the other hand, AI models may not be able to correctly identify the artist for real paintings, inducing users to incorrectly attribute paintings. In this paper, both problems are experimentally studied using state-of-the-art AI models for image generation and analysis on a large dataset with close to 40,000 paintings from 128 artists. The results show that vision language models have limited capabilities to: 1) perform canvas attribution and 2) to identify AI generated images . As users increasingly rely on queries to AI models to get information, these results show the need to improve the capabilities of VLMs to reliably perform artist attribution and detection of AI generated images to prevent the spread of incorrect information.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.01408
• PDF: https://arxiv.org/pdf/2508.01408
• Github: https://ama2210.github.io/WikiArt_VLM_Web/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Title: Puppeteer: Rig and Animate Your 3D Models
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10898
• PDF: https://arxiv.org/pdf/2508.10898
• Project Page: https://chaoyuesong.github.io/Puppeteer/
• Github: https://github.com/Seed3D/Puppeteer
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.10898
• PDF: https://arxiv.org/pdf/2508.10898
• Project Page: https://chaoyuesong.github.io/Puppeteer/
• Github: https://github.com/Seed3D/Puppeteer
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤2
🔹 Title: We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/we-math-20-a-versatile-mathbook-system-for-incentivizing-visual-mathematical-reasoning
• PDF: https://arxiv.org/pdf/2508.10433
• Project Page: https://we-math2.github.io/
• Github: https://github.com/We-Math/We-Math2.0
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/We-Math/We-Math2.0-Pro
• https://huggingface.co/datasets/We-Math/We-Math2.0-Standard
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 14
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/we-math-20-a-versatile-mathbook-system-for-incentivizing-visual-mathematical-reasoning
• PDF: https://arxiv.org/pdf/2508.10433
• Project Page: https://we-math2.github.io/
• Github: https://github.com/We-Math/We-Math2.0
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/We-Math/We-Math2.0-Pro
• https://huggingface.co/datasets/We-Math/We-Math2.0-Standard
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
❤4
🔹 Title: ChartCap: Mitigating Hallucination of Dense Chart Captioning
🔹 Publication Date: Published on Aug 5
🔹 Abstract: ChartCap, a large-scale dataset with dense, type-specific captions for real-world charts, improves caption accuracy and reduces hallucinations in vision language models. AI-generated summary Generating accurate, informative, and hallucination-free captions for charts remains challenging for vision language models , primarily due to the lack of large-scale, high-quality datasets of real-world charts . However, existing real-world chart datasets suffer from the inclusion of extraneous information that cannot be inferred from the chart and failure to sufficiently capture structural elements and key insights . Therefore, we introduce ChartCap, a large-scale dataset of 565K real-world chart images paired with type-specific, dense captions that exclude extraneous information and highlight both structural elements and key insights in detail. To build ChartCap, we design a four-stage pipeline that generates captions using only the discernible data from the chart and employ a cycle consistency-based human verification , which accelerates quality control without sacrificing accuracy. Additionally, we propose a novel metric, the Visual Consistency Score , which evaluates caption quality by measuring the similarity between the chart regenerated from a caption and the original chart, independent of reference captions. Extensive experiments confirms that models fine-tuned on ChartCap consistently generate more accurate and informative captions with reduced hallucinations , surpassing both open-source and proprietary models and even human-annotated captions.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03164
• PDF: https://arxiv.org/pdf/2508.03164
• Project Page: https://junyoung-00.github.io/ChartCap/
• Github: https://junyoung-00.github.io/ChartCap/
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/junyoung-00/ChartCap
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
🔹 Publication Date: Published on Aug 5
🔹 Abstract: ChartCap, a large-scale dataset with dense, type-specific captions for real-world charts, improves caption accuracy and reduces hallucinations in vision language models. AI-generated summary Generating accurate, informative, and hallucination-free captions for charts remains challenging for vision language models , primarily due to the lack of large-scale, high-quality datasets of real-world charts . However, existing real-world chart datasets suffer from the inclusion of extraneous information that cannot be inferred from the chart and failure to sufficiently capture structural elements and key insights . Therefore, we introduce ChartCap, a large-scale dataset of 565K real-world chart images paired with type-specific, dense captions that exclude extraneous information and highlight both structural elements and key insights in detail. To build ChartCap, we design a four-stage pipeline that generates captions using only the discernible data from the chart and employ a cycle consistency-based human verification , which accelerates quality control without sacrificing accuracy. Additionally, we propose a novel metric, the Visual Consistency Score , which evaluates caption quality by measuring the similarity between the chart regenerated from a caption and the original chart, independent of reference captions. Extensive experiments confirms that models fine-tuned on ChartCap consistently generate more accurate and informative captions with reduced hallucinations , surpassing both open-source and proprietary models and even human-annotated captions.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03164
• PDF: https://arxiv.org/pdf/2508.03164
• Project Page: https://junyoung-00.github.io/ChartCap/
• Github: https://junyoung-00.github.io/ChartCap/
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/junyoung-00/ChartCap
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT