✨NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
📝 Summary:
LLMs in RAG systems exhibit poor confidence calibration due to noisy contexts. This paper proposes NAACL, a noise-aware calibration framework. NAACL uses new rules and supervised fine-tuning to make LLMs intrinsically aware of noisy input, significantly improving confidence calibration.
🔹 Publication Date: Published on Jan 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11004
• PDF: https://arxiv.org/pdf/2601.11004
• Github: https://github.com/HKUST-KnowComp/NAACL
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMs #RAG #ConfidenceCalibration #NLP #AI
📝 Summary:
LLMs in RAG systems exhibit poor confidence calibration due to noisy contexts. This paper proposes NAACL, a noise-aware calibration framework. NAACL uses new rules and supervised fine-tuning to make LLMs intrinsically aware of noisy input, significantly improving confidence calibration.
🔹 Publication Date: Published on Jan 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11004
• PDF: https://arxiv.org/pdf/2601.11004
• Github: https://github.com/HKUST-KnowComp/NAACL
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMs #RAG #ConfidenceCalibration #NLP #AI
❤3
✨PaddleOCR 3.0 Technical Report
📝 Summary:
PaddleOCR 3.0 is an open-source toolkit offering efficient OCR and document parsing solutions. Its models achieve competitive accuracy and efficiency with fewer than 100 million parameters, rivaling much larger vision-language models.
🔹 Publication Date: Published on Jul 8, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.05595
• PDF: https://huggingface.co/collections/PaddlePaddle/pp-structurev3
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
PaddleOCR 3.0 is an open-source toolkit offering efficient OCR and document parsing solutions. Its models achieve competitive accuracy and efficiency with fewer than 100 million parameters, rivaling much larger vision-language models.
🔹 Publication Date: Published on Jul 8, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.05595
• PDF: https://huggingface.co/collections/PaddlePaddle/pp-structurev3
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Multi-Agent Collaboration via Evolving Orchestration
📝 Summary:
A centralized orchestrator, trained with reinforcement learning, dynamically directs LLM agents for multi-agent collaboration. This puppeteer-style method achieves superior performance and reduced computational costs.
🔹 Publication Date: Published on May 26, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.19591
• PDF: https://arxiv.org/pdf/2505.19591
• Github: https://github.com/OpenBMB/ChatDev/tree/puppeteer
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A centralized orchestrator, trained with reinforcement learning, dynamically directs LLM agents for multi-agent collaboration. This puppeteer-style method achieves superior performance and reduced computational costs.
🔹 Publication Date: Published on May 26, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.19591
• PDF: https://arxiv.org/pdf/2505.19591
• Github: https://github.com/OpenBMB/ChatDev/tree/puppeteer
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models
📝 Summary:
MemoryRewardBench is a new benchmark evaluating reward models ability to assess long-term memory management in LLMs across various context lengths and patterns. Evaluations reveal newer RMs outperform predecessors, open-source models are closing the gap, and current RMs have limitations.
🔹 Publication Date: Published on Jan 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11969
• PDF: https://arxiv.org/pdf/2601.11969
• Github: https://github.com/LCM-Lab/MemRewardBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
MemoryRewardBench is a new benchmark evaluating reward models ability to assess long-term memory management in LLMs across various context lengths and patterns. Evaluations reveal newer RMs outperform predecessors, open-source models are closing the gap, and current RMs have limitations.
🔹 Publication Date: Published on Jan 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11969
• PDF: https://arxiv.org/pdf/2601.11969
• Github: https://github.com/LCM-Lab/MemRewardBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation
📝 Summary:
UniX presents a unified medical foundation model that decouples visual understanding and generation tasks using distinct autoregressive and diffusion branches with cross-modal attention for enhanced p...
🔹 Publication Date: Published on Jan 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11522
• PDF: https://arxiv.org/pdf/2601.11522
• Github: https://github.com/ZrH42/UniX
🔹 Models citing this paper:
• https://huggingface.co/ZrH42/UniX
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
UniX presents a unified medical foundation model that decouples visual understanding and generation tasks using distinct autoregressive and diffusion branches with cross-modal attention for enhanced p...
🔹 Publication Date: Published on Jan 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11522
• PDF: https://arxiv.org/pdf/2601.11522
• Github: https://github.com/ZrH42/UniX
🔹 Models citing this paper:
• https://huggingface.co/ZrH42/UniX
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Aligning Agentic World Models via Knowledgeable Experience Learning
📝 Summary:
WorldMind addresses LLM physical hallucinations by autonomously building a symbolic world knowledge repository. It unifies process and goal experiences to enforce physical feasibility and task optimality, achieving superior performance and transferability.
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13247
• PDF: https://arxiv.org/pdf/2601.13247
• Project Page: https://zjunlp.github.io/project/WorldMind/
• Github: https://github.com/zjunlp/WorldMind
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
WorldMind addresses LLM physical hallucinations by autonomously building a symbolic world knowledge repository. It unifies process and goal experiences to enforce physical feasibility and task optimality, achieving superior performance and transferability.
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13247
• PDF: https://arxiv.org/pdf/2601.13247
• Project Page: https://zjunlp.github.io/project/WorldMind/
• Github: https://github.com/zjunlp/WorldMind
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey
📝 Summary:
Large language models face significant challenges in software issue resolution, prompting the development of autonomous coding agents through various training-free and training-based methodologies. AI...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11655
• PDF: https://arxiv.org/pdf/2601.11655
• Project Page: https://deepsoftwareanalytics.github.io/Awesome-Issue-Resolution/
• Github: https://github.com/DeepSoftwareAnalytics/Awesome-Issue-Resolution
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Large language models face significant challenges in software issue resolution, prompting the development of autonomous coding agents through various training-free and training-based methodologies. AI...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11655
• PDF: https://arxiv.org/pdf/2601.11655
• Project Page: https://deepsoftwareanalytics.github.io/Awesome-Issue-Resolution/
• Github: https://github.com/DeepSoftwareAnalytics/Awesome-Issue-Resolution
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨A BERTology View of LLM Orchestrations: Token- and Layer-Selective Probes for Efficient Single-Pass Classification
📝 Summary:
Lightweight probes trained on hidden states of LLMs enable efficient classification tasks without additional computational overhead, improving safety and sentiment analysis performance. AI-generated s...
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13288
• PDF: https://arxiv.org/pdf/2601.13288
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Lightweight probes trained on hidden states of LLMs enable efficient classification tasks without additional computational overhead, improving safety and sentiment analysis performance. AI-generated s...
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13288
• PDF: https://arxiv.org/pdf/2601.13288
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization
📝 Summary:
Being-H0.5 is a Vision-Language-Action model that enables robust cross-embodiment generalization through human-centric learning and a Mixture-of-Transformers architecture with specialized embodiment h...
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2601.12993
• PDF: https://arxiv.org/pdf/2601.12993
• Project Page: https://research.beingbeyond.com/being-h05
• Github: https://github.com/BeingBeyond/Being-H
🔹 Models citing this paper:
• https://huggingface.co/BeingBeyond/Being-H05-2B
• https://huggingface.co/BeingBeyond/Being-H05-2B_libero
• https://huggingface.co/BeingBeyond/Being-H05-2B_robocasa
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Being-H0.5 is a Vision-Language-Action model that enables robust cross-embodiment generalization through human-centric learning and a Mixture-of-Transformers architecture with specialized embodiment h...
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2601.12993
• PDF: https://arxiv.org/pdf/2601.12993
• Project Page: https://research.beingbeyond.com/being-h05
• Github: https://github.com/BeingBeyond/Being-H
🔹 Models citing this paper:
• https://huggingface.co/BeingBeyond/Being-H05-2B
• https://huggingface.co/BeingBeyond/Being-H05-2B_libero
• https://huggingface.co/BeingBeyond/Being-H05-2B_robocasa
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents
📝 Summary:
ToolPRMBench is introduced as a large-scale benchmark for evaluating process reward models in tool-using agents, featuring step-level test cases and multi-LLM verification to ensure data quality. AI-g...
🔹 Publication Date: Published on Jan 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.12294
• PDF: https://arxiv.org/pdf/2601.12294
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ToolPRMBench is introduced as a large-scale benchmark for evaluating process reward models in tool-using agents, featuring step-level test cases and multi-LLM verification to ensure data quality. AI-g...
🔹 Publication Date: Published on Jan 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.12294
• PDF: https://arxiv.org/pdf/2601.12294
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Toward Efficient Agents: Memory, Tool learning, and Planning
📝 Summary:
Efficiency in agentic systems is examined across memory, tool learning, and planning components, analyzing trade-offs between effectiveness and computational costs through various optimization strateg...
🔹 Publication Date: Published on Jan 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14192
• PDF: https://arxiv.org/pdf/2601.14192
• Project Page: https://efficient-agents.github.io/
• Github: https://github.com/yxf203/Awesome-Efficient-Agents
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Efficiency in agentic systems is examined across memory, tool learning, and planning components, analyzing trade-offs between effectiveness and computational costs through various optimization strateg...
🔹 Publication Date: Published on Jan 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14192
• PDF: https://arxiv.org/pdf/2601.14192
• Project Page: https://efficient-agents.github.io/
• Github: https://github.com/yxf203/Awesome-Efficient-Agents
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer
📝 Summary:
OmniTransfer presents a unified framework for spatio-temporal video transfer that enhances appearance consistency and temporal control through multi-view information and multimodal semantic guidance. ...
🔹 Publication Date: Published on Jan 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14250
• PDF: https://arxiv.org/pdf/2601.14250
• Project Page: https://pangzecheung.github.io/OmniTransfer/
• Github: https://github.com/PangzeCheung/OmniTransfer
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OmniTransfer presents a unified framework for spatio-temporal video transfer that enhances appearance consistency and temporal control through multi-view information and multimodal semantic guidance. ...
🔹 Publication Date: Published on Jan 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14250
• PDF: https://arxiv.org/pdf/2601.14250
• Project Page: https://pangzecheung.github.io/OmniTransfer/
• Github: https://github.com/PangzeCheung/OmniTransfer
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨PRiSM: Benchmarking Phone Realization in Speech Models
📝 Summary:
PRiSM is a new open-source benchmark for evaluating phonetic perception in speech models, moving beyond surface-level transcription accuracy. It assesses utility in clinical, educational, and multilingual settings with various probes. Findings show diverse language exposure is crucial, and specia...
🔹 Publication Date: Published on Jan 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14046
• PDF: https://arxiv.org/pdf/2601.14046
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
PRiSM is a new open-source benchmark for evaluating phonetic perception in speech models, moving beyond surface-level transcription accuracy. It assesses utility in clinical, educational, and multilingual settings with various probes. Findings show diverse language exposure is crucial, and specia...
🔹 Publication Date: Published on Jan 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.14046
• PDF: https://arxiv.org/pdf/2601.14046
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨LIBERTy: A Causal Framework for Benchmarking Concept-Based Explanations of LLMs with Structural Counterfactuals
📝 Summary:
A framework for generating structured counterfactual pairs using LLMs and SCMs enables improved evaluation and analysis of concept-based explanations in high-stakes domains. AI-generated summary Conce...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10700
• PDF: https://arxiv.org/pdf/2601.10700
• Github: https://github.com/GilatToker/Liberty-benchmark
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A framework for generating structured counterfactual pairs using LLMs and SCMs enables improved evaluation and analysis of concept-based explanations in high-stakes domains. AI-generated summary Conce...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10700
• PDF: https://arxiv.org/pdf/2601.10700
• Github: https://github.com/GilatToker/Liberty-benchmark
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨A Hybrid Protocol for Large-Scale Semantic Dataset Generation in Low-Resource Languages: The Turkish Semantic Relations Corpus
📝 Summary:
A hybrid methodology combining FastText embeddings, clustering, and AI classification generates a large-scale Turkish semantic relations dataset with high accuracy validation. AI-generated summary We ...
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13253
• PDF: https://arxiv.org/pdf/2601.13253
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A hybrid methodology combining FastText embeddings, clustering, and AI classification generates a large-scale Turkish semantic relations dataset with high accuracy validation. AI-generated summary We ...
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13253
• PDF: https://arxiv.org/pdf/2601.13253
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Beyond Cosine Similarity: Taming Semantic Drift and Antonym Intrusion in a 15-Million Node Turkish Synonym Graph
📝 Summary:
A large-scale semantic clustering system addresses the limitation of neural embeddings in distinguishing synonyms from antonyms through a specialized three-way discriminator and novel clustering algor...
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13251
• PDF: https://arxiv.org/pdf/2601.13251
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A large-scale semantic clustering system addresses the limitation of neural embeddings in distinguishing synonyms from antonyms through a specialized three-way discriminator and novel clustering algor...
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13251
• PDF: https://arxiv.org/pdf/2601.13251
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
Document Index for Vectorless RAG Based on Reasoning
PageIndex is an open-source RAG framework that eliminates vector databases and chunking from the pipeline when searching for documents.
Most RAG systems rely on semantic similarity: they cut a document into pieces, build embeddings, and then retrieve fragments that "resemble the query".
But similarity does not equal relevance.
In professional documents such as financial reports, legal documents, and technical manuals, multi-step parsing and domain-specific logic are often needed. Vector-based search easily gets stuck when almost every section uses the same terminology.
PageIndex does it differently.
It builds a hierarchical tree from the document, similar to a table of contents, but tailored for LLMs. Then, it uses reasoning-based tree search to "navigate" the structure like a human expert would.
The two-step process:
1. Generate a tree-based index of the document structure
2. Retrieve the needed information through reasoning-based tree search
The LLM can "think" about the document structure. Instead of matching embeddings, it reasons like: "Trends in debt are usually in the financial summary or Appendix G. Let's look there."
Key features:
* No vector database or embedding pipeline
* No artificial chunking that breaks context at boundaries
* Traceable retrieval with precise references down to the page level
* Navigation based on reasoning, mirroring human document analysis
PageIndex is used in Mafin 2.5 and claims 98.7% accuracy on FinanceBench for financial document analysis.
And yes, it's completely open source.
👉 @DataScienceT
PageIndex is an open-source RAG framework that eliminates vector databases and chunking from the pipeline when searching for documents.
Most RAG systems rely on semantic similarity: they cut a document into pieces, build embeddings, and then retrieve fragments that "resemble the query".
But similarity does not equal relevance.
In professional documents such as financial reports, legal documents, and technical manuals, multi-step parsing and domain-specific logic are often needed. Vector-based search easily gets stuck when almost every section uses the same terminology.
PageIndex does it differently.
It builds a hierarchical tree from the document, similar to a table of contents, but tailored for LLMs. Then, it uses reasoning-based tree search to "navigate" the structure like a human expert would.
The two-step process:
1. Generate a tree-based index of the document structure
2. Retrieve the needed information through reasoning-based tree search
The LLM can "think" about the document structure. Instead of matching embeddings, it reasons like: "Trends in debt are usually in the financial summary or Appendix G. Let's look there."
Key features:
* No vector database or embedding pipeline
* No artificial chunking that breaks context at boundaries
* Traceable retrieval with precise references down to the page level
* Navigation based on reasoning, mirroring human document analysis
PageIndex is used in Mafin 2.5 and claims 98.7% accuracy on FinanceBench for financial document analysis.
And yes, it's completely open source.
Please open Telegram to view this post
VIEW IN TELEGRAM
❤2
✨FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs
📝 Summary:
FutureOmni is the first benchmark evaluating multimodal models ability to forecast future events from audio-visual data. Current models struggle, particularly with speech-heavy scenarios. The paper proposes an improved training strategy, Omni-Modal Future Forecasting, which enhances performance a...
🔹 Publication Date: Published on Jan 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2601.13836
• PDF: https://arxiv.org/pdf/2601.13836
• Project Page: https://openmoss.github.io/FutureOmni
• Github: https://openmoss.github.io/FutureOmni
✨ Datasets citing this paper:
• https://huggingface.co/datasets/OpenMOSS-Team/FutureOmni
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MultimodalLLMs #FutureForecasting #AIResearch #DeepLearning #Benchmarking
📝 Summary:
FutureOmni is the first benchmark evaluating multimodal models ability to forecast future events from audio-visual data. Current models struggle, particularly with speech-heavy scenarios. The paper proposes an improved training strategy, Omni-Modal Future Forecasting, which enhances performance a...
🔹 Publication Date: Published on Jan 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2601.13836
• PDF: https://arxiv.org/pdf/2601.13836
• Project Page: https://openmoss.github.io/FutureOmni
• Github: https://openmoss.github.io/FutureOmni
✨ Datasets citing this paper:
• https://huggingface.co/datasets/OpenMOSS-Team/FutureOmni
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MultimodalLLMs #FutureForecasting #AIResearch #DeepLearning #Benchmarking
✨Agentic-R: Learning to Retrieve for Agentic Search
📝 Summary:
This paper introduces Agentic-R, a new retriever training framework for agentic search. It uses both local relevance and global answer correctness metrics, with iterative optimization between the agent and retriever. Agentic-R consistently outperforms strong baselines on QA benchmarks.
🔹 Publication Date: Published on Jan 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11888
• PDF: https://arxiv.org/pdf/2601.11888
🔹 Models citing this paper:
• https://huggingface.co/liuwenhan/Agentic-R_e5
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AgenticSearch #InformationRetrieval #MachineLearning #QuestionAnswering #AIResearch
📝 Summary:
This paper introduces Agentic-R, a new retriever training framework for agentic search. It uses both local relevance and global answer correctness metrics, with iterative optimization between the agent and retriever. Agentic-R consistently outperforms strong baselines on QA benchmarks.
🔹 Publication Date: Published on Jan 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11888
• PDF: https://arxiv.org/pdf/2601.11888
🔹 Models citing this paper:
• https://huggingface.co/liuwenhan/Agentic-R_e5
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AgenticSearch #InformationRetrieval #MachineLearning #QuestionAnswering #AIResearch
✨SciCoQA: Quality Assurance for Scientific Paper--Code Alignment
📝 Summary:
SciCoQA is a dataset containing 611 paper-code discrepancies for identifying mismatches between scientific publications and code. It shows that even advanced language models struggle significantly to detect these issues, with the best model finding less than half of real-world discrepancies.
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.12910
• PDF: https://arxiv.org/pdf/2601.12910
• Project Page: https://ukplab.github.io/scicoqa
• Github: https://github.com/UKPLab/scicoqa
✨ Datasets citing this paper:
• https://huggingface.co/datasets/UKPLab/scicoqa
✨ Spaces citing this paper:
• https://huggingface.co/spaces/UKPLab/scicoqa
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SciCoQA #AcademicIntegrity #CodeQuality #NLP #ResearchData
📝 Summary:
SciCoQA is a dataset containing 611 paper-code discrepancies for identifying mismatches between scientific publications and code. It shows that even advanced language models struggle significantly to detect these issues, with the best model finding less than half of real-world discrepancies.
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.12910
• PDF: https://arxiv.org/pdf/2601.12910
• Project Page: https://ukplab.github.io/scicoqa
• Github: https://github.com/UKPLab/scicoqa
✨ Datasets citing this paper:
• https://huggingface.co/datasets/UKPLab/scicoqa
✨ Spaces citing this paper:
• https://huggingface.co/spaces/UKPLab/scicoqa
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SciCoQA #AcademicIntegrity #CodeQuality #NLP #ResearchData
✨Think3D: Thinking with Space for Spatial Reasoning
📝 Summary:
Think3D improves vision-language models' 3D reasoning by enabling interactive spatial exploration using 3D reconstruction and camera operations. This training-free framework significantly boosts performance on spatial reasoning tasks for models like GPT-4.1 and Gemini 2.5 Pro, offering a path to ...
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13029
• PDF: https://arxiv.org/pdf/2601.13029
• Github: https://github.com/zhangzaibin/spagent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#3DReasoning #SpatialAI #VisionLanguageModels #MachineLearning #ComputerVision
📝 Summary:
Think3D improves vision-language models' 3D reasoning by enabling interactive spatial exploration using 3D reconstruction and camera operations. This training-free framework significantly boosts performance on spatial reasoning tasks for models like GPT-4.1 and Gemini 2.5 Pro, offering a path to ...
🔹 Publication Date: Published on Jan 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.13029
• PDF: https://arxiv.org/pdf/2601.13029
• Github: https://github.com/zhangzaibin/spagent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#3DReasoning #SpatialAI #VisionLanguageModels #MachineLearning #ComputerVision