Data Science | Machine Learning with Python for Researchers
32.6K subscribers
3.34K photos
126 videos
23 files
3.55K links
ads: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
🔹 Title: Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models

🔹 Publication Date: Published on Aug 4

🔹 Abstract: Research on efficient reasoning methods for Large Reasoning Models (LRMs) aims to reduce reasoning path length without sacrificing performance, through single-model optimization and model collaboration. AI-generated summary Recently, Large Reasoning Models (LRMs) have gradually become a research hotspot due to their outstanding performance in handling complex tasks. Among them, DeepSeek R1 has garnered significant attention for its exceptional performance and open-source nature, driving advancements in the research of R1-style LRMs. Unlike traditional Large Language Models (LLMs), these models enhance logical deduction and decision-making capabilities during reasoning by incorporating mechanisms such as long chain-of-thought and self-reflection through reinforcement learning . However, with the widespread application of these models, the problem of overthinking has gradually emerged. Specifically, when generating answers, these models often construct excessively long reasoning chains with redundant or repetitive steps, which leads to reduced reasoning efficiency and may affect the accuracy of the final answer. To this end, various efficient reasoning methods have been proposed, aiming to reduce the length of reasoning paths without compromising model performance and reasoning capability. By reviewing the current research advancements in the field of efficient reasoning methods systematically, we categorize existing works into two main directions based on the lens of single-model optimization versus model collaboration : (1) Efficient Reasoning with Single Model, which focuses on improving the reasoning efficiency of individual models; and (2) Efficient Reasoning with Model Collaboration , which explores optimizing reasoning paths through collaboration among multiple models. Besides, we maintain a public GitHub repository that tracks the latest progress in efficient reasoning methods .

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.02120

• PDF: https://arxiv.org/pdf/2508.02120

• Project Page: https://github.com/yuelinan/Awesome-Efficient-R1-style-LRMs

• Github: https://github.com/yuelinan/Awesome-Efficient-R1-style-LRMs

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
2
🔹 Title: On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

🔹 Publication Date: Published on Aug 7

🔹 Abstract: Dynamic Fine-Tuning (DFT) improves the generalization of Large Language Models (LLMs) by dynamically rescaling gradients, outperforming standard Supervised Fine-Tuning (SFT) and showing competitive results in offline reinforcement learning. AI-generated summary We present a simple yet theoretically motivated improvement to Supervised Fine-Tuning (SFT) for the Large Language Model (LLM), addressing its limited generalization compared to reinforcement learning (RL). Through mathematical analysis, we reveal that standard SFT gradients implicitly encode a problematic reward structure that may severely restrict the generalization capabilities of model. To rectify this, we propose Dynamic Fine-Tuning (DFT), stabilizing gradient updates for each token by dynamically rescaling the objective function with the probability of this token. Remarkably, this single-line code change significantly outperforms standard SFT across multiple challenging benchmarks and base models, demonstrating greatly improved generalization. Additionally, our approach shows competitive results in offline RL settings, offering an effective yet simpler alternative. This work bridges theoretical insight and practical solutions, substantially advancing SFT performance. The code will be available at https://github.com/yongliang-wu/DFT.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.05629

• PDF: https://arxiv.org/pdf/2508.05629

• Github: https://github.com/yongliang-wu/DFT

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
2
🔹 Title: Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation

🔹 Publication Date: Published on Aug 7

🔹 Abstract: Genie Envisioner integrates policy learning, evaluation, and simulation using a video diffusion model and neural simulator for instruction-driven robotic manipulation. AI-generated summary We introduce Genie Envisioner (GE), a unified world foundation platform for robotic manipulation that integrates policy learning, evaluation, and simulation within a single video-generative framework. At its core, GE-Base is a large-scale, instruction-conditioned video diffusion model that captures the spatial, temporal, and semantic dynamics of real-world robotic interactions in a structured latent space . Built upon this foundation, GE-Act maps latent representations to executable action trajectories through a lightweight, flow-matching decoder , enabling precise and generalizable policy inference across diverse embodiments with minimal supervision. To support scalable evaluation and training, GE-Sim serves as an action-conditioned neural simulator, producing high-fidelity rollouts for closed-loop policy development. The platform is further equipped with EWMBench , a standardized benchmark suite measuring visual fidelity , physical consistency , and instruction-action alignment. Together, these components establish Genie Envisioner as a scalable and practical foundation for instruction-driven, general-purpose embodied intelligence. All code, models, and benchmarks will be released publicly.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.05635

• PDF: https://arxiv.org/pdf/2508.05635

• Project Page: https://genie-envisioner.github.io/

• Github: https://genie-envisioner.github.io/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
5
🔹 Title: Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation

🔹 Publication Date: Published on Aug 7

🔹 Abstract: Genie Envisioner integrates policy learning, evaluation, and simulation using a video diffusion model and neural simulator for instruction-driven robotic manipulation. AI-generated summary We introduce Genie Envisioner (GE), a unified world foundation platform for robotic manipulation that integrates policy learning, evaluation, and simulation within a single video-generative framework. At its core, GE-Base is a large-scale, instruction-conditioned video diffusion model that captures the spatial, temporal, and semantic dynamics of real-world robotic interactions in a structured latent space . Built upon this foundation, GE-Act maps latent representations to executable action trajectories through a lightweight, flow-matching decoder , enabling precise and generalizable policy inference across diverse embodiments with minimal supervision. To support scalable evaluation and training, GE-Sim serves as an action-conditioned neural simulator, producing high-fidelity rollouts for closed-loop policy development. The platform is further equipped with EWMBench , a standardized benchmark suite measuring visual fidelity , physical consistency , and instruction-action alignment. Together, these components establish Genie Envisioner as a scalable and practical foundation for instruction-driven, general-purpose embodied intelligence. All code, models, and benchmarks will be released publicly.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.05635

• PDF: https://arxiv.org/pdf/2508.05635

• Project Page: https://genie-envisioner.github.io/

• Github: https://genie-envisioner.github.io/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
3
🔹 Title: Beyond the Trade-off: Self-Supervised Reinforcement Learning for Reasoning Models' Instruction Following

🔹 Publication Date: Published on Aug 4

🔹 Abstract: A self-supervised RL framework enhances instruction following in reasoning models without external supervision, maintaining reasoning performance and offering scalability and cost-effectiveness. AI-generated summary Reasoning models excel in complex problem solving but exhibit a concerning trade off between reasoning capabilities and instruction following abilities. Existing approaches for improving instruction following rely on stronger external models, creating methodological bottlenecks and practical limitations including increased costs and accessibility constraints. We propose a self-supervised RL framework that leverages reasoning models ' own internal signals to improve instruction following capabilities without external supervision. Extensive experiments demonstrate that our framework significantly improves instruction following capabilities while maintaining reasoning performance, offering a scalable and cost-effective approach to enhance instruction following in reasoning models . The data and code are publicly available at https://github.com/Rainier-rq/verl-if.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.02150

• PDF: https://arxiv.org/pdf/2508.02150

• Github: https://github.com/Rainier-rq/verl-if

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: CRINN: Contrastive Reinforcement Learning for Approximate Nearest Neighbor Search

🔹 Publication Date: Published on Aug 4

🔹 Abstract: CRINN, a reinforcement learning-based approach, optimizes approximate nearest-neighbor search algorithms for speed while maintaining accuracy, outperforming state-of-the-art methods on several benchmarks. AI-generated summary Approximate nearest-neighbor search ( ANNS ) algorithms have become increasingly critical for recent AI applications, particularly in retrieval-augmented generation ( RAG ) and agent-based LLM applications. In this paper, we present CRINN, a new paradigm for ANNS algorithms. CRINN treats ANNS optimization as a reinforcement learning problem where execution speed serves as the reward signal . This approach enables the automatic generation of progressively faster ANNS implementations while maintaining accuracy constraints. Our experimental evaluation demonstrates CRINN's effectiveness across six widely-used NNS benchmark datasets . When compared against state-of-the-art open-source ANNS algorithms, CRINN achieves best performance on three of them ( GIST-960-Euclidean , MNIST-784-Euclidean , and GloVe-25-angular ), and tied for first place on two of them ( SIFT-128-Euclidean and GloVe-25-angular ). The implications of CRINN's success reach well beyond ANNS optimization: It validates that LLMs augmented with reinforcement learning can function as an effective tool for automating sophisticated algorithmic optimizations that demand specialized knowledge and labor-intensive manual refinement.Code can be found at https://github.com/deepreinforce-ai/CRINN

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.02091

• PDF: https://arxiv.org/pdf/2508.02091

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
2
🔹 Title: Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity

🔹 Publication Date: Published on Aug 7

🔹 Abstract: Hi3DEval is a hierarchical evaluation framework for 3D generative content that combines object-level and part-level assessments, including material realism, using a large-scale dataset and hybrid 3D representations. AI-generated summary Despite rapid advances in 3D content generation, quality assessment for the generated 3D assets remains challenging. Existing methods mainly rely on image-based metrics and operate solely at the object level, limiting their ability to capture spatial coherence, material authenticity, and high-fidelity local details. 1) To address these challenges, we introduce Hi3DEval , a hierarchical evaluation framework tailored for 3D generative content. It combines both object-level and part-level evaluation , enabling holistic assessments across multiple dimensions as well as fine-grained quality analysis. Additionally, we extend texture evaluation beyond aesthetic appearance by explicitly assessing material realism , focusing on attributes such as albedo, saturation, and metallicness. 2) To support this framework, we construct Hi3DBench , a large-scale dataset comprising diverse 3D assets and high-quality annotations, accompanied by a reliable multi-agent annotation pipeline. We further propose a 3D-aware automated scoring system based on hybrid 3D representations. Specifically, we leverage video-based representations for object-level and material-subject evaluations to enhance modeling of spatio-temporal consistency and employ pretrained 3D features for part-level perception . Extensive experiments demonstrate that our approach outperforms existing image-based metrics in modeling 3D characteristics and achieves superior alignment with human preference, providing a scalable alternative to manual evaluations. The project page is available at https://zyh482.github.io/ Hi3DEval /.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.05609

• PDF: https://arxiv.org/pdf/2508.05609

• Project Page: https://zyh482.github.io/Hi3DEval/

• Github: https://zyh482.github.io/Hi3DEval/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
https://huggingface.co/spaces/3DTopia/3DGen-Leaderboard
==================================

For more data science resources:
https://t.iss.one/DataScienceT
3
🔹 Title: Are Today's LLMs Ready to Explain Well-Being Concepts?

🔹 Publication Date: Published on Aug 6

🔹 Abstract: LLMs can be fine-tuned to generate high-quality, audience-tailored explanations of well-being concepts using Supervised Fine-Tuning and Direct Preference Optimization. AI-generated summary Well-being encompasses mental, physical, and social dimensions essential to personal growth and informed life decisions. As individuals increasingly consult Large Language Models ( LLMs ) to understand well-being, a key challenge emerges: Can LLMs generate explanations that are not only accurate but also tailored to diverse audiences? High-quality explanations require both factual correctness and the ability to meet the expectations of users with varying expertise. In this work, we construct a large-scale dataset comprising 43,880 explanations of 2,194 well-being concepts, generated by ten diverse LLMs . We introduce a principle-guided LLM-as-a-judge evaluation framework, employing dual judges to assess explanation quality . Furthermore, we show that fine-tuning an open-source LLM using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) can significantly enhance the quality of generated explanations. Our results reveal: (1) The proposed LLM judges align well with human evaluations; (2) explanation quality varies significantly across models, audiences, and categories; and (3) DPO- and SFT-finetuned models outperform their larger counterparts, demonstrating the effectiveness of preference-based learning for specialized explanation tasks.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03990

• PDF: https://arxiv.org/pdf/2508.03990

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Tool-integrated Reinforcement Learning for Repo Deep Search

🔹 Publication Date: Published on Aug 5

🔹 Abstract: ToolTrain, a two-stage training framework combining supervised fine-tuning and reinforcement learning, enhances LLMs for issue localization by integrating repository retrieval tools, achieving state-of-the-art performance. AI-generated summary Issue localization, the process of identifying code locations that need modification to resolve software issues, is a critical yet challenging task in software development. The semantic gap between natural language issue descriptions and faulty code requires complex multi-hop reasoning through code dependencies. Existing LLM-based agents attempt to address this by integrating repository retrieval tools. However, this transforms issue localization into a demanding task we call Repo Deep Search , which requires the LLM to effectively utilize various repository retrieval tools throughout a multi-step reasoning and navigation process. To tackle this challenge, we present ToolTrain, a two-stage tool-integrated training framework combining rejection-sampled supervised fine-tuning and tool-integrated reinforcement learning to enhance LLMs' ability to use retrieval tools for issue localization. Experimental results show that ToolTrain-trained models achieve state-of-the-art performance, with our 32B model even surpassing Claude-3.7 on function-level localization. The results also show that improved localization performance translates to better end-to-end issue resolution performance. This further demonstrates that training for issue localization is a viable and effective strategy for improving automated software development .

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03012

• PDF: https://arxiv.org/pdf/2508.03012

• Github: https://github.com/Mizersy/RepoDeepSearch

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex Conversations

🔹 Publication Date: Published on Jul 30

🔹 Abstract: A benchmark dataset for Spoken Dialogue Models (SDMs) in English and Chinese is presented to evaluate their performance in understanding and emulating human spoken conversations, addressing challenges like ambiguity and context-dependency. AI-generated summary Spoken Dialogue Models ( SDMs ) have recently attracted significant attention for their ability to generate voice responses directly to users' spoken queries. Despite their increasing popularity, there exists a gap in research focused on comprehensively understanding their practical effectiveness in comprehending and emulating human conversations. This is especially true compared to text-based Large Language Models ( LLMs ), which benefit from extensive benchmarking. Human voice interactions are inherently more complex than text due to characteristics unique to spoken dialogue. Ambiguity poses one challenge, stemming from semantic factors like polysemy , as well as phonological aspects such as heterograph , heteronyms , and stress patterns . Additionally, context-dependency , like omission , coreference , and multi-turn interaction, adds further complexity to human conversational dynamics. To illuminate the current state of SDM development and to address these challenges, we present a benchmark dataset in this paper, which comprises 1,079 instances in English and Chinese. Accompanied by an LLM-based evaluation method that closely aligns with human judgment , this dataset facilitates a comprehensive exploration of the performance of SDMs in tackling these practical challenges.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.22968

• PDF: https://arxiv.org/pdf/2507.22968

• Project Page: https://step-out.github.io/C3-web/

• Github: https://github.com/step-out/C3

🔹 Datasets citing this paper:
https://huggingface.co/datasets/ChengqianMa/C3

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
2
🔹 Title: EVOC2RUST: A Skeleton-guided Framework for Project-Level C-to-Rust Translation

🔹 Publication Date: Published on Aug 6

🔹 Abstract: EvoC2Rust is an automated framework that translates entire C projects to Rust using a skeleton-guided approach, combining rule-based and LLM-based methods to improve syntax, semantics, and safety. AI-generated summary Rust's compile-time safety guarantees make it ideal for safety-critical systems, creating demand for translating legacy C codebases to Rust. While various approaches have emerged for this task, they face inherent trade-offs: rule-based solutions face challenges in meeting code safety and idiomaticity requirements, while LLM -based solutions often fail to generate semantically equivalent Rust code, due to the heavy dependencies of modules across the entire codebase. Recent studies have revealed that both solutions are limited to small-scale programs. In this paper, we propose EvoC2Rust, an automated framework for converting entire C projects to equivalent Rust ones. EvoC2Rust employs a skeleton-guided translation strategy for project-level translation. The pipeline consists of three evolutionary stages : 1) it first decomposes the C project into functional modules , employs a feature-mapping-enhanced LLM to transform definitions and macros and generates type-checked function stubs , which form a compilable Rust skeleton; 2) it then incrementally translates the function, replacing the corresponding stub placeholder; 3) finally, it repairs compilation errors by integrating LLM and static analysis . Through evolutionary augmentation, EvoC2Rust combines the advantages of both rule-based and LLM -based solutions. Our evaluation on open-source benchmarks and six industrial projects demonstrates EvoC2Rust's superior performance in project-level C-to-Rust translation. On average, it achieves 17.24% and 14.32% improvements in syntax and semantic accuracy over the LLM -based approaches, along with a 96.79% higher code safety rate than the rule-based tools. At the module level, EvoC2Rust reaches 92.25% compilation and 89.53% test pass rates on industrial projects, even for complex codebases and long functions.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.04295

• PDF: https://arxiv.org/pdf/2508.04295

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
2
🔹 Title: R-Zero: Self-Evolving Reasoning LLM from Zero Data

🔹 Publication Date: Published on Aug 7

🔹 Abstract: R-Zero is a self-evolving framework that autonomously generates and learns from its own training data, improving reasoning capabilities in LLMs without human-curated tasks. AI-generated summary Self-evolving Large Language Models ( LLMs ) offer a scalable path toward super-intelligence by autonomously generating, refining, and learning from their own experiences. However, existing methods for training such models still rely heavily on vast human-curated tasks and labels, typically via fine-tuning or reinforcement learning, which poses a fundamental bottleneck to advancing AI systems toward capabilities beyond human intelligence. To overcome this limitation, we introduce R-Zero , a fully autonomous framework that generates its own training data from scratch. Starting from a single base LLM, R-Zero initializes two independent models with distinct roles, a Challenger and a Solver . These models are optimized separately and co-evolve through interaction: the Challenger is rewarded for proposing tasks near the edge of the Solver capability, and the Solver is rewarded for solving increasingly challenging tasks posed by the Challenger . This process yields a targeted, self-improving curriculum without any pre-existing tasks and labels. Empirically, R-Zero substantially improves reasoning capability across different backbone LLMs , e.g., boosting the Qwen3-4B-Base by +6.49 on math-reasoning benchmarks and +7.54 on general-domain reasoning benchmarks .

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.05004

• PDF: https://arxiv.org/pdf/2508.05004

• Project Page: https://chengsong-huang.github.io/R-Zero.github.io/

• Github: https://github.com/Chengsong-Huang/R-Zero

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
3
🔹 Title: Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?

🔹 Publication Date: Published on Aug 5

🔹 Abstract: Double-Bench is a large-scale, multilingual, and multimodal evaluation system for document Retrieval-Augmented Generation (RAG) systems, addressing limitations in current benchmarks and providing comprehensive assessments of system components. AI-generated summary Retrieval-Augmented Generation (RAG) systems using Multimodal Large Language Models (MLLMs) show great promise for complex document understanding, yet their development is critically hampered by inadequate evaluation. Current benchmarks often focus on specific part of document RAG system and use synthetic data with incomplete ground truth and evidence labels, therefore failing to reflect real-world bottlenecks and challenges. To overcome these limitations, we introduce Double-Bench : a new large-scale , multilingual , and multimodal evaluation system that is able to produce fine-grained assessment to each component within document RAG system s. It comprises 3,276 documents (72,880 pages) and 5,168 single- and multi-hop queries across 6 languages and 4 document types with streamlined dynamic update support for potential data contamination issues. Queries are grounded in exhaustively scanned evidence pages and verified by human experts to ensure maximum quality and completeness. Our comprehensive experiments across 9 state-of-the-art embedding models, 4 MLLMs and 4 end-to-end document RAG frameworks demonstrate the gap between text and visual embedding models is narrowing, highlighting the need in building stronger document retrieval models . Our findings also reveal the over-confidence dilemma within current document RAG frameworks that tend to provide answer even without evidence support. We hope our fully open-source Double-Bench provide a rigorous foundation for future research in advanced document RAG system s. We plan to retrieve timely corpus and release new benchmarks on an annual basis.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03644

• PDF: https://arxiv.org/pdf/2508.03644

• Project Page: https://double-bench.github.io/

• Github: https://github.com/Episoode/Double-Bench

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
2
🔹 Title: RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems

🔹 Publication Date: Published on Aug 2

🔹 Abstract: RoboMemory, a brain-inspired multi-memory framework, enhances lifelong learning in physical robots by integrating cognitive neuroscience principles and achieving state-of-the-art performance in real-world tasks. AI-generated summary We present RoboMemory, a brain-inspired multi-memory framework for lifelong learning in physical embodied systems, addressing critic al challenges in real-world environments: continuous learning, multi-module memory latency, task correlation capture, and infinite-loop mitigation in closed-loop planning. Grounded in cognitive neuroscience, it integrates four core modules: the Information Preprocessor (thalamus-like), the Lifelong Embodied Memory System (hippocampus-like), the Closed-Loop Planning Module (prefrontal lobe-like), and the Low-Level Executer (cerebellum-like) to enable long-term planning and cumulative learning. The Lifelong Embodied Memory System , central to the framework, alleviates inference speed issues in complex memory frameworks via parallelized updates/retrieval across Spatial, Temporal, Episodic, and Semantic submodules. It incorporates a dynamic Knowledge Graph (KG) and consistent architectural design to enhance memory consistency and scalability. Evaluations on EmbodiedBench show RoboMemory outperforms the open-source baseline ( Qwen2.5-VL-72B-Ins ) by 25% in average success rate and surpasses the closed-source State-of-the-Art ( SOTA ) ( Claude3.5-Sonnet ) by 5%, establishing new SOTA . Ablation studies validate key components ( critic , spatial memory , long-term memory ), while real-world deployment confirms its lifelong learning capability with significantly improved success rates across repeated tasks. RoboMemory alleviates high latency challenges with scalability, serving as a foundational reference for integrating multi-modal memory systems in physical robots.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.01415

• PDF: https://arxiv.org/pdf/2508.01415

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: On the Expressiveness of Softmax Attention: A Recurrent Neural Network Perspective

🔹 Publication Date: Published on Jul 31

🔹 Abstract: Softmax attention is more expressive than linear attention due to its recurrent form, which can be analyzed using RNN components. AI-generated summary Since its introduction, softmax attention has become the backbone of modern transformer architectures due to its expressiveness and scalability across a wide range of tasks. However, the main drawback of softmax attention is the quadratic memory requirement and computational complexity with respect to the sequence length . By replacing the softmax nonlinearity, linear attention and similar methods have been introduced to avoid the quadratic bottleneck of softmax attention . Despite these linear forms of attention being derived from the original softmax formulation, they typically lag in terms of downstream accuracy. While strong intuition of the softmax nonlinearity on the query and key inner product suggests that it has desirable properties compared to other nonlinearities, the question of why this discrepancy exists still remains unanswered. This work demonstrates that linear attention is an approximation of softmax attention by deriving the recurrent form of softmax attention . Using this form, each part of softmax attention can be described in the language of recurrent neural networks (RNNs) . Describing softmax attention as an RNN allows for the ablation of the components of softmax attention to understand the importance of each part and how they interact. In this way, our work helps explain why softmax attention is more expressive than its counterparts.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.23632

• PDF: https://arxiv.org/pdf/2507.23632

• Github: https://github.com/gmongaras/On-the-Expressiveness-of-Softmax-Attention-A-Recurrent-Neural-Network-Perspective

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
2
🔹 Title: ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability

🔹 Publication Date: Published on Aug 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.07050
• PDF: https://arxiv.org/pdf/2508.07050
• Project Page: https://github.com/8421BCD/ReasonRank
• Github: https://github.com/8421BCD/ReasonRank

🔹 Datasets citing this paper:
https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k
https://huggingface.co/datasets/liuwenhan/reasonrank_data_rl
https://huggingface.co/datasets/liuwenhan/reasonrank_data_sft

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: WideSearch: Benchmarking Agentic Broad Info-Seeking

🔹 Publication Date: Published on Aug 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.07999
• PDF: https://arxiv.org/pdf/2508.07999
• Project Page: https://widesearch-seed.github.io/
• Github: https://widesearch-seed.github.io/

🔹 Datasets citing this paper:
https://huggingface.co/datasets/ByteDance-Seed/WideSearch

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

🔹 Publication Date: Published on Aug 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.07981
• PDF: https://arxiv.org/pdf/2508.07981
• Project Page: https://amap-ml.github.io/Omni-Effects.github.io/
• Github: https://github.com/AMAP-ML/Omni-Effects

🔹 Datasets citing this paper:
https://huggingface.co/datasets/GD-ML/Omni-VFX

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

🔹 Publication Date: Published on Aug 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2508.07629
• PDF: https://arxiv.org/pdf/2508.07629
• Project Page: https://github.com/suu990901/KlearReasoner
• Github: https://github.com/suu990901/KlearReasoner

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: UserBench: An Interactive Gym Environment for User-Centric Agents

🔹 Publication Date: Published on Jul 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.22034
• PDF: https://arxiv.org/pdf/2507.22034
• Github: https://github.com/SalesforceAIResearch/UserBench

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1