Data Science | Machine Learning with Python for Researchers
31.3K subscribers
1.48K photos
102 videos
22 files
1.75K links
Admin: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
🔹 Title: Model-Task Alignment Drives Distinct RL Outcomes

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21188
• PDF: https://arxiv.org/pdf/2508.21188

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Mimicking the Physicist's Eye:A VLM-centric Approach for Physics Formula Discovery

🔹 Publication Date: Published on Aug 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.17380
• PDF: https://arxiv.org/pdf/2508.17380
• Github: https://jiaaqiliu.github.io/VIPER-R1/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Deep Residual Echo State Networks: exploring residual orthogonal connections in untrained Recurrent Neural Networks

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21172
• PDF: https://arxiv.org/pdf/2508.21172
• Github: https://github.com/NennoMP/deepresesn

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Quantization Robustness to Input Degradations for Object Detection

🔹 Publication Date: Published on Aug 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19600
• PDF: https://arxiv.org/pdf/2508.19600

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: EduRABSA: An Education Review Dataset for Aspect-based Sentiment Analysis Tasks

🔹 Publication Date: Published on Aug 23

🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/yhua219/edurabsa-dataset-68b59bad56a9e1384de7faf2
• PDF: https://arxiv.org/pdf/2508.17008
• Github: https://github.com/yhua219/edurabsa_dataset_and_annotation_tool

🔹 Datasets citing this paper:
https://huggingface.co/datasets/yhua219/EduRABSA_ASTE
https://huggingface.co/datasets/yhua219/EduRABSA_AOPE
https://huggingface.co/datasets/yhua219/EduRABSA_ASQE
https://huggingface.co/datasets/yhua219/EduRABSA_ACD

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21104
• PDF: https://arxiv.org/pdf/2508.21104

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: No Label Left Behind: A Unified Surface Defect Detection Model for all Supervision Regimes

🔹 Publication Date: Published on Aug 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19060
• PDF: https://arxiv.org/pdf/2508.19060
• Github: https://github.com/blaz-r/SuperSimplenet

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: SWE-Exp: Experience-Driven Software Issue Resolution

🔹 Publication Date: Published on Jul 31

🔹 Abstract: SWE-Exp enhances software issue resolution by systematically accumulating and leveraging repair expertise from past agent experiences, improving resolution rates. AI-generated summary Recent advances in large language model (LLM) agents have shown remarkable progress in software issue resolution, leveraging advanced techniques such as multi-agent collaboration and Monte Carlo Tree Search (MCTS) . However, current agents act as memoryless explorers - treating each problem separately without retaining or reusing knowledge from previous repair experiences. This leads to redundant exploration of failed trajectories and missed chances to adapt successful issue resolution methods to similar problems. To address this problem, we introduce SWE-Exp, an experience - enhanced approach that distills concise and actionable experience from prior agent trajectories, enabling continuous learning across issues. Our method introduces a multi-faceted experience bank that captures both successful and failed repair attempts. Specifically, it extracts reusable issue resolution knowledge at different levels - from high-level problem comprehension to specific code changes. Experiments show that SWE-Exp achieves state-of-the-art resolution rate (41.6% Pass@1) on SWE-bench-Verified under open-source agent frameworks . Our approach establishes a new paradigm in which automated software engineering agents systematically accumulate and leverage repair expertise, fundamentally shifting from trial-and-error exploration to strategic, experience-driven issue resolution.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.23361

• PDF: https://arxiv.org/pdf/2507.23361

• Github: https://github.com/YerbaPage/SWE-Exp

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on τ-bench

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20931
• PDF: https://arxiv.org/pdf/2508.20931

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Sculptor: Empowering LLMs with Cognitive Agency via Active Context Management

🔹 Publication Date: Published on Aug 6

🔹 Abstract: Sculptor, a framework for Active Context Management, enhances LLM performance on long contexts by enabling proactive attention and memory control, reducing proactive interference and improving reasoning reliability. AI-generated summary Large Language Models (LLMs) suffer from significant performance degradation when processing long contexts due to proactive interference , where irrelevant information in earlier parts of the context disrupts reasoning and memory recall. While most research focuses on external memory systems to augment LLMs' capabilities, we propose a complementary approach: empowering LLMs with Active Context Management (ACM) tools to actively sculpt their internal working memory. We introduce Sculptor, a framework that equips LLMs with three categories of tools: (1) context fragmentation , (2) summary , hide , and restore , and (3) intelligent search . Our approach enables LLMs to proactively manage their attention and working memory, analogous to how humans selectively focus on relevant information while filtering out distractions. Experimental evaluation on information-sparse benchmarks- PI-LLM ( proactive interference ) and NeedleBench Multi-Needle Reasoning -demonstrates that Sculptor significantly improves performance even without specific training, leveraging LLMs' inherent tool calling generalization capabilities. By enabling Active Context Management, Sculptor not only mitigates proactive interference but also provides a cognitive foundation for more reliable reasoning across diverse long-context tasks-highlighting that explicit context-control strategies, rather than merely larger token windows, are key to robustness at scale.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.04664

• PDF: https://arxiv.org/pdf/2508.04664

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: UI-Level Evaluation of ALLaM 34B: Measuring an Arabic-Centric LLM via HUMAIN Chat

🔹 Publication Date: Published on Aug 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.17378
• PDF: https://arxiv.org/pdf/2508.17378

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: T2R-bench: A Benchmark for Generating Article-Level Reports from Real World Industrial Tables

🔹 Publication Date: Published on Aug 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19813
• PDF: https://arxiv.org/pdf/2508.19813

🔹 Datasets citing this paper:
https://huggingface.co/datasets/Tele-AI/TeleTableBench

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: From reactive to cognitive: brain-inspired spatial intelligence for embodied agents

🔹 Publication Date: Published on Aug 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.17198
• PDF: https://arxiv.org/pdf/2508.17198
• Github: https://github.com/Heathcliff-saku/BSC-Nav

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
2
🔹 Title: Democracy-in-Silico: Institutional Design as Alignment in AI-Governed Polities

🔹 Publication Date: Published on Aug 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19562
• PDF: https://arxiv.org/pdf/2508.19562

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

🔹 Publication Date: Published on Sep 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.02522
• PDF: https://arxiv.org/pdf/2509.02522
• Github: https://github.com/ritzz-ai/PACS

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Jointly Reinforcing Diversity and Quality in Language Model Generations

🔹 Publication Date: Published on Sep 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.02534
• PDF: https://arxiv.org/pdf/2509.02534

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion

🔹 Publication Date: Published on Sep 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.01215
• PDF: https://arxiv.org/pdf/2509.01215
• Github: https://github.com/Tencent/POINTS-Reader

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: MedDINOv3: How to adapt vision foundation models for medical image segmentation?

🔹 Publication Date: Published on Sep 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.02379
• PDF: https://arxiv.org/pdf/2509.02379
• Github: https://github.com/ricklisz/MedDINOv3

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision

🔹 Publication Date: Published on Sep 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.01360
• PDF: https://arxiv.org/pdf/2509.01360

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views

🔹 Publication Date: Published on Sep 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.01250
• PDF: https://arxiv.org/pdf/2509.01250
• Github: https://github.com/aHapBean/Point-PQAE

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Universal Deep Research: Bring Your Own Model and Strategy

🔹 Publication Date: Published on Aug 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.00244
• PDF: https://arxiv.org/pdf/2509.00244

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT