Data Science | Machine Learning with Python for Researchers – Telegram

Data Science | Machine Learning with Python for Researchers

31.5K subscribers

1.58K photos

102 videos

22 files

1.85K links

Admin: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT

Download Telegram

About

Blog

Apps

Platform

Data Science | Machine Learning with Python for Researchers

31.5K subscribers

Data Science | Machine Learning with Python for Researchers

🔹 Title: Dress&Dance: Dress up and Dance as You Like It - Technical Preview

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21070
• PDF: https://arxiv.org/pdf/2508.21070

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

312 views05:01

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21061
• PDF: https://arxiv.org/pdf/2508.21061

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

385 views05:01

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: FakeParts: a New Family of AI-Generated DeepFakes

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21052
• PDF: https://arxiv.org/pdf/2508.21052

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

340 views05:01

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21046
• PDF: https://arxiv.org/pdf/2508.21046
• Github: https://github.com/JiuTian-VL/CogVLA

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

383 views05:01

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: Provable Benefits of In-Tool Learning for Large Language Models

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20755
• PDF: https://arxiv.org/pdf/2508.20755

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

❤1

457 views05:02

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: LaTCoder: Converting Webpage Design to Code with Layout-as-Thought

🔹 Publication Date: Published on Aug 5

🔹 Abstract: LaTCoder enhances layout preservation in design-to-code tasks by dividing webpage designs into blocks and using Chain-of-Thought reasoning with MLLMs, achieving significant improvements in metrics and human preference. AI-generated summary Converting webpage designs into code (design-to-code) plays a vital role in User Interface (UI) development for front-end developers, bridging the gap between visual design and functional implementation. While recent Multimodal Large Language Models (MLLMs) have shown significant potential in design-to-code tasks, they often fail to accurately preserve the layout during code generation. To this end, we draw inspiration from the Chain-of-Thought (CoT) reasoning in human cognition and propose LaTCoder, a novel approach that enhances layout preservation in webpage design during code generation with Layout-as-Thought (LaT). Specifically, we first introduce a simple yet efficient algorithm to divide the webpage design into image blocks . Next, we prompt MLLMs using a CoTbased approach to generate code for each block. Finally, we apply two assembly strategies- absolute positioning and an MLLM-based method-followed by dynamic selection to determine the optimal output. We evaluate the effectiveness of LaTCoder using multiple backbone MLLMs (i.e., DeepSeek-VL2, Gemini, and GPT-4o) on both a public benchmark and a newly introduced, more challenging benchmark (CC-HARD) that features complex layouts. The experimental results on automatic metrics demonstrate significant improvements. Specifically, TreeBLEU scores increased by 66.67% and MAE decreased by 38% when using DeepSeek-VL2, compared to direct prompting. Moreover, the human preference evaluation results indicate that annotators favor the webpages generated by LaTCoder in over 60% of cases, providing strong evidence of the effectiveness of our method.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03560

• PDF: https://arxiv.org/pdf/2508.03560

🔹 Datasets citing this paper:
• https://huggingface.co/datasets/xcodemind/CC-HARD

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

349 views06:58

Data Science | Machine Learning with Python for Researchers

🔹 Title: Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20766
• PDF: https://arxiv.org/pdf/2508.20766

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

442 views07:02

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: Multi-View 3D Point Tracking

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21060
• PDF: https://arxiv.org/pdf/2508.21060

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

489 views07:02

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD

🔹 Publication Date: Published on Aug 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.17450
• PDF: https://arxiv.org/pdf/2508.17450
• Github: https://github.com/Social-AI-Studio/DuET-PD

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

530 views09:02

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: OneReward: Unified Mask-Guided Image Generation via Multi-Task Human Preference Learning

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21066
• PDF: https://arxiv.org/pdf/2508.21066
• Project Page: https://one-reward.github.io/
• Github: https://one-reward.github.io/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

❤1

661 views09:02

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: Social-MAE: A Transformer-Based Multimodal Autoencoder for Face and Voice

🔹 Publication Date: Published on Aug 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.17502
• PDF: https://arxiv.org/pdf/2508.17502

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

❤2

708 views13:40

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: Learning an Efficient Multi-Turn Dialogue Evaluator from Multiple Judges

🔹 Publication Date: Published on Aug 1

🔹 Abstract: An efficient multi-turn dialogue evaluator aggregates multiple LLM judgments into a single model to assess dialogue quality with reduced computational cost. AI-generated summary Evaluating the conversational abilities of large language models (LLMs) remains a challenging task. Current mainstream approaches primarily rely on the `` LLM-as-a-judge " paradigm, where an LLM is prompted to serve as an evaluator to assess dialogue quality. However, such methods often suffer from various biases, which undermine the reliability and consistency of the evaluation results. To mitigate these biases, recent methods employ multiple LLMs as judges and aggregate their judgments to select the optimal assessment. Although effective, this multi-judge approach incurs significant computational overhead during inference. In this paper, we propose an efficient multi-turn dialogue evaluator that captures the collective wisdom of multiple LLM judges by aggregating their preference knowledge into a single model. Our approach preserves the advantages of diverse multi-judge feedback while drastically reducing the evaluation cost, enabling fast and flexible dialogue quality assessment. Extensive experiments on seven single rating and pairwise comparison dialogue evaluation benchmarks demonstrate that our method outperforms existing baselines across diverse scenarios, showcasing its efficiency and robustness.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.00454

• PDF: https://arxiv.org/pdf/2508.00454

• Github: https://github.com/James-TYQ/MTDEval

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

633 views04:58

Data Science | Machine Learning with Python for Researchers

🔹 Title: LeanK: Learnable K Cache Channel Pruning for Efficient Decoding

🔹 Publication Date: Published on Aug 4

🔹 Abstract: LeanK, a learning-based method, prunes unimportant key cache channels in large language models to reduce memory usage and accelerate decoding without sacrificing accuracy. AI-generated summary Large language models (LLMs) enable long-context tasks but face efficiency challenges due to the growing key-value (KV) cache. We propose LeanK, a learning-based method that prunes unimportant key (K) cache channels by leveraging static channel sparsity . With a novel two-stage training process, LeanK learns channel-wise static mask that could satisfy specific sparsity ratio and hardware alignment requirement. LeanK reduces GPU memory and accelerates decoding without sacrificing accuracy. Experiments demonstrate up to 70% K cache and 16%-18% V cache memory reduction. Custom decoding kernel enables 1.3x speedup for attention computation . We also provide insights into model channels and attention heads during long-context inference by analyzing the learned importance distribution . Our code is available at https://aka.ms/LeanK.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.02215

• PDF: https://arxiv.org/pdf/2508.02215

• Project Page: https://aka.ms/LeanK

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

783 views06:59

Data Science | Machine Learning with Python for Researchers

🔹 Title: Efficient Agents: Building Effective Agents While Reducing Cost

🔹 Publication Date: Published on Jul 24

🔹 Abstract: A study on the efficiency-effectiveness trade-off in LLM-driven agent systems identifies optimal agent framework design to reduce costs while maintaining performance. AI-generated summary The remarkable capabilities of Large Language Model ( LLM )-driven agents have enabled sophisticated systems to tackle complex, multi-step tasks, but their escalating costs threaten scalability and accessibility. This work presents the first systematic study of the efficiency-effectiveness trade-off in modern agent systems , addressing the critical need for cost-effective designs without sacrificing performance. We investigate three key questions: (1) How much complexity do agentic tasks inherently require? (2) When do additional modules yield diminishing returns? (3) How much efficiency can be gained through the design of efficient agent framework s? Through an empirical analysis on the GAIA benchmark, we evaluate the impact of LLM backbone selection, agent framework designs, and test-time scaling strategies . Using the cost-of-pass metric, we quantify the efficiency-performance trade-off across these dimensions. Our findings inform the development of Efficient Agents , a novel agent framework that has an optimal complexity to task requirements. Efficient Agents retains 96.7% of the performance of OWL , one leading open-source agent framework , while reducing operational costs from 0.398 to 0.228, resulting in a 28.4% improvement in cost-of-pass . Our work provides actionable insights for designing efficient, high-performing agent systems , advancing the accessibility and sustainability of AI-driven solutions.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.02694

• PDF: https://arxiv.org/pdf/2508.02694

• Github: https://github.com/OPPO-PersonalAI/OAgents

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

❤1

761 views06:59

Data Science | Machine Learning with Python for Researchers

Forwarded from Python | Machine Learning | Coding | R

This channels is for Programmers, Coders, Software Engineers.

0️⃣ Python
1️⃣ Data Science
2️⃣ Machine Learning
3️⃣ Data Visualization
4️⃣ Artificial Intelligence
5️⃣ Data Analysis
6️⃣ Statistics
7️⃣ Deep Learning
8️⃣ programming Languages

✅

https://t.iss.one/addlist/8_rRW2scgfRhOTc0

✅

https://t.iss.one/Codeprogrammer

Please open Telegram to view this post

VIEW IN TELEGRAM

244 views20:34

Data Science | Machine Learning with Python for Researchers

🔹 Title: A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

🔹 Publication Date: Published on Aug 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.18106
• PDF: https://arxiv.org/pdf/2508.18106
• Github: https://github.com/Tencent/AICGSecEval

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

382 views06:14

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21112
• PDF: https://arxiv.org/pdf/2508.21112
• Project Page: https://eo-robotics.ai/eo-1
• Github: https://github.com/EO-Robotics/EO-1

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

279 views06:14

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2508.21113
• PDF: https://arxiv.org/pdf/2508.21113
• Github: https://github.com/yannqi/R-4B

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

269 views06:14

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis

🔹 Publication Date: Published on Aug 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.13618
• PDF: https://arxiv.org/pdf/2508.13618
• Project Page: https://freedomintelligence.github.io/talk-vid/
• Github: https://github.com/FreedomIntelligence/TalkVid

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

311 views06:14

Explore Data Science

Data Science | Machine Learning with Python for Researchers

🔹 Title: Efficient Code Embeddings from Code Generation Models

🔹 Publication Date: Published on Aug 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21290
• PDF: https://arxiv.org/pdf/2508.21290

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

244 views06:14

Explore Data Science