Data Science | Machine Learning with Python for Researchers
31.4K subscribers
1.53K photos
102 videos
22 files
1.81K links
Admin: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
🎁❗️TODAY FREE❗️🎁

Entry to our VIP channel is completely free today. Tomorrow it will cost $500! 🔥

JOIN 👇

https://t.iss.one/+Gc5luJUbfjRkMTk5
https://t.iss.one/+Gc5luJUbfjRkMTk5
https://t.iss.one/+Gc5luJUbfjRkMTk5
🔹 Title: SEAM: Semantically Equivalent Across Modalities Benchmark for Vision-Language Models

🔹 Publication Date: Published on Aug 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.18179
• PDF: https://arxiv.org/pdf/2508.18179
• Project Page: https://lilv98.github.io/SEAM-Website/
• Github: https://github.com/CSSLab/SEAM

🔹 Datasets citing this paper:
https://huggingface.co/datasets/lilvjosephtang/SEAM-Benchmark

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation?

🔹 Publication Date: Published on Aug 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19827
• PDF: https://arxiv.org/pdf/2508.19827

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Training a Foundation Model for Materials on a Budget

🔹 Publication Date: Published on Aug 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.16067
• PDF: https://arxiv.org/pdf/2508.16067
• Github: https://github.com/atomicarchitects/nequix

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
2
🔹 Title: rStar2-Agent: Agentic Reasoning Technical Report

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20722
• PDF: https://arxiv.org/pdf/2508.20722

🔹 Datasets citing this paper:
https://huggingface.co/datasets/rstar2-reproduce/rStar2-Agent-RL-Data

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20453
• PDF: https://arxiv.org/pdf/2508.20453
• Github: https://github.com/Accenture/mcp-bench

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
2
🔹 Title: Mixture of Contexts for Long Video Generation

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21058
• PDF: https://arxiv.org/pdf/2508.21058
• Project Page: https://primecai.github.io/moc/
• Github: https://primecai.github.io/moc/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2508.20751
• PDF: https://arxiv.org/pdf/2508.20751
• Project Page: https://codegoat24.github.io/UnifiedReward/Pref-GRPO
• Github: https://github.com/CodeGoat24/UniGenBench

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: ROSE: Remove Objects with Side Effects in Videos

🔹 Publication Date: Published on Aug 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.18633
• PDF: https://arxiv.org/pdf/2508.18633
• Project Page: https://rose2025-inpaint.github.io/
• Github: https://github.com/Kunbyte-AI/ROSE

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
https://huggingface.co/spaces/Kunbyte/ROSE
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Collaborative Multi-Modal Coding for High-Quality 3D Generation

🔹 Publication Date: Published on Aug 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.15228
• PDF: https://arxiv.org/pdf/2508.15228

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

🔹 Publication Date: Published on Aug 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.18966
• PDF: https://arxiv.org/pdf/2508.18966
• Project Page: https://bytedance.github.io/USO/
• Github: https://bytedance.github.io/USO/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
https://huggingface.co/spaces/bytedance-research/USO
https://huggingface.co/spaces/bep40/USO
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: AWorld: Orchestrating the Training Recipe for Agentic AI

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20404
• PDF: https://arxiv.org/pdf/2508.20404
• Github: https://github.com/inclusionAI/AWorld/tree/main

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: TCIA: A Task-Centric Instruction Augmentation Method for Instruction Finetuning

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20374
• PDF: https://arxiv.org/pdf/2508.20374

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Dress&Dance: Dress up and Dance as You Like It - Technical Preview

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21070
• PDF: https://arxiv.org/pdf/2508.21070

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21061
• PDF: https://arxiv.org/pdf/2508.21061

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: FakeParts: a New Family of AI-Generated DeepFakes

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21052
• PDF: https://arxiv.org/pdf/2508.21052

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21046
• PDF: https://arxiv.org/pdf/2508.21046
• Github: https://github.com/JiuTian-VL/CogVLA

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Provable Benefits of In-Tool Learning for Large Language Models

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20755
• PDF: https://arxiv.org/pdf/2508.20755

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: LaTCoder: Converting Webpage Design to Code with Layout-as-Thought

🔹 Publication Date: Published on Aug 5

🔹 Abstract: LaTCoder enhances layout preservation in design-to-code tasks by dividing webpage designs into blocks and using Chain-of-Thought reasoning with MLLMs, achieving significant improvements in metrics and human preference. AI-generated summary Converting webpage designs into code (design-to-code) plays a vital role in User Interface (UI) development for front-end developers, bridging the gap between visual design and functional implementation. While recent Multimodal Large Language Models (MLLMs) have shown significant potential in design-to-code tasks, they often fail to accurately preserve the layout during code generation. To this end, we draw inspiration from the Chain-of-Thought (CoT) reasoning in human cognition and propose LaTCoder, a novel approach that enhances layout preservation in webpage design during code generation with Layout-as-Thought (LaT). Specifically, we first introduce a simple yet efficient algorithm to divide the webpage design into image blocks . Next, we prompt MLLMs using a CoTbased approach to generate code for each block. Finally, we apply two assembly strategies- absolute positioning and an MLLM-based method-followed by dynamic selection to determine the optimal output. We evaluate the effectiveness of LaTCoder using multiple backbone MLLMs (i.e., DeepSeek-VL2, Gemini, and GPT-4o) on both a public benchmark and a newly introduced, more challenging benchmark (CC-HARD) that features complex layouts. The experimental results on automatic metrics demonstrate significant improvements. Specifically, TreeBLEU scores increased by 66.67% and MAE decreased by 38% when using DeepSeek-VL2, compared to direct prompting. Moreover, the human preference evaluation results indicate that annotators favor the webpages generated by LaTCoder in over 60% of cases, providing strong evidence of the effectiveness of our method.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03560

• PDF: https://arxiv.org/pdf/2508.03560

🔹 Datasets citing this paper:
https://huggingface.co/datasets/xcodemind/CC-HARD

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20766
• PDF: https://arxiv.org/pdf/2508.20766

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Multi-View 3D Point Tracking

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21060
• PDF: https://arxiv.org/pdf/2508.21060

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT