Data Science | Machine Learning with Python for Researchers
31.8K subscribers
2.03K photos
102 videos
22 files
2.3K links
Admin: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
🔹 Title: Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR

🔹 Publication Date: Published on Sep 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.18174
• PDF: https://huggingface.co/datasets/Misraj/KITAB_pdf_to_markdown_reviewed

🔹 Datasets citing this paper:
https://huggingface.co/datasets/Misraj/KITAB_pdf_to_markdown_reviewed
https://huggingface.co/datasets/Misraj/Misraj-DocOCR

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Large Language Models Discriminate Against Speakers of German Dialects

🔹 Publication Date: Published on Sep 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.13835
• PDF: https://arxiv.org/pdf/2509.13835

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: VIR-Bench: Evaluating Geospatial and Temporal Understanding of MLLMs via Travel Video Itinerary Reconstruction

🔹 Publication Date: Published on Sep 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.19002
• PDF: https://arxiv.org/pdf/2509.19002
• Github: https://github.com/nlp-waseda/VIR-Bench

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction

🔹 Publication Date: Published on Sep 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.18090
• PDF: https://arxiv.org/pdf/2509.18090
• Project Page: https://fictionarry.github.io/GeoSVR-project/
• Github: https://github.com/Fictionarry/GeoSVR

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: DRISHTIKON: A Multimodal Multilingual Benchmark for Testing Language Models' Understanding on Indian Culture

🔹 Publication Date: Published on Sep 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.19274
• PDF: https://arxiv.org/pdf/2509.19274

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Better Late Than Never: Evaluation of Latency Metrics for Simultaneous Speech-to-Text Translation

🔹 Publication Date: Published on Sep 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.17349
• PDF: https://arxiv.org/pdf/2509.17349
• Project Page: https://huggingface.co/collections/meetween/meetweens-research-papers-68d28369b32dbced7ff9e0df

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Soft Tokens, Hard Truths

🔹 Publication Date: Published on Sep 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.19170
• PDF: https://arxiv.org/pdf/2509.19170

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: CommonForms: A Large, Diverse Dataset for Form Field Detection

🔹 Publication Date: Published on Sep 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.16506
• PDF: https://arxiv.org/pdf/2509.16506

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: PEEK: Guiding and Minimal Image Representations for Zero-Shot Generalization of Robot Manipulation Policies

🔹 Publication Date: Published on Sep 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.18282
• PDF: https://arxiv.org/pdf/2509.18282
• Project Page: https://peek-robot.github.io/

🔹 Datasets citing this paper:
https://huggingface.co/datasets/jesbu1/peek_bridge_labels

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
3
🔹 Title: RadEval: A framework for radiology text evaluation

🔹 Publication Date: Published on Sep 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.18030
• PDF: https://arxiv.org/pdf/2509.18030
• Github: https://github.com/jbdel/RadEval

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
https://huggingface.co/spaces/X-iZhang/RadEval
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning

🔹 Publication Date: Published on Sep 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.20360
• PDF: https://arxiv.org/pdf/2509.20360

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
1
🔹 Title: PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation

🔹 Publication Date: Published on Sep 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.20358
• PDF: https://arxiv.org/pdf/2509.20358
• Project Page: https://cwchenwang.github.io/physctrl/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: EmbeddingGemma: Powerful and Lightweight Text Representations

🔹 Publication Date: Published on Sep 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.20354
• PDF: https://arxiv.org/pdf/2509.20354

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Video models are zero-shot learners and reasoners

🔹 Publication Date: Published on Sep 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.20328
• PDF: https://arxiv.org/pdf/2509.20328
• Project Page: https://video-zero-shot.github.io/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Logics-Parsing Technical Report

🔹 Publication Date: Published on Sep 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.19760
• PDF: https://arxiv.org/pdf/2509.19760
• Github: https://github.com/alibaba/Logics-Parsing

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: On the Use of Agentic Coding: An Empirical Study of Pull Requests on GitHub

🔹 Publication Date: Published on Sep 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.14745
• PDF: https://arxiv.org/pdf/2509.14745
• Project Page: https://huggingface.co/papers?q=project%20maintainers
• Github: https://huggingface.co/papers?q=GitHub%20pull%20requests

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: SIM-CoT: Supervised Implicit Chain-of-Thought

🔹 Publication Date: Published on Sep 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.20317
• PDF: https://arxiv.org/pdf/2509.20317
• Github: https://github.com/InternLM/SIM-CoT

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: LLMs4All: A Review on Large Language Models for Research and Applications in Academic Disciplines

🔹 Publication Date: Published on Sep 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.19580
• PDF: https://arxiv.org/pdf/2509.19580

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: Advancing Speech Understanding in Speech-Aware Language Models with GRPO

🔹 Publication Date: Published on Sep 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.16990
• PDF: https://arxiv.org/pdf/2509.16990

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://t.iss.one/DataScienceT
🔹 Title: ATLAS: Benchmarking and Adapting LLMs for Global Trade via Harmonized Tariff Code Classification

🔹 Publication Date: Published on Sep 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.18400
• PDF: https://arxiv.org/pdf/2509.18400
• Project Page: https://tariffpro.flexify.ai/

🔹 Datasets citing this paper:
https://huggingface.co/datasets/flexifyai/cross_rulings_hts_dataset_for_tariffs

🔹 Spaces citing this paper:
https://huggingface.co/spaces/flexifyai/atlas-llama3_3-70b-hts-demo
==================================

For more data science resources:
https://t.iss.one/DataScienceT