Data Science | Machine Learning with Python for Researchers
32.6K subscribers
3.29K photos
124 videos
23 files
3.5K links
ads: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
Scaling Agents via Continual Pre-training

📝 Summary:
Current agentic LLMs underperform due to training tensions. This paper proposes Agentic Continual Pre-training CPT to build powerful agentic foundation models. Their AgentFounder model achieves state-of-the-art performance on benchmarks with strong tool-use.

🔹 Publication Date: Published on Sep 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2502.06589
• PDF: https://arxiv.org/pdf/2509.13310
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#LLMAgents #ContinualPretraining #FoundationModels #AIResearch #ToolUse
TabTune: A Unified Library for Inference and Fine-Tuning Tabular Foundation Models

📝 Summary:
TabTune is a unified library that standardizes the workflow for tabular foundation models. It provides consistent access to state-of-the-art models, diverse adaptation strategies, and integrated evaluation for performance, calibration, and fairness.

🔹 Publication Date: Published on Nov 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02802
• PDF: https://arxiv.org/pdf/2511.02802
• Github: https://github.com/Lexsi-Labs/TabTune

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#TabularData #FoundationModels #MachineLearning #DataScience #AIResearch
1
DINOv3

📝 Summary:
DINOv3 is a self-supervised vision model excelling across tasks. It scales datasets, prevents dense feature degradation via Gram anchoring, and uses post-hoc strategies for flexibility. This versatile foundation model outperforms specialized state of the art without fine-tuning.

🔹 Publication Date: Published on Aug 13

🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/facebook/dinov3
• PDF: https://arxiv.org/pdf/2508.10104
• Project Page: https://ai.meta.com/blog/dinov3-self-supervised-vision-model/
• Github: https://github.com/facebookresearch/dinov3

🔹 Models citing this paper:
https://huggingface.co/facebook/dinov3-vit7b16-pretrain-lvd1689m
https://huggingface.co/facebook/dinov3-vitb16-pretrain-lvd1689m
https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m

Datasets citing this paper:
https://huggingface.co/datasets/zhuangzhe1229/test_dataset
https://huggingface.co/datasets/simon123905/vitl

Spaces citing this paper:
https://huggingface.co/spaces/atalaydenknalbant/DINOv3
https://huggingface.co/spaces/manu02/DINOv3-Interactive-Patch-Cosine-Similarity
https://huggingface.co/spaces/merve/dinov3-viz

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#DINOv3 #SelfSupervisedLearning #ComputerVision #FoundationModels #AI
OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation

📝 Summary:
OlmoEarth is a novel multimodal spatio-temporal foundation model for Earth observation data. It employs new self-supervised learning methods to achieve state-of-the-art performance on many tasks. It is deployed as a platform for non-profits and NGOs.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13655
• PDF: https://arxiv.org/pdf/2511.13655
• Project Page: https://olmoearth.allenai.org/
• Github: https://github.com/allenai/olmoearth_pretrain

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#EarthObservation #FoundationModels #AI #RemoteSensing #SelfSupervisedLearning