Data Science | Machine Learning with Python for Researchers
32.6K subscribers
3.29K photos
124 videos
23 files
3.51K links
ads: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
✨VideoSSR: Video Self-Supervised Reinforcement Learning

πŸ“ Summary:
VideoSSR is a novel self-supervised reinforcement learning framework that leverages intrinsic video information to generate high-quality training data. It uses three pretext tasks and the VideoSSR-30K dataset, improving MLLM performance across 17 benchmarks by over 5%.

πŸ”Ή Publication Date: Published on Nov 9

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.06281
β€’ PDF: https://arxiv.org/pdf/2511.06281
β€’ Project Page: https://github.com/lcqysl/VideoSSR
β€’ Github: https://github.com/lcqysl/VideoSSR

πŸ”Ή Models citing this paper:
β€’ https://huggingface.co/yhx12/VideoSSR

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#ReinforcementLearning #SelfSupervisedLearning #VideoAI #MachineLearning #DeepLearning
✨The Path Not Taken: RLVR Provably Learns Off the Principals

πŸ“ Summary:
RLVR learns by modifying parameters off principal directions in low-curvature subspaces, appearing sparse due to optimization bias. This distinct optimization regime contrasts with SFT, meaning SFT-era fine-tuning methods are flawed for RLVR.

πŸ”Ή Publication Date: Published on Nov 11

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.08567
β€’ PDF: https://arxiv.org/pdf/2511.08567

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#RLVR #MachineLearning #Optimization #DeepLearning #AIResearch
πŸ”₯1
✨TimeSearch-R: Adaptive Temporal Search for Long-Form Video Understanding via Self-Verification Reinforcement Learning

πŸ“ Summary:
TimeSearch-R improves long-form video understanding by optimizing temporal search with reinforcement learning. It uses GRPO-CSV to verify searched frame completeness, leading to improved reasoning. This achieves state-of-the-art performance on multiple video benchmarks.

πŸ”Ή Publication Date: Published on Nov 7

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.05489
β€’ PDF: https://arxiv.org/pdf/2511.05489
β€’ Github: https://github.com/Time-Search/TimeSearch-R

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#VideoUnderstanding #ReinforcementLearning #DeepLearning #AIResearch #ComputerVision
✨Toward the Frontiers of Reliable Diffusion Sampling via Adversarial Sinkhorn Attention Guidance

πŸ“ Summary:
ASAG is a novel diffusion guidance method that uses optimal transport and the Sinkhorn algorithm to adversarially disrupt attention scores. It weakens misleading attention alignments by injecting an adversarial cost, improving sample quality, controllability, and fidelity without model retraining.

πŸ”Ή Publication Date: Published on Nov 10

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.07499
β€’ PDF: https://arxiv.org/pdf/2511.07499

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#DiffusionModels #AdversarialAI #OptimalTransport #GenerativeAI #DeepLearning
✨Efficient Guided Generation for Large Language Models

πŸ“ Summary:
This paper introduces an efficient method to guide large language model text generation. It uses regular expressions and context-free grammars with minimal added overhead, making guided generation practical.

πŸ”Ή Publication Date: Published on Jul 19, 2023

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2307.09702
β€’ PDF: https://arxiv.org/pdf/2307.09702
β€’ Github: https://github.com/normal-computing/outlines

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#LLMs #TextGeneration #NLP #AI #DeepLearning
✨Motif 2 12.7B technical report

πŸ“ Summary:
Motif-2-12.7B is an efficient LLM combining Grouped Differential Attention and system-level optimizations. It achieves competitive performance across diverse benchmarks with a smaller model size.

πŸ”Ή Publication Date: Published on Nov 7

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.07464
β€’ PDF: https://arxiv.org/pdf/2511.07464

πŸ”Ή Models citing this paper:
β€’ https://huggingface.co/Motif-Technologies/optimizer
β€’ https://huggingface.co/Motif-Technologies/Motif-2-12.7B-Instruct
β€’ https://huggingface.co/Motif-Technologies/Motif-2-12.7B-Base

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#LLM #AI #DeepLearning #EfficientAI #AttentionMechanisms
✨Black-Box On-Policy Distillation of Large Language Models

πŸ“ Summary:
Generative Adversarial Distillation GAD is a new black-box on-policy method for distilling LLMs. GAD trains a student generator and a discriminator for adaptive feedback, surpassing traditional distillation. It enables student LLMs to perform comparably to proprietary teachers.

πŸ”Ή Publication Date: Published on Nov 13

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.10643
β€’ PDF: https://arxiv.org/pdf/2511.10643

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#LLMs #AIDistillation #MachineLearning #GenerativeAI #DeepLearning
✨Virtual Width Networks

πŸ“ Summary:
Virtual Width Networks VWN enhance model efficiency by expanding representational width without increasing computational cost. VWN accelerates optimization and improves loss reduction, showing a log-linear scaling relation between virtual width and loss.

πŸ”Ή Publication Date: Published on Nov 14

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.11238
β€’ PDF: https://arxiv.org/pdf/2511.11238

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#NeuralNetworks #DeepLearning #ModelEfficiency #MachineLearning #AI
✨DoPE: Denoising Rotary Position Embedding

πŸ“ Summary:
DoPE improves Transformer length generalization by detecting and mitigating noisy frequency bands in positional embeddings. This training-free method enhances retrieval accuracy and reasoning stability across extended contexts up to 64K tokens.

πŸ”Ή Publication Date: Published on Nov 12

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.09146
β€’ PDF: https://arxiv.org/pdf/2511.09146
β€’ Project Page: https://The-physical-picture-of-LLMs.github.io

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#Transformers #PositionalEmbedding #LLMs #DeepLearning #AIResearch
✨Test-Time Spectrum-Aware Latent Steering for Zero-Shot Generalization in Vision-Language Models

πŸ“ Summary:
VLMs degrade under test-time domain shifts. Spectrum-Aware Test-Time Steering STS is a lightweight method that adapts VLM latent representations by steering them using textual embedding subspaces, without backpropagation. STS surpasses state-of-the-art, offering faster inference and less memory.

πŸ”Ή Publication Date: Published on Nov 12

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.09809
β€’ PDF: https://arxiv.org/pdf/2511.09809

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#VisionLanguageModels #ZeroShotGeneralization #DomainAdaptation #DeepLearning #AI
✨TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models

πŸ“ Summary:
TiViBench is a new benchmark assessing image-to-video models reasoning across four dimensions and 24 tasks. Commercial models show stronger reasoning potential. VideoTPO, a test-time strategy, significantly enhances performance, advancing reasoning in video generation.

πŸ”Ή Publication Date: Published on Nov 17

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.13704
β€’ PDF: https://arxiv.org/pdf/2511.13704
β€’ Project Page: https://haroldchen19.github.io/TiViBench-Page/
β€’ Github: https://haroldchen19.github.io/TiViBench-Page/

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#VideoGeneration #AIBenchmark #ComputerVision #DeepLearning #AIResearch
✨Back to Basics: Let Denoising Generative Models Denoise

πŸ“ Summary:
Denoising diffusion models should predict clean images directly, not noise, leveraging the data manifold assumption. The paper introduces JiT, a model using simple, large-patch Transformers that achieves competitive generative results on ImageNet.

πŸ”Ή Publication Date: Published on Nov 17

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.13720
β€’ PDF: https://arxiv.org/pdf/2511.13720
β€’ Github: https://github.com/LTH14/JiT

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#DiffusionModels #GenerativeAI #DeepLearning #ComputerVision #AIResearch
❀1
✨UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity

πŸ“ Summary:
UnSAMv2 enables continuous segmentation granularity control for the SAM model without human annotations. It uses self-supervised learning on unlabeled data to discover mask-granularity pairs and a novel control embedding. UnSAMv2 significantly enhances SAM-2s performance across various segmentati...

πŸ”Ή Publication Date: Published on Nov 17

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.13714
β€’ PDF: https://arxiv.org/pdf/2511.13714
β€’ Project Page: https://yujunwei04.github.io/UnSAMv2-Project-Page/
β€’ Github: https://github.com/yujunwei04/UnSAMv2

✨ Spaces citing this paper:
β€’ https://huggingface.co/spaces/yujunwei04/UnSAMv2

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#AI #ComputerVision #SelfSupervisedLearning #ImageSegmentation #DeepLearning
✨Error-Driven Scene Editing for 3D Grounding in Large Language Models

πŸ“ Summary:
DEER-3D improves 3D LLM grounding by iteratively editing and retraining models. It diagnoses predicate-level errors, then generates targeted 3D scene edits as counterfactuals to enhance spatial understanding and accuracy.

πŸ”Ή Publication Date: Published on Nov 18

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.14086
β€’ PDF: https://arxiv.org/pdf/2511.14086
β€’ Github: https://github.com/zhangyuejoslin/Deer-3D

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#LLMs #3DGrounding #ComputerVision #DeepLearning #AIResearch
✨Orion: A Unified Visual Agent for Multimodal Perception, Advanced Visual Reasoning and Execution

πŸ“ Summary:
Orion is a visual agent framework that orchestrates specialized computer vision tools to execute complex visual workflows. It achieves competitive performance on benchmarks and enables autonomous, tool-driven visual reasoning.

πŸ”Ή Publication Date: Published on Nov 18

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.14210
β€’ PDF: https://arxiv.org/pdf/2511.14210

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#ComputerVision #AIagents #VisualReasoning #MultimodalAI #DeepLearning
✨A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space

πŸ“ Summary:
CoTyle introduces code-to-style image generation, creating consistent visual styles from numerical codes. It is the first open-source academic method for this task, using a discrete style codebook and a text-to-image diffusion model for diverse, reproducible styles.

πŸ”Ή Publication Date: Published on Nov 13

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.10555
β€’ PDF: https://arxiv.org/pdf/2511.10555
β€’ Project Page: https://Kwai-Kolors.github.io/CoTyle/
β€’ Github: https://github.com/Kwai-Kolors/CoTyle

✨ Spaces citing this paper:
β€’ https://huggingface.co/spaces/Kwai-Kolors/CoTyle

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#ImageGeneration #DiffusionModels #NeuralStyle #ComputerVision #DeepLearning
✨Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

πŸ“ Summary:
This paper clarifies RL for LLM Agents by extending the MDP framework. It introduces Agent-R1, a modular and flexible training framework, demonstrating its effectiveness on Multihop QA tasks.

πŸ”Ή Publication Date: Published on Nov 18

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.14460
β€’ PDF: https://arxiv.org/pdf/2511.14460
β€’ Github: https://github.com/0russwest0/Agent-R1

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#LLMAgents #ReinforcementLearning #AI #DeepLearning #NLP
✨UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE

πŸ“ Summary:
UniMoE-Audio unifies speech and music generation using a novel Dynamic-Capacity Mixture-of-Experts framework. It addresses data imbalance and task conflicts through a hybrid expert design and a three-stage training, achieving state-of-the-art performance and synergistic cross-domain learning.

πŸ”Ή Publication Date: Published on Oct 15

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxivexplained.com/papers/unimoe-audio-unified-speech-and-music-generation-with-dynamic-capacity-moe
β€’ PDF: https://arxiv.org/pdf/2510.13344
β€’ Project Page: https://mukioxun.github.io/Uni-MoE-site/home.html
β€’ Github: https://github.com/HITsz-TMG/Uni-MoE/blob/master/UniMoE-Audio

πŸ”Ή Models citing this paper:
β€’ https://huggingface.co/HIT-TMG/UniMoE-Audio-Preview

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#SpeechGeneration #MusicGeneration #MixtureOfExperts #GenerativeAI #DeepLearning
✨Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

πŸ“ Summary:
Uni-MoE introduces a sparse Multimodal Mixture of Experts LLM efficiently handling diverse data types. It uses modality-specific encoders and a progressive training strategy, reducing performance bias and improving collaboration across modalities.

πŸ”Ή Publication Date: Published on May 18, 2024

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2405.11273
β€’ PDF: https://arxiv.org/pdf/2405.11273
β€’ Github: https://github.com/hitsz-tmg/umoe-scaling-unified-multimodal-llms

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#MultimodalAI #LLMs #MixtureOfExperts #DeepLearning #AIResearch
✨Φeat: Physically-Grounded Feature Representation

πŸ“ Summary:
Ξ¦eat is a new self-supervised visual backbone that captures material identity like reflectance and mesostructure. It learns robust features invariant to external physical factors such as shape and lighting, promoting physics-aware perception.

πŸ”Ή Publication Date: Published on Nov 14

πŸ”Ή Paper Links:
β€’ arXiv Page: https://arxiv.org/abs/2511.11270
β€’ PDF: https://arxiv.org/pdf/2511.11270

==================================

For more data science resources:
βœ“ https://t.iss.one/DataScienceT

#ComputerVision #SelfSupervisedLearning #DeepLearning #FeatureLearning #PhysicsAwareAI