✨Locality-Attending Vision Transformer
📝 Summary:
Vision transformers are enhanced for segmentation tasks through a Gaussian kernel modulation that improves local attention while maintaining classification performance. AI-generated summary Vision tra...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04892
• PDF: https://arxiv.org/pdf/2603.04892
• Github: https://github.com/sinahmr/LocAtViT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Vision transformers are enhanced for segmentation tasks through a Gaussian kernel modulation that improves local attention while maintaining classification performance. AI-generated summary Vision tra...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04892
• PDF: https://arxiv.org/pdf/2603.04892
• Github: https://github.com/sinahmr/LocAtViT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨RealWonder: Real-Time Physical Action-Conditioned Video Generation
📝 Summary:
RealWonder enables real-time action-conditioned video generation by integrating 3D reconstruction, physics simulation, and a distilled video generator to simulate physical consequences of 3D actions. ...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05449
• PDF: https://arxiv.org/pdf/2603.05449
• Project Page: https://liuwei283.github.io/RealWonder/
• Github: https://github.com/liuwei283/RealWonder
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
RealWonder enables real-time action-conditioned video generation by integrating 3D reconstruction, physics simulation, and a distilled video generator to simulate physical consequences of 3D actions. ...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05449
• PDF: https://arxiv.org/pdf/2603.05449
• Project Page: https://liuwei283.github.io/RealWonder/
• Github: https://github.com/liuwei283/RealWonder
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨KARL: Knowledge Agents via Reinforcement Learning
📝 Summary:
A reinforcement learning system for enterprise search agents achieves superior performance through diverse training data generation and multi-task learning approaches. AI-generated summary We present ...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05218
• PDF: https://arxiv.org/pdf/2603.05218
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A reinforcement learning system for enterprise search agents achieves superior performance through diverse training data generation and multi-task learning approaches. AI-generated summary We present ...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05218
• PDF: https://arxiv.org/pdf/2603.05218
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling
📝 Summary:
Timer-S1 is a scalable Mixture-of-Experts time series model with 8.3B parameters that uses serial scaling and novel TimeMoE blocks to improve long-term forecasting accuracy. AI-generated summary We in...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04791
• PDF: https://arxiv.org/pdf/2603.04791
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Timer-S1 is a scalable Mixture-of-Experts time series model with 8.3B parameters that uses serial scaling and novel TimeMoE blocks to improve long-term forecasting accuracy. AI-generated summary We in...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04791
• PDF: https://arxiv.org/pdf/2603.04791
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨DreamWorld: Unified World Modeling in Video Generation
📝 Summary:
DreamWorld introduces a unified framework for video generation that integrates multiple types of world knowledge through joint modeling of temporal dynamics, spatial geometry, and semantic consistency...
🔹 Publication Date: Published on Feb 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00466
• PDF: https://arxiv.org/pdf/2603.00466
• Github: https://github.com/ABU121111/DreamWorld
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
DreamWorld introduces a unified framework for video generation that integrates multiple types of world knowledge through joint modeling of temporal dynamics, spatial geometry, and semantic consistency...
🔹 Publication Date: Published on Feb 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00466
• PDF: https://arxiv.org/pdf/2603.00466
• Github: https://github.com/ABU121111/DreamWorld
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline
📝 Summary:
MM-Lifelong dataset captures natural video sequences across multiple temporal scales to evaluate multimodal lifelong understanding, revealing limitations in current approaches and introducing a recurs...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05484
• PDF: https://arxiv.org/pdf/2603.05484
• Project Page: https://huggingface.co/datasets/CG-Bench/MM-Lifelong
• Github: https://github.com/cg1177/Recursive-Multimodal-Agent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
MM-Lifelong dataset captures natural video sequences across multiple temporal scales to evaluate multimodal lifelong understanding, revealing limitations in current approaches and introducing a recurs...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05484
• PDF: https://arxiv.org/pdf/2603.05484
• Project Page: https://huggingface.co/datasets/CG-Bench/MM-Lifelong
• Github: https://github.com/cg1177/Recursive-Multimodal-Agent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨On-Policy Self-Distillation for Reasoning Compression
📝 Summary:
OPSDC enables efficient reasoning model compression by having models distill concise behavior from their own outputs, achieving significant token reduction while maintaining accuracy. AI-generated sum...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05433
• PDF: https://arxiv.org/pdf/2603.05433
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OPSDC enables efficient reasoning model compression by having models distill concise behavior from their own outputs, achieving significant token reduction while maintaining accuracy. AI-generated sum...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05433
• PDF: https://arxiv.org/pdf/2603.05433
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨UltraDexGrasp: Learning Universal Dexterous Grasping for Bimanual Robots with Synthetic Data
📝 Summary:
A bimanual robotic grasping framework is presented that generates diverse grasp data through optimization and planning, enabling effective zero-shot sim-to-real transfer with high success rates on nov...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05312
• PDF: https://arxiv.org/pdf/2603.05312
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A bimanual robotic grasping framework is presented that generates diverse grasp data through optimization and planning, enabling effective zero-shot sim-to-real transfer with high success rates on nov...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05312
• PDF: https://arxiv.org/pdf/2603.05312
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Distribution-Conditioned Transport
📝 Summary:
Distribution-conditioned transport framework enables generalization to unseen distribution pairs and supports semi-supervised learning for scientific applications. AI-generated summary Learning a tran...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04736
• PDF: https://arxiv.org/pdf/2603.04736
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Distribution-conditioned transport framework enables generalization to unseen distribution pairs and supports semi-supervised learning for scientific applications. AI-generated summary Learning a tran...
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04736
• PDF: https://arxiv.org/pdf/2603.04736
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SageBwd: A Trainable Low-bit Attention
📝 Summary:
Research investigates why low-bit attention methods like SageBwd exhibit performance gaps during pre-training and identifies key factors for stable training with reduced precision. AI-generated summar...
🔹 Publication Date: Published on Mar 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02170
• PDF: https://arxiv.org/pdf/2603.02170
• Project Page: https://github.com/thu-ml/SageAttention
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Research investigates why low-bit attention methods like SageBwd exhibit performance gaps during pre-training and identifies key factors for stable training with reduced precision. AI-generated summar...
🔹 Publication Date: Published on Mar 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.02170
• PDF: https://arxiv.org/pdf/2603.02170
• Project Page: https://github.com/thu-ml/SageAttention
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
📝 Summary:
AgentVista presents a comprehensive benchmark for multimodal agents requiring long-horizon tool interactions across multiple modalities and complex real-world scenarios. AI-generated summary Real-worl...
🔹 Publication Date: Published on Feb 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.23166
• PDF: https://arxiv.org/pdf/2602.23166
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
AgentVista presents a comprehensive benchmark for multimodal agents requiring long-horizon tool interactions across multiple modalities and complex real-world scenarios. AI-generated summary Real-worl...
🔹 Publication Date: Published on Feb 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.23166
• PDF: https://arxiv.org/pdf/2602.23166
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨RoboPocket: Improve Robot Policies Instantly with Your Phone
📝 Summary:
RoboPocket enables efficient, robot-free policy iteration via smartphones. It uses augmented reality to visualize policy weaknesses, guiding data collection, and asynchronous online finetuning to update policies quickly. This doubles data and sample efficiency.
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05504
• PDF: https://arxiv.org/pdf/2603.05504
• Project Page: https://robo-pocket.github.io
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
RoboPocket enables efficient, robot-free policy iteration via smartphones. It uses augmented reality to visualize policy weaknesses, guiding data collection, and asynchronous online finetuning to update policies quickly. This doubles data and sample efficiency.
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.05504
• PDF: https://arxiv.org/pdf/2603.05504
• Project Page: https://robo-pocket.github.io
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Large Multimodal Models as General In-Context Classifiers
📝 Summary:
Large Multimodal Models demonstrate superior performance in closed-world classification with in-context learning and excel in open-world scenarios when equipped with the proposed CIRCLE method for pse...
🔹 Publication Date: Published on Feb 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.23229
• PDF: https://arxiv.org/pdf/2602.23229
• Project Page: https://circle-lmm.github.io/
• Github: https://github.com/marco-garosi/CIRCLE
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Large Multimodal Models demonstrate superior performance in closed-world classification with in-context learning and excel in open-world scenarios when equipped with the proposed CIRCLE method for pse...
🔹 Publication Date: Published on Feb 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.23229
• PDF: https://arxiv.org/pdf/2602.23229
• Project Page: https://circle-lmm.github.io/
• Github: https://github.com/marco-garosi/CIRCLE
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface
📝 Summary:
GLiNER2 is a unified transformer-based framework that supports multiple NLP tasks with improved efficiency and accessibility compared to large language models. AI-generated summary Information extract...
🔹 Publication Date: Published on Jul 24, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.18546
• PDF: https://arxiv.org/pdf/2507.18546
• Github: https://github.com/fastino-ai/GLiNER2
🔹 Models citing this paper:
• https://huggingface.co/fastino/gliner2-base-v1
• https://huggingface.co/fastino/gliner2-large-v1
• https://huggingface.co/fastino/gliner2-multi-v1
✨ Spaces citing this paper:
• https://huggingface.co/spaces/sitammeur/GLiNER2-Suite
• https://huggingface.co/spaces/fastino/gliner2-official-demo
• https://huggingface.co/spaces/sohom004/testdup
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
GLiNER2 is a unified transformer-based framework that supports multiple NLP tasks with improved efficiency and accessibility compared to large language models. AI-generated summary Information extract...
🔹 Publication Date: Published on Jul 24, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.18546
• PDF: https://arxiv.org/pdf/2507.18546
• Github: https://github.com/fastino-ai/GLiNER2
🔹 Models citing this paper:
• https://huggingface.co/fastino/gliner2-base-v1
• https://huggingface.co/fastino/gliner2-large-v1
• https://huggingface.co/fastino/gliner2-multi-v1
✨ Spaces citing this paper:
• https://huggingface.co/spaces/sitammeur/GLiNER2-Suite
• https://huggingface.co/spaces/fastino/gliner2-official-demo
• https://huggingface.co/spaces/sohom004/testdup
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
GLiNER2: An Efficient Multi-Task Information Extraction System...
Information extraction (IE) is fundamental to numerous NLP applications, yet existing solutions often require specialized models for different tasks or rely on computationally expensive large...
✨Mozi: Governed Autonomy for Drug Discovery LLM Agents
📝 Summary:
Mozi is a dual-layer framework for reliable drug discovery LLM agents, solving issues of tool-use governance and long-horizon reliability. It uses a control plane for isolated tool-use and replanning, plus a workflow plane for structured stages with human oversight, ensuring robust, auditable res...
🔹 Publication Date: Published on Mar 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03655
• PDF: https://arxiv.org/pdf/2603.03655
• Project Page: https://ai4s.idea.edu.cn/ai4s/mozi
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMAgents #DrugDiscovery #AI #MachineLearning #AutonomousAgents
📝 Summary:
Mozi is a dual-layer framework for reliable drug discovery LLM agents, solving issues of tool-use governance and long-horizon reliability. It uses a control plane for isolated tool-use and replanning, plus a workflow plane for structured stages with human oversight, ensuring robust, auditable res...
🔹 Publication Date: Published on Mar 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.03655
• PDF: https://arxiv.org/pdf/2603.03655
• Project Page: https://ai4s.idea.edu.cn/ai4s/mozi
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLMAgents #DrugDiscovery #AI #MachineLearning #AutonomousAgents
✨MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models
📝 Summary:
MASQuant improves multimodal LLM quantization by resolving smoothing misalignment and cross-modal invariance. It uses modality-aware smoothing and SVD whitening for cross-modal compensation, achieving stable, competitive performance.
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04800
• PDF: https://arxiv.org/pdf/2603.04800
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MultimodalAI #LLM #Quantization #DeepLearning #AIResearch
📝 Summary:
MASQuant improves multimodal LLM quantization by resolving smoothing misalignment and cross-modal invariance. It uses modality-aware smoothing and SVD whitening for cross-modal compensation, achieving stable, competitive performance.
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04800
• PDF: https://arxiv.org/pdf/2603.04800
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MultimodalAI #LLM #Quantization #DeepLearning #AIResearch
✨SkillNet: Create, Evaluate, and Connect AI Skills
📝 Summary:
SkillNet is an open infrastructure that systematically creates, evaluates, and organizes AI skills using a unified ontology. This overcomes the lack of skill accumulation in current agents, significantly boosting performance by 40 percent in rewards and reducing execution steps by 30 percent.
🔹 Publication Date: Published on Feb 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04448
• PDF: https://arxiv.org/pdf/2603.04448
• Project Page: https://skillnet.openkg.cn/
• Github: https://github.com/zjunlp/SkillNet
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #AISkills #AIAgents #Ontology #MachineLearning
📝 Summary:
SkillNet is an open infrastructure that systematically creates, evaluates, and organizes AI skills using a unified ontology. This overcomes the lack of skill accumulation in current agents, significantly boosting performance by 40 percent in rewards and reducing execution steps by 30 percent.
🔹 Publication Date: Published on Feb 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04448
• PDF: https://arxiv.org/pdf/2603.04448
• Project Page: https://skillnet.openkg.cn/
• Github: https://github.com/zjunlp/SkillNet
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #AISkills #AIAgents #Ontology #MachineLearning
✨STMI: Segmentation-Guided Token Modulation with Cross-Modal Hypergraph Interaction for Multi-Modal Object Re-Identification
📝 Summary:
STMI is a novel multi-modal ReID framework that improves object re-identification. It uses segmentation-guided modulation for foreground enhancement, token reallocation for compact features, and cross-modal hypergraph interaction to capture high-order semantic relationships.
🔹 Publication Date: Published on Feb 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00695
• PDF: https://arxiv.org/pdf/2603.00695
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ObjectReID #ComputerVision #DeepLearning #MultiModalAI #STMI
📝 Summary:
STMI is a novel multi-modal ReID framework that improves object re-identification. It uses segmentation-guided modulation for foreground enhancement, token reallocation for compact features, and cross-modal hypergraph interaction to capture high-order semantic relationships.
🔹 Publication Date: Published on Feb 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.00695
• PDF: https://arxiv.org/pdf/2603.00695
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ObjectReID #ComputerVision #DeepLearning #MultiModalAI #STMI