✨GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents
📝 Summary:
GameWorld presents a standardized benchmark for evaluating multimodal large language model agents in video games, featuring diverse games and verified metrics for comprehensive assessment. AI-generate...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07429
• PDF: https://arxiv.org/pdf/2604.07429
• Project Page: https://gameworld-bench.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
GameWorld presents a standardized benchmark for evaluating multimodal large language model agents in video games, featuring diverse games and verified metrics for comprehensive assessment. AI-generate...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07429
• PDF: https://arxiv.org/pdf/2604.07429
• Project Page: https://gameworld-bench.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping
📝 Summary:
MegaStyle presents a scalable data curation pipeline for creating high-quality, style-consistent datasets using large generative models and proposes style-supervised contrastive learning for effective...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08364
• PDF: https://arxiv.org/pdf/2604.08364
• Project Page: https://jeoyal.github.io/MegaStyle/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
MegaStyle presents a scalable data curation pipeline for creating high-quality, style-consistent datasets using large generative models and proposes style-supervised contrastive learning for effective...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08364
• PDF: https://arxiv.org/pdf/2604.08364
• Project Page: https://jeoyal.github.io/MegaStyle/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence
📝 Summary:
OpenSpatial presents an open-source data engine for spatial reasoning tasks using 3D bounding boxes, creating a large-scale dataset and achieving state-of-the-art performance in spatial perception ben...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07296
• PDF: https://arxiv.org/pdf/2604.07296
• Github: https://github.com/VINHYU/OpenSpatial
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OpenSpatial presents an open-source data engine for spatial reasoning tasks using 3D bounding boxes, creating a large-scale dataset and achieving state-of-the-art performance in spatial perception ben...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07296
• PDF: https://arxiv.org/pdf/2604.07296
• Github: https://github.com/VINHYU/OpenSpatial
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Lighting-grounded Video Generation with Renderer-based Agent Reasoning
📝 Summary:
LiVER presents a diffusion-based framework for scene-controllable video generation that disentangles 3D scene properties through explicit conditioning and automated user instruction translation. AI-ge...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07966
• PDF: https://arxiv.org/pdf/2604.07966
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
LiVER presents a diffusion-based framework for scene-controllable video generation that disentangles 3D scene properties through explicit conditioning and automated user instruction translation. AI-ge...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07966
• PDF: https://arxiv.org/pdf/2604.07966
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
✨Automating Database-Native Function Code Synthesis with LLMs
📝 Summary:
D a t a b a s e s y s t e m s i n c o r p o r a t e a n e v e r - g r o w i n g n u m b e r o f f u n c t i o n s i n t h e i r k e r n e l s ( a . k . a . , d a t a b a s e n a t i v e f u n c t i o ...
🔹 Publication Date: Published on Apr 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06231
• PDF: https://arxiv.org/pdf/2604.06231
• Project Page: https://code4db.github.io/hi-opencook/
• Github: https://github.com/weAIDB/OpenCook
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
D a t a b a s e s y s t e m s i n c o r p o r a t e a n e v e r - g r o w i n g n u m b e r o f f u n c t i o n s i n t h e i r k e r n e l s ( a . k . a . , d a t a b a s e n a t i v e f u n c t i o ...
🔹 Publication Date: Published on Apr 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06231
• PDF: https://arxiv.org/pdf/2604.06231
• Project Page: https://code4db.github.io/hi-opencook/
• Github: https://github.com/weAIDB/OpenCook
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds
📝 Summary:
A physics-aligned simulation framework enables effective robotic manipulation of deformable objects by creating metric-consistent synthetic data that matches real-world performance. AI-generated summa...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08544
• PDF: https://arxiv.org/pdf/2604.08544
• Project Page: https://internrobotics.github.io/sim1.github.io/
• Github: https://github.com/InternRobotics/SIM1
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A physics-aligned simulation framework enables effective robotic manipulation of deformable objects by creating metric-consistent synthetic data that matches real-world performance. AI-generated summa...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08544
• PDF: https://arxiv.org/pdf/2604.08544
• Project Page: https://internrobotics.github.io/sim1.github.io/
• Github: https://github.com/InternRobotics/SIM1
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Structured Distillation of Web Agent Capabilities Enables Generalization
📝 Summary:
Structured synthetic trajectory generation using a frontier LLM as teacher enables open-weight web agents with superior performance and cross-environment capabilities. AI-generated summary Frontier LL...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/McGill-NLP/a3-agent-as-annotators
• PDF: https://arxiv.org/pdf/2604.07776
• Project Page: https://agent-as-annotators.github.io/
• Github: https://github.com/McGill-NLP/agent-as-annotators
🔹 Models citing this paper:
• https://huggingface.co/McGill-NLP/A3-Qwen3.5-9B
• https://huggingface.co/McGill-NLP/A3-Qwen3.5-4B
• https://huggingface.co/McGill-NLP/A3-Qwen3.5-2B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/McGill-NLP/A3-Synth
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Structured synthetic trajectory generation using a frontier LLM as teacher enables open-weight web agents with superior performance and cross-environment capabilities. AI-generated summary Frontier LL...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/McGill-NLP/a3-agent-as-annotators
• PDF: https://arxiv.org/pdf/2604.07776
• Project Page: https://agent-as-annotators.github.io/
• Github: https://github.com/McGill-NLP/agent-as-annotators
🔹 Models citing this paper:
• https://huggingface.co/McGill-NLP/A3-Qwen3.5-9B
• https://huggingface.co/McGill-NLP/A3-Qwen3.5-4B
• https://huggingface.co/McGill-NLP/A3-Qwen3.5-2B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/McGill-NLP/A3-Synth
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Structural Graph Probing of Vision-Language Models
📝 Summary:
Vision-language models exhibit structured neural topology where correlation graphs reveal behaviorally significant patterns and influential recurrent hub neurons that drive multimodal performance. AI-...
🔹 Publication Date: Published on Mar 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27070
• PDF: https://arxiv.org/pdf/2603.27070
• Github: https://github.com/he-h/vlm-graphprobing
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Vision-language models exhibit structured neural topology where correlation graphs reveal behaviorally significant patterns and influential recurrent hub neurons that drive multimodal performance. AI-...
🔹 Publication Date: Published on Mar 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27070
• PDF: https://arxiv.org/pdf/2603.27070
• Github: https://github.com/he-h/vlm-graphprobing
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ClawBench: Can AI Agents Complete Everyday Online Tasks?
📝 Summary:
ClawBench presents a framework of 153 real-world online tasks on live platforms to evaluate AI agents. These complex multi-step tasks require capabilities like document processing and form filling. Current frontier AI models complete only a small portion, showing significant limitations for gener...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08523
• PDF: https://arxiv.org/pdf/2604.08523
• Project Page: https://claw-bench.com
• Github: https://github.com/reacher-z/ClawBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ClawBench presents a framework of 153 real-world online tasks on live platforms to evaluate AI agents. These complex multi-step tasks require capabilities like document processing and form filling. Current frontier AI models complete only a small portion, showing significant limitations for gener...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08523
• PDF: https://arxiv.org/pdf/2604.08523
• Project Page: https://claw-bench.com
• Github: https://github.com/reacher-z/ClawBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability
📝 Summary:
Supervised finetuning and reinforcement learning exhibit conditional cross-domain generalization in reasoning tasks, influenced by optimization dynamics, data quality, and model capability, with asymm...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06628
• PDF: https://arxiv.org/pdf/2604.06628
• Github: https://github.com/Nebularaid2000/rethink_sft_generalization
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Supervised finetuning and reinforcement learning exhibit conditional cross-domain generalization in reasoning tasks, influenced by optimization dynamics, data quality, and model capability, with asymm...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06628
• PDF: https://arxiv.org/pdf/2604.06628
• Github: https://github.com/Nebularaid2000/rethink_sft_generalization
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ViVa: A Video-Generative Value Model for Robot Reinforcement Learning
📝 Summary:
ViVa is a video-generative value model for robot reinforcement learning. It estimates values by leveraging pretrained video generators to predict future robot dynamics, moving beyond static observations. This approach improves robot manipulation and generalizes to novel objects.
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08168
• PDF: https://arxiv.org/pdf/2604.08168
• Project Page: https://viva-value-model.github.io/
• Github: https://github.com/GigaAI-research/ViVa
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Robotics #ReinforcementLearning #GenerativeAI #MachineLearning #AI
📝 Summary:
ViVa is a video-generative value model for robot reinforcement learning. It estimates values by leveraging pretrained video generators to predict future robot dynamics, moving beyond static observations. This approach improves robot manipulation and generalizes to novel objects.
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08168
• PDF: https://arxiv.org/pdf/2604.08168
• Project Page: https://viva-value-model.github.io/
• Github: https://github.com/GigaAI-research/ViVa
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Robotics #ReinforcementLearning #GenerativeAI #MachineLearning #AI
✨POS-ISP: Pipeline Optimization at the Sequence Level for Task-aware ISP
📝 Summary:
POS-ISP presents a sequence-level reinforcement learning framework for optimizing image signal processing pipelines by predicting complete module sequences and parameters in a single forward pass, imp...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06938
• PDF: https://arxiv.org/pdf/2604.06938
• Project Page: https://w1jyun.github.io/POS-ISP/
• Github: https://github.com/w1jyun/POS-ISP
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
POS-ISP presents a sequence-level reinforcement learning framework for optimizing image signal processing pipelines by predicting complete module sequences and parameters in a single forward pass, imp...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06938
• PDF: https://arxiv.org/pdf/2604.06938
• Project Page: https://w1jyun.github.io/POS-ISP/
• Github: https://github.com/w1jyun/POS-ISP
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨On the Global Photometric Alignment for Low-Level Vision
📝 Summary:
Photometric alignment loss addresses optimization pathologies in low-level vision by discounting photometric discrepancies through affine color alignment while preserving content restoration. AI-gener...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08172
• PDF: https://arxiv.org/pdf/2604.08172
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Photometric alignment loss addresses optimization pathologies in low-level vision by discounting photometric discrepancies through affine color alignment while preserving content restoration. AI-gener...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08172
• PDF: https://arxiv.org/pdf/2604.08172
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ImplicitMemBench: Measuring Unconscious Behavioral Adaptation in Large Language Models
📝 Summary:
ImplicitMemBench presents a novel benchmark for evaluating implicit memory in LLM agents through procedural memory, priming, and classical conditioning constructs, revealing significant performance ga...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08064
• PDF: https://arxiv.org/pdf/2604.08064
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ImplicitMemBench presents a novel benchmark for evaluating implicit memory in LLM agents through procedural memory, priming, and classical conditioning constructs, revealing significant performance ga...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08064
• PDF: https://arxiv.org/pdf/2604.08064
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Personalizing Text-to-Image Generation to Individual Taste
📝 Summary:
A novel dataset and predictive framework called PAMELA are introduced to model personalized image evaluations by leveraging user-specific ratings across diverse image domains, enabling more accurate p...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07427
• PDF: https://arxiv.org/pdf/2604.07427
• Project Page: https://pamela-bench.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A novel dataset and predictive framework called PAMELA are introduced to model personalized image evaluations by leveraging user-specific ratings across diverse image domains, enabling more accurate p...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07427
• PDF: https://arxiv.org/pdf/2604.07427
• Project Page: https://pamela-bench.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨AnomalyVFM -- Transforming Vision Foundation Models into Zero-Shot Anomaly Detectors
📝 Summary:
AnomalyVFM is a framework that enhances vision foundation models for zero-shot anomaly detection through synthetic dataset generation and parameter-efficient adaptation, achieving superior performance...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20524
• PDF: https://arxiv.org/pdf/2601.20524
• Project Page: https://maticfuc.github.io/anomaly_vfm/
• Github: https://github.com/MaticFuc/AnomalyVFM
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
AnomalyVFM is a framework that enhances vision foundation models for zero-shot anomaly detection through synthetic dataset generation and parameter-efficient adaptation, achieving superior performance...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20524
• PDF: https://arxiv.org/pdf/2601.20524
• Project Page: https://maticfuc.github.io/anomaly_vfm/
• Github: https://github.com/MaticFuc/AnomalyVFM
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
📝 Summary:
VRAG-RL introduces a reinforcement learning framework to empower vision-language models for understanding visually rich information. It uses adaptive visual perception and query optimization to enhance retrieval and reasoning, overcoming limitations of current RAG methods.
🔹 Publication Date: Published on May 28, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.22019
• PDF: https://arxiv.org/pdf/2505.22019
• Github: https://github.com/Alibaba-NLP/VRAG
🔹 Models citing this paper:
• https://huggingface.co/Qiuchen-Wang/Qwen2.5-VL-7B-VRAG
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#RAG #ReinforcementLearning #VisionLanguageModels #ComputerVision #AI
📝 Summary:
VRAG-RL introduces a reinforcement learning framework to empower vision-language models for understanding visually rich information. It uses adaptive visual perception and query optimization to enhance retrieval and reasoning, overcoming limitations of current RAG methods.
🔹 Publication Date: Published on May 28, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.22019
• PDF: https://arxiv.org/pdf/2505.22019
• Github: https://github.com/Alibaba-NLP/VRAG
🔹 Models citing this paper:
• https://huggingface.co/Qiuchen-Wang/Qwen2.5-VL-7B-VRAG
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#RAG #ReinforcementLearning #VisionLanguageModels #ComputerVision #AI
✨Faithful GRPO: Improving Visual Spatial Reasoning in Multimodal Language Models via Constrained Policy Optimization
📝 Summary:
Researchers investigate how reinforcement learning with verifiable rewards can improve visual reasoning accuracy while maintaining logical consistency and visual grounding in multimodal reasoning mode...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08476
• PDF: https://arxiv.org/pdf/2604.08476
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Researchers investigate how reinforcement learning with verifiable rewards can improve visual reasoning accuracy while maintaining logical consistency and visual grounding in multimodal reasoning mode...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08476
• PDF: https://arxiv.org/pdf/2604.08476
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤2
Media is too big
VIEW IN TELEGRAM
✨Small Vision-Language Models are Smart Compressors for Long Video Understanding
📝 Summary:
Tempo is an efficient framework that compresses long videos for multimodal understanding by using a small vision-language model for temporal compression and adaptive token allocation to maintain inten...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08120
• PDF: https://arxiv.org/pdf/2604.08120
• Project Page: https://feielysia.github.io/tempo-page/
• Github: https://feielysia.github.io/tempo-page/
🔹 Models citing this paper:
• https://huggingface.co/Vision-CAIR/Tempo-6B
✨ Spaces citing this paper:
• https://huggingface.co/spaces/Vision-CAIR/Tempo
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Tempo is an efficient framework that compresses long videos for multimodal understanding by using a small vision-language model for temporal compression and adaptive token allocation to maintain inten...
🔹 Publication Date: Published on Apr 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.08120
• PDF: https://arxiv.org/pdf/2604.08120
• Project Page: https://feielysia.github.io/tempo-page/
• Github: https://feielysia.github.io/tempo-page/
🔹 Models citing this paper:
• https://huggingface.co/Vision-CAIR/Tempo-6B
✨ Spaces citing this paper:
• https://huggingface.co/spaces/Vision-CAIR/Tempo
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨CylinderDepth: Cylindrical Spatial Attention for Multi-View Consistent Self-Supervised Surround Depth Estimation
📝 Summary:
A geometry-guided method for multi-camera depth estimation that improves consistency across overlapping images using cylindrical spatial attention mechanisms. AI-generated summary Self-supervised surr...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16428
• PDF: https://arxiv.org/pdf/2511.16428
• Project Page: https://abualhanud.github.io/CylinderDepthPage/
• Github: https://abualhanud.github.io/CylinderDepthPage/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A geometry-guided method for multi-camera depth estimation that improves consistency across overlapping images using cylindrical spatial attention mechanisms. AI-generated summary Self-supervised surr...
🔹 Publication Date: Published on Apr 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16428
• PDF: https://arxiv.org/pdf/2511.16428
• Project Page: https://abualhanud.github.io/CylinderDepthPage/
• Github: https://abualhanud.github.io/CylinderDepthPage/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Training a Student Expert via Semi-Supervised Foundation Model Distillation
📝 Summary:
A semi-supervised distillation framework compresses vision foundation models into compact experts for instance segmentation. It uses limited labeled and abundant unlabeled data, employing a novel instance-aware contrastive loss. The student models outperform their teachers and state-of-the-art SSKD.
🔹 Publication Date: Published on Apr 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.03841
• PDF: https://arxiv.org/pdf/2604.03841
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A semi-supervised distillation framework compresses vision foundation models into compact experts for instance segmentation. It uses limited labeled and abundant unlabeled data, employing a novel instance-aware contrastive loss. The student models outperform their teachers and state-of-the-art SSKD.
🔹 Publication Date: Published on Apr 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.03841
• PDF: https://arxiv.org/pdf/2604.03841
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research