✨HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention
📝 Summary:
HISA improves sparse attention efficiency by replacing the traditional indexer with a hierarchical approach that reduces computational complexity from O(L²) to sub-quadratic scaling while maintaining ...
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28458
• PDF: https://arxiv.org/pdf/2603.28458
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
HISA improves sparse attention efficiency by replacing the traditional indexer with a hierarchical approach that reduces computational complexity from O(L²) to sub-quadratic scaling while maintaining ...
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28458
• PDF: https://arxiv.org/pdf/2603.28458
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MolmoPoint: Better Pointing for VLMs with Grounding Tokens
📝 Summary:
A vision-language model approach for grounding that directly selects visual tokens containing target concepts through specialized pointing tokens, achieving superior performance in image, GUI, video p...
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28069
• PDF: https://arxiv.org/pdf/2603.28069
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A vision-language model approach for grounding that directly selects visual tokens containing target concepts through specialized pointing tokens, achieving superior performance in image, GUI, video p...
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28069
• PDF: https://arxiv.org/pdf/2603.28069
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MOOZY: A Patient-First Foundation Model for Computational Pathology
📝 Summary:
A patient-first pathology foundation model named MOOZY uses a case transformer to model dependencies across multiple slides from the same patient, achieving superior performance on diverse clinical ta...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27048
• PDF: https://arxiv.org/pdf/2603.27048
• Project Page: https://atlasanalyticslab.github.io/MOOZY/
• Github: https://github.com/AtlasAnalyticsLab/MOOZY
🔹 Models citing this paper:
• https://huggingface.co/AtlasAnalyticsLab/MOOZY
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A patient-first pathology foundation model named MOOZY uses a case transformer to model dependencies across multiple slides from the same patient, achieving superior performance on diverse clinical ta...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27048
• PDF: https://arxiv.org/pdf/2603.27048
• Project Page: https://atlasanalyticslab.github.io/MOOZY/
• Github: https://github.com/AtlasAnalyticsLab/MOOZY
🔹 Models citing this paper:
• https://huggingface.co/AtlasAnalyticsLab/MOOZY
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨EpochX: Building the Infrastructure for an Emergent Agent Civilization
📝 Summary:
EpochX is a credits-native marketplace infrastructure designed for human-agent production networks. It enables scalable task delegation and verification, generating reusable skills and workflows. This system fosters cumulative improvement and durable human-agent collaboration through economic inc...
🔹 Publication Date: Published on Mar 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27304
• PDF: https://arxiv.org/pdf/2603.27304
• Project Page: https://epochx.cc
• Github: https://github.com/QuantaAlpha/EpochX
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AIAgents #HumanAICooperation #AIInfrastructure #AIEconomics #EmergentAI
📝 Summary:
EpochX is a credits-native marketplace infrastructure designed for human-agent production networks. It enables scalable task delegation and verification, generating reusable skills and workflows. This system fosters cumulative improvement and durable human-agent collaboration through economic inc...
🔹 Publication Date: Published on Mar 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27304
• PDF: https://arxiv.org/pdf/2603.27304
• Project Page: https://epochx.cc
• Github: https://github.com/QuantaAlpha/EpochX
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AIAgents #HumanAICooperation #AIInfrastructure #AIEconomics #EmergentAI
✨On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers
📝 Summary:
Diffusion transformers often lack visual diversity. This paper introduces on-the-fly repulsion in the contextual space to enhance diversity. It intervenes in multimodal attention during the forward pass, yielding rich outcomes without losing quality or efficiency.
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28762
• PDF: https://arxiv.org/pdf/2603.28762
• Project Page: https://contextual-repulsion.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DiffusionModels #DeepLearning #GenerativeAI #ComputerVision #AIResearch
📝 Summary:
Diffusion transformers often lack visual diversity. This paper introduces on-the-fly repulsion in the contextual space to enhance diversity. It intervenes in multimodal attention during the forward pass, yielding rich outcomes without losing quality or efficiency.
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28762
• PDF: https://arxiv.org/pdf/2603.28762
• Project Page: https://contextual-repulsion.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DiffusionModels #DeepLearning #GenerativeAI #ComputerVision #AIResearch
✨SEAR: Schema-Based Evaluation and Routing for LLM Gateways
📝 Summary:
SEAR is a schema-based system for evaluating and routing LLM responses in gateways. It uses structured signals from LLM reasoning to make accurate, interpretable decisions, unifying evaluation and routing. It achieved significant cost reductions with comparable quality in production.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26728
• PDF: https://arxiv.org/pdf/2603.26728
• Project Page: https://www.strukto.ai/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #AIGateways #AIEvaluation #AIRouting #MachineLearning
📝 Summary:
SEAR is a schema-based system for evaluating and routing LLM responses in gateways. It uses structured signals from LLM reasoning to make accurate, interpretable decisions, unifying evaluation and routing. It achieved significant cost reductions with comparable quality in production.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26728
• PDF: https://arxiv.org/pdf/2603.26728
• Project Page: https://www.strukto.ai/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #AIGateways #AIEvaluation #AIRouting #MachineLearning
✨TAPS: Task Aware Proposal Distributions for Speculative Sampling
📝 Summary:
Speculative decoding quality depends on matching draft model training data to the downstream task. Task-specific training yields specialized drafters that are best combined at inference time using confidence-based routing, outperforming averaging. Confidence is a more effective routing signal tha...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27027
• PDF: https://arxiv.org/pdf/2603.27027
• Github: https://github.com/Moe-Zbeeb/TAPS
🔹 Models citing this paper:
• https://huggingface.co/zbeeb/Hass-MathInstruct_20epochs
• https://huggingface.co/zbeeb/Hass-ShareGPT_20epochs
• https://huggingface.co/zbeeb/Hass-Sharegpt-Mathinstruct-20epochs
✨ Datasets citing this paper:
• https://huggingface.co/datasets/zbeeb/TAPS-Datasets
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SpeculativeDecoding #LLM #MachineLearning #AIResearch #NLP
📝 Summary:
Speculative decoding quality depends on matching draft model training data to the downstream task. Task-specific training yields specialized drafters that are best combined at inference time using confidence-based routing, outperforming averaging. Confidence is a more effective routing signal tha...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27027
• PDF: https://arxiv.org/pdf/2603.27027
• Github: https://github.com/Moe-Zbeeb/TAPS
🔹 Models citing this paper:
• https://huggingface.co/zbeeb/Hass-MathInstruct_20epochs
• https://huggingface.co/zbeeb/Hass-ShareGPT_20epochs
• https://huggingface.co/zbeeb/Hass-Sharegpt-Mathinstruct-20epochs
✨ Datasets citing this paper:
• https://huggingface.co/datasets/zbeeb/TAPS-Datasets
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SpeculativeDecoding #LLM #MachineLearning #AIResearch #NLP
arXiv.org
TAPS: Task Aware Proposal Distributions for Speculative Sampling
Speculative decoding accelerates autoregressive generation by letting a lightweight draft model propose future tokens that a larger target model then verifies in parallel. In practice, however,...
✨KAT-Coder-V2 Technical Report
📝 Summary:
KAT-Coder-V2 is an agentic coding model that uses a 'Specialize-then-Unify' approach across five expert domains. It employs novel training methods and infrastructure, achieving strong performance on SWE-bench, PinchBench, and other coding benchmarks.
🔹 Publication Date: Published on Mar 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27703
• PDF: https://arxiv.org/pdf/2603.27703
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #Coding #LLM #MachineLearning #Research
📝 Summary:
KAT-Coder-V2 is an agentic coding model that uses a 'Specialize-then-Unify' approach across five expert domains. It employs novel training methods and infrastructure, achieving strong performance on SWE-bench, PinchBench, and other coding benchmarks.
🔹 Publication Date: Published on Mar 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27703
• PDF: https://arxiv.org/pdf/2603.27703
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #Coding #LLM #MachineLearning #Research
✨Unified Number-Free Text-to-Motion Generation Via Flow Matching
📝 Summary:
Existing text-to-motion models struggle with variable agents, leading to inefficiency and errors. This paper proposes Unified Motion Flow UMF, a two-stage approach prior and reaction that uses P-Flow and S-Flow in a unified latent space. UMF effectively generates multi-person motion from text, mi...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27040
• PDF: https://arxiv.org/pdf/2603.27040
• Project Page: https://githubhgh.github.io/umf/
• Github: https://github.com/Githubhgh/UMF_CVPR
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TextToMotion #FlowMatching #GenerativeAI #MotionSynthesis #DeepLearning
📝 Summary:
Existing text-to-motion models struggle with variable agents, leading to inefficiency and errors. This paper proposes Unified Motion Flow UMF, a two-stage approach prior and reaction that uses P-Flow and S-Flow in a unified latent space. UMF effectively generates multi-person motion from text, mi...
🔹 Publication Date: Published on Mar 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27040
• PDF: https://arxiv.org/pdf/2603.27040
• Project Page: https://githubhgh.github.io/umf/
• Github: https://github.com/Githubhgh/UMF_CVPR
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TextToMotion #FlowMatching #GenerativeAI #MotionSynthesis #DeepLearning
✨Text Data Integration
📝 Summary:
This paper argues for integrating textual data into data integration systems, as current approaches largely focus on structured data. It will explore the challenges, state-of-the-art, and open problems in utilizing unstructured text.
🔹 Publication Date: Published on Mar 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27055
• PDF: https://arxiv.org/pdf/2603.27055
• Project Page: https://dtim.upc.edu/en
• Github: https://github.com/dtim-upc/THOR
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DataIntegration #UnstructuredData #TextData #NLP #DataScience
📝 Summary:
This paper argues for integrating textual data into data integration systems, as current approaches largely focus on structured data. It will explore the challenges, state-of-the-art, and open problems in utilizing unstructured text.
🔹 Publication Date: Published on Mar 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27055
• PDF: https://arxiv.org/pdf/2603.27055
• Project Page: https://dtim.upc.edu/en
• Github: https://github.com/dtim-upc/THOR
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DataIntegration #UnstructuredData #TextData #NLP #DataScience
✨A Comparative Study in Surgical AI: Datasets, Foundation Models, and Barriers to Med-AGI
📝 Summary:
This paper finds that even state-of-the-art multi-billion parameter AI models struggle with surgical tool detection, a seemingly simple task. Scaling models further offers diminishing returns, suggesting fundamental limitations for current Vision Language Models in surgical use cases beyond just ...
🔹 Publication Date: Published on Mar 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27341
• PDF: https://arxiv.org/pdf/2603.27341
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SurgicalAI #MedicalAI #FoundationModels #VisionLanguageModels #AIHealthcare
📝 Summary:
This paper finds that even state-of-the-art multi-billion parameter AI models struggle with surgical tool detection, a seemingly simple task. Scaling models further offers diminishing returns, suggesting fundamental limitations for current Vision Language Models in surgical use cases beyond just ...
🔹 Publication Date: Published on Mar 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27341
• PDF: https://arxiv.org/pdf/2603.27341
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SurgicalAI #MedicalAI #FoundationModels #VisionLanguageModels #AIHealthcare
This media is not supported in your browser
VIEW IN TELEGRAM
✨HandX: Scaling Bimanual Motion and Interaction Generation
📝 Summary:
HandX presents a new foundation for bimanual hand motion synthesis, offering a high-fidelity dataset, an LLM-driven annotation method, and new evaluation metrics. It enables high-quality dexterous motion generation, with scaling trends observed.
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28766
• PDF: https://arxiv.org/pdf/2603.28766
• Project Page: https://github.com/handx-project/HandX
• Github: https://github.com/handx-project/HandX
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MotionSynthesis #BimanualInteraction #DexterousManipulation #AIResearch #LLM
📝 Summary:
HandX presents a new foundation for bimanual hand motion synthesis, offering a high-fidelity dataset, an LLM-driven annotation method, and new evaluation metrics. It enables high-quality dexterous motion generation, with scaling trends observed.
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28766
• PDF: https://arxiv.org/pdf/2603.28766
• Project Page: https://github.com/handx-project/HandX
• Github: https://github.com/handx-project/HandX
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MotionSynthesis #BimanualInteraction #DexterousManipulation #AIResearch #LLM
✨AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding
📝 Summary:
AdaptToken enables efficient long video understanding for MLLMs by using model uncertainty to dynamically select relevant tokens. It allocates a global token budget and supports early stopping, significantly improving accuracy and reducing inference time across benchmarks.
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28696
• PDF: https://arxiv.org/pdf/2603.28696
• Project Page: https://haozheqi.github.io/adapt-token
• Github: https://github.com/HaozheQi/AdaptToken
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLM #VideoUnderstanding #MachineLearning #AIResearch #TokenSelection
📝 Summary:
AdaptToken enables efficient long video understanding for MLLMs by using model uncertainty to dynamically select relevant tokens. It allocates a global token budget and supports early stopping, significantly improving accuracy and reducing inference time across benchmarks.
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28696
• PDF: https://arxiv.org/pdf/2603.28696
• Project Page: https://haozheqi.github.io/adapt-token
• Github: https://github.com/HaozheQi/AdaptToken
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLM #VideoUnderstanding #MachineLearning #AIResearch #TokenSelection
This media is not supported in your browser
VIEW IN TELEGRAM
✨LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset
📝 Summary:
This paper introduces KITScenes LongTail, a new dataset for long-tail driving events. It offers multi-view video, trajectories, and multilingual expert reasoning traces. This resource improves few-shot generalization and evaluates multimodal models instruction following capabilities.
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23607
• PDF: https://arxiv.org/pdf/2603.23607
• Project Page: https://huggingface.co/datasets/KIT-MRT/KITScenes-LongTail
✨ Datasets citing this paper:
• https://huggingface.co/datasets/KIT-MRT/KITScenes-LongTail
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AutonomousDriving #ComputerVision #Datasets #LongTailLearning #MultimodalAI
📝 Summary:
This paper introduces KITScenes LongTail, a new dataset for long-tail driving events. It offers multi-view video, trajectories, and multilingual expert reasoning traces. This resource improves few-shot generalization and evaluates multimodal models instruction following capabilities.
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23607
• PDF: https://arxiv.org/pdf/2603.23607
• Project Page: https://huggingface.co/datasets/KIT-MRT/KITScenes-LongTail
✨ Datasets citing this paper:
• https://huggingface.co/datasets/KIT-MRT/KITScenes-LongTail
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AutonomousDriving #ComputerVision #Datasets #LongTailLearning #MultimodalAI
Forwarded from Machine Learning with Python
Follow the Machine Learning with Python channel on WhatsApp: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A
❤1
✨ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding
📝 Summary:
ChartNet is a 1.5 million-sample multimodal dataset. It improves AI models chart understanding by providing diverse charts with aligned visual, textual, and numerical data. Fine-tuning on this open-source dataset significantly enhances performance in data visualization.
🔹 Publication Date: Published on Mar 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27064
• PDF: https://arxiv.org/pdf/2603.27064
• Project Page: https://huggingface.co/datasets/ibm-granite/ChartNet
🔹 Models citing this paper:
• https://huggingface.co/ibm-granite/granite-4.0-3b-vision
• https://huggingface.co/beaupi/granite-4.0-3b-vision-oQ8
✨ Datasets citing this paper:
• https://huggingface.co/datasets/ibm-granite/ChartNet
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ChartNet is a 1.5 million-sample multimodal dataset. It improves AI models chart understanding by providing diverse charts with aligned visual, textual, and numerical data. Fine-tuning on this open-source dataset significantly enhances performance in data visualization.
🔹 Publication Date: Published on Mar 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27064
• PDF: https://arxiv.org/pdf/2603.27064
• Project Page: https://huggingface.co/datasets/ibm-granite/ChartNet
🔹 Models citing this paper:
• https://huggingface.co/ibm-granite/granite-4.0-3b-vision
• https://huggingface.co/beaupi/granite-4.0-3b-vision-oQ8
✨ Datasets citing this paper:
• https://huggingface.co/datasets/ibm-granite/ChartNet
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨STRIDE: When to Speak Meets Sequence Denoising for Streaming Video Understanding
📝 Summary:
STRIDE enables proactive video understanding by modeling temporal activation patterns through iterative denoising within sliding windows, improving timing decisions in streaming scenarios. AI-generate...
🔹 Publication Date: Published on Mar 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27593
• PDF: https://arxiv.org/pdf/2603.27593
• Project Page: https://interlive-team.github.io/STRIDE/
• Github: https://interlive-team.github.io/STRIDE/
🔹 Models citing this paper:
• https://huggingface.co/interlive/STRIDE-2B
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
STRIDE enables proactive video understanding by modeling temporal activation patterns through iterative denoising within sliding windows, improving timing decisions in streaming scenarios. AI-generate...
🔹 Publication Date: Published on Mar 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.27593
• PDF: https://arxiv.org/pdf/2603.27593
• Project Page: https://interlive-team.github.io/STRIDE/
• Github: https://interlive-team.github.io/STRIDE/
🔹 Models citing this paper:
• https://huggingface.co/interlive/STRIDE-2B
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SpatialLM: Training Large Language Models for Structured Indoor Modeling
📝 Summary:
SpatialLM, a multimodal large language model, processes 3D point cloud data to generate structured scene understanding outputs, achieving state-of-the-art performance in layout estimation and competit...
🔹 Publication Date: Published on Jun 9, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2506.07491
• PDF: https://arxiv.org/pdf/2506.07491
• Project Page: https://manycore-research.github.io/SpatialLM
• Github: https://github.com/manycore-research/SpatialLM
🔹 Models citing this paper:
• https://huggingface.co/manycore-research/SpatialLM1.1-Qwen-0.5B
• https://huggingface.co/manycore-research/SpatialLM1.1-Llama-1B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/Voxel51/spatial_lm_dataset
• https://huggingface.co/datasets/manycore-research/SpatialLM-Dataset
• https://huggingface.co/datasets/manycore-research/SpatialLM-Testset
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SpatialLM, a multimodal large language model, processes 3D point cloud data to generate structured scene understanding outputs, achieving state-of-the-art performance in layout estimation and competit...
🔹 Publication Date: Published on Jun 9, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2506.07491
• PDF: https://arxiv.org/pdf/2506.07491
• Project Page: https://manycore-research.github.io/SpatialLM
• Github: https://github.com/manycore-research/SpatialLM
🔹 Models citing this paper:
• https://huggingface.co/manycore-research/SpatialLM1.1-Qwen-0.5B
• https://huggingface.co/manycore-research/SpatialLM1.1-Llama-1B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/Voxel51/spatial_lm_dataset
• https://huggingface.co/datasets/manycore-research/SpatialLM-Dataset
• https://huggingface.co/datasets/manycore-research/SpatialLM-Testset
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
SpatialLM: Training Large Language Models for Structured Indoor Modeling
SpatialLM is a large language model designed to process 3D point cloud data and generate structured 3D scene understanding outputs. These outputs include architectural elements like walls, doors,...
This media is not supported in your browser
VIEW IN TELEGRAM
✨INSID3: Training-Free In-Context Segmentation with DINOv3
📝 Summary:
INSID3 demonstrates training-free in-context segmentation using only frozen DINOv3 features. This minimalist approach achieves state-of-the-art results across various segmentation tasks with fewer parameters and no supervision.
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28480
• PDF: https://arxiv.org/pdf/2603.28480
• Project Page: https://visinf.github.io/INSID3/
• Github: https://github.com/visinf/INSID3
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
INSID3 demonstrates training-free in-context segmentation using only frozen DINOv3 features. This minimalist approach achieves state-of-the-art results across various segmentation tasks with fewer parameters and no supervision.
🔹 Publication Date: Published on Mar 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.28480
• PDF: https://arxiv.org/pdf/2603.28480
• Project Page: https://visinf.github.io/INSID3/
• Github: https://github.com/visinf/INSID3
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research