✨PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
📝 Summary:
PPTAgent, a two-stage approach, improves presentation generation by analyzing reference presentations and ensuring structural and content consistency, outperforming traditional methods across content,...
🔹 Publication Date: Published on Jan 7, 2025
🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/ICIP/pptagent
• PDF: https://arxiv.org/pdf/2501.03936
• Project Page: https://github.com/icip-cas/PPTAgent
• Github: https://github.com/icip-cas/PPTAgent
✨ Datasets citing this paper:
• https://huggingface.co/datasets/Forceless/Zenodo10K
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
PPTAgent, a two-stage approach, improves presentation generation by analyzing reference presentations and ensuring structural and content consistency, outperforming traditional methods across content,...
🔹 Publication Date: Published on Jan 7, 2025
🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/ICIP/pptagent
• PDF: https://arxiv.org/pdf/2501.03936
• Project Page: https://github.com/icip-cas/PPTAgent
• Github: https://github.com/icip-cas/PPTAgent
✨ Datasets citing this paper:
• https://huggingface.co/datasets/Forceless/Zenodo10K
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Expert Threshold Routing for Autoregressive Language Modeling with Dynamic Computation Allocation and Load Balancing
📝 Summary:
Expert Threshold ET routing dynamically allocates computation in MoE models. Tokens route to experts based on individual scores exceeding EMA thresholds, achieving load balance without auxiliary losses. ET lowers cross-entropy loss by 0.067 compared to Token-choice MoE.
🔹 Publication Date: Published on Mar 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.11535
• PDF: https://arxiv.org/pdf/2603.11535
• Github: https://github.com/MasterGodzilla/Expert-Threshold-Routing
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Expert Threshold ET routing dynamically allocates computation in MoE models. Tokens route to experts based on individual scores exceeding EMA thresholds, achieving load balance without auxiliary losses. ET lowers cross-entropy loss by 0.067 compared to Token-choice MoE.
🔹 Publication Date: Published on Mar 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.11535
• PDF: https://arxiv.org/pdf/2603.11535
• Github: https://github.com/MasterGodzilla/Expert-Threshold-Routing
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning
📝 Summary:
V-JEPA 2.1 is a self-supervised model learning dense visual representations for images and videos. It combines dense predictive loss, deep self-supervision, multi-modal tokenizers, and scaling to achieve state-of-the-art performance across various benchmarks, significantly advancing visual unders...
🔹 Publication Date: Published on Mar 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14482
• PDF: https://arxiv.org/pdf/2603.14482
• Project Page: https://ai.meta.com/blog/v-jepa-2-world-model-benchmarks/
• Github: https://github.com/facebookresearch/vjepa2
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SelfSupervisedLearning #ComputerVision #DeepLearning #AI #VideoUnderstanding
📝 Summary:
V-JEPA 2.1 is a self-supervised model learning dense visual representations for images and videos. It combines dense predictive loss, deep self-supervision, multi-modal tokenizers, and scaling to achieve state-of-the-art performance across various benchmarks, significantly advancing visual unders...
🔹 Publication Date: Published on Mar 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14482
• PDF: https://arxiv.org/pdf/2603.14482
• Project Page: https://ai.meta.com/blog/v-jepa-2-world-model-benchmarks/
• Github: https://github.com/facebookresearch/vjepa2
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SelfSupervisedLearning #ComputerVision #DeepLearning #AI #VideoUnderstanding
✨From Prior to Pro: Efficient Skill Mastery via Distribution Contractive RL Finetuning
📝 Summary:
DICE-RL refines pretrained generative robot policies via reinforcement learning distribution contraction. It boosts high-success behaviors, leading to stable, sample-efficient mastery of complex manipulation from pixels on real robots.
🔹 Publication Date: Published on Mar 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.10263
• PDF: https://arxiv.org/pdf/2603.10263
• Project Page: https://zhanyisun.github.io/dice.rl.2026/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
DICE-RL refines pretrained generative robot policies via reinforcement learning distribution contraction. It boosts high-success behaviors, leading to stable, sample-efficient mastery of complex manipulation from pixels on real robots.
🔹 Publication Date: Published on Mar 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.10263
• PDF: https://arxiv.org/pdf/2603.10263
• Project Page: https://zhanyisun.github.io/dice.rl.2026/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨AdapterTune: Zero-Initialized Low-Rank Adapters for Frozen Vision Transformers
📝 Summary:
AdapterTune introduces zero-initialized low-rank adapters for Vision Transformers, addressing optimization instability and capacity issues. This method prevents representation drift and significantly improves accuracy, often outperforming full fine-tuning with fewer parameters.
🔹 Publication Date: Published on Mar 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14706
• PDF: https://arxiv.org/pdf/2603.14706
• Github: https://github.com/salimkhazem/adaptertune
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
AdapterTune introduces zero-initialized low-rank adapters for Vision Transformers, addressing optimization instability and capacity issues. This method prevents representation drift and significantly improves accuracy, often outperforming full fine-tuning with fewer parameters.
🔹 Publication Date: Published on Mar 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.14706
• PDF: https://arxiv.org/pdf/2603.14706
• Github: https://github.com/salimkhazem/adaptertune
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OSM-based Domain Adaptation for Remote Sensing VLMs
📝 Summary:
A self-contained domain adaptation framework for vision-language models in remote sensing uses OpenStreetMap data and optical character recognition to generate captions without requiring external teac...
🔹 Publication Date: Published on Mar 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.11804
• PDF: https://arxiv.org/pdf/2603.11804
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A self-contained domain adaptation framework for vision-language models in remote sensing uses OpenStreetMap data and optical character recognition to generate captions without requiring external teac...
🔹 Publication Date: Published on Mar 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.11804
• PDF: https://arxiv.org/pdf/2603.11804
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding
📝 Summary:
A video diffusion model is repurposed as a latent world simulator to enhance multimodal large language models with implicit 3D structural priors and physical laws through spatiotemporal feature extrac...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19235
• PDF: https://arxiv.org/pdf/2603.19235
• Project Page: https://github.com/H-EmbodVis/VEGA-3D
• Github: https://github.com/H-EmbodVis/VEGA-3D
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A video diffusion model is repurposed as a latent world simulator to enhance multimodal large language models with implicit 3D structural priors and physical laws through spatiotemporal feature extrac...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19235
• PDF: https://arxiv.org/pdf/2603.19235
• Project Page: https://github.com/H-EmbodVis/VEGA-3D
• Github: https://github.com/H-EmbodVis/VEGA-3D
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing
📝 Summary:
SAMA presents a factorized approach to video editing that separates semantic anchoring from motion modeling, enabling instruction-guided edits with preserved motion through pre-trained motion restorat...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19228
• PDF: https://arxiv.org/pdf/2603.19228
• Project Page: https://cynthiazxy123.github.io/SAMA/
• Github: https://github.com/Cynthiazxy123/SAMA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SAMA presents a factorized approach to video editing that separates semantic anchoring from motion modeling, enabling instruction-guided edits with preserved motion through pre-trained motion restorat...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19228
• PDF: https://arxiv.org/pdf/2603.19228
• Project Page: https://cynthiazxy123.github.io/SAMA/
• Github: https://github.com/Cynthiazxy123/SAMA
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens
📝 Summary:
CubiD is a discrete generation model for high-dimensional representations that enables fine-grained masking and learns rich correlations across spatial positions while maintaining fixed generation ste...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19232
• PDF: https://arxiv.org/pdf/2603.19232
• Github: https://github.com/YuqingWang1029/CubiD
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
CubiD is a discrete generation model for high-dimensional representations that enables fine-grained masking and learns rich correlations across spatial positions while maintaining fixed generation ste...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19232
• PDF: https://arxiv.org/pdf/2603.19232
• Github: https://github.com/YuqingWang1029/CubiD
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Memento-Skills: Let Agents Design Agents
📝 Summary:
A generalist language model agent system autonomously designs and improves task-specific agents through memory-based reinforcement learning with stateful prompts and skill libraries. AI-generated summ...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18743
• PDF: https://arxiv.org/pdf/2603.18743
• Project Page: https://memento.run/
• Github: https://github.com/Memento-Teams/Memento-Skills
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A generalist language model agent system autonomously designs and improves task-specific agents through memory-based reinforcement learning with stateful prompts and skill libraries. AI-generated summ...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18743
• PDF: https://arxiv.org/pdf/2603.18743
• Project Page: https://memento.run/
• Github: https://github.com/Memento-Teams/Memento-Skills
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World
📝 Summary:
F2LLM-v2 is a multilingual embedding model family trained on 60 million samples across 200+ languages, achieving superior performance through LLM-based training, matryoshka learning, pruning, and dist...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19223
• PDF: https://arxiv.org/pdf/2603.19223
• Project Page: https://huggingface.co/collections/codefuse-ai/f2llm
🔹 Models citing this paper:
• https://huggingface.co/codefuse-ai/F2LLM-v2-8B-Preview
• https://huggingface.co/codefuse-ai/F2LLM-v2-0.6B-Preview
• https://huggingface.co/codefuse-ai/F2LLM-v2-1.7B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/codefuse-ai/F2LLM-v2
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
F2LLM-v2 is a multilingual embedding model family trained on 60 million samples across 200+ languages, achieving superior performance through LLM-based training, matryoshka learning, pruning, and dist...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19223
• PDF: https://arxiv.org/pdf/2603.19223
• Project Page: https://huggingface.co/collections/codefuse-ai/f2llm
🔹 Models citing this paper:
• https://huggingface.co/codefuse-ai/F2LLM-v2-8B-Preview
• https://huggingface.co/codefuse-ai/F2LLM-v2-0.6B-Preview
• https://huggingface.co/codefuse-ai/F2LLM-v2-1.7B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/codefuse-ai/F2LLM-v2
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs
📝 Summary:
Long-form audio-visual comprehension benchmark reveals significant challenges for current omnimodal large language models in handling extended multi-modal inputs. AI-generated summary Recent advanceme...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19217
• PDF: https://arxiv.org/pdf/2603.19217
• Project Page: https://kd-tao.github.io/LVOmniBench/
• Github: https://github.com/KD-TAO/LVOmniBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Long-form audio-visual comprehension benchmark reveals significant challenges for current omnimodal large language models in handling extended multi-modal inputs. AI-generated summary Recent advanceme...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19217
• PDF: https://arxiv.org/pdf/2603.19217
• Project Page: https://kd-tao.github.io/LVOmniBench/
• Github: https://github.com/KD-TAO/LVOmniBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨FASTER: Rethinking Real-Time Flow VLAs
📝 Summary:
Fast Action Sampling for ImmediaTE Reaction (FASTER) reduces real-time reaction latency in Vision-Language-Action models by adapting sampling schedules to prioritize immediate actions while maintainin...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19199
• PDF: https://arxiv.org/pdf/2603.19199
• Project Page: https://innovator-zero.github.io/FASTER
• Github: https://github.com/innovator-zero/FASTER
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Fast Action Sampling for ImmediaTE Reaction (FASTER) reduces real-time reaction latency in Vision-Language-Action models by adapting sampling schedules to prioritize immediate actions while maintainin...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19199
• PDF: https://arxiv.org/pdf/2603.19199
• Project Page: https://innovator-zero.github.io/FASTER
• Github: https://github.com/innovator-zero/FASTER
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
📝 Summary:
Nemotron-Cascade 2 is a 30B parameter Mixture-of-Experts model with 3B activated parameters that achieves exceptional reasoning and agentic capabilities, matching frontier open models despite its comp...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19220
• PDF: https://arxiv.org/pdf/2603.19220
🔹 Models citing this paper:
• https://huggingface.co/nvidia/Nemotron-Cascade-2-30B-A3B
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Nemotron-Cascade 2 is a 30B parameter Mixture-of-Experts model with 3B activated parameters that achieves exceptional reasoning and agentic capabilities, matching frontier open models despite its comp...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19220
• PDF: https://arxiv.org/pdf/2603.19220
🔹 Models citing this paper:
• https://huggingface.co/nvidia/Nemotron-Cascade-2-30B-A3B
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning
📝 Summary:
Modulated Hazard-aware Policy Optimization introduces a Log-Fidelity Modulator and Decoupled Hazard Penalty to stabilize reinforcement learning by controlling importance ratios and regulating asymmetr...
🔹 Publication Date: Published on Mar 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16929
• PDF: https://arxiv.org/pdf/2603.16929
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Modulated Hazard-aware Policy Optimization introduces a Log-Fidelity Modulator and Decoupled Hazard Penalty to stabilize reinforcement learning by controlling importance ratios and regulating asymmetr...
🔹 Publication Date: Published on Mar 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.16929
• PDF: https://arxiv.org/pdf/2603.16929
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Matryoshka Gaussian Splatting
📝 Summary:
Matryoshka Gaussian Splatting enables continuous level of detail rendering by training a single ordered set of Gaussians that maintains full-capacity quality while allowing smooth quality-scaling trad...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19234
• PDF: https://arxiv.org/pdf/2603.19234
• Github: https://github.com/ZhilinGuo/matryoshka-gaussian-splatting
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Matryoshka Gaussian Splatting enables continuous level of detail rendering by training a single ordered set of Gaussians that maintains full-capacity quality while allowing smooth quality-scaling trad...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19234
• PDF: https://arxiv.org/pdf/2603.19234
• Github: https://github.com/ZhilinGuo/matryoshka-gaussian-splatting
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Reasoning over mathematical objects: on-policy reward modeling and test time aggregation
📝 Summary:
This paper introduces Principia, a new dataset for deriving mathematical objects, and training recipes using on-policy LLM judges. These methods significantly improve model performance and enable cross-format generalization in reasoning tasks, while also scaling test-time compute.
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18886
• PDF: https://arxiv.org/pdf/2603.18886
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
This paper introduces Principia, a new dataset for deriving mathematical objects, and training recipes using on-policy LLM judges. These methods significantly improve model performance and enable cross-format generalization in reasoning tasks, while also scaling test-time compute.
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18886
• PDF: https://arxiv.org/pdf/2603.18886
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents
📝 Summary:
Reinforcement learning infrastructure for multi-turn LLM agents that provides scalable rollout services and standardized sandbox environments for complex interactive tasks. AI-generated summary Multi-...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18815
• PDF: https://arxiv.org/pdf/2603.18815
• Github: https://github.com/NVIDIA-NeMo/ProRL-Agent-Server
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Reinforcement learning infrastructure for multi-turn LLM agents that provides scalable rollout services and standardized sandbox environments for complex interactive tasks. AI-generated summary Multi-...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18815
• PDF: https://arxiv.org/pdf/2603.18815
• Github: https://github.com/NVIDIA-NeMo/ProRL-Agent-Server
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨COT-FM: Cluster-wise Optimal Transport Flow Matching
📝 Summary:
COT-FM enhances Flow Matching by clustering target samples and assigning dedicated source distributions. This creates straighter probability paths, enabling faster and more reliable generation with improved quality across diverse tasks.
🔹 Publication Date: Published on Mar 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13395
• PDF: https://arxiv.org/pdf/2603.13395
• Project Page: https://embodiedai-ntu.github.io/cotfm/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
COT-FM enhances Flow Matching by clustering target samples and assigning dedicated source distributions. This creates straighter probability paths, enabling faster and more reliable generation with improved quality across diverse tasks.
🔹 Publication Date: Published on Mar 11
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.13395
• PDF: https://arxiv.org/pdf/2603.13395
• Project Page: https://embodiedai-ntu.github.io/cotfm/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer
📝 Summary:
A three-stage framework bridges semantic and kinematic conditions using discrete tokens and diffusion synthesis. Its core MoTok tokenizer achieves compact high-fidelity tokens, significantly boosting controllability, fidelity, and reducing token usage under strong kinematic constraints.
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19227
• PDF: https://arxiv.org/pdf/2603.19227
• Project Page: https://rheallyc.github.io/projects/motok/
• Github: https://github.com/rheallyc/MoTok
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A three-stage framework bridges semantic and kinematic conditions using discrete tokens and diffusion synthesis. Its core MoTok tokenizer achieves compact high-fidelity tokens, significantly boosting controllability, fidelity, and reducing token usage under strong kinematic constraints.
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19227
• PDF: https://arxiv.org/pdf/2603.19227
• Project Page: https://rheallyc.github.io/projects/motok/
• Github: https://github.com/rheallyc/MoTok
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding
📝 Summary:
Top-tier MLLMs demonstrate limited capability in processing discrete symbols despite strong performance in complex reasoning, revealing a cognitive mismatch between visual perception and symbolic unde...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18472
• PDF: https://arxiv.org/pdf/2603.18472
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Top-tier MLLMs demonstrate limited capability in processing discrete symbols despite strong performance in complex reasoning, revealing a cognitive mismatch between visual perception and symbolic unde...
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18472
• PDF: https://arxiv.org/pdf/2603.18472
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research