✨Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
📝 Summary:
Transparent objects are hard for perception. This work observes video diffusion models can synthesize transparent phenomena, so they repurpose one. Their DKT model, trained on a new dataset, achieves zero-shot SOTA for depth and normal estimation of transparent objects, proving diffusion knows tr...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23705
• PDF: https://arxiv.org/pdf/2512.23705
• Project Page: https://daniellli.github.io/projects/DKT/
• Github: https://github.com/Daniellli/DKT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ComputerVision #DiffusionModels #DepthEstimation #TransparentObjects #AIResearch
📝 Summary:
Transparent objects are hard for perception. This work observes video diffusion models can synthesize transparent phenomena, so they repurpose one. Their DKT model, trained on a new dataset, achieves zero-shot SOTA for depth and normal estimation of transparent objects, proving diffusion knows tr...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23705
• PDF: https://arxiv.org/pdf/2512.23705
• Project Page: https://daniellli.github.io/projects/DKT/
• Github: https://github.com/Daniellli/DKT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ComputerVision #DiffusionModels #DepthEstimation #TransparentObjects #AIResearch
✨SpotEdit: Selective Region Editing in Diffusion Transformers
📝 Summary:
SpotEdit is a training-free framework for selective image editing in diffusion transformers. It avoids reprocessing stable regions by reusing their features, combining them with edited areas. This reduces computation and preserves unchanged regions, enhancing efficiency and precision.
🔹 Publication Date: Published on Dec 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22323
• PDF: https://arxiv.org/pdf/2512.22323
• Project Page: https://biangbiang0321.github.io/SpotEdit.github.io
• Github: https://biangbiang0321.github.io/SpotEdit.github.io
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ImageEditing #DiffusionModels #ComputerVision #AIResearch #DeepLearning
📝 Summary:
SpotEdit is a training-free framework for selective image editing in diffusion transformers. It avoids reprocessing stable regions by reusing their features, combining them with edited areas. This reduces computation and preserves unchanged regions, enhancing efficiency and precision.
🔹 Publication Date: Published on Dec 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22323
• PDF: https://arxiv.org/pdf/2512.22323
• Project Page: https://biangbiang0321.github.io/SpotEdit.github.io
• Github: https://biangbiang0321.github.io/SpotEdit.github.io
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ImageEditing #DiffusionModels #ComputerVision #AIResearch #DeepLearning
✨Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone
📝 Summary:
Dream-VL and Dream-VLA are diffusion-based vision-language and vision-language-action models. They achieve state-of-the-art performance in visual planning and robotic control, surpassing autoregressive baselines via their diffusion backbone's superior action generation.
🔹 Publication Date: Published on Dec 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22615
• PDF: https://arxiv.org/pdf/2512.22615
• Project Page: https://hkunlp.github.io/blog/2025/dream-vlx/
• Github: https://github.com/DreamLM/Dream-VLX
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VisionLanguageModels #DiffusionModels #Robotics #AI #ComputerVision
📝 Summary:
Dream-VL and Dream-VLA are diffusion-based vision-language and vision-language-action models. They achieve state-of-the-art performance in visual planning and robotic control, surpassing autoregressive baselines via their diffusion backbone's superior action generation.
🔹 Publication Date: Published on Dec 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22615
• PDF: https://arxiv.org/pdf/2512.22615
• Project Page: https://hkunlp.github.io/blog/2025/dream-vlx/
• Github: https://github.com/DreamLM/Dream-VLX
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VisionLanguageModels #DiffusionModels #Robotics #AI #ComputerVision
✨GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models
📝 Summary:
GRAN-TED improves text encoders for diffusion models by addressing evaluation and adaptation challenges. It introduces TED-6K, an efficient text-only benchmark that predicts generation quality 750x faster. Using this, GRAN-TED develops a superior encoder via a two-stage training method, enhancing...
🔹 Publication Date: Published on Dec 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15560
• PDF: https://arxiv.org/pdf/2512.15560
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DiffusionModels #TextEmbeddings #AIResearch #MachineLearning #NLP
📝 Summary:
GRAN-TED improves text encoders for diffusion models by addressing evaluation and adaptation challenges. It introduces TED-6K, an efficient text-only benchmark that predicts generation quality 750x faster. Using this, GRAN-TED develops a superior encoder via a two-stage training method, enhancing...
🔹 Publication Date: Published on Dec 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15560
• PDF: https://arxiv.org/pdf/2512.15560
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DiffusionModels #TextEmbeddings #AIResearch #MachineLearning #NLP
✨Act2Goal: From World Model To General Goal-conditioned Policy
📝 Summary:
Act2Goal is a new policy for robust long-horizon robotic manipulation. It uses a goal-conditioned visual world model with multi-scale temporal control to plan intermediate states and execute precisely. This allows strong generalization and rapid online adaptation, significantly boosting real-robo...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23541
• PDF: https://arxiv.org/pdf/2512.23541
• Project Page: https://act2goal.github.io/
• Github: https://act2goal.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Robotics #AI #MachineLearning #WorldModels #ReinforcementLearning
📝 Summary:
Act2Goal is a new policy for robust long-horizon robotic manipulation. It uses a goal-conditioned visual world model with multi-scale temporal control to plan intermediate states and execute precisely. This allows strong generalization and rapid online adaptation, significantly boosting real-robo...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23541
• PDF: https://arxiv.org/pdf/2512.23541
• Project Page: https://act2goal.github.io/
• Github: https://act2goal.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Robotics #AI #MachineLearning #WorldModels #ReinforcementLearning
This media is not supported in your browser
VIEW IN TELEGRAM
✨Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion
📝 Summary:
Diffusion-based video super-resolution (VSR) methods achieve strong perceptual quality but remain impractical for latency-sensitive settings due to reliance on future frames and expensive multi-step d...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23709
• PDF: https://arxiv.org/pdf/2512.23709
• Project Page: https://jamichss.github.io/stream-diffvsr-project-page/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Diffusion-based video super-resolution (VSR) methods achieve strong perceptual quality but remain impractical for latency-sensitive settings due to reliance on future frames and expensive multi-step d...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23709
• PDF: https://arxiv.org/pdf/2512.23709
• Project Page: https://jamichss.github.io/stream-diffvsr-project-page/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Web World Models
📝 Summary:
Web World Models (WWMs) combine web frameworks with large language models to create controllable, open-ended persistent environments by structuring world state in web code and leveraging model-driven ...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23676
• PDF: https://arxiv.org/pdf/2512.23676
• Project Page: https://github.com/Princeton-AI2-Lab/Web-World-Models
• Github: https://github.com/Princeton-AI2-Lab/Web-World-Models
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Web World Models (WWMs) combine web frameworks with large language models to create controllable, open-ended persistent environments by structuring world state in web code and leveraging model-driven ...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23676
• PDF: https://arxiv.org/pdf/2512.23676
• Project Page: https://github.com/Princeton-AI2-Lab/Web-World-Models
• Github: https://github.com/Princeton-AI2-Lab/Web-World-Models
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨DiRL: An Efficient Post-Training Framework for Diffusion Language Models
📝 Summary:
DiRL is an efficient post-training framework for Diffusion Language Models, integrating online updates and introducing DiPO for unbiased policy optimization. It achieves state-of-the-art math performance for dLLMs, surpassing comparable models.
🔹 Publication Date: Published on Dec 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22234
• PDF: https://arxiv.org/pdf/2512.22234
• Github: https://github.com/OpenMOSS/DiRL
🔹 Models citing this paper:
• https://huggingface.co/OpenMOSS-Team/DiRL-8B-Instruct
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DiffusionModels #LLM #ModelOptimization #MachineLearning #AI
📝 Summary:
DiRL is an efficient post-training framework for Diffusion Language Models, integrating online updates and introducing DiPO for unbiased policy optimization. It achieves state-of-the-art math performance for dLLMs, surpassing comparable models.
🔹 Publication Date: Published on Dec 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22234
• PDF: https://arxiv.org/pdf/2512.22234
• Github: https://github.com/OpenMOSS/DiRL
🔹 Models citing this paper:
• https://huggingface.co/OpenMOSS-Team/DiRL-8B-Instruct
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DiffusionModels #LLM #ModelOptimization #MachineLearning #AI
✨Video-BrowseComp: Benchmarking Agentic Video Research on Open Web
📝 Summary:
The paper introduces Video-BrowseComp, a benchmark for agentic video research on the open web, addressing the gap in current passive video processing. It requires navigating video timelines for answers, revealing that advanced models struggle with metadata-sparse video content, achieving only 15....
🔹 Publication Date: Published on Dec 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23044
• PDF: https://arxiv.org/pdf/2512.23044
• Project Page: https://liang-zhengyang.github.io/video-browsecomp/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
The paper introduces Video-BrowseComp, a benchmark for agentic video research on the open web, addressing the gap in current passive video processing. It requires navigating video timelines for answers, revealing that advanced models struggle with metadata-sparse video content, achieving only 15....
🔹 Publication Date: Published on Dec 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23044
• PDF: https://arxiv.org/pdf/2512.23044
• Project Page: https://liang-zhengyang.github.io/video-browsecomp/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Monadic Context Engineering
📝 Summary:
The proliferation of Large Language Models (LLMs) has catalyzed a shift towards autonomous agents capable of complex reasoning and tool use. However, current agent architectures are frequently constru...
🔹 Publication Date: Published on Dec 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22431
• PDF: https://arxiv.org/pdf/2512.22431
• Project Page: https://yifanzhang-pro.github.io/monadic-context-engineering/
• Github: https://github.com/yifanzhang-pro/monadic-context-engineering
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
The proliferation of Large Language Models (LLMs) has catalyzed a shift towards autonomous agents capable of complex reasoning and tool use. However, current agent architectures are frequently constru...
🔹 Publication Date: Published on Dec 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22431
• PDF: https://arxiv.org/pdf/2512.22431
• Project Page: https://yifanzhang-pro.github.io/monadic-context-engineering/
• Github: https://github.com/yifanzhang-pro/monadic-context-engineering
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Training AI Co-Scientists Using Rubric Rewards
📝 Summary:
AI co-scientists are emerging as a tool to assist human researchers in achieving their research goals. A crucial feature of these AI co-scientists is the ability to generate a research plan given a se...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23707
• PDF: https://arxiv.org/pdf/2512.23707
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
AI co-scientists are emerging as a tool to assist human researchers in achieving their research goals. A crucial feature of these AI co-scientists is the ability to generate a research plan given a se...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23707
• PDF: https://arxiv.org/pdf/2512.23707
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Nested Browser-Use Learning for Agentic Information Seeking
📝 Summary:
Information-seeking (IS) agents have achieved strong performance across a range of wide and deep search tasks, yet their tool use remains largely restricted to API-level snippet retrieval and URL-base...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23647
• PDF: https://arxiv.org/pdf/2512.23647
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Information-seeking (IS) agents have achieved strong performance across a range of wide and deep search tasks, yet their tool use remains largely restricted to API-level snippet retrieval and URL-base...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23647
• PDF: https://arxiv.org/pdf/2512.23647
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding
📝 Summary:
Omnimodal large language models have made significant strides in unifying audio and visual modalities; however, they often lack the fine-grained cross-modal understanding and have difficulty with mult...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23646
• PDF: https://arxiv.org/pdf/2512.23646
• Project Page: https://kd-tao.github.io/OmniAgent/
• Github: https://kd-tao.github.io/OmniAgent/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Omnimodal large language models have made significant strides in unifying audio and visual modalities; however, they often lack the fine-grained cross-modal understanding and have difficulty with mult...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23646
• PDF: https://arxiv.org/pdf/2512.23646
• Project Page: https://kd-tao.github.io/OmniAgent/
• Github: https://kd-tao.github.io/OmniAgent/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨SurgWorld: Learning Surgical Robot Policies from Videos via World Modeling
📝 Summary:
Data scarcity remains a fundamental barrier to achieving fully autonomous surgical robots. While large scale vision language action (VLA) models have shown impressive generalization in household and i...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23162
• PDF: https://arxiv.org/pdf/2512.23162
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Data scarcity remains a fundamental barrier to achieving fully autonomous surgical robots. While large scale vision language action (VLA) models have shown impressive generalization in household and i...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23162
• PDF: https://arxiv.org/pdf/2512.23162
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨An Information Theoretic Perspective on Agentic System Design
📝 Summary:
Agentic language model (LM) systems power modern applications like "Deep Research" and "Claude Code," and leverage multi-LM architectures to overcome context limitations. Beneath their apparent divers...
🔹 Publication Date: Published on Dec 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21720
• PDF: https://arxiv.org/pdf/2512.21720
• Project Page: https://hazyresearch.stanford.edu/blog/2025-12-29-agentic-it
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Agentic language model (LM) systems power modern applications like "Deep Research" and "Claude Code," and leverage multi-LM architectures to overcome context limitations. Beneath their apparent divers...
🔹 Publication Date: Published on Dec 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21720
• PDF: https://arxiv.org/pdf/2512.21720
• Project Page: https://hazyresearch.stanford.edu/blog/2025-12-29-agentic-it
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
Media is too big
VIEW IN TELEGRAM
✨VL-LN Bench: Towards Long-horizon Goal-oriented Navigation with Active Dialogs
📝 Summary:
A new benchmark for dialog-enabled navigation tasks introduces interactive learning to resolve ambiguous instructions through active dialog, enhancing real-world applicability of embodied agents. AI-g...
🔹 Publication Date: Published on Dec 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22342
• PDF: https://arxiv.org/pdf/2512.22342
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A new benchmark for dialog-enabled navigation tasks introduces interactive learning to resolve ambiguous instructions through active dialog, enhancing real-world applicability of embodied agents. AI-g...
🔹 Publication Date: Published on Dec 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22342
• PDF: https://arxiv.org/pdf/2512.22342
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨ProGuard: Towards Proactive Multimodal Safeguard
📝 Summary:
The rapid evolution of generative models has led to a continuous emergence of multimodal safety risks, exposing the limitations of existing defense methods. To address these challenges, we propose Pro...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23573
• PDF: https://arxiv.org/pdf/2512.23573
• Project Page: https://yushaohan.github.io/ProGuard/
• Github: https://huggingface.co/collections/yushaohan/proguard
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
The rapid evolution of generative models has led to a continuous emergence of multimodal safety risks, exposing the limitations of existing defense methods. To address these challenges, we propose Pro...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23573
• PDF: https://arxiv.org/pdf/2512.23573
• Project Page: https://yushaohan.github.io/ProGuard/
• Github: https://huggingface.co/collections/yushaohan/proguard
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Quantile Rendering: Efficiently Embedding High-dimensional Feature on 3D Gaussian Splatting
📝 Summary:
Recent advancements in computer vision have successfully extended Open-vocabulary segmentation (OVS) to the 3D domain by leveraging 3D Gaussian Splatting (3D-GS). Despite this progress, efficiently re...
🔹 Publication Date: Published on Dec 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20927
• PDF: https://arxiv.org/pdf/2512.20927
• Project Page: https://jaesung-choe.github.io/qrender/index.html
• Github: https://jaesung-choe.github.io/qrender/index.html
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Recent advancements in computer vision have successfully extended Open-vocabulary segmentation (OVS) to the 3D domain by leveraging 3D Gaussian Splatting (3D-GS). Despite this progress, efficiently re...
🔹 Publication Date: Published on Dec 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20927
• PDF: https://arxiv.org/pdf/2512.20927
• Project Page: https://jaesung-choe.github.io/qrender/index.html
• Github: https://jaesung-choe.github.io/qrender/index.html
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation
📝 Summary:
Real-time portrait animation is essential for interactive applications such as virtual assistants and live avatars, requiring high visual fidelity, temporal coherence, ultra-low latency, and responsiv...
🔹 Publication Date: Published on Dec 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21734
• PDF: https://arxiv.org/pdf/2512.21734
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Real-time portrait animation is essential for interactive applications such as virtual assistants and live avatars, requiring high visual fidelity, temporal coherence, ultra-low latency, and responsiv...
🔹 Publication Date: Published on Dec 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21734
• PDF: https://arxiv.org/pdf/2512.21734
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Bridging Your Imagination with Audio-Video Generation via a Unified Director
📝 Summary:
A unified director model leveraging a Mixture-of-Transformers architecture with interleaved and disentangled learning generates coherent video scripts and consistent keyframes through a single framewo...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23222
• PDF: https://arxiv.org/pdf/2512.23222
• Project Page: https://kebii.github.io/UniMAGE/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A unified director model leveraging a Mixture-of-Transformers architecture with interleaved and disentangled learning generates coherent video scripts and consistent keyframes through a single framewo...
🔹 Publication Date: Published on Dec 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.23222
• PDF: https://arxiv.org/pdf/2512.23222
• Project Page: https://kebii.github.io/UniMAGE/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Reverse Personalization
📝 Summary:
A reverse personalization framework using conditional diffusion inversion enables attribute-controllable face anonymization, balancing identity removal and image quality. AI-generated summary Recent t...
🔹 Publication Date: Published on Dec 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22984
• PDF: https://arxiv.org/pdf/2512.22984
• Github: https://github.com/hanweikung/reverse-personalization
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A reverse personalization framework using conditional diffusion inversion enables attribute-controllable face anonymization, balancing identity removal and image quality. AI-generated summary Recent t...
🔹 Publication Date: Published on Dec 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22984
• PDF: https://arxiv.org/pdf/2512.22984
• Github: https://github.com/hanweikung/reverse-personalization
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1