✨FlowAct-R1: Towards Interactive Humanoid Video Generation
📝 Summary:
FlowAct-R1 enables real-time interactive humanoid video generation with high-fidelity synthesis and low-latency responsiveness through MMDiT architecture and chunkwise diffusion forcing strategies. AI...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10103
• PDF: https://arxiv.org/pdf/2601.10103
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
FlowAct-R1 enables real-time interactive humanoid video generation with high-fidelity synthesis and low-latency responsiveness through MMDiT architecture and chunkwise diffusion forcing strategies. AI...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10103
• PDF: https://arxiv.org/pdf/2601.10103
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs
📝 Summary:
Reinforcement learning for large language models is enhanced by a rollout-level objective that rewards rare high-level reasoning strategies, improving diverse solution discovery without sacrificing in...
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08763
• PDF: https://arxiv.org/pdf/2601.08763
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Reinforcement learning for large language models is enhanced by a rollout-level objective that rewards rare high-level reasoning strategies, improving diverse solution discovery without sacrificing in...
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08763
• PDF: https://arxiv.org/pdf/2601.08763
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
📝 Summary:
Multi-Agent Test-Time Reinforcement Learning (MATTRL) enhances multi-agent reasoning through structured textual experience injection and consensus-based decision making at inference time. AI-generated...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09667
• PDF: https://arxiv.org/pdf/2601.09667
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Multi-Agent Test-Time Reinforcement Learning (MATTRL) enhances multi-agent reasoning through structured textual experience injection and consensus-based decision making at inference time. AI-generated...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09667
• PDF: https://arxiv.org/pdf/2601.09667
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨EvasionBench: Detecting Evasive Answers in Financial Q&A via Multi-Model Consensus and LLM-as-Judge
📝 Summary:
EvasionBench introduces a large-scale benchmark for detecting evasive responses in earnings calls using a multi-model annotation framework that leverages disagreement between advanced language models ...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09142
• PDF: https://arxiv.org/pdf/2601.09142
🔹 Models citing this paper:
• https://huggingface.co/FutureMa/Eva-4B
✨ Spaces citing this paper:
• https://huggingface.co/spaces/FutureMa/financial-evasion-detection
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
EvasionBench introduces a large-scale benchmark for detecting evasive responses in earnings calls using a multi-model annotation framework that leverages disagreement between advanced language models ...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09142
• PDF: https://arxiv.org/pdf/2601.09142
🔹 Models citing this paper:
• https://huggingface.co/FutureMa/Eva-4B
✨ Spaces citing this paper:
• https://huggingface.co/spaces/FutureMa/financial-evasion-detection
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback
📝 Summary:
A guardrail model and reasoning framework are developed to detect and prevent unsafe tool invocations in LLM agents, improving both safety and task performance under adversarial conditions. AI-generat...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10156
• PDF: https://arxiv.org/pdf/2601.10156
• Github: https://github.com/MurrayTom/ToolSafe
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A guardrail model and reasoning framework are developed to detect and prevent unsafe tool invocations in LLM agents, improving both safety and task performance under adversarial conditions. AI-generat...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10156
• PDF: https://arxiv.org/pdf/2601.10156
• Github: https://github.com/MurrayTom/ToolSafe
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Transition Matching Distillation for Fast Video Generation
📝 Summary:
Transition Matching Distillation enables efficient video generation by distilling diffusion models into few-step predictors using conditional flows and semantic representation decomposition. AI-genera...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09881
• PDF: https://arxiv.org/pdf/2601.09881
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Transition Matching Distillation enables efficient video generation by distilling diffusion models into few-step predictors using conditional flows and semantic representation decomposition. AI-genera...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09881
• PDF: https://arxiv.org/pdf/2601.09881
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Action100M: A Large-scale Video Action Dataset
📝 Summary:
Action100M is a large-scale video action dataset constructed from internet instructional videos using automated pipelines with V-JEPA embeddings and GPT-based reasoning for structured annotations. AI-...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10592
• PDF: https://arxiv.org/pdf/2601.10592
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Action100M is a large-scale video action dataset constructed from internet instructional videos using automated pipelines with V-JEPA embeddings and GPT-based reasoning for structured annotations. AI-...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10592
• PDF: https://arxiv.org/pdf/2601.10592
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨STEP3-VL-10B Technical Report
📝 Summary:
STEP3-VL-10B is a lightweight 10B multimodal model that rivals much larger models and proprietary flagships in performance. It uses unified pre-training, scaled post-training, and Parallel Coordinated Reasoning for efficient visual reasoning. This open-source model sets a new standard for compact...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09668
• PDF: https://arxiv.org/pdf/2601.09668
• Project Page: https://stepfun-ai.github.io/Step3-VL-10B
• Github: https://github.com/stepfun-ai/Step3-VL-10B
🔹 Models citing this paper:
• https://huggingface.co/stepfun-ai/Step3-VL-10B
• https://huggingface.co/stepfun-ai/Step3-VL-10B-Base
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
STEP3-VL-10B is a lightweight 10B multimodal model that rivals much larger models and proprietary flagships in performance. It uses unified pre-training, scaled post-training, and Parallel Coordinated Reasoning for efficient visual reasoning. This open-source model sets a new standard for compact...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09668
• PDF: https://arxiv.org/pdf/2601.09668
• Project Page: https://stepfun-ai.github.io/Step3-VL-10B
• Github: https://github.com/stepfun-ai/Step3-VL-10B
🔹 Models citing this paper:
• https://huggingface.co/stepfun-ai/Step3-VL-10B
• https://huggingface.co/stepfun-ai/Step3-VL-10B-Base
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨PRL: Process Reward Learning Improves LLMs' Reasoning Ability and Broadens the Reasoning Boundary
📝 Summary:
Process Reward Learning decomposes reinforcement learning objectives into intermediate steps to provide fine-grained supervision for improving large language model reasoning abilities. AI-generated su...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10201
• PDF: https://arxiv.org/pdf/2601.10201
• Github: https://github.com/MaxwellJryao/Process-Reward-Learning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Process Reward Learning decomposes reinforcement learning objectives into intermediate steps to provide fine-grained supervision for improving large language model reasoning abilities. AI-generated su...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10201
• PDF: https://arxiv.org/pdf/2601.10201
• Github: https://github.com/MaxwellJryao/Process-Reward-Learning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning
📝 Summary:
LaViT addresses the perception gap in multimodal reasoning by aligning latent visual thoughts through autoregressive reconstruction of visual semantics and attention trajectories, improving visual gro...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10129
• PDF: https://arxiv.org/pdf/2601.10129
• Github: https://github.com/Svardfox/LaViT
🔹 Models citing this paper:
• https://huggingface.co/Svard/LaViT-3B
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
LaViT addresses the perception gap in multimodal reasoning by aligning latent visual thoughts through autoregressive reconstruction of visual semantics and attention trajectories, improving visual gro...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10129
• PDF: https://arxiv.org/pdf/2601.10129
• Github: https://github.com/Svardfox/LaViT
🔹 Models citing this paper:
• https://huggingface.co/Svard/LaViT-3B
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Deriving Character Logic from Storyline as Codified Decision Trees
📝 Summary:
Executable and interpretable decision trees are induced from narrative data to create robust behavioral profiles for role-playing agents, outperforming traditional methods in consistency and reliabili...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10080
• PDF: https://arxiv.org/pdf/2601.10080
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Executable and interpretable decision trees are induced from narrative data to create robust behavioral profiles for role-playing agents, outperforming traditional methods in consistency and reliabili...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10080
• PDF: https://arxiv.org/pdf/2601.10080
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Urban Socio-Semantic Segmentation with Vision-Language Reasoning
📝 Summary:
SocioReasoner, a vision-language AI, performs urban socio-semantic segmentation of social entities. It simulates human reasoning using reinforcement learning on a new dataset. This approach outperforms state-of-the-art models, achieving strong zero-shot generalization.
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10477
• PDF: https://arxiv.org/pdf/2601.10477
• Github: https://github.com/AMAP-ML/SocioReasoner
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SocioReasoner, a vision-language AI, performs urban socio-semantic segmentation of social entities. It simulates human reasoning using reinforcement learning on a new dataset. This approach outperforms state-of-the-art models, achieving strong zero-shot generalization.
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10477
• PDF: https://arxiv.org/pdf/2601.10477
• Github: https://github.com/AMAP-ML/SocioReasoner
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨LSRIF: Logic-Structured Reinforcement Learning for Instruction Following
📝 Summary:
A logic-structured training framework explicitly models instruction logic through constraint-aware reward mechanisms, improving instruction-following and reasoning capabilities in large language model...
🔹 Publication Date: Published on Jan 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06431
• PDF: https://arxiv.org/pdf/2601.06431
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A logic-structured training framework explicitly models instruction logic through constraint-aware reward mechanisms, improving instruction-following and reasoning capabilities in large language model...
🔹 Publication Date: Published on Jan 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.06431
• PDF: https://arxiv.org/pdf/2601.06431
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts
📝 Summary:
A novel framework injects semantic intent into Mixture-of-Experts routing for image generation and editing, resolving task interference through hierarchical task annotation and predictive alignment re...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08881
• PDF: https://arxiv.org/pdf/2601.08881
• Project Page: https://yuci-gpt.github.io/TAG-MoE/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A novel framework injects semantic intent into Mixture-of-Experts routing for image generation and editing, resolving task interference through hierarchical task annotation and predictive alignment re...
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08881
• PDF: https://arxiv.org/pdf/2601.08881
• Project Page: https://yuci-gpt.github.io/TAG-MoE/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨WildRayZer: Self-supervised Large View Synthesis in Dynamic Environments
📝 Summary:
WildRayZer is a self-supervised framework for novel view synthesis in dynamic environments that uses analysis-by-synthesis to handle moving cameras and objects through motion masking and gradient gati...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10716
• PDF: https://arxiv.org/pdf/2601.10716
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
WildRayZer is a self-supervised framework for novel view synthesis in dynamic environments that uses analysis-by-synthesis to handle moving cameras and objects through motion masking and gradient gati...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10716
• PDF: https://arxiv.org/pdf/2601.10716
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning
📝 Summary:
Existing AI agents for science struggle with static tool libraries. This paper introduces Test-Time Tool Evolution TTE, a new method allowing agents to dynamically create, verify, and evolve tools during inference. TTE achieves state-of-the-art performance and adapts tools across domains.
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07641
• PDF: https://arxiv.org/pdf/2601.07641
• Github: https://github.com/lujiaxuan0520/Test-Time-Tool-Evol
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #ScientificReasoning #ToolEvolution #AgentAI #AIResearch
📝 Summary:
Existing AI agents for science struggle with static tool libraries. This paper introduces Test-Time Tool Evolution TTE, a new method allowing agents to dynamically create, verify, and evolve tools during inference. TTE achieves state-of-the-art performance and adapts tools across domains.
🔹 Publication Date: Published on Jan 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.07641
• PDF: https://arxiv.org/pdf/2601.07641
• Github: https://github.com/lujiaxuan0520/Test-Time-Tool-Evol
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #ScientificReasoning #ToolEvolution #AgentAI #AIResearch
✨Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering
📝 Summary:
ML-Master 2.0 enables ultra-long-horizon AI autonomy for machine learning engineering. It uses Hierarchical Cognitive Caching to accumulate knowledge from execution, decoupling short-term actions from long-term strategy, achieving state-of-the-art results.
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10402
• PDF: https://arxiv.org/pdf/2601.10402
• Project Page: https://sjtu-sai-agents.github.io/ML-Master/
• Github: https://github.com/sjtu-sai-agents/ML-Master
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #MachineLearning #AutonomousAI #AIAgents #CognitiveAI
📝 Summary:
ML-Master 2.0 enables ultra-long-horizon AI autonomy for machine learning engineering. It uses Hierarchical Cognitive Caching to accumulate knowledge from execution, decoupling short-term actions from long-term strategy, achieving state-of-the-art results.
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10402
• PDF: https://arxiv.org/pdf/2601.10402
• Project Page: https://sjtu-sai-agents.github.io/ML-Master/
• Github: https://github.com/sjtu-sai-agents/ML-Master
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #MachineLearning #AutonomousAI #AIAgents #CognitiveAI
✨CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents
📝 Summary:
Computer Use Agents CUAs are vulnerable to prompt injection. This paper introduces Single-Shot Planning, generating a full execution graph before UI observation to ensure control flow integrity. This secures CUAs against instruction injections while maintaining performance, though Branch Steering...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09923
• PDF: https://arxiv.org/pdf/2601.09923
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AgentSecurity #PromptInjection #AIsecurity #Cybersecurity #AIagents
📝 Summary:
Computer Use Agents CUAs are vulnerable to prompt injection. This paper introduces Single-Shot Planning, generating a full execution graph before UI observation to ensure control flow integrity. This secures CUAs against instruction injections while maintaining performance, though Branch Steering...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09923
• PDF: https://arxiv.org/pdf/2601.09923
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AgentSecurity #PromptInjection #AIsecurity #Cybersecurity #AIagents
✨HeartMuLa: A Family of Open Sourced Music Foundation Models
📝 Summary:
HeartMuLa introduces open-source music foundation models for understanding and generation. It features an LLM-based generator creating high-fidelity music with controllable attributes. This system achieves commercial-grade quality using academic resources.
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10547
• PDF: https://arxiv.org/pdf/2601.10547
🔹 Models citing this paper:
• https://huggingface.co/HeartMuLa/HeartMuLa-oss-3B
• https://huggingface.co/HeartMuLa/HeartCodec-oss
• https://huggingface.co/HeartMuLa/HeartTranscriptor-oss
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MusicAI #GenerativeAI #FoundationModels #LLM #OpenSource
📝 Summary:
HeartMuLa introduces open-source music foundation models for understanding and generation. It features an LLM-based generator creating high-fidelity music with controllable attributes. This system achieves commercial-grade quality using academic resources.
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10547
• PDF: https://arxiv.org/pdf/2601.10547
🔹 Models citing this paper:
• https://huggingface.co/HeartMuLa/HeartMuLa-oss-3B
• https://huggingface.co/HeartMuLa/HeartCodec-oss
• https://huggingface.co/HeartMuLa/HeartTranscriptor-oss
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MusicAI #GenerativeAI #FoundationModels #LLM #OpenSource
✨VIBE: Visual Instruction Based Editor
📝 Summary:
VIBE is a compact image editor using a 2B-parameter guidance model and a 1.6B-parameter diffusion model. It achieves high-quality, source-consistent edits with low computational cost, outperforming larger models. VIBE fits in 24GB GPU memory and generates 2K images in 4 seconds.
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02242
• PDF: https://arxiv.org/pdf/2601.02242
• Project Page: https://riko0.github.io/VIBE/
• Github: https://github.com/ai-forever/vibe
🔹 Models citing this paper:
• https://huggingface.co/iitolstykh/VIBE-Image-Edit
✨ Spaces citing this paper:
• https://huggingface.co/spaces/iitolstykh/VIBE-Image-Edit-DEMO
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ImageEditing #DiffusionModels #GenerativeAI #EfficientAI #AI
📝 Summary:
VIBE is a compact image editor using a 2B-parameter guidance model and a 1.6B-parameter diffusion model. It achieves high-quality, source-consistent edits with low computational cost, outperforming larger models. VIBE fits in 24GB GPU memory and generates 2K images in 4 seconds.
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02242
• PDF: https://arxiv.org/pdf/2601.02242
• Project Page: https://riko0.github.io/VIBE/
• Github: https://github.com/ai-forever/vibe
🔹 Models citing this paper:
• https://huggingface.co/iitolstykh/VIBE-Image-Edit
✨ Spaces citing this paper:
• https://huggingface.co/spaces/iitolstykh/VIBE-Image-Edit-DEMO
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ImageEditing #DiffusionModels #GenerativeAI #EfficientAI #AI
✨Alterbute: Editing Intrinsic Attributes of Objects in Images
📝 Summary:
Alterbute is a diffusion method for editing intrinsic object attributes like color or shape, while preserving identity and scene context. It uses a relaxed training objective and Visual Named Entities for scalable, identity-preserving supervision, outperforming existing methods.
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2601.10714
• PDF: https://arxiv.org/pdf/2601.10714
• Project Page: https://talreiss.github.io/alterbute/
• Github: https://talreiss.github.io/alterbute/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Alterbute #DiffusionModels #ImageEditing #ComputerVision #AIResearch
📝 Summary:
Alterbute is a diffusion method for editing intrinsic object attributes like color or shape, while preserving identity and scene context. It uses a relaxed training objective and Visual Named Entities for scalable, identity-preserving supervision, outperforming existing methods.
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2601.10714
• PDF: https://arxiv.org/pdf/2601.10714
• Project Page: https://talreiss.github.io/alterbute/
• Github: https://talreiss.github.io/alterbute/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Alterbute #DiffusionModels #ImageEditing #ComputerVision #AIResearch