✨Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation
📝 Summary:
MLLMs struggle with hallucinations on counterfactual videos. DualityForge synthesizes counterfactual video data and QA pairs through diffusion-based editing to address this. This method significantly reduces model hallucinations and improves general performance.
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24271
• PDF: https://arxiv.org/pdf/2512.24271
• Project Page: https://amap-ml.github.io/Taming-Hallucinations/
• Github: https://github.com/AMAP-ML/Taming-Hallucinations
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLMs #VideoUnderstanding #AIHallucinations #GenerativeAI #MachineLearning
📝 Summary:
MLLMs struggle with hallucinations on counterfactual videos. DualityForge synthesizes counterfactual video data and QA pairs through diffusion-based editing to address this. This method significantly reduces model hallucinations and improves general performance.
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24271
• PDF: https://arxiv.org/pdf/2512.24271
• Project Page: https://amap-ml.github.io/Taming-Hallucinations/
• Github: https://github.com/AMAP-ML/Taming-Hallucinations
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLMs #VideoUnderstanding #AIHallucinations #GenerativeAI #MachineLearning
This media is not supported in your browser
VIEW IN TELEGRAM
✨NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
📝 Summary:
NeoVerse is a 4D world model for reconstruction and video generation. It scales to in-the-wild monocular videos using pose-free feed-forward reconstruction and online degradation simulation, achieving state-of-the-art performance.
🔹 Publication Date: Published on Jan 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00393
• PDF: https://arxiv.org/pdf/2601.00393
• Project Page: https://neoverse-4d.github.io/
• Github: https://neoverse-4d.github.io
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#4DWorldModel #VideoGeneration #ComputerVision #DeepLearning #AI
📝 Summary:
NeoVerse is a 4D world model for reconstruction and video generation. It scales to in-the-wild monocular videos using pose-free feed-forward reconstruction and online degradation simulation, achieving state-of-the-art performance.
🔹 Publication Date: Published on Jan 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00393
• PDF: https://arxiv.org/pdf/2601.00393
• Project Page: https://neoverse-4d.github.io/
• Github: https://neoverse-4d.github.io
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#4DWorldModel #VideoGeneration #ComputerVision #DeepLearning #AI
✨MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing
📝 Summary:
MorphAny3D offers a training-free framework for high-quality 3D morphing, even across categories. It leverages Structured Latent representations with novel attention mechanisms MCA, TFSA for structural coherence and temporal consistency. This achieves state-of-the-art results and supports advance...
🔹 Publication Date: Published on Jan 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00204
• PDF: https://arxiv.org/pdf/2601.00204
• Project Page: https://xiaokunsun.github.io/MorphAny3D.github.io
• Github: https://github.com/XiaokunSun/MorphAny3D
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#3DMorphing #ComputerGraphics #DeepLearning #StructuredLatent #AIResearch
📝 Summary:
MorphAny3D offers a training-free framework for high-quality 3D morphing, even across categories. It leverages Structured Latent representations with novel attention mechanisms MCA, TFSA for structural coherence and temporal consistency. This achieves state-of-the-art results and supports advance...
🔹 Publication Date: Published on Jan 1
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00204
• PDF: https://arxiv.org/pdf/2601.00204
• Project Page: https://xiaokunsun.github.io/MorphAny3D.github.io
• Github: https://github.com/XiaokunSun/MorphAny3D
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#3DMorphing #ComputerGraphics #DeepLearning #StructuredLatent #AIResearch
✨Nested Learning: The Illusion of Deep Learning Architectures
📝 Summary:
Nested Learning NL models ML as nested optimization problems. It enables expressive algorithms for higher-order learning and continual adaptation, introducing optimizers, self-modifying models, and continuum memory systems.
🔹 Publication Date: Published on Dec 31, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24695
• PDF: https://arxiv.org/pdf/2512.24695
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#NestedLearning #MachineLearning #DeepLearning #Optimization #AI
📝 Summary:
Nested Learning NL models ML as nested optimization problems. It enables expressive algorithms for higher-order learning and continual adaptation, introducing optimizers, self-modifying models, and continuum memory systems.
🔹 Publication Date: Published on Dec 31, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24695
• PDF: https://arxiv.org/pdf/2512.24695
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#NestedLearning #MachineLearning #DeepLearning #Optimization #AI
nature papers: 1400$
Q1 and Q2 papers 900$
Q3 and Q4 papers 500$
Doctoral thesis (complete) 700$
M.S thesis 300$
paper simulation 200$
Contact me
https://t.iss.one/m/-nTmpj5vYzNk
Q1 and Q2 papers 900$
Q3 and Q4 papers 500$
Doctoral thesis (complete) 700$
M.S thesis 300$
paper simulation 200$
Contact me
https://t.iss.one/m/-nTmpj5vYzNk
ML Research Hub pinned «nature papers: 1400$ Q1 and Q2 papers 900$ Q3 and Q4 papers 500$ Doctoral thesis (complete) 700$ M.S thesis 300$ paper simulation 200$ Contact me https://t.iss.one/m/-nTmpj5vYzNk»
This media is not supported in your browser
VIEW IN TELEGRAM
✨AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction
📝 Summary:
AdaGaR reconstructs dynamic 3D scenes from monocular video. It introduces an Adaptive Gabor Representation for detail and stability, and Cubic Hermite Splines for temporal continuity. This method achieves state-of-the-art performance.
🔹 Publication Date: Published on Jan 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00796
• PDF: https://arxiv.org/pdf/2601.00796
• Project Page: https://jiewenchan.github.io/AdaGaR/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#3DReconstruction #ComputerVision #DynamicScenes #MonocularVideo #GaborRepresentation
📝 Summary:
AdaGaR reconstructs dynamic 3D scenes from monocular video. It introduces an Adaptive Gabor Representation for detail and stability, and Cubic Hermite Splines for temporal continuity. This method achieves state-of-the-art performance.
🔹 Publication Date: Published on Jan 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00796
• PDF: https://arxiv.org/pdf/2601.00796
• Project Page: https://jiewenchan.github.io/AdaGaR/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#3DReconstruction #ComputerVision #DynamicScenes #MonocularVideo #GaborRepresentation
❤1
✨InfoSynth: Information-Guided Benchmark Synthesis for LLMs
📝 Summary:
InfoSynth automatically generates novel and diverse coding benchmarks for LLMs. It uses information-theoretic metrics and genetic algorithms to create scalable self-verifying problems, overcoming manual effort and training data contamination.
🔹 Publication Date: Published on Jan 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00575
• PDF: https://arxiv.org/pdf/2601.00575
• Project Page: https://ishirgarg.github.io/infosynth_web/
• Github: https://github.com/ishirgarg/infosynth
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #AI #Benchmarking #GenerativeAI #DeepLearning
📝 Summary:
InfoSynth automatically generates novel and diverse coding benchmarks for LLMs. It uses information-theoretic metrics and genetic algorithms to create scalable self-verifying problems, overcoming manual effort and training data contamination.
🔹 Publication Date: Published on Jan 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.00575
• PDF: https://arxiv.org/pdf/2601.00575
• Project Page: https://ishirgarg.github.io/infosynth_web/
• Github: https://github.com/ishirgarg/infosynth
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #AI #Benchmarking #GenerativeAI #DeepLearning
✨Diversity or Precision? A Deep Dive into Next Token Prediction
📝 Summary:
This paper proposes a pre-training objective that reshapes the token-output distribution for better RL exploration. It uses reward-shaping to balance diversity and precision in next-token prediction. Contrary to intuition, a precision-oriented prior surprisingly yields a superior exploration spac...
🔹 Publication Date: Published on Dec 28, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22955
• PDF: https://arxiv.org/pdf/2512.22955
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#NextTokenPrediction #ReinforcementLearning #LLM #NLP #AIResearch
📝 Summary:
This paper proposes a pre-training objective that reshapes the token-output distribution for better RL exploration. It uses reward-shaping to balance diversity and precision in next-token prediction. Contrary to intuition, a precision-oriented prior surprisingly yields a superior exploration spac...
🔹 Publication Date: Published on Dec 28, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.22955
• PDF: https://arxiv.org/pdf/2512.22955
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#NextTokenPrediction #ReinforcementLearning #LLM #NLP #AIResearch
❤1
✨OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions
📝 Summary:
OmniVCus introduces a system for feedforward multi-subject video customization with multimodal controls. It proposes a data pipeline, VideoCus-Factory, and a diffusion Transformer framework with novel embedding mechanisms. This enables more subjects and precise editing, significantly outperformin...
🔹 Publication Date: Published on Jun 29, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2506.23361
• PDF: https://arxiv.org/pdf/2506.23361
• Project Page: https://caiyuanhao1998.github.io/project/OmniVCus/
• Github: https://github.com/caiyuanhao1998/Open-OmniVCus
🔹 Models citing this paper:
• https://huggingface.co/CaiYuanhao/OmniVCus
✨ Datasets citing this paper:
• https://huggingface.co/datasets/CaiYuanhao/OmniVCus
• https://huggingface.co/datasets/CaiYuanhao/OmniVCus-Test
• https://huggingface.co/datasets/CaiYuanhao/OmniVCus-Train
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoGeneration #DiffusionModels #MultimodalAI #DeepLearning #ComputerVision
📝 Summary:
OmniVCus introduces a system for feedforward multi-subject video customization with multimodal controls. It proposes a data pipeline, VideoCus-Factory, and a diffusion Transformer framework with novel embedding mechanisms. This enables more subjects and precise editing, significantly outperformin...
🔹 Publication Date: Published on Jun 29, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2506.23361
• PDF: https://arxiv.org/pdf/2506.23361
• Project Page: https://caiyuanhao1998.github.io/project/OmniVCus/
• Github: https://github.com/caiyuanhao1998/Open-OmniVCus
🔹 Models citing this paper:
• https://huggingface.co/CaiYuanhao/OmniVCus
✨ Datasets citing this paper:
• https://huggingface.co/datasets/CaiYuanhao/OmniVCus
• https://huggingface.co/datasets/CaiYuanhao/OmniVCus-Test
• https://huggingface.co/datasets/CaiYuanhao/OmniVCus-Train
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoGeneration #DiffusionModels #MultimodalAI #DeepLearning #ComputerVision
arXiv.org
OmniVCus: Feedforward Subject-driven Video Customization with...
Existing feedforward subject-driven video customization methods mainly study single-subject scenarios due to the difficulty of constructing multi-subject training data pairs. Another challenging...
❤1
Media is too big
VIEW IN TELEGRAM
OnSpace Mobile App builder: Build AI Apps in minutes
Visit website: https://www.onspace.ai/?via=tg_datas
Or Download app:https://onspace.onelink.me/za8S/h1jb6sb9?c=datas
With OnSpace, you can build website or AI Mobile Apps by chatting with AI, and publish to PlayStore or AppStore.
What will you get:
✔️ Create app or website by chatting with AI;
✔️ Integrate with Any top AI power just by giving order (like Sora2, Nanobanan Pro & Gemini 3 Pro);
✔️ Download APK,AAB file, publish to AppStore.
✔️ Add payments and monetize like in-app-purchase and Stripe.
✔️ Functional login & signup.
✔️ Database + dashboard in minutes.
✔️ Full tutorial on YouTube and within 1 day customer service
Visit website: https://www.onspace.ai/?via=tg_datas
Or Download app:https://onspace.onelink.me/za8S/h1jb6sb9?c=datas
With OnSpace, you can build website or AI Mobile Apps by chatting with AI, and publish to PlayStore or AppStore.
What will you get:
Please open Telegram to view this post
VIEW IN TELEGRAM
❤2
✨Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
📝 Summary:
Bitnet.cpp enhances edge inference for ternary LLMs using a novel mixed-precision matrix multiplication library. This system incorporates Ternary Lookup Tables and Int2 with a Scale for efficient, lossless inference, achieving up to a 6.25x speed increase over baselines.
🔹 Publication Date: Published on Feb 17, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.11880
• PDF: https://arxiv.org/pdf/2502.11880
• Github: https://github.com/microsoft/BitNet/tree/paper
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #EdgeAI #MachineLearning #DeepLearning #AI
📝 Summary:
Bitnet.cpp enhances edge inference for ternary LLMs using a novel mixed-precision matrix multiplication library. This system incorporates Ternary Lookup Tables and Int2 with a Scale for efficient, lossless inference, achieving up to a 6.25x speed increase over baselines.
🔹 Publication Date: Published on Feb 17, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.11880
• PDF: https://arxiv.org/pdf/2502.11880
• Github: https://github.com/microsoft/BitNet/tree/paper
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #EdgeAI #MachineLearning #DeepLearning #AI
❤1
✨BitNet b1.58 2B4T Technical Report
📝 Summary:
BitNet b1.58 2B4T is the first open-source 1-bit Large Language Model with 2 billion parameters. It matches full-precision LLM performance while offering significant improvements in computational efficiency like reduced memory and energy. The model weights are openly released for research.
🔹 Publication Date: Published on Apr 16, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.12285
• PDF: https://arxiv.org/pdf/2504.12285
• Github: https://github.com/microsoft/bitnet
🔹 Models citing this paper:
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16
✨ Spaces citing this paper:
• https://huggingface.co/spaces/suayptalha/Chat-with-Bitnet-b1.58-2B-4T
• https://huggingface.co/spaces/aizip-dev/SLM-RAG-Arena
• https://huggingface.co/spaces/Tonic/Native_1-bit_LLM
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #AI #Quantization #OpenSourceAI #DeepLearning
📝 Summary:
BitNet b1.58 2B4T is the first open-source 1-bit Large Language Model with 2 billion parameters. It matches full-precision LLM performance while offering significant improvements in computational efficiency like reduced memory and energy. The model weights are openly released for research.
🔹 Publication Date: Published on Apr 16, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.12285
• PDF: https://arxiv.org/pdf/2504.12285
• Github: https://github.com/microsoft/bitnet
🔹 Models citing this paper:
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16
✨ Spaces citing this paper:
• https://huggingface.co/spaces/suayptalha/Chat-with-Bitnet-b1.58-2B-4T
• https://huggingface.co/spaces/aizip-dev/SLM-RAG-Arena
• https://huggingface.co/spaces/Tonic/Native_1-bit_LLM
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #AI #Quantization #OpenSourceAI #DeepLearning
arXiv.org
BitNet b1.58 2B4T Technical Report
We introduce BitNet b1.58 2B4T, the first open-source, native 1-bit Large Language Model (LLM) at the 2-billion parameter scale. Trained on a corpus of 4 trillion tokens, the model has been...
✨Taming Preference Mode Collapse via Directional Decoupling Alignment in Diffusion Reinforcement Learning
📝 Summary:
This paper addresses Preference Mode Collapse PMC in text-to-image diffusion models, where models lose diversity despite high reward scores. It introduces D^2-Align, a framework that mitigates PMC by directionally correcting the reward signal during optimization. This novel approach maintains gen...
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24146
• PDF: https://arxiv.org/pdf/2512.24146
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DiffusionModels #ReinforcementLearning #GenerativeAI #MachineLearning #AIResearch
📝 Summary:
This paper addresses Preference Mode Collapse PMC in text-to-image diffusion models, where models lose diversity despite high reward scores. It introduces D^2-Align, a framework that mitigates PMC by directionally correcting the reward signal during optimization. This novel approach maintains gen...
🔹 Publication Date: Published on Dec 30, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.24146
• PDF: https://arxiv.org/pdf/2512.24146
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DiffusionModels #ReinforcementLearning #GenerativeAI #MachineLearning #AIResearch
This media is not supported in your browser
VIEW IN TELEGRAM
✨DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
📝 Summary:
DreamID-V is a novel video face swapping framework that uses diffusion transformers and curriculum learning. It achieves superior identity preservation and visual realism by bridging the image-to-video gap, outperforming existing methods and enhancing temporal consistency.
🔹 Publication Date: Published on Jan 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01425
• PDF: https://arxiv.org/pdf/2601.01425
• Project Page: https://guoxu1233.github.io/DreamID-V/
• Github: https://guoxu1233.github.io/DreamID-V/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#FaceSwapping #DiffusionModels #ComputerVision #GenerativeAI #VideoAI
📝 Summary:
DreamID-V is a novel video face swapping framework that uses diffusion transformers and curriculum learning. It achieves superior identity preservation and visual realism by bridging the image-to-video gap, outperforming existing methods and enhancing temporal consistency.
🔹 Publication Date: Published on Jan 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.01425
• PDF: https://arxiv.org/pdf/2601.01425
• Project Page: https://guoxu1233.github.io/DreamID-V/
• Github: https://guoxu1233.github.io/DreamID-V/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#FaceSwapping #DiffusionModels #ComputerVision #GenerativeAI #VideoAI
✨BitNet Distillation
📝 Summary:
BitNet Distillation fine-tunes LLMs to 1.58-bit precision using SubLN, attention distillation, and continual pre-training. It achieves comparable performance to full-precision models, offering 10x memory savings and 2.65x faster inference.
🔹 Publication Date: Published on Oct 15, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.13998
• PDF: https://arxiv.org/pdf/2510.13998
• Github: https://github.com/microsoft/BitNet
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #Quantization #ModelCompression #DeepLearning #AI
📝 Summary:
BitNet Distillation fine-tunes LLMs to 1.58-bit precision using SubLN, attention distillation, and continual pre-training. It achieves comparable performance to full-precision models, offering 10x memory savings and 2.65x faster inference.
🔹 Publication Date: Published on Oct 15, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.13998
• PDF: https://arxiv.org/pdf/2510.13998
• Github: https://github.com/microsoft/BitNet
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LLM #Quantization #ModelCompression #DeepLearning #AI
✨NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation
📝 Summary:
NextFlow is a unified decoder-only transformer enabling fast multimodal understanding and generation. It uses next-token prediction for text and next-scale for images, generating 1024x1024 images in 5 seconds. It achieves state-of-the-art performance among unified models.
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02204
• PDF: https://arxiv.org/pdf/2601.02204
• Github: https://github.com/ByteVisionLab/NextFlow
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
NextFlow is a unified decoder-only transformer enabling fast multimodal understanding and generation. It uses next-token prediction for text and next-scale for images, generating 1024x1024 images in 5 seconds. It achieves state-of-the-art performance among unified models.
🔹 Publication Date: Published on Jan 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.02204
• PDF: https://arxiv.org/pdf/2601.02204
• Github: https://github.com/ByteVisionLab/NextFlow
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research