✨UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
📝 Summary:
UnityVideo is a unified framework enhancing video generation by integrating multiple modalities and training paradigms. It uses dynamic noising and a modality switcher for comprehensive world understanding. This improves video quality, consistency, and zero-shot generalization to new data.
🔹 Publication Date: Published on Dec 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07831
• PDF: https://arxiv.org/pdf/2512.07831
• Project Page: https://jackailab.github.io/Projects/UnityVideo/
• Github: https://github.com/dvlab-research/UnityVideo
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoGeneration #MultimodalAI #GenerativeAI #DeepLearning #AIResearch
📝 Summary:
UnityVideo is a unified framework enhancing video generation by integrating multiple modalities and training paradigms. It uses dynamic noising and a modality switcher for comprehensive world understanding. This improves video quality, consistency, and zero-shot generalization to new data.
🔹 Publication Date: Published on Dec 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07831
• PDF: https://arxiv.org/pdf/2512.07831
• Project Page: https://jackailab.github.io/Projects/UnityVideo/
• Github: https://github.com/dvlab-research/UnityVideo
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoGeneration #MultimodalAI #GenerativeAI #DeepLearning #AIResearch
✨Decouple to Generalize: Context-First Self-Evolving Learning for Data-Scarce Vision-Language Reasoning
📝 Summary:
DoGe is a framework that addresses data scarcity in vision-language models. It decouples context learning from problem solving, using a curriculum to improve reward signals and data diversity. This enhances generalization and performance.
🔹 Publication Date: Published on Dec 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06835
• PDF: https://arxiv.org/pdf/2512.06835
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VisionLanguage #DataScarcity #MachineLearning #AIResearch #DeepLearning
📝 Summary:
DoGe is a framework that addresses data scarcity in vision-language models. It decouples context learning from problem solving, using a curriculum to improve reward signals and data diversity. This enhances generalization and performance.
🔹 Publication Date: Published on Dec 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06835
• PDF: https://arxiv.org/pdf/2512.06835
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VisionLanguage #DataScarcity #MachineLearning #AIResearch #DeepLearning
❤1
✨OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation
📝 Summary:
OmniSafeBench-MM is a unified toolbox for evaluating multi-modal jailbreak attacks and defenses in MLLMs. It integrates various attacks, defense strategies, and a diverse dataset to provide a comprehensive, standardized, and reproducible platform for research.
🔹 Publication Date: Published on Dec 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06589
• PDF: https://arxiv.org/pdf/2512.06589
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLMs #AISafety #AIsecurity #Benchmark #DeepLearning
📝 Summary:
OmniSafeBench-MM is a unified toolbox for evaluating multi-modal jailbreak attacks and defenses in MLLMs. It integrates various attacks, defense strategies, and a diverse dataset to provide a comprehensive, standardized, and reproducible platform for research.
🔹 Publication Date: Published on Dec 6
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06589
• PDF: https://arxiv.org/pdf/2512.06589
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MLLMs #AISafety #AIsecurity #Benchmark #DeepLearning
❤1
✨Predicting Time-Dependent Flow Over Complex Geometries Using Operator Networks
📝 Summary:
A Deep Operator Network predicts unsteady flow velocity fields over complex geometries with up to 1000X speedup over traditional simulations. It accurately captures near-term transients but shows error accumulation in fine-scale wakes.
🔹 Publication Date: Published on Dec 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04434
• PDF: https://arxiv.org/pdf/2512.04434
• Github: https://github.com/baskargroup/TimeDependent-DeepONet
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DeepLearning #FluidDynamics #AI #CFD #MachineLearning
📝 Summary:
A Deep Operator Network predicts unsteady flow velocity fields over complex geometries with up to 1000X speedup over traditional simulations. It accurately captures near-term transients but shows error accumulation in fine-scale wakes.
🔹 Publication Date: Published on Dec 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04434
• PDF: https://arxiv.org/pdf/2512.04434
• Github: https://github.com/baskargroup/TimeDependent-DeepONet
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DeepLearning #FluidDynamics #AI #CFD #MachineLearning
Media is too big
VIEW IN TELEGRAM
✨OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
📝 Summary:
OneStory generates coherent multi-shot videos by modeling global cross-shot context. It uses a Frame Selection module and an Adaptive Conditioner for next-shot generation, leveraging pretrained models and a new dataset. This achieves state-of-the-art narrative coherence for long-form video storyt...
🔹 Publication Date: Published on Dec 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07802
• PDF: https://arxiv.org/pdf/2512.07802
• Project Page: https://zhaochongan.github.io/projects/OneStory/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoGeneration #AI #DeepLearning #ComputerVision #GenerativeAI
📝 Summary:
OneStory generates coherent multi-shot videos by modeling global cross-shot context. It uses a Frame Selection module and an Adaptive Conditioner for next-shot generation, leveraging pretrained models and a new dataset. This achieves state-of-the-art narrative coherence for long-form video storyt...
🔹 Publication Date: Published on Dec 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07802
• PDF: https://arxiv.org/pdf/2512.07802
• Project Page: https://zhaochongan.github.io/projects/OneStory/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoGeneration #AI #DeepLearning #ComputerVision #GenerativeAI
❤1
✨SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian Splatting
📝 Summary:
This survey overviews efficient 3D and 4D Gaussian Splatting. It categorizes parameter and restructuring compression methods to reduce memory and computation while maintaining reconstruction quality. It also covers current limitations and future research.
🔹 Publication Date: Published on Dec 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07197
• PDF: https://arxiv.org/pdf/2512.07197
• Project Page: https://cmlab-korea.github.io/Awesome-Efficient-GS/
• Github: https://cmlab-korea.github.io/Awesome-Efficient-GS/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#GaussianSplatting #3DVision #ComputerGraphics #DeepLearning #Efficiency
📝 Summary:
This survey overviews efficient 3D and 4D Gaussian Splatting. It categorizes parameter and restructuring compression methods to reduce memory and computation while maintaining reconstruction quality. It also covers current limitations and future research.
🔹 Publication Date: Published on Dec 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07197
• PDF: https://arxiv.org/pdf/2512.07197
• Project Page: https://cmlab-korea.github.io/Awesome-Efficient-GS/
• Github: https://cmlab-korea.github.io/Awesome-Efficient-GS/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#GaussianSplatting #3DVision #ComputerGraphics #DeepLearning #Efficiency
✨Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training
📝 Summary:
AutoQ-VIS is an unsupervised Video Instance Segmentation framework that bridges the synthetic-to-real domain gap. It uses quality-guided self-training with automatic quality assessment for progressive adaptation. This method achieves state-of-the-art results without requiring human annotations.
🔹 Publication Date: Published on Dec 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06864
• PDF: https://arxiv.org/pdf/2512.06864
• Github: https://github.com/wcbup/AutoQ-VIS/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoInstanceSegmentation #UnsupervisedLearning #ComputerVision #MachineLearning #DeepLearning
📝 Summary:
AutoQ-VIS is an unsupervised Video Instance Segmentation framework that bridges the synthetic-to-real domain gap. It uses quality-guided self-training with automatic quality assessment for progressive adaptation. This method achieves state-of-the-art results without requiring human annotations.
🔹 Publication Date: Published on Dec 7
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06864
• PDF: https://arxiv.org/pdf/2512.06864
• Github: https://github.com/wcbup/AutoQ-VIS/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoInstanceSegmentation #UnsupervisedLearning #ComputerVision #MachineLearning #DeepLearning
✨Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
📝 Summary:
D4RT is a transformer-based model that efficiently reconstructs 4D scenes from videos. It uses a novel querying mechanism to infer depth and motion by flexibly probing 3D space-time points, outperforming previous methods.
🔹 Publication Date: Published on Dec 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08924
• PDF: https://arxiv.org/pdf/2512.08924
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#4DReconstruction #ComputerVision #Transformers #DynamicScenes #DeepLearning
📝 Summary:
D4RT is a transformer-based model that efficiently reconstructs 4D scenes from videos. It uses a novel querying mechanism to infer depth and motion by flexibly probing 3D space-time points, outperforming previous methods.
🔹 Publication Date: Published on Dec 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08924
• PDF: https://arxiv.org/pdf/2512.08924
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#4DReconstruction #ComputerVision #Transformers #DynamicScenes #DeepLearning
✨VideoSSM: Autoregressive Long Video Generation with Hybrid State-Space Memory
📝 Summary:
VideoSSM proposes a hybrid state-space memory model for long video generation. It unifies autoregressive diffusion with global state-space memory and local context to achieve state-of-the-art temporal consistency and motion stability. This enables scalable, interactive minute-scale video synthesis.
🔹 Publication Date: Published on Dec 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04519
• PDF: https://arxiv.org/pdf/2512.04519
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoGeneration #GenerativeAI #DiffusionModels #StateSpaceModels #DeepLearning
📝 Summary:
VideoSSM proposes a hybrid state-space memory model for long video generation. It unifies autoregressive diffusion with global state-space memory and local context to achieve state-of-the-art temporal consistency and motion stability. This enables scalable, interactive minute-scale video synthesis.
🔹 Publication Date: Published on Dec 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04519
• PDF: https://arxiv.org/pdf/2512.04519
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoGeneration #GenerativeAI #DiffusionModels #StateSpaceModels #DeepLearning
✨Smart Timing for Mining: A Deep Learning Framework for Bitcoin Hardware ROI Prediction
📝 Summary:
MineROI-Net is a Transformer model predicting Bitcoin ASIC hardware profitability within one year, addressing acquisition timing. It achieves 83.7% accuracy, outperforming baselines, and precisely identifies profitable or unprofitable periods to reduce financial risk.
🔹 Publication Date: Published on Dec 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05402
• PDF: https://arxiv.org/pdf/2512.05402
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DeepLearning #Bitcoin #CryptoMining #FinancialModeling #AIResearch
📝 Summary:
MineROI-Net is a Transformer model predicting Bitcoin ASIC hardware profitability within one year, addressing acquisition timing. It achieves 83.7% accuracy, outperforming baselines, and precisely identifies profitable or unprofitable periods to reduce financial risk.
🔹 Publication Date: Published on Dec 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05402
• PDF: https://arxiv.org/pdf/2512.05402
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DeepLearning #Bitcoin #CryptoMining #FinancialModeling #AIResearch
❤1