✨Mitigating Intra- and Inter-modal Forgetting in Continual Learning of Unified Multimodal Models
📝 Summary:
Unified Multimodal Generative Models UMGMs suffer severe intra- and inter-modal forgetting in continual learning. Modality-Decoupled Experts MoDE is proposed to mitigate this by decoupling modality-specific updates and using knowledge distillation. MoDE effectively prevents both types of forgetting.
🔹 Publication Date: Published on Dec 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.03125
• PDF: https://arxiv.org/pdf/2512.03125
• Github: https://github.com/Christina200/MoDE-official
✨ Datasets citing this paper:
• https://huggingface.co/datasets/ChristinaW/MoDE-official
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MultimodalAI #ContinualLearning #GenerativeAI #MachineLearning #AIResearch
📝 Summary:
Unified Multimodal Generative Models UMGMs suffer severe intra- and inter-modal forgetting in continual learning. Modality-Decoupled Experts MoDE is proposed to mitigate this by decoupling modality-specific updates and using knowledge distillation. MoDE effectively prevents both types of forgetting.
🔹 Publication Date: Published on Dec 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.03125
• PDF: https://arxiv.org/pdf/2512.03125
• Github: https://github.com/Christina200/MoDE-official
✨ Datasets citing this paper:
• https://huggingface.co/datasets/ChristinaW/MoDE-official
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#MultimodalAI #ContinualLearning #GenerativeAI #MachineLearning #AIResearch
❤1
✨End-to-End Test-Time Training for Long Context
📝 Summary:
This paper proposes End-to-End Test-Time Training TTT-E2E for long-context language modeling, treating it as continual learning. It uses a standard Transformer, learning at test time and improving initialization via meta-learning. TTT-E2E scales well and offers constant inference latency, being m...
🔹 Publication Date: Published on Dec 29, 2025
🔹 Paper Links:
• arXiv Page: https://arxivlens.com/PaperView/Details/end-to-end-test-time-training-for-long-context-6176-bf8fd7e6
• PDF: https://arxiv.org/pdf/2512.23675
• Github: https://github.com/test-time-training/e2e
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TestTimeTraining #LongContext #LanguageModels #Transformers #ContinualLearning
📝 Summary:
This paper proposes End-to-End Test-Time Training TTT-E2E for long-context language modeling, treating it as continual learning. It uses a standard Transformer, learning at test time and improving initialization via meta-learning. TTT-E2E scales well and offers constant inference latency, being m...
🔹 Publication Date: Published on Dec 29, 2025
🔹 Paper Links:
• arXiv Page: https://arxivlens.com/PaperView/Details/end-to-end-test-time-training-for-long-context-6176-bf8fd7e6
• PDF: https://arxiv.org/pdf/2512.23675
• Github: https://github.com/test-time-training/e2e
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TestTimeTraining #LongContext #LanguageModels #Transformers #ContinualLearning