✨Aletheia tackles FirstProof autonomously
📝 Summary:
We report the performance of Aletheia (Feng et al., 2026b), a mathematics research agent powered by Gemini 3 Deep Think, on the inaugural FirstProof challenge. Within the allowed timeframe of the chal...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21201
• PDF: https://arxiv.org/pdf/2602.21201
• Project Page: https://github.com/google-deepmind/superhuman/tree/main/aletheia
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
We report the performance of Aletheia (Feng et al., 2026b), a mathematics research agent powered by Gemini 3 Deep Think, on the inaugural FirstProof challenge. Within the allowed timeframe of the chal...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21201
• PDF: https://arxiv.org/pdf/2602.21201
• Project Page: https://github.com/google-deepmind/superhuman/tree/main/aletheia
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨The Diffusion Duality, Chapter II: Ψ-Samplers and Efficient Curriculum
📝 Summary:
Discrete diffusion models with predictor-corrector samplers surpass traditional methods in generation quality and efficiency, challenging assumptions about masked diffusion's necessity in language mod...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21185
• PDF: https://arxiv.org/pdf/2602.21185
• Project Page: https://s-sahoo.com/duo-ch2/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Discrete diffusion models with predictor-corrector samplers surpass traditional methods in generation quality and efficiency, challenging assumptions about masked diffusion's necessity in language mod...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21185
• PDF: https://arxiv.org/pdf/2602.21185
• Project Page: https://s-sahoo.com/duo-ch2/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Test-Time Training with KV Binding Is Secretly Linear Attention
📝 Summary:
This paper reinterprets Test-Time Training TTT with KV binding. Instead of memorization, it shows TTT is a form of learned linear attention with enhanced representational capacity. This new perspective explains puzzling behaviors, simplifies architectures, and boosts efficiency.
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21204
• PDF: https://arxiv.org/pdf/2602.21204
• Project Page: https://research.nvidia.com/labs/sil/projects/tttla/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
This paper reinterprets Test-Time Training TTT with KV binding. Instead of memorization, it shows TTT is a form of learned linear attention with enhanced representational capacity. This new perspective explains puzzling behaviors, simplifies architectures, and boosts efficiency.
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21204
• PDF: https://arxiv.org/pdf/2602.21204
• Project Page: https://research.nvidia.com/labs/sil/projects/tttla/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Generative AI and Machine Learning Collaboration for Container Dwell Time Prediction via Data Standardization
📝 Summary:
A collaborative framework integrating generative artificial intelligence with machine learning improves container dwell time prediction by standardizing unstructured text data, leading to reduced reha...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20540
• PDF: https://arxiv.org/pdf/2602.20540
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A collaborative framework integrating generative artificial intelligence with machine learning improves container dwell time prediction by standardizing unstructured text data, leading to reduced reha...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20540
• PDF: https://arxiv.org/pdf/2602.20540
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨DREAM: Deep Research Evaluation with Agentic Metrics
📝 Summary:
Deep Research Agents generate analyst-grade reports, yet evaluating them remains challenging due to the absence of a single ground truth and the multidimensional nature of research quality. Recent ben...
🔹 Publication Date: Published on Feb 21
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.18940
• PDF: https://arxiv.org/pdf/2602.18940
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Deep Research Agents generate analyst-grade reports, yet evaluating them remains challenging due to the absence of a single ground truth and the multidimensional nature of research quality. Recent ben...
🔹 Publication Date: Published on Feb 21
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.18940
• PDF: https://arxiv.org/pdf/2602.18940
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking
📝 Summary:
UPipe enables efficient processing of long sequences in Transformer models through fine-grained chunking at the attention head level, significantly reducing activation memory usage while maintaining t...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21196
• PDF: https://arxiv.org/pdf/2602.21196
• Project Page: https://rghadia.github.io/untied_ulysses_proj/
• Github: https://github.com/togethercomputer/Untied-Ulysses
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
UPipe enables efficient processing of long sequences in Transformer models through fine-grained chunking at the attention head level, significantly reducing activation memory usage while maintaining t...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21196
• PDF: https://arxiv.org/pdf/2602.21196
• Project Page: https://rghadia.github.io/untied_ulysses_proj/
• Github: https://github.com/togethercomputer/Untied-Ulysses
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OCR-Agent: Agentic OCR with Capability and Memory Reflection
📝 Summary:
A novel iterative self-correction framework enhances vision-language models' reasoning robustness through capability reflection and memory reflection mechanisms, achieving superior performance on visu...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21053
• PDF: https://arxiv.org/pdf/2602.21053
• Github: https://github.com/AIGeeksGroup/OCR-Agent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A novel iterative self-correction framework enhances vision-language models' reasoning robustness through capability reflection and memory reflection mechanisms, achieving superior performance on visu...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21053
• PDF: https://arxiv.org/pdf/2602.21053
• Github: https://github.com/AIGeeksGroup/OCR-Agent
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨OmniOCR: Generalist OCR for Ethnic Minority Languages
📝 Summary:
OmniOCR presents a universal framework for ethnic minority scripts using Dynamic LoRA and sparsity regularization to achieve state-of-the-art accuracy with improved parameter efficiency in low-resourc...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21042
• PDF: https://arxiv.org/pdf/2602.21042
• Github: https://github.com/AIGeeksGroup/OmniOCR
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
OmniOCR presents a universal framework for ethnic minority scripts using Dynamic LoRA and sparsity regularization to achieve state-of-the-art accuracy with improved parameter efficiency in low-resourc...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21042
• PDF: https://arxiv.org/pdf/2602.21042
• Github: https://github.com/AIGeeksGroup/OmniOCR
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨LaS-Comp: Zero-shot 3D Completion with Latent-Spatial Consistency
📝 Summary:
LaS-Comp is a zero-shot 3D shape completion method that leverages 3D foundation models. It uses a two-stage approach for faithful reconstruction and seamless boundary refinement. This training-free framework outperforms prior state-of-the-art methods.
🔹 Publication Date: Published on Feb 21
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.18735
• PDF: https://arxiv.org/pdf/2602.18735
• Github: https://github.com/DavidYan2001/LaS-Comp
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#3DCompletion #ZeroShotLearning #FoundationModels #ComputerVision #AI
📝 Summary:
LaS-Comp is a zero-shot 3D shape completion method that leverages 3D foundation models. It uses a two-stage approach for faithful reconstruction and seamless boundary refinement. This training-free framework outperforms prior state-of-the-art methods.
🔹 Publication Date: Published on Feb 21
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.18735
• PDF: https://arxiv.org/pdf/2602.18735
• Github: https://github.com/DavidYan2001/LaS-Comp
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#3DCompletion #ZeroShotLearning #FoundationModels #ComputerVision #AI
✨One-step Language Modeling via Continuous Denoising
📝 Summary:
This paper introduces flow-based language models that use continuous denoising over one-hot token encodings. They surpass discrete diffusion models in quality and speed, particularly for few-step generation, challenging discrete diffusion's necessity for discrete data.
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16813
• PDF: https://arxiv.org/pdf/2602.16813
• Project Page: https://one-step-lm.github.io/
• Github: https://github.com/david3684/flm
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LanguageModels #GenerativeAI #DeepLearning #NLP #AI
📝 Summary:
This paper introduces flow-based language models that use continuous denoising over one-hot token encodings. They surpass discrete diffusion models in quality and speed, particularly for few-step generation, challenging discrete diffusion's necessity for discrete data.
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16813
• PDF: https://arxiv.org/pdf/2602.16813
• Project Page: https://one-step-lm.github.io/
• Github: https://github.com/david3684/flm
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#LanguageModels #GenerativeAI #DeepLearning #NLP #AI
✨TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering
📝 Summary:
TextPecker proposes a reinforcement learning strategy to improve visual text rendering by perceiving and mitigating structural anomalies in text-to-image generation. It uses a new annotated dataset and synthesis engine to significantly enhance structural fidelity and semantic alignment, setting a...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20903
• PDF: https://arxiv.org/pdf/2602.20903
• Project Page: https://github.com/CIawevy/TextPecker
• Github: https://github.com/CIawevy/TextPecker
🔹 Models citing this paper:
• https://huggingface.co/CIawevy/TextPecker-8B-InternVL3
• https://huggingface.co/CIawevy/TextPecker-8B-Qwen3VL
• https://huggingface.co/CIawevy/QwenImage-TextPecker-SQPA
✨ Datasets citing this paper:
• https://huggingface.co/datasets/CIawevy/TextPecker-1.5M
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
TextPecker proposes a reinforcement learning strategy to improve visual text rendering by perceiving and mitigating structural anomalies in text-to-image generation. It uses a new annotated dataset and synthesis engine to significantly enhance structural fidelity and semantic alignment, setting a...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20903
• PDF: https://arxiv.org/pdf/2602.20903
• Project Page: https://github.com/CIawevy/TextPecker
• Github: https://github.com/CIawevy/TextPecker
🔹 Models citing this paper:
• https://huggingface.co/CIawevy/TextPecker-8B-InternVL3
• https://huggingface.co/CIawevy/TextPecker-8B-Qwen3VL
• https://huggingface.co/CIawevy/QwenImage-TextPecker-SQPA
✨ Datasets citing this paper:
• https://huggingface.co/datasets/CIawevy/TextPecker-1.5M
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
arXiv.org
TextPecker: Rewarding Structural Anomaly Quantification for...
Visual Text Rendering (VTR) remains a critical challenge in text-to-image generation, where even advanced models frequently produce text with structural anomalies such as distortion, blurriness,...
✨Communication-Inspired Tokenization for Structured Image Representations
📝 Summary:
COMiT introduces a framework for learning structured, object-centric visual tokens through iterative encoding and flow-matching decoding. This single-transformer approach improves compositional generalization and relational reasoning by creating interpretable token structures.
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20731
• PDF: https://arxiv.org/pdf/2602.20731
• Project Page: https://araachie.github.io/comit/
• Github: https://github.com/araachie/comit
🔹 Models citing this paper:
• https://huggingface.co/cvg-unibe/comit-xl
• https://huggingface.co/cvg-unibe/comit-l
• https://huggingface.co/cvg-unibe/comit-b
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ComputerVision #Transformers #ImageRecognition #RepresentationLearning #AIResearch
📝 Summary:
COMiT introduces a framework for learning structured, object-centric visual tokens through iterative encoding and flow-matching decoding. This single-transformer approach improves compositional generalization and relational reasoning by creating interpretable token structures.
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20731
• PDF: https://arxiv.org/pdf/2602.20731
• Project Page: https://araachie.github.io/comit/
• Github: https://github.com/araachie/comit
🔹 Models citing this paper:
• https://huggingface.co/cvg-unibe/comit-xl
• https://huggingface.co/cvg-unibe/comit-l
• https://huggingface.co/cvg-unibe/comit-b
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ComputerVision #Transformers #ImageRecognition #RepresentationLearning #AIResearch
✨Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization
📝 Summary:
This paper introduces adaptive text anonymization, a framework that uses prompt optimization to automatically adjust anonymization strategies for language models. It adapts to varying privacy-utility requirements across diverse domains, achieving a better trade-off than baselines. It is efficient...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20743
• PDF: https://arxiv.org/pdf/2602.20743
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TextAnonymization #Privacy #PromptOptimization #LLM #NLP
📝 Summary:
This paper introduces adaptive text anonymization, a framework that uses prompt optimization to automatically adjust anonymization strategies for language models. It adapts to varying privacy-utility requirements across diverse domains, achieving a better trade-off than baselines. It is efficient...
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20743
• PDF: https://arxiv.org/pdf/2602.20743
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#TextAnonymization #Privacy #PromptOptimization #LLM #NLP
✨Query-focused and Memory-aware Reranker for Long Context Processing
📝 Summary:
This reranking framework uses attention scores from selected LLM heads to estimate passage-query relevance. It's lightweight, achieves strong performance, and outperforms state-of-the-art rerankers across various domains, including long narrative datasets and the LoCoMo benchmark.
🔹 Publication Date: Published on Feb 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12192
• PDF: https://arxiv.org/pdf/2602.12192
• Project Page: https://qdcassie-li.github.io/QRRanker/
🔹 Models citing this paper:
• https://huggingface.co/MindscapeRAG/QRRanker
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Reranking #LLM #NLP #InformationRetrieval #LongContext
📝 Summary:
This reranking framework uses attention scores from selected LLM heads to estimate passage-query relevance. It's lightweight, achieves strong performance, and outperforms state-of-the-art rerankers across various domains, including long narrative datasets and the LoCoMo benchmark.
🔹 Publication Date: Published on Feb 12
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12192
• PDF: https://arxiv.org/pdf/2602.12192
• Project Page: https://qdcassie-li.github.io/QRRanker/
🔹 Models citing this paper:
• https://huggingface.co/MindscapeRAG/QRRanker
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Reranking #LLM #NLP #InformationRetrieval #LongContext
❤1
✨QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models
📝 Summary:
QuantVLA is a training-free post-training quantization framework for vision-language-action models. Through scale-calibrated components, it significantly reduces memory and speeds up inference while maintaining performance, enabling efficient deployment for embodied AI.
🔹 Publication Date: Published on Feb 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20309
• PDF: https://arxiv.org/pdf/2602.20309
• Project Page: https://quantvla.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Quantization #VLAModels #EmbodiedAI #AIResearch #DeepLearning
📝 Summary:
QuantVLA is a training-free post-training quantization framework for vision-language-action models. Through scale-calibrated components, it significantly reduces memory and speeds up inference while maintaining performance, enabling efficient deployment for embodied AI.
🔹 Publication Date: Published on Feb 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20309
• PDF: https://arxiv.org/pdf/2602.20309
• Project Page: https://quantvla.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Quantization #VLAModels #EmbodiedAI #AIResearch #DeepLearning
❤1
✨Multi-Vector Index Compression in Any Modality
📝 Summary:
This paper introduces attention-guided clustering AGC for compressing multi-vector document representations across various modalities. AGC consistently outperforms other compression methods in text, visual-document, and video retrieval, often matching or improving upon uncompressed indexes.
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21202
• PDF: https://arxiv.org/pdf/2602.21202
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#IndexCompression #MultiModal #InformationRetrieval #MachineLearning #VectorDatabases
📝 Summary:
This paper introduces attention-guided clustering AGC for compressing multi-vector document representations across various modalities. AGC consistently outperforms other compression methods in text, visual-document, and video retrieval, often matching or improving upon uncompressed indexes.
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.21202
• PDF: https://arxiv.org/pdf/2602.21202
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#IndexCompression #MultiModal #InformationRetrieval #MachineLearning #VectorDatabases
✨PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency
📝 Summary:
PETS is a principled framework for efficient test-time self-consistency that optimizes trajectory allocation. It defines a new self-consistency rate, reducing sampling requirements while maintaining accuracy. PETS significantly cuts sampling budgets by up to 75 percent offline and 55 percent onli...
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16745
• PDF: https://arxiv.org/pdf/2602.16745
• Github: https://github.com/ZDCSlab/PETS
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SelfConsistency #MachineLearning #Optimization #AI #Efficiency
📝 Summary:
PETS is a principled framework for efficient test-time self-consistency that optimizes trajectory allocation. It defines a new self-consistency rate, reducing sampling requirements while maintaining accuracy. PETS significantly cuts sampling budgets by up to 75 percent offline and 55 percent onli...
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16745
• PDF: https://arxiv.org/pdf/2602.16745
• Github: https://github.com/ZDCSlab/PETS
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#SelfConsistency #MachineLearning #Optimization #AI #Efficiency
✨DeepSeek-V3 Technical Report
📝 Summary:
DeepSeek-V3 is an efficient Mixture-of-Experts language model 671B parameters using MLA and DeepSeekMoE architectures. It achieves strong performance, comparable to leading models, with highly stable and cost-effective training on 14.8T tokens.
🔹 Publication Date: Published on Dec 27, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.19437
• PDF: https://arxiv.org/pdf/2412.19437
• Github: https://github.com/deepseek-ai/deepseek-v3
🔹 Models citing this paper:
• https://huggingface.co/deepseek-ai/DeepSeek-V3
• https://huggingface.co/deepseek-ai/DeepSeek-V3-0324
• https://huggingface.co/deepseek-ai/DeepSeek-V3-Base
✨ Spaces citing this paper:
• https://huggingface.co/spaces/nanotron/ultrascale-playbook
• https://huggingface.co/spaces/Ki-Seki/ultrascale-playbook-zh-cn
• https://huggingface.co/spaces/weege007/ultrascale-playbook
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DeepSeekV3 #MoE #LLM #AI #MachineLearning
📝 Summary:
DeepSeek-V3 is an efficient Mixture-of-Experts language model 671B parameters using MLA and DeepSeekMoE architectures. It achieves strong performance, comparable to leading models, with highly stable and cost-effective training on 14.8T tokens.
🔹 Publication Date: Published on Dec 27, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.19437
• PDF: https://arxiv.org/pdf/2412.19437
• Github: https://github.com/deepseek-ai/deepseek-v3
🔹 Models citing this paper:
• https://huggingface.co/deepseek-ai/DeepSeek-V3
• https://huggingface.co/deepseek-ai/DeepSeek-V3-0324
• https://huggingface.co/deepseek-ai/DeepSeek-V3-Base
✨ Spaces citing this paper:
• https://huggingface.co/spaces/nanotron/ultrascale-playbook
• https://huggingface.co/spaces/Ki-Seki/ultrascale-playbook-zh-cn
• https://huggingface.co/spaces/weege007/ultrascale-playbook
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DeepSeekV3 #MoE #LLM #AI #MachineLearning
arXiv.org
DeepSeek-V3 Technical Report
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training,...
✨See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis
📝 Summary:
ArtiAgent automates creating artifact-annotated image datasets. It uses three agents to perceive entities, inject artifacts into real images via diffusion transformers, and curate the results. This enables training models to detect and fix visual flaws in AI-generated content.
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20951
• PDF: https://arxiv.org/pdf/2602.20951
• Github: https://github.com/krafton-ai/ArtiAgent
✨ Datasets citing this paper:
• https://huggingface.co/datasets/KRAFTON/ArtiBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
ArtiAgent automates creating artifact-annotated image datasets. It uses three agents to perceive entities, inject artifacts into real images via diffusion transformers, and curate the results. This enables training models to detect and fix visual flaws in AI-generated content.
🔹 Publication Date: Published on Feb 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20951
• PDF: https://arxiv.org/pdf/2602.20951
• Github: https://github.com/krafton-ai/ArtiAgent
✨ Datasets citing this paper:
• https://huggingface.co/datasets/KRAFTON/ArtiBench
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research