飞讯-币圈快讯

🚀 DeepSeek 发布 DeepSeek-Prover-V2-671B 模型，参数达 6710 亿

据 BlockBeats 报道，4 月 30 日，DeepSeek 在 Hugging Face 上发布了 DeepSeek-Prover-V2-671B 模型。该模型使用 safetensors 文件格式，支持多种计算精度，参数达 6710 亿。

该模型采用 DeepSeek-V3 架构，使用 MoE 模式，具有 61 层 Transformer 层和 7168 维隐藏层。支持超长上下文，最大位置嵌入达 16.38 万，采用 FP8 量化技术，提高推理效率。

#DeepSeek #DeepSeekProver #HuggingFace #AI模型 #机器学习 #Transformer #MoE #超长上下文 #FP8量化 #推理效率

7 views10:45

About

Blog

Apps

Platform