DeepSeek / Deepseek V3 (FP8)
Released: 12/26/2024Deepseek V3 is an LLM. It excels in handling very large context windows and advanced reasoning tasks due to its Mixture-of-Experts (MoE) architecture, which activates 37 billion parameters per token from a total of 671 billion. This design allows for more efficient scaling, high training efficiency, and significant reductions in computational and memory cost. Its FP8 quantization further optimizes inference speed and resource usage.
Some other noteworthy features of Deepseek V3 include support for multi-token prediction, which accelerates inference, and an architecture designed to efficiently leverage modern GPU hardware for large-scale workloads.
Metric | Value |
---|---|
Parameter Count | 671 billion |
Mixture of Experts | Yes |
Active Parameter Count | 37 billion |
Context Length | Unknown |
Multilingual | Unknown |
Quantized* | Yes |
Precision* | FP8 |
*Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.
DeepSeek models available on Oxen.ai
Modality | Price (1M tokens) | ||||
---|---|---|---|---|---|
Model | Inference provider | Input | Output | Input | Output |
![]() | text | text | $3.00 | $8.00 | |
text | text | $0.55 | $2.19 | ||
text | text | $7.00 | $7.00 | ||
![]() | text | text | $0.59 | $0.79 | |
text | text | $0.27 | $1.10 | ||
![]() | text | text | $0.75 | $3.00 | |
text | text | $1.25 | $1.25 |