Models/Deepseek V3 (FP8)
DeepSeekDeepSeek / Deepseek V3 (FP8)
Released: 12/26/2024
texttext
Input: $1.25 / Output: $1.25

Deepseek V3 is an LLM. It excels in handling very large context windows and advanced reasoning tasks due to its Mixture-of-Experts (MoE) architecture, which activates 37 billion parameters per token from a total of 671 billion. This design allows for more efficient scaling, high training efficiency, and significant reductions in computational and memory cost. Its FP8 quantization further optimizes inference speed and resource usage.

Some other noteworthy features of Deepseek V3 include support for multi-token prediction, which accelerates inference, and an architecture designed to efficiently leverage modern GPU hardware for large-scale workloads.

MetricValue
Parameter Count671 billion
Mixture of ExpertsYes
Active Parameter Count37 billion
Context LengthUnknown
MultilingualUnknown
Quantized*Yes
Precision*FP8

*Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.

DeepSeek models available on Oxen.ai
ModalityPrice (1M tokens)
ModelInference providerInputOutputInputOutput
Fireworks AIFireworks AI
texttext$3.00$8.00
DeepSeekDeepSeek
texttext$0.55$2.19
Together.aiTogether.ai
texttext$7.00$7.00
GroqGroq
texttext$0.59$0.79
DeepSeekDeepSeek
texttext$0.27$1.10
Fireworks AIFireworks AI
texttext$0.75$3.00
Together.aiTogether.ai
texttext$1.25$1.25
See all models available on Oxen.ai