Alibaba / Wan-AI/Wan2.1-T2V-1.3B-Diffusers
textvideo
$NaN
Wan-AI/Wan2.1-T2V-1.3B-Diffusers is a text-to-video diffusion model. It excels in generating 480P videos from text prompts efficiently on consumer-grade GPUs, requiring only 8.19GB of VRAM, while maintaining competitive video quality.
Some other noteworthy features of Wan-AI/Wan2.1-T2V-1.3B-Diffusers include multilingual support (English and Chinese), image-to-video conversion, aspect ratio control, visual text rendering inside videos, prompt enhancement, and the ability to add sound effects or background music to generated videos.
| Metric | Value |
|---|---|
| Parameter Count | 1.3 billion |
| Mixture of Experts | No |
| Context Length | Unknown |
| Multilingual | Yes |
| Quantized* | No |
*Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.
Alibaba models available on Oxen.ai
| Modality | Price (1M tokens) | ||||
|---|---|---|---|---|---|
| Model | Input | Output | Input | Output | |
| text | video | N/A | N/A | ||
| text | video | N/A | N/A | ||
| text | video | N/A | N/A | ||
| text | video | N/A | N/A | ||