Model Inference

Choose the right model, get to the perfect prompt.

83 models. New models added every week.

Anthropic AIAnthropic AIBlack Forest LabsBlack Forest LabsDeepSeekDeepSeekGoogleGoogleMetaMetaMistral AIMistral AIMoonshot AIMoonshot AINous ResearchNous ResearchOpenAIOpenAIPerplexityPerplexityQwenQwenShow all
Qwenqwen-image
Aug 2025
Excels at high-fidelity text rendering in English and Chinese, advanced image editing, diverse style generation, and intelligent image understanding tasks.
$0.03
Googlenano-banana
Aug 2025
Generates and edits images with high detail, keeping character and background consistency, supporting style transfer, upscaling, and complex multi-step instructions.
$0.04
QwenQwen Image Edit
Aug 2025
Advanced image editing model enabling precise bilingual text modification, style transfer, and nuanced visual changes—all via intuitive natural language instructions.
$0.03
Black Forest LabsFlux Kontext Dev
Aug 2025
Delivers precise, iterative image editing and generation with consistent character, style, and text changes—using multimodal input for seamless scene transformations.
$0.03
Black Forest LabsFLUX.1-dev
Aug 2025
Generates highly detailed, realistic images from text or images with advanced editing, style transfer, and open-access tools for inpainting and structure guidance.
$0.03
OpenAIOpenAI GPT OSS 120B
Aug 2025
Built with a Mixture-of-Experts design, delivers efficient, transparent reasoning, tool use, and agentic capabilities, even with 128K token context windows.
Input: $0.15 / Output: $0.60
OpenAIOpenAI GPT OSS 20B
Aug 2025
Delivers strong reasoning and chain-of-thought, agentic features, and multilingual support, optimized for local deployment and efficient use on modest hardware.
Input: $0.07 / Output: $0.30
QwenQwen 3 Coder 480B
Jul 2025
Designed for complex coding and agentic workflows, it excels at understanding large codebases and supports 262K-token context for advanced software engineering.
Input: $2.00 / Output: $2.00
QwenQwen3 Coder 480B (A35B) Instruct
Jul 2025
Enables advanced agentic coding, large-scale code analysis, and tool integration with up to 256K native context for repository-level tasks and automation.
Input: $0.45 / Output: $1.80
Moonshot AIKimi K2 Instruct
Jul 2025
Handles complex reasoning, coding, and tool use with 128K token context, enabling autonomous problem-solving and workflow automation.
Input: $0.60 / Output: $2.50
Anthropic AIClaude Opus 4
May 2025
Excels at deep reasoning, complex coding, and autonomous agent workflows with sustained performance, extended thinking, tool use, and memory across tasks.
Input: $15.00 / Output: $75.00
Anthropic AIClaude Sonnet 4
May 2025
Balances intelligence with efficiency for coding, research, and automation tasks; excels in reasoning, content generation, and nuanced instruction following.
Input: $3.00 / Output: $15.00
GoogleGemini 2.5 Pro Preview
May 2025
Excels at building interactive web apps, advanced code editing and agentic workflows, with native multimodality and strong video-to-code capabilities.
Input: $1.25 / Output: $10.00
QwenQwen 3 235B-A22B
Apr 2025
MoE model with 22B active parameters featuring dual thinking modes for complex reasoning and efficient conversation across 100+ languages.
Input: $0.22 / Output: $0.88
QwenQwen 3 30B-A3B
Apr 2025
MoE architecture with 3.3B active parameters, balancing efficiency with strong reasoning, multilingual capabilities, and specialized thinking mode.
Input: $0.90 / Output: $0.90
GoogleGemini 2.5 Flash Preview
Apr 2025
A thinking model offering enhanced reasoning with controllable "thinking" capabilities, balancing speed, cost, and performance for developers.
Input: $0.15 / Output: $3.50
OpenAIo4 mini
Apr 2025
Optimized for fast, affordable reasoning with strong coding and visual skills, large 200k-token context, and efficient handling of complex tasks.
Input: $1.10 / Output: $4.40
OpenAIo3
Apr 2025
Excels at advanced reasoning, coding, math, and visual tasks with simulated reasoning, tool use, web browsing, and image understanding integration.
Input: $2.00 / Output: $8.00
OpenAIGPT 4.1 nano
Apr 2025
OpenAI's fastest, cost-effective model with full 1 million token context, optimized for classification, autocompletion, and real-time AI agent tasks.
Input: $0.10 / Output: $0.40
OpenAIGPT 4.1 mini
Apr 2025
Powerful mid-sized model with GPT-4o-level performance at lower cost and latency, featuring a 1 million token context window for complex tasks.
Input: $0.40 / Output: $1.60
OpenAIGPT 4.1
Apr 2025
Excels in coding and instruction following with million-token context window, enabling superior performance on complex, multi-step tasks.
Input: $2.00 / Output: $8.00
MetaLlama 4 Scout Instruct
Apr 2025
Multimodal model specializing in multilingual text and image analysis, multi-document summarization, codebase reasoning, and highly personalized task automation.
Input: $0.08 / Output: $0.30
MetaLlama 4 Maverick
Apr 2025
Multimodal model with 17B active parameters, excelling at text, image, code, and multilingual tasks. Supports 1M-token context for advanced enterprise use.
Input: $0.22 / Output: $0.88
QwenQwen2.5-VL 32B Instruct
Mar 2025
imagetext
Handles advanced visual recognition, complex analysis of images and videos, structured data extraction, agentic tool use, and robust multilingual reasoning.
Input: $0.90 / Output: $0.90
DeepSeekDeepseek V3
Mar 2025
Delivers advanced reasoning, code generation, and mathematical skills, processes long inputs efficiently, and accelerates results with innovative Mixture-of-Experts design.
Input: $0.27 / Output: $1.10
Mistral AIMistral Small 3.1
Mar 2025
A lightweight, versatile 24B multimodal model handling text and images with extensive multilingual support and 128k token context window.
Input: $0.10 / Output: $0.30
GoogleGemma 3 27B
Mar 2025
Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning.
Input: $0.20 / Output: $0.40
GoogleGemini 2.0 Pro
Mar 2025
Google's most powerful Gemini 2.0 model, released in Feburary, 2025
Input: $1.25 / Output: $5.00
PerplexityPerplexity Sonar Deep Research
Mar 2025
Performs exhaustive, multi-step research by autonomously searching and synthesizing hundreds of sources into detailed, expert-level reports across domains.
Input: $5.00 / Output: $15.00
PerplexityPerplexity Sonar Reasoning Pro
Mar 2025
Premium reasoning model for complex, multi-step analysis. Delivers detailed explanations, real-time web search, and double citations for thorough answers.
Input: $3.00 / Output: $10.00
Anthropic AIClaude 3.7 Sonnet
Feb 2025
A hybrid reasoning model with standard and extended thinking modes, delivering twice the speed and exceptional performance in coding and problem-solving tasks.
Input: $3.00 / Output: $15.00
PerplexityPerplexity Sonar Reasoning
Jan 2025
Fast reasoning model with real-time web search, chain-of-thought capabilities, and citation support. Excels at complex queries with quick, accurate responses.
Input: $2.00 / Output: $6.00
PerplexityPerplexity Sonar
Jan 2025
Optimized for search-augmented tasks, delivering fast, accurate answers with real-time web data and detailed citations. Excels in research and fact-checking.
Input: $2.00 / Output: $2.00
PerplexityPerplexity Sonar Pro
Jan 2025
Excels at complex, multi-step queries with real-time web search, detailed answers, extensive citations, and customizable information retrieval.
Input: $5.00 / Output: $20.00
DeepSeekDeepseek R1
Jan 2025
Employs a massive Mixture-of-Experts architecture and Multi-Layer Attention to deliver advanced, polished reasoning and problem-solving across math, code, and more.
Input: $3.00 / Output: $8.00
DeepSeekDeepseek R1
Jan 2025
An open-source reasoning model using Mixture-of-Experts architecture, delivering powerful math and code capabilities comparable to OpenAI's o1.
Input: $0.55 / Output: $2.19
DeepSeekDeepseek V3
Dec 2024
Powers advanced reasoning, code generation, and multilingual tasks with efficient MoE architecture and enhanced multi-token prediction for faster, optimized results.
Input: $0.75 / Output: $3.00
Nous Research Hermes 3 8B
Dec 2024
Advanced agentic capabilities, roleplaying, reasoning, multi-turn conversation, long context coherence, and code generation with structured outputs.
Input: $0.03 / Output: $0.03
Nous Research Hermes 3 70B
Dec 2024
Advanced agentic capabilities with strong roleplaying, reasoning, and structured output generation for technical tasks.
Input: $0.20 / Output: $0.20
MetaLlama 3.3 70B Instruct
Dec 2024
Optimized for dialogue with strong reasoning, multilingual support, and efficient performance approaching larger models.
Input: $0.90 / Output: $0.90
MetaLlama 3.3 70B Speculative Decoding
Dec 2024
Optimized for speed via speculative decoding, excels in reasoning, coding, and complex tasks while maintaining high efficiency.
Input: $0.59 / Output: $0.59
QwenQwQ 32B Preview
Nov 2024
Specialized in advanced reasoning and problem-solving, excelling in mathematics and programming with a 32B parameter transformer architecture.
Input: $0.90 / Output: $0.90
QwenQwen 2.5 Coder 32B Instruct
Nov 2024
Specializes in code generation, reasoning, and fixing with 128K token context, open-source licensing, and local deployment capabilities.
Input: $0.90 / Output: $0.90
MetaLlama 3.1 8B
Oct 2024
Multilingual dialogue model optimized for tool integration and safety, with 128K context length for extended interactions.
Input: $0.05 / Output: $0.08
Anthropic AIClaude 3.5 Sonnet
Oct 2024
Powerful AI with exceptional coding abilities, twice the speed of previous versions, and advanced reasoning for complex software development tasks.
Input: $3.00 / Output: $15.00
Anthropic AIClaude 3.5 Haiku
Oct 2024
Anthropic's fastest model offering advanced coding, tool use, and reasoning capabilities with rapid response times for real-time applications and personalized experiences.
Input: $0.80 / Output: $4.00
Mistral AIMinistral 8B
Oct 2024
Efficient edge model with native function calling and interleaved sliding-window attention for fast, memory-efficient processing in resource-constrained environments.
Input: $0.10 / Output: $0.10
QwenQwen2.5 1.5B Instruct
Sep 2024
texttext
Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:
$0.00
MetaLlama 3.2 3B Instruct
Sep 2024
Highly efficient multilingual model excels at instruction-following, dialogue, and summarization, supporting eight languages for accurate, compact text generation and retrieval.
Input: $0.02 / Output: $0.02
GoogleGemini 1.5 Flash - 8B
Sep 2024
Optimized for high-volume, cost-effective tasks with multimodal input support, excelling in transcription and long-context processing.
Input: $0.04 / Output: $0.15
QwenQwen2.5 72B Instruct
Sep 2024
Instruction-tuned LLM excelling in long-context processing (131K tokens), multilingual support (29+ languages), and structured data handling.
Input: $0.90 / Output: $0.90
OpenAIo1 mini
Sep 2024
Optimized for coding and math with Chain-of-Thought reasoning, offering fast, cost-efficient responses for complex problem-solving.
Input: $3.00 / Output: $12.00
OpenAIo1 preview
Sep 2024
Reasoning-focused LLM for complex science, math, and coding tasks, generating detailed thought processes before responses.
Input: $15.00 / Output: $60.00
Mistral AIPixtral 12B
Sep 2024
Multimodal model handling text and images at native resolution with 128K context window, excelling in visual reasoning tasks like document analysis and image captioning.
Input: $0.15 / Output: $0.15
Nous Research Hermes 3 405B
Aug 2024
Advanced agentic capabilities with enhanced reasoning, roleplaying, and multi-turn conversation handling. Excels in structured output and long-context coherence.
Input: $0.90 / Output: $0.90
MetaLlama 3.1 70B Instruct
Jul 2024
Multilingual LLM excelling in question answering, reasoning, code generation, and synthetic data generation.
Input: $0.90 / Output: $0.90
MetaLlama 3.3 70B Versatile 128k
Jul 2024
Excels in multilingual tasks, tool use, coding, and reasoning with improved accuracy and efficient performance.
Input: $0.59 / Output: $0.79
MetaLlama 3.1 405B Instruct
Jul 2024
Optimized for multilingual dialogue with 128k context, instruction-tuned via SFT/RLHF, and enhanced with synthetic data for safety and performance.
Input: $3.00 / Output: $3.00
MetaLlama 3.1 8B Instruct
Jul 2024
Optimized for multilingual dialogue with 128k context length, excels in chat, text generation, and language translation.
Input: $0.20 / Output: $0.20
Mistral AIMistral Nemo
Jul 2024
Handles long-form content with 128k token context, excels in multilingual tasks, coding, and function calling via natural language.
Input: $0.15 / Output: $0.15
OpenAIGPT 4o mini
Jul 2024
Cost-efficient, fast model with 128K context window, supporting text/vision inputs and improved multilingual performance.
Input: $0.15 / Output: $0.60
GoogleGemma 2 9B Instruct
Jun 2024
Efficient 9B parameter model trained on diverse web, code, and math data, excelling in coding and mathematical tasks.
Input: $0.20 / Output: $0.20
Mistral AICodestral 2405
May 2024
Specializes in code generation with 32k token context, excelling in completion, debugging, and optimization across 80+ languages.
Input: $0.20 / Output: $0.60
GoogleGemini 1.5 Pro
May 2024
Multimodal LLM with 2M token context, excels in complex reasoning, coding, and multimodal Q&A across text, images, audio, and video.
Input: $1.25 / Output: $5.00
GoogleText Embedding 004
May 2024
textembeddings
Generates vector representations capturing semantic meaning/context for tasks like semantic search, text classification, and clustering. Multilingual support with versatile applications.
Input: $0.02 / Output: $0.02
GoogleGemini 1.5 Flash
May 2024
Optimized for speed and efficiency, handles high-volume tasks with multimodal processing (text, images, video, audio) for summarization, chat, and data extraction.
Input: $0.08 / Output: $0.30
OpenAIGPT 4o
May 2024
Multimodal LLM for real-time text, audio, and visual processing with multilingual support, emotional audio responses, and image generation.
Input: $2.50 / Output: $10.00
Mistral AIMixtral 8x22B
Apr 2024
Efficient Sparse MoE architecture with 39B active parameters, excels in multilingual tasks, math, coding, and handles 64K token contexts.
Input: $2.00 / Output: $6.00
Mistral AIMistral Large 2
Feb 2024
Powerful LLM with 123B parameters, excelling in multilingual tasks, coding, and reasoning, optimized for single-node inference and long-context applications.
Input: $2.00 / Output: $6.00
OpenAIText Embedding 3 - Small
Jan 2024
textembeddings
Generates compact, efficient embeddings for NLP tasks with multilingual support, balancing performance and low latency.
Input: $0.02 / Output: $0.02
OpenAIText Embedding 3 - Large
Jan 2024
textembeddings
Generates high-quality embeddings for complex text analysis and multilingual applications with 8,191 token context.
Input: $0.13 / Output: $0.13
Mistral AIMixtral 8x7B
Dec 2023
Efficient Mixture of Experts (8 experts) with 13B active parameters, optimized for multilingual tasks and cost-performance balance.
Input: $0.70 / Output: $0.70
OpenAIDALL-E 3
Oct 2023
Translates nuanced text prompts into detailed, accurate images with automatic prompt rewriting, multiple aspect ratios, and ChatGPT integration for creative workflows[1][2][6].
$NaN
Mistral AIMistral 7B
Sep 2023
Balanced performance in natural language and code tasks, efficiently handling longer sequences with innovative attention mechanisms.
Input: $0.25 / Output: $0.25
OpenAIo3 mini
Feb 2025
Optimized for STEM reasoning and problem-solving, excelling in complex tasks like advanced math and coding with improved cost efficiency.
Input: $1.10 / Output: $4.40
DeepSeekDeepseek R1 Distill Llama 70B
Feb 2025
Delivers strong mathematical and coding abilities, matching the performance of larger models while using efficient distillation and multilingual support.
Input: $0.59 / Output: $0.79
OpenAIo1
Feb 2025
Specializes in complex reasoning through chain-of-thought processing, excelling in STEM tasks like coding, math, and scientific analysis.
Input: $15.00 / Output: $60.00
OpenAIGPT 4.5
Feb 2025
Excels in natural conversation and creative tasks with improved emotional intelligence and multilingual support, prioritizing intuitive interactions over structured reasoning.
Input: $75.00 / Output: $150.00
Mistral AIMinistral 3B
Oct 2024
Optimized for edge computing with function-calling capabilities, excelling in knowledge retrieval and commonsense reasoning with 128k token context.
Input: $0.04 / Output: $0.04
GoogleGemini 2.0 Flash Lite
Feb 2025
Cost-efficient, budget-friendly multimodal LLM for real-time tasks with 1M token input context and enhanced performance.
Input: $0.08 / Output: $0.30
GoogleGemini 2.0 Flash
Feb 2025
Multimodal LLM for agentic applications, handling real-time data integration and multi-step tasks with enhanced reasoning via Thinking Mode, integrating Google tools and third-party functions.
Input: $0.10 / Output: $0.40
Mistral AICodestral Latest
Feb 2025
Specializes in coding tasks with multilingual support for 80+ languages, excelling in code generation, fill-in-the-middle, and test creation with a 256K token context.
Input: $0.30 / Output: $0.90
GoogleGemini 2.5 Pro Experimental
Mar 2025
Handles complex reasoning and coding tasks, generates and interprets multimodal content, and supports interactive visualizations with an extensive 1M token context.
Input: $2.50 / Output: $5.00