Supported Models&Pricing
Model Catalog & Pricing
Updated: 2025-09-11
Anthropic
Claude Haiku 3
Text (Legacy)
0.25
1.25
-
Anthropic
Claude Haiku 3.5
Text (Latest)
0.8
4
1
Anthropic
Claude Opus 3
Text (Legacy)
15
75
18.75
Anthropic
Claude Opus 4
Text (Legacy)
15
75
18.75
Anthropic
Claude Opus 4.1
Text (Latest)
15
75
18.75
Anthropic
Claude Sonnet 3.7
Text (Legacy)
3
15
-
Anthropic
Claude Sonnet 4
Text (Latest)
3 ≤ 200K tokens6 > 200K tokens
15 ≤ 200K tokens22.5 > 200K tokens
3.75 ≤ 200K tokens7.5 > 200K tokens
DeepSeek
DeepSeek-R1
Text
0.07
1.68
0.56
DeepSeek
DeepSeek-V3
Text (Latest)
0.07
1.68
0.56
DeepSeek
deepseek-coder
Text
0.07
0.56
0.14
Gemini 2.5 Pro
Text (Latest)
1.25 ≤ 200K tokens2.5 > 200K tokens
10 ≤ 200K tokens15 > 200K tokens
-
Gemini-2.0-flash
Text
0.15
0.6
-
Gemini-2.0-flash-lite
Text
0.075
0.3
-
Gemini-2.5-flash
Text
2.5
10 ≤ 200K tokens15 > 200K tokens
-
Gemini-2.5-flash-lite
Text
0.1
0.4
-
Gemini-pro-vision
Image
50
200
-
OpenAI
gpt-3.5-turbo
Legacy text (Standard)
0.5
1.5
-
OpenAI
gpt-4
Legacy text (Standard)
30
60
-
OpenAI
gpt-4.1
Text (Standard)
2
8
0.5
OpenAI
gpt-4.1-mini
Text (Standard)
0.4
1.6
0.1
OpenAI
gpt-4.1-nano
Text (Standard)
0.1
0.4
0.025
OpenAI
gpt-4o
Text (Standard)
2.5
10
1.25
OpenAI
gpt-4-turbo
Legacy text (Standard)
10
30
-
OpenAI
gpt-5
Text (Standard)
1.25
10
0.125
OpenAI
gpt-5-mini
Text (Standard)
0.25
2
0.025
OpenAI
gpt-5-nano
Text (Standard)
0.05
0.4
0.005
OpenAI
o3
Text (Standard, reasoning)
2
8
0.5
OpenAI
o4-mini
Text (Standard, reasoning)
1.1
4.4
0.275
I. Recommended Models & Use Cases
(Prices shown as USD per 1M tokens — in = input, out = output)
1) Low cost / high volume
gpt-5-nano ($0.05 in / $0.4 out): Ultra-cheap default for canary/large-scale tasks.
gpt-4.1-nano ($0.1 / $0.4): A more stable, all-round “nano” tier.
Gemini-2.0-flash-lite ($0.075 / $0.3) or Gemini-2.5-flash-lite ($0.1 / $0.4): Great value for lightweight generation/understanding.
deepseek-coder ($0.07 / $0.56): High value for code generation/refactoring.
2) General chat / content generation
gpt-4.1-mini ($0.4 / $1.6): Stable, versatile “mini” workhorse.
DeepSeek-V3 ($0.07 / $1.68): Strong in Chinese and medium-to-long text with friendly cost.
Claude Haiku 3 ($0.25 / $1.25): Lightweight, high-quality alternative.
3) High quality / complex tasks
gpt-4o ($2.5 / $10) or gpt-4.1 ($2 / $8): For high-quality long-form and multi-turn.
Gemini 2.5 Pro (tiered): $1.25–$2.5 in / $10–$15 out for complex, composite workloads.
Gemini-2.5-flash ($2.5 in / $10–$15 out): Strong performance at a higher price.
4) Reasoning / chain-of-thought
o4-mini ($1.1 / $4.4): Reasoning-capable at a lower price point.
o3 ($2 / $8): A stronger reasoning route.
DeepSeek-R1 ($0.07 / $1.68): Low-cost reasoning trials.
5) Vision / multimodal
Gemini-pro-vision ($50 in / $200 out): Use when vision understanding is required (costly—call only as needed).
II. Full Price Table (USD / 1M tokens)
The “Cached input” column applies only when a provider supports cache-hit discounts; if not indicated, that model/route does not differentiate cache pricing.
Notes
Prices above are per million tokens. If you see tiering like “≤200K / >200K”, billing follows the provider’s tier active at call time.
By default, user-side settlement sums input and output tokens separately (and, where applicable, by route definition).
If a single request is split across providers due to fallback/segmented routing, charges are computed per segment and then merged.
For provider revenue share, protocol fees, and the user 90% discount rule, see Pricing & Rewards Allocation Model.
Last updated