Supported Models&Pricing

Model Catalog & Pricing

Updated: 2025-09-11

Provider

Model

Type

Input $/1M

Output $/1M

Cached input $/1M

Anthropic

Claude Haiku 3

Text (Legacy)

0.25

1.25

Anthropic

Claude Haiku 3.5

Text (Latest)

0.8

Anthropic

Claude Opus 3

Text (Legacy)

18.75

Anthropic

Claude Opus 4

Text (Legacy)

18.75

Anthropic

Claude Opus 4.1

Text (Latest)

18.75

Anthropic

Claude Sonnet 3.7

Text (Legacy)

Anthropic

Claude Sonnet 4

Text (Latest)

3 ≤ 200K tokens6 > 200K tokens

15 ≤ 200K tokens22.5 > 200K tokens

3.75 ≤ 200K tokens7.5 > 200K tokens

DeepSeek

DeepSeek-R1

Text

0.07

1.68

0.56

DeepSeek

DeepSeek-V3

Text (Latest)

0.07

1.68

0.56

DeepSeek

deepseek-coder

Text

0.07

0.56

0.14

Google

Gemini 2.5 Pro

Text (Latest)

1.25 ≤ 200K tokens2.5 > 200K tokens

10 ≤ 200K tokens15 > 200K tokens

Google

Gemini-2.0-flash

Text

0.15

0.6

Google

Gemini-2.0-flash-lite

Text

0.075

0.3

Google

Gemini-2.5-flash

Text

2.5

10 ≤ 200K tokens15 > 200K tokens

Google

Gemini-2.5-flash-lite

Text

0.1

0.4

Google

Gemini-pro-vision

Image

200

OpenAI

gpt-3.5-turbo

Legacy text (Standard)

0.5

1.5

OpenAI

gpt-4

Legacy text (Standard)

OpenAI

gpt-4.1

Text (Standard)

0.5

OpenAI

gpt-4.1-mini

Text (Standard)

0.4

1.6

0.1

OpenAI

gpt-4.1-nano

Text (Standard)

0.1

0.4

0.025

OpenAI

gpt-4o

Text (Standard)

2.5

1.25

OpenAI

gpt-4-turbo

Legacy text (Standard)

OpenAI

gpt-5

Text (Standard)

1.25

0.125

OpenAI

gpt-5-mini

Text (Standard)

0.25

0.025

OpenAI

gpt-5-nano

Text (Standard)

0.05

0.4

0.005

OpenAI

Text (Standard, reasoning)

0.5

OpenAI

o4-mini

Text (Standard, reasoning)

1.1

4.4

0.275

I. Recommended Models & Use Cases

(Prices shown as USD per 1M tokens — in = input, out = output)

1) Low cost / high volume

gpt-5-nano ($0.05 in / $0.4 out): Ultra-cheap default for canary/large-scale tasks.
gpt-4.1-nano ($0.1 / $0.4): A more stable, all-round “nano” tier.
Gemini-2.0-flash-lite ($0.075 / $0.3) or Gemini-2.5-flash-lite ($0.1 / $0.4): Great value for lightweight generation/understanding.
deepseek-coder ($0.07 / $0.56): High value for code generation/refactoring.

2) General chat / content generation

gpt-4.1-mini ($0.4 / $1.6): Stable, versatile “mini” workhorse.
DeepSeek-V3 ($0.07 / $1.68): Strong in Chinese and medium-to-long text with friendly cost.
Claude Haiku 3 ($0.25 / $1.25): Lightweight, high-quality alternative.

3) High quality / complex tasks

gpt-4o ($2.5 / $10) or gpt-4.1 ($2 / $8): For high-quality long-form and multi-turn.
Gemini 2.5 Pro (tiered): $1.25–$2.5 in / $10–$15 out for complex, composite workloads.
Gemini-2.5-flash ($2.5 in / $10–$15 out): Strong performance at a higher price.

4) Reasoning / chain-of-thought

o4-mini ($1.1 / $4.4): Reasoning-capable at a lower price point.
o3 ($2 / $8): A stronger reasoning route.
DeepSeek-R1 ($0.07 / $1.68): Low-cost reasoning trials.

5) Vision / multimodal

Gemini-pro-vision ($50 in / $200 out): Use when vision understanding is required (costly—call only as needed).

II. Full Price Table (USD / 1M tokens)

The “Cached input” column applies only when a provider supports cache-hit discounts; if not indicated, that model/route does not differentiate cache pricing.

Notes

Prices above are per million tokens. If you see tiering like “≤200K / >200K”, billing follows the provider’s tier active at call time.
By default, user-side settlement sums input and output tokens separately (and, where applicable, by route definition).
If a single request is split across providers due to fallback/segmented routing, charges are computed per segment and then merged.
For provider revenue share, protocol fees, and the user 90% discount rule, see Pricing & Rewards Allocation Model.

PreviousEnterprise Partnerships NextBYOK（Bring Your Own Keys）

Last updated 8 days ago