Supported Models&Pricing

Model Catalog & Pricing

Updated: 2025-09-11

Provider
Model
Type
Input $/1M
Output $/1M
Cached input $/1M

Anthropic

Claude Haiku 3

Text (Legacy)

0.25

1.25

-

Anthropic

Claude Haiku 3.5

Text (Latest)

0.8

4

1

Anthropic

Claude Opus 3

Text (Legacy)

15

75

18.75

Anthropic

Claude Opus 4

Text (Legacy)

15

75

18.75

Anthropic

Claude Opus 4.1

Text (Latest)

15

75

18.75

Anthropic

Claude Sonnet 3.7

Text (Legacy)

3

15

-

Anthropic

Claude Sonnet 4

Text (Latest)

3 ≤ 200K tokens6 > 200K tokens

15 ≤ 200K tokens22.5 > 200K tokens

3.75 ≤ 200K tokens7.5 > 200K tokens

DeepSeek

DeepSeek-R1

Text

0.07

1.68

0.56

DeepSeek

DeepSeek-V3

Text (Latest)

0.07

1.68

0.56

DeepSeek

deepseek-coder

Text

0.07

0.56

0.14

Google

Gemini 2.5 Pro

Text (Latest)

1.25 ≤ 200K tokens2.5 > 200K tokens

10 ≤ 200K tokens15 > 200K tokens

-

Google

Gemini-2.0-flash

Text

0.15

0.6

-

Google

Gemini-2.0-flash-lite

Text

0.075

0.3

-

Google

Gemini-2.5-flash

Text

2.5

10 ≤ 200K tokens15 > 200K tokens

-

Google

Gemini-2.5-flash-lite

Text

0.1

0.4

-

Google

Gemini-pro-vision

Image

50

200

-

OpenAI

gpt-3.5-turbo

Legacy text (Standard)

0.5

1.5

-

OpenAI

gpt-4

Legacy text (Standard)

30

60

-

OpenAI

gpt-4.1

Text (Standard)

2

8

0.5

OpenAI

gpt-4.1-mini

Text (Standard)

0.4

1.6

0.1

OpenAI

gpt-4.1-nano

Text (Standard)

0.1

0.4

0.025

OpenAI

gpt-4o

Text (Standard)

2.5

10

1.25

OpenAI

gpt-4-turbo

Legacy text (Standard)

10

30

-

OpenAI

gpt-5

Text (Standard)

1.25

10

0.125

OpenAI

gpt-5-mini

Text (Standard)

0.25

2

0.025

OpenAI

gpt-5-nano

Text (Standard)

0.05

0.4

0.005

OpenAI

o3

Text (Standard, reasoning)

2

8

0.5

OpenAI

o4-mini

Text (Standard, reasoning)

1.1

4.4

0.275

(Prices shown as USD per 1M tokens — in = input, out = output)

1) Low cost / high volume

  • gpt-5-nano ($0.05 in / $0.4 out): Ultra-cheap default for canary/large-scale tasks.

  • gpt-4.1-nano ($0.1 / $0.4): A more stable, all-round “nano” tier.

  • Gemini-2.0-flash-lite ($0.075 / $0.3) or Gemini-2.5-flash-lite ($0.1 / $0.4): Great value for lightweight generation/understanding.

  • deepseek-coder ($0.07 / $0.56): High value for code generation/refactoring.

2) General chat / content generation

  • gpt-4.1-mini ($0.4 / $1.6): Stable, versatile “mini” workhorse.

  • DeepSeek-V3 ($0.07 / $1.68): Strong in Chinese and medium-to-long text with friendly cost.

  • Claude Haiku 3 ($0.25 / $1.25): Lightweight, high-quality alternative.

3) High quality / complex tasks

  • gpt-4o ($2.5 / $10) or gpt-4.1 ($2 / $8): For high-quality long-form and multi-turn.

  • Gemini 2.5 Pro (tiered): $1.25–$2.5 in / $10–$15 out for complex, composite workloads.

  • Gemini-2.5-flash ($2.5 in / $10–$15 out): Strong performance at a higher price.

4) Reasoning / chain-of-thought

  • o4-mini ($1.1 / $4.4): Reasoning-capable at a lower price point.

  • o3 ($2 / $8): A stronger reasoning route.

  • DeepSeek-R1 ($0.07 / $1.68): Low-cost reasoning trials.

5) Vision / multimodal

  • Gemini-pro-vision ($50 in / $200 out): Use when vision understanding is required (costly—call only as needed).


II. Full Price Table (USD / 1M tokens)

The “Cached input” column applies only when a provider supports cache-hit discounts; if not indicated, that model/route does not differentiate cache pricing.


Notes

  • Prices above are per million tokens. If you see tiering like “≤200K / >200K”, billing follows the provider’s tier active at call time.

  • By default, user-side settlement sums input and output tokens separately (and, where applicable, by route definition).

  • If a single request is split across providers due to fallback/segmented routing, charges are computed per segment and then merged.

  • For provider revenue share, protocol fees, and the user 90% discount rule, see Pricing & Rewards Allocation Model.

Last updated