OpenAI

OpenAI o4-mini

Cheap reasoning — o-series math without the o3 invoice.

ReasoningFast

$ / 1M input

$1.100

What you pay per million prompt tokens.

$ / 1M output

$4.400

What you pay per million completion tokens.

Blended (70 / 30)

$2.090

Typical chat workload mix.

Ratings & benchmarks

Snapshot 2026-04-28

Overall

4.4 / 5

Composite of intelligence + reliability.

Output speed

50 t/s

Output tokens per second under typical load.

Time to first token

4.0 s

Lower is better. Reasoning models naturally stretch this.

Value

4.0 / 5

Intelligence per dollar at typical mix.

Sources: Artificial Analysis · OpenRouter Stats · Vellum LLM Leaderboard. Rating values are blended from these public leaderboards on the snapshot date and refreshed as the underlying sources update.

Context window

200,000 tokens

Try in API

Replace $THK_KEY with your Universal Key from /dashboard. Token Harbor speaks the OpenAI Chat Completions wire format — the SDK you already use just points at our base URL.

curl

curl https://tokenharbor.ai/v1/chat/completions \
  -H "Authorization: Bearer $THK_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/o4-mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
    api_key="$THK_KEY",
    base_url="https://tokenharbor.ai/v1",
)

resp = client.chat.completions.create(
    model="openai/o4-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)

Node (openai SDK)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.THK_KEY,
  baseURL: "https://tokenharbor.ai/v1",
});

const resp = await client.chat.completions.create({
  model: "openai/o4-mini",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(resp.choices[0].message.content);

Smart routing

Don't want to babysit the model id? Pass tokenharbor/auto and our smart router picks the best fit per request — health- scored across 9 upstream vendors, automatically failing over when one slows down.

How smart routing works

Pricing footnote. The numbers above match OpenAI's published rates as of 2026-04-28. We don't add a markup — we make money on the 10% spread between gateway pricing and the discounted rate we pay our preferred provider for this model (recorded in lib/models-catalog.ts as openai).

Try in API

Replace $THK_KEY with your Universal Key from /dashboard. Token Harbor speaks the OpenAI Chat Completions wire format — the SDK you already use just points at our base URL.

curl

curl https://tokenharbor.ai/v1/chat/completions \
  -H "Authorization: Bearer $THK_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/o4-mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
    api_key="$THK_KEY",
    base_url="https://tokenharbor.ai/v1",
)

resp = client.chat.completions.create(
    model="openai/o4-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)

Node (openai SDK)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.THK_KEY,
  baseURL: "https://tokenharbor.ai/v1",
});

const resp = await client.chat.completions.create({
  model: "openai/o4-mini",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(resp.choices[0].message.content);