Routing & pricing

How flux-auto routes each request, what the three pricing lanes cost, pay-as-you-go billing, and the transparency headers that show which model answered.

FluxRouter's job is to right-size each request: send simple work to a cheap, fast model and hard work to a stronger one, without you having to pick. You set model to flux-auto, FluxRouter chooses the model, and every response tells you what it chose.

How flux-auto routes

When you send a request with model: "flux-auto", FluxRouter routes it live in two steps. A request classifier matches the request to a tier (fast, standard, or reasoning), and within that tier an adaptive, learning router (a Thompson-sampling bandit) picks the model — cost-aware, preferring the cheaper option among candidates it expects to do the job equally well. The same kind of request gets the same class of model and the same lane rate, so what you pay stays predictable.

You stay in control:

Want a specific class without naming a model? Use a tier alias: flux-fast, flux-standard, or flux-reasoning.
Want one exact model every time? Pin it with a flux-pinned-* id. See Models.

Pricing lanes

Models are priced in three lanes. You pay per token at the lane rate of whichever model served the request, with input and output priced separately. All rates are pay-as-you-go.

Lane	Input	Output	For
Fast (Express Lane)	$1 / 1M	$4 / 1M	Lightweight, high-volume, latency-sensitive work
Standard (Daily Driver)	$2 / 1M	$8 / 1M	General-purpose coding, writing, and analysis
Reasoning (Deep Thought)	$4 / 1M	$15 / 1M	The hardest reasoning and frontier-model work

The live rate for the model that served your request is what you are charged. See the pricing page for current numbers and How pay-as-you-go billing works for the full breakdown.

Pay-as-you-go

You pay for what you route. There is no per-seat fee on usage and no minimum commitment to send a request. Plans set an included credit and a monthly spend ceiling; pay-as-you-go bills usage directly. Because flux-auto sends cheap requests to cheap models, a typical mixed workload costs less than pinning everything to a single frontier model.

See the pricing page for current plans and ceilings.

Transparency headers

Every response carries X-Flux-* headers so you can see exactly what happened. These are returned on chat completion responses:

Header	Value	Meaning
`X-Flux-Model`	e.g. `claude-haiku`	The model that actually served the request
`X-Flux-Original-Model`	e.g. `flux-auto`	The model id you requested
`X-Flux-Routed`	`true` / `false`	Whether the router changed the model
`X-Flux-Request-Id`	a unique id	Identifier for support and debugging
`X-Flux-Cost-Usd`	e.g. `0.000412`	What this request cost, in USD (non-streaming responses)

Read these to confirm which model answered, see what it cost, and keep a request id for support. Inspect them with curl:

bash

curl -i https://api.fluxrouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $FLUX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "flux-auto",
    "messages": [{ "role": "user", "content": "ping" }]
  }'
# Response headers include:
# X-Flux-Model: ...
# X-Flux-Original-Model: flux-auto
# X-Flux-Routed: true
# X-Flux-Request-Id: ...
# X-Flux-Cost-Usd: 0.000412

The cost header is present on non-streaming responses. On streaming responses the final cost is only known after the headers have flushed, so it is reported on your bill rather than in the header.

Cost vs going direct

The point of routing is that you do not overpay for easy requests. With flux-auto, simple prompts land on Fast and Standard models and only the hard ones reach the Reasoning lane, so a mixed workload pays a blended rate instead of a frontier rate on everything. You see the model and lane on every response via the headers above, and your spend rolls up to a single bill.

Routing & pricing

How flux-auto routes each request, what the three pricing lanes cost, pay-as-you-go billing, and the transparency headers that show which model answered.

How flux-auto routes#

Pricing lanes#

Pay-as-you-go#

Transparency headers#

Cost vs going direct#

How flux-auto routes

Pricing lanes

Pay-as-you-go

Transparency headers

Cost vs going direct