How smart routing works

What flux-auto does for you, how the live, adaptive router picks a model for each request, and how you stay in control by using a tier alias or pinning a model.

Smart routing is the reason FluxRouter exists: you send a request to flux-auto and Flux picks a sensible model for it, so you do not have to choose a model for every prompt. This page explains what that means in practice.

What does flux-auto do for me?

When you set model to flux-auto, Flux inspects your request and picks a model sized for the work. Lightweight requests go to fast, inexpensive models; harder requests go to stronger ones. You get a reasonable result without managing model selection yourself, and you pay the rate of whichever model actually served the request rather than a frontier rate on everything.

bash

curl https://api.fluxrouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $FLUX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "flux-auto",
    "messages": [{ "role": "user", "content": "Summarize this paragraph." }]
  }'

How does flux-auto decide which model to use?

Routing happens in two steps, live on every request:

A request classifier picks the lane. Flux reads your request and matches it to a tier — fast, standard, or reasoning — based on the kind of work it is. The three tiers map to the three pricing lanes described in Tiers and pricing lanes.
A learning router picks the model within the lane. Inside the lane, an adaptive router (a Thompson-sampling bandit) selects among the lane's models based on how they have actually been performing. It is cost-aware: among candidates it expects to do the job equally well, it prefers the cheaper one.

The lane sets what you pay, so the class of model — and the rate — stays predictable for the same kind of request, while the router keeps improving its picks within that class.

Is the router self-learning?

Yes. The lane a request lands in is decided by the classifier, and within the lane the router learns continuously from live results: models that keep winning get picked more, models that slip get picked less, and cost breaks ties among quality-equivalent candidates. Your rate is set by the lane, not by which model inside it served the request.

How do I stay in control?

You are never locked into whatever flux-auto would pick. You have two levels of control, in increasing strictness:

Choose a tier. Pass a tier alias instead of flux-auto: flux-fast, flux-standard, or flux-reasoning. This tells Flux which class of model you want without naming a specific one.
Pin an exact model. Pass a flux-pinned-* alias to lock in a single backing model every time. See flux-auto vs pinning a model.

The authoritative list of aliases you can send is always GET /v1/models.

How do I see what it picked?

Every response includes X-Flux-* headers that tell you which model served the request, whether the router changed your requested model, and what the request cost. See Transparency headers.

How smart routing works

What flux-auto does for you, how the live, adaptive router picks a model for each request, and how you stay in control by using a tier alias or pinning a model.

What does flux-auto do for me?#

How does flux-auto decide which model to use?#

Is the router self-learning?#

How do I stay in control?#

How do I see what it picked?#

What does flux-auto do for me?

How does flux-auto decide which model to use?

Is the router self-learning?

How do I stay in control?

How do I see what it picked?