Vision and image inputs
Send images to the model using the standard OpenAI multimodal message shape with image URLs or base64 data.
Vision means sending an image to the model alongside text so it can describe, read, or reason about the picture. FluxRouter accepts the standard OpenAI multimodal message shape against https://api.fluxrouter.ai/v1/chat/completions, so you pass images exactly as you would calling OpenAI. The only Flux-specific part is the base URL and the flux-auto model.
The multimodal message shape
Instead of a plain string, the content of a user message is an array of parts. Each part is either text or an image. An image part uses type: "image_url".
{
"role": "user",
"content": [
{ "type": "text", "text": "What is in this image?" },
{ "type": "image_url", "image_url": { "url": "https://example.com/cat.jpg" } }
]
}
Send an image by URL
The simplest case: point the model at a public image URL.
from openai import OpenAI
client = OpenAI(
api_key="sk-...", # your Flux key
base_url="https://api.fluxrouter.ai/v1", # the one line you change
)
response = client.chat.completions.create(
model="flux-auto",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image in one sentence."},
{
"type": "image_url",
"image_url": {"url": "https://example.com/cat.jpg"},
},
],
}
],
)
print(response.choices[0].message.content)
Send an image as base64
For local images, encode the file and pass it as a data: URL. This works the same way; only the url value changes.
import base64
from openai import OpenAI
client = OpenAI(
api_key="sk-...",
base_url="https://api.fluxrouter.ai/v1",
)
with open("diagram.png", "rb") as f:
b64 = base64.b64encode(f.read()).decode("utf-8")
response = client.chat.completions.create(
model="flux-auto",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What does this diagram show?"},
{
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{b64}"},
},
],
}
],
)
print(response.choices[0].message.content)
curl (image URL)
curl https://api.fluxrouter.ai/v1/chat/completions \
-H "Authorization: Bearer $FLUX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "flux-auto",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "What is in this image?" },
{ "type": "image_url", "image_url": { "url": "https://example.com/cat.jpg" } }
]
}
]
}'
Make sure your model can see images
Vision is a capability of the underlying model, not of FluxRouter. flux-auto routes to a sensible model for the request, and when your message includes an image the router aims for a vision-capable model, but image support ultimately depends on the model that serves the request.
If your application always sends images, pin a vision-capable model so every request lands on one that supports image inputs. Pass a flux-pinned-* id (for example flux-pinned-claude-sonnet, flux-pinned-gpt-5, or flux-pinned-gemini-3-1-pro) instead of flux-auto. See Models for the full list.
Notes
- This page covers image inputs (the model reads an image you send). To create images from a text prompt, see Generate images.
- The Anthropic-compatible base at
https://api.fluxrouter.ai/anthropicalso accepts images, using Anthropic's native image content blocks (asourcewithtype: "base64"ortype: "url"), exactly as you would calling Anthropic directly. - Multiple images per message are allowed; add more
image_urlparts to the samecontentarray.