Truncated or empty output

Why FluxRouter responses stop early or come back blank, and how to fix it: max_tokens, streaming handling, and client cut-offs.

A response that stops mid-sentence or comes back empty is almost always a client-side limit or a handling bug, not a routing problem. Work through these in order.

Why does my response stop mid-sentence?

Symptom: The model output is cut off before it finishes.

Cause: max_tokens is too low. The model hits the output limit and stops.

Fix:

Raise max_tokens to give the response room to finish. On the OpenAI chat path this is max_tokens (or max_completion_tokens); on the Anthropic path max_tokens is required.
Check the response finish_reason. A value of length confirms the output was truncated by the token limit, not by the model finishing naturally (stop).
json
```
{
  "choices": [
    { "finish_reason": "length", "message": { "role": "assistant", "content": "..." } }
  ]
}
```
If you are asking for long output (code files, long documents), set max_tokens generously.

Why is my streamed response incomplete?

Symptom: Streaming output looks short or drops the end of the message.

Cause: The stream is not being fully consumed, or chunks are being dropped before the [DONE] event.

Fix:

Read the stream to completion. Do not break out of the loop early, and wait for the terminating [DONE] event (OpenAI path) or the message_stop event (Anthropic path).
Accumulate delta content across all chunks rather than reading only the first or last.
Make sure your HTTP client is not buffering with a size cap that truncates the stream.

Why is the response completely empty?

Symptom: The request succeeds (200) but content is blank.

Cause: Usually one of: max_tokens set so low the model produced nothing usable, a reasoning-heavy model spending its budget before visible output, or your code reading the wrong field.

Fix:

Read the right field. OpenAI chat: choices[0].message.content. OpenAI Responses: the output array. Anthropic: content[0].text.
Raise max_tokens if it was set very low (for example under 64).
Check finish_reason. A length finish with empty visible content means the budget was consumed before output was emitted; raise max_tokens.
Confirm the request actually has a user message with non-empty content.

Why does it cut off only in my app, but work in curl?

Symptom: The same prompt returns full output in curl but truncates in your application.

Cause: A client-side read timeout or a response size cap in your HTTP layer is cutting the connection before the full body arrives.

Fix:

Increase your client read timeout. Long generations can take longer than a default 30s timeout.
Prefer streaming for long responses so you receive output incrementally instead of waiting for one large body.
See Latency and timeouts for client timeout settings.

Truncated or empty output

Why FluxRouter responses stop early or come back blank, and how to fix it: max_tokens, streaming handling, and client cut-offs.

Why does my response stop mid-sentence?#

Why is my streamed response incomplete?#

Why is the response completely empty?#

Why does it cut off only in my app, but work in curl?#

Why does my response stop mid-sentence?

Why is my streamed response incomplete?

Why is the response completely empty?

Why does it cut off only in my app, but work in curl?