Mixed Understanding
Use the returned task ID to query the task for the final result.
Authorizations
All endpoints require Bearer Token authentication. Add to the request header:
Authorization: Bearer YOUR_API_KEY
YOUR_API_KEY is the API Token (sk-... format).
Body
Request body in messages[] form (OpenAI Chat compatible). Apart from the fields listed below, other OpenAI-compatible parameters (temperature, top_p, stop, frequency_penalty, etc.) are used per the OpenAI Chat spec.
"claude-opus-4-7"
OpenAI Chat format messages array. messages[*].content may be a string or an array; array element type ∈ {text, image_url, video_url, audio_url, file_url}. A type the model does not support returns 422 model_not_support_capability.
Typical content shapes:
- Plain text (string content, simplest form):
[{"role": "user", "content": "count 1 to 3"}]- Plain text (array content, the unified format when mixing with multimodal):
[{"role": "user", "content": [{"type": "text", "text": "count 1 to 3"}]}]- text + image:
[{"role": "user", "content": [
{"type": "text", "text": "describe this image"},
{"type": "image_url", "image_url": {"url": "https://example.com/x.png"}}
]}]- text + video:
[{"role": "user", "content": [
{"type": "text", "text": "summarize this video"},
{"type": "video_url", "video_url": {"url": "https://example.com/clip.mp4"}}
]}]- text + file:
[{"role": "user", "content": [
{"type": "text", "text": "extract key points"},
{"type": "file_url", "file_url": {"url": "https://example.com/doc.pdf"}}
]}]- Multi-turn conversation (system + multiple user/assistant turns):
[
{"role": "system", "content": "You are a terse assistant."},
{"role": "user", "content": "1+1?"},
{"role": "assistant", "content": "2"},
{"role": "user", "content": "3+3?"}
]- Mixed attachments of all types (a single user message containing 2 each of
image_url/video_url/audio_url/file_url):
[{"role": "user", "content": [
{"type": "text", "text": "Summarize the key information from the following images, videos, audio, and documents"},
{"type": "image_url", "image_url": {"url": "https://example.com/image-1.png"}},
{"type": "image_url", "image_url": {"url": "https://example.com/image-2.jpg"}},
{"type": "video_url", "video_url": {"url": "https://example.com/clip-1.mp4"}},
{"type": "video_url", "video_url": {"url": "https://example.com/clip-2.mp4"}},
{"type": "audio_url", "audio_url": {"url": "https://example.com/audio-1.mp3"}},
{"type": "audio_url", "audio_url": {"url": "https://example.com/audio-2.wav"}},
{"type": "file_url", "file_url": {"url": "https://example.com/doc-1.pdf"}},
{"type": "file_url", "file_url": {"url": "https://example.com/doc-2.docx"}}
]}][
{ "role": "user", "content": "count 1 to 3" }
]Whether to stream.
Behavior differences:
| Value | Submit response stream field | SSE endpoint |
|---|---|---|
false | null | Not available |
true | {"url": "/v1/llm/generations/{task_id}/stream"} | Available; meanwhile task.data accumulates the full response |
false
Generation token limit. Family-level constraints: claude-* required; gpt-* usually ≥ 16; gemini-* optional.
64
Sampling temperature.
Nucleus sampling.
Stop sequences.
Response
Task created
Submit response, conforming to the unified task standard shape. results / error are fixed at null during submit; they are returned via GET /v1/tasks/{task_id} after the task completes or fails
Task ID, formatted as task-llm-{timestamp}-{8random}. Used for GET /v1/tasks/{task_id} queries or GET /v1/llm/generations/{task_id}/stream SSE subscriptions
"task-llm-1776874565-yq3szvcu"
Object type, fixed at llm.generation.task
llm.generation.task "llm.generation.task"
Media type, fixed at llm
llm "llm"
The model name submitted by the client (echoed verbatim)
"claude-opus-4-7"
Task status, fixed at pending during submit
pending "pending"
Progress 0-100, fixed at 0 during submit
0
Creation time (Unix seconds)
1776874565
Returns {url: ...} when stream=true; null when stream=false. The client uses this to decide whether to connect to SSE
Fixed at null during submit; obtained via GET /v1/tasks/{task_id} after the task completes — results[0] is the full OpenAI ChatCompletion response.
Known limitation: a thinking model's reasoning content (reasoning_content) appears only in the SSE stream's delta and is not accumulated into results[0].message.content
null
Fixed at null during submit; returned via GET /v1/tasks/{task_id} when the task fails
null