OmniHuman 1.5 Digital Human Video Generation
- OmniHuman 1.5 audio-driven digital human video generation model
- Input a portrait image + audio to generate a synchronized video of the character speaking/singing/performing
- Supports turbo mode
- Asynchronous processing mode; use the returned task ID to query the task status
- Generated video links are valid for 24 hours; please save them promptly
Authorizations
All endpoints require Bearer Token authentication
Add the following to your request headers:
Authorization: Bearer YOUR_API_KEY
Body
omnihuman-1.5: Audio-driven digital human video
"omnihuman-1.5"
Public URL of the portrait image
Notes:
- Formats: JPG / PNG / WebP / GIF / AVIF
- File size up to 5MB
- Must be a publicly accessible URL
- The image must contain a human face
"https://example.com/portrait.jpg"
Public URL of the audio file
Notes:
- Formats: MP3 / WAV / OGG / M4A / AAC
- Must be a publicly accessible URL
- Audio duration determines output video duration and billing
- Duration limit: up to 30 seconds (up to 60 seconds in 720p mode)
"https://example.com/speech.mp3"
Text guidance to control character behavior and actions
Notes:
- Supports Chinese, English, Japanese, Korean, Spanish, and Indonesian
- Describing dynamic actions works better; avoid describing static attributes already present in the image
"The person speaks calmly to the camera"
Turbo mode
Notes:
- Enables faster generation with slight quality loss
- Suitable for quick iteration and previewing
Do not pass this parameter unless necessary.
true
Advanced options for additional control
Do not pass this parameter unless necessary.
Response
Task created successfully
Task creation timestamp
1757165031
Task ID
"task-unified-1757165031-uyujaw3d"
Actual model name used
Specific task type
video.generation.task Task progress percentage (0-100)
0 <= x <= 1000
Task status
pending, processing, completed, failed "pending"
Asynchronous task information
Task output type
video "video"