Skip to main content
POST
/
v1
/
video
/
talking-avatar
Talking Avatar
curl --request POST \
  --url https://api.domoai.com/v1/video/talking-avatar \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "prompt": "natural talking expression",
  "image": {
    "bytes_base64_encoded": "/9j/4AAQSkZJRgABAQAAAQABAAD/2w..."
  },
  "audio": {
    "bytes_base64_encoded": "SUQzAgAAAAAPdlRDTQAACABzdW1tZX..."
  },
  "callback_url": "https://example.com/callback",
  "seconds": 5
}
'
{
  "data": {
    "task_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a"
  },
  "code": 0
}

Authorizations

Authorization
string
header
required

API Key Bearer Token

Body

application/json

Input schema for creating a talking avatar task

audio
AudioInput · object
required

Input audio to drive the avatar animation. Supports base64 encoded audio, URL, or domoai_uri from file upload API.

Examples:
{
  "bytes_base64_encoded": "SUQzAgAAAAAPdlRDTQAACABzdW1tZX..."
}
{
  "domoai_uri": "domoai://eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJvcmdhbml6YXRpb25faWQiOiIwNjhmZWViMy1hYmVkLTcyYTItODAwMC1hZDM1ZTg0ZGIxNDAiLCJ1cGxvYWRfYnVja2V0IjoiZW50LWFwaS10ZXN0LTEzMzYyODM0MDgiLCJ1cGxvYWRfa2V5IjoiZXBoZW1lcmFsLXVwbG9hZHMvMDY4ZmVlYjMtYWJlZC03MmEyLTgwMDAtYWQzNWU4NGRiMTQwLzI1YjhmOTgwLWM2MzEtNGQ1NC05Y2VhLTU0ZGFiYjhiMjYwNy9maWxlLm1wMyIsInR5cGUiOiJlcGhlbWVyYWwiLCJjb250ZW50X3R5cGUiOiJhdWRpby9tcGVnIiwiZmlsZV9zaXplIjoxMjU2MzIsImlhdCI6MTc2ODM4NzA3MCwiZXhwIjoxNzY4NDczNDcwLCJpc3MiOiJodHRwOi8vemhtLWFwaS5mcnAtZGV2LmRvbW8uY29vbC8ifQ.E60fIHPeDtEE0NmKaPGRYtyQJtaEQI9I0aNEwgtsLrg"
}
seconds
integer
required

Output video duration in seconds.

Required range: 1 <= x <= 60
callback_url
string<uri> | null

Callback notification URL for task results. If configured, the server will actively send notifications when the task status changes. The message schema of the notification can be found in the Callback Protocol section.

Required string length: 1 - 2083
Example:

"https://example.com/callback"

prompt
string
default:""

Optional prompt for generation guidance.

Maximum string length: 2000
image
ImageInput · object

Input image of the avatar. Supports base64 encoded image or URL. Either image or video must be provided.

Example:
{
  "domoai_uri": "domoai://eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJvcmdhbml6YXRpb25faWQiOiIwNjhmZWViMy1hYmVkLTcyYTItODAwMC1hZDM1ZTg0ZGIxNDAiLCJ1cGxvYWRfYnVja2V0IjoiZW50LWFwaS10ZXN0LTEzMzYyODM0MDgiLCJ1cGxvYWRfa2V5IjoiZXBoZW1lcmFsLXVwbG9hZHMvMDY4ZmVlYjMtYWJlZC03MmEyLTgwMDAtYWQzNWU4NGRiMTQwLzI1YjhmOTgwLWM2MzEtNGQ1NC05Y2VhLTU0ZGFiYjhiMjYwNy9maWxlLm1wMyIsInR5cGUiOiJlcGhlbWVyYWwiLCJjb250ZW50X3R5cGUiOiJhdWRpby9tcGVnIiwiZmlsZV9zaXplIjoxMjU2MzIsImlhdCI6MTc2ODM4NzA3MCwiZXhwIjoxNzY4NDczNDcwLCJpc3MiOiJodHRwOi8vemhtLWFwaS5mcnAtZGV2LmRvbW8uY29vbC8ifQ.E60fIHPeDtEE0NmKaPGRYtyQJtaEQI9I0aNEwgtsLrg"
}
video
VideoInput · object

Input video of the avatar. Supports base64 encoded video, URL, or domoai_uri from file upload API. Either image or video must be provided.

Example:
{
  "domoai_uri": "domoai://eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJvcmdhbml6YXRpb25faWQiOiIwNjhmZWViMy1hYmVkLTcyYTItODAwMC1hZDM1ZTg0ZGIxNDAiLCJ1cGxvYWRfYnVja2V0IjoiZW50LWFwaS10ZXN0LTEzMzYyODM0MDgiLCJ1cGxvYWRfa2V5IjoiZXBoZW1lcmFsLXVwbG9hZHMvMDY4ZmVlYjMtYWJlZC03MmEyLTgwMDAtYWQzNWU4NGRiMTQwLzI1YjhmOTgwLWM2MzEtNGQ1NC05Y2VhLTU0ZGFiYjhiMjYwNy9maWxlLm1wMyIsInR5cGUiOiJlcGhlbWVyYWwiLCJjb250ZW50X3R5cGUiOiJhdWRpby9tcGVnIiwiZmlsZV9zaXplIjoxMjU2MzIsImlhdCI6MTc2ODM4NzA3MCwiZXhwIjoxNzY4NDczNDcwLCJpc3MiOiJodHRwOi8vemhtLWFwaS5mcnAtZGV2LmRvbW8uY29vbC8ifQ.E60fIHPeDtEE0NmKaPGRYtyQJtaEQI9I0aNEwgtsLrg"
}
aspect_ratio
enum<string> | null

Output video aspect ratio. If null or not provided, the system will automatically detect and use the closest matching ratio based on the input image/video.

Available options:
16:9,
9:16,
1:1,
4:3,
3:4
Example:

"16:9"

model
enum<string>
default:talking-avatar-v1

Model version to use for generation.

Available options:
talking-avatar-v1
Example:

"talking-avatar-v1"

Response

Successful Response

data
TaskOut · object
required
code
integer
default:0