Developer API
Integrate Addavox localization into your workflow. Choose the Translation API for standalone translated audio, or the Video Dubbing API for full video localization with timing, QA, and subtitles.
API Keys
Generate API keys in the Addavox app. Each key uses the same plan and included minutes as your account.
Manage API KeysBase URL:
https://api.addavox.com/api/v1
Auth header:
X-API-Key: YOUR_KEY
Service Overview
Two APIs for different localization needs.
- Endpoint
POST /api/v1/localize- Output
- Standalone sequential audio
- Languages
- Single per request
- QA
- None
From $0.05/min
View details ↓- Endpoint
POST /api/v1/localize-video- Output
- Video + audio + subtitles per language
- Languages
- Multiple per request
- QA
- LLM QA included
- Review
- Invite reviewers via magic link
From $1.60/min per language
Business annual rate.
View details ↓Translation API
Single endpoint: POST /api/v1/localize — generate standalone translated audio from your source content.
Web app vs this API: when you use the Addavox product in the browser, segment editing and localization jobs keep preview audio on each segment and defer the full mixed program until you download assets from the project page. These v1 endpoints build the stitched output as part of the job so API clients can fetch results without a separate export step.
You provide
- Source audio
- Source + target language
We handle
- Transcription
- Translation
- Speaker-matched voice audio
- Sequential stitching
From $0.08/min
Business plan rate. See pricing →
You provide
- Source audio
- Transcription with timestamps by segment
- Source + target language
We handle
- Translation
- Speaker-matched voice audio
- Sequential stitching
From $0.06/min
Business plan rate. See pricing →
You provide
- Source audio
- Transcription + translation with timestamps by segment
We handle
- Speaker-matched voice audio
- Sequential stitching
From $0.05/min
Business plan rate. See pricing →
How It Works
Output Format
The API produces a standalone sequential audio file. Each translated segment is generated and stitched together in order with brief pauses between them. The output is not time-aligned to the original source — it is a new audio file intended to be listened to independently.
Why provide source audio and timestamps?
The source audio and segment timestamps (start_time, end_time) are used to identify and match each speaker's voice in the original recording. This allows the generated audio to sound like the original speakers. These timestamps do not control the output timing.
No QA or duration matching
Unlike Video Dubbing, the Translation API does not perform text rewriting, tempo adjustment, or timing alignment. This is what makes it faster and more affordable. If you need audio synced to the original video with frame-level timing, use the Video Dubbing API.
Code Examples
Video Dubbing API
Endpoint: POST /api/v1/localize-video — full video localization with timing, QA, and subtitles.
Jobs started through this API produce per-language deliverables when work completes. The in-browser Addavox app uses the same localization engine but defers full mixed assets to the project Download flow unless noted otherwise.
You provide
- Video URL
- Source language
- Target languages (one or more)
We handle
- Audio/video separation
- Transcription
- Translation
- LLM QA
- Voice synthesis
- Timing alignment
- Subtitle generation
- Video rendering per language
Output Format
Per-language video + audio + subtitles. A zip download containing all languages is also available. Signed URLs expire after 24 hours.
Multi-Language Jobs
One request, multiple languages. The master job spawns child jobs per language. Poll the master job for per-language status. Status values: queued → running → completed | failed.
Reviewer Workflow
Invite human reviewers via the API. Each reviewer gets a magic link email to edit localization in a web editor — no account needed.
Code Examples
Consent & Authorization
All API requests must include a consent object and a top-level mode field. Together these create a per-job attestation record confirming you hold the necessary rights and speaker consents.
The mode field determines voice synthesis: "voice_matched" uses speaker voice cloning, "standard" uses synthetic TTS. Both modes are the same price — the choice is purely consent-driven.
Voice-Matched Mode — Full Consent Required
Standard Mode — Content Rights Only
Field Reference
| Field | Type | Required | Description |
|---|---|---|---|
| mode | string | Top-level | voice_matched or standard |
| speaker_consent_obtained | boolean | voice_matched only | Explicit consent from identifiable speakers |
| content_rights_confirmed | boolean | Both modes | Ownership or valid license to the content |
| eula_accepted | boolean | Both modes | Accepts the Addavox EULA |
| attested_by | string | Both modes | Email or identifier of responsible party |
| attested_at | ISO 8601 | Both modes | Within 24 hours of request time |
Consent Error Codes
| HTTP | Code | Condition |
|---|---|---|
| 403 | CONSENT_MISSING | No consent object |
| 403 | CONSENT_INCOMPLETE | Missing attested_by or invalid attested_at |
| 403 | CONSENT_NOT_AFFIRMED | Rights or EULA not affirmed |
| 403 | CONSENT_EXPIRED | attested_at older than 24 hours |
| 403 | SPEAKER_CONSENT_REQUIRED | voice_matched without speaker consent |
| 400 | INVALID_MODE | Invalid mode value |
API Reference
Full interactive schema and additional endpoints are available via OpenAPI.
Endpoints
| POST /localize | Pre-translated / partial-transcript audio localization |
| POST /localize-video | Full video dubbing (multi-language) |
| GET /jobs/{job_id} | Job status |
| GET /jobs/{job_id}/result | Deliverables (URLs, zip) |
| GET /jobs | List jobs |
| DELETE /jobs/{job_id} | Cancel job |
| GET /account | Account info |
| GET /voices | TTS voice catalog |
| GET /languages | Supported languages |
| POST /projects/{id}/reviewers | Invite reviewer |
| GET|POST /jobs/{id}/webhooks | Webhook status / retry |
API Service Pricing
Per-minute rates by subscription plan. Rates below reflect Business plan annual pricing (20% discount). See the full table for every plan and service.
Business plan rates shown with annual discount applied.
Free
Video Dubbing
Included (~min)
2 min
Base rate (annual)
/min
Overage
$4.00/min
API Services
Audio Separation
Proprietary denoised voice track
Base rate
$0.03/min
Overage
$0.04/min
Included (~min)
~— min
Gender Detection
Gender classification from voice + context
Base rate
$0.02/min
Overage
$0.02/min
Included (~min)
~— min
Transcription (STT)
Deepgram speech-to-text with diarization
Base rate
$0.05/min
Overage
$0.06/min
Included (~min)
~— min
Translation
Google Translate
Base rate
$0.03/min
Overage
$0.04/min
Included (~min)
~— min
Text-to-Speech
Google TTS + other providers
Base rate
$0.06/min
Overage
$0.08/min
Included (~min)
~— min
Voice Match
Proprietary voice cloning (all-in)
Base rate
$0.08/min
Overage
$0.10/min
Included (~min)
~— min
Starter
Video Dubbing
Included (~min)
10 min
Base rate (annual)
$2.80/min
Overage
$3.60/min
API Services
Audio Separation
Proprietary denoised voice track
Base rate
$0.02/min
Overage
$0.03/min
Included (~min)
~— min
Gender Detection
Gender classification from voice + context
Base rate
$0.02/min
Overage
$0.02/min
Included (~min)
~— min
Transcription (STT)
Deepgram speech-to-text with diarization
Base rate
$0.04/min
Overage
$0.05/min
Included (~min)
~— min
Translation
Google Translate
Base rate
$0.02/min
Overage
$0.03/min
Included (~min)
~— min
Text-to-Speech
Google TTS + other providers
Base rate
$0.05/min
Overage
$0.06/min
Included (~min)
~— min
Voice Match
Proprietary voice cloning (all-in)
Base rate
$0.08/min
Overage
$0.10/min
Included (~min)
~— min
Creator
Video Dubbing
Included (~min)
30 min
Base rate (annual)
$2.64/min
Overage
$3.20/min
API Services
Transcribe + Translate + Voice Audio
Full Translation API bundle
Base rate
$0.11/min
Overage
$0.13/min
Included (~min)
~— min
Translate + Voice Audio
Translate + voice audio bundle
Base rate
$0.10/min
Overage
$0.11/min
Included (~min)
~— min
Voice Audio Only
Voice audio only bundle
Base rate
$0.08/min
Overage
$0.10/min
Included (~min)
~— min
Audio Separation
Proprietary denoised voice track
Base rate
$0.02/min
Overage
$0.02/min
Included (~min)
~— min
Gender Detection
Gender classification from voice + context
Base rate
$0.01/min
Overage
$0.02/min
Included (~min)
~— min
Transcription (STT)
Deepgram speech-to-text with diarization
Base rate
$0.03/min
Overage
$0.04/min
Included (~min)
~— min
Translation
Google Translate
Base rate
$0.02/min
Overage
$0.03/min
Included (~min)
~— min
Text-to-Speech
Google TTS + other providers
Base rate
$0.04/min
Overage
$0.05/min
Included (~min)
~— min
Voice Match
Proprietary voice cloning (all-in)
Base rate
$0.06/min
Overage
$0.08/min
Included (~min)
~— min
Pro
Video Dubbing
Included (~min)
120 min
Base rate (annual)
$1.99/min
Overage
$2.80/min
API Services
Transcribe + Translate + Voice Audio
Full Translation API bundle
Base rate
$0.10/min
Overage
$0.11/min
Included (~min)
~— min
Translate + Voice Audio
Translate + voice audio bundle
Base rate
$0.08/min
Overage
$0.10/min
Included (~min)
~— min
Voice Audio Only
Voice audio only bundle
Base rate
$0.06/min
Overage
$0.08/min
Included (~min)
~— min
Audio Separation
Proprietary denoised voice track
Base rate
$0.02/min
Overage
$0.02/min
Included (~min)
~— min
Gender Detection
Gender classification from voice + context
Base rate
$0.01/min
Overage
$0.01/min
Included (~min)
~— min
Transcription (STT)
Deepgram speech-to-text with diarization
Base rate
$0.02/min
Overage
$0.03/min
Included (~min)
~— min
Translation
Google Translate
Base rate
$0.02/min
Overage
$0.02/min
Included (~min)
~— min
Text-to-Speech
Google TTS + other providers
Base rate
$0.03/min
Overage
$0.04/min
Included (~min)
~— min
Voice Match
Proprietary voice cloning (all-in)
Base rate
$0.05/min
Overage
$0.06/min
Included (~min)
~— min
Business
Video Dubbing
Included (~min)
500 min
Base rate (annual)
$1.60/min
Overage
$2.40/min
API Services
Transcribe + Translate + Voice Audio
Full Translation API bundle
Base rate
$0.08/min
Overage
$0.10/min
Included (~min)
~— min
Translate + Voice Audio
Translate + voice audio bundle
Base rate
$0.06/min
Overage
$0.08/min
Included (~min)
~— min
Voice Audio Only
Voice audio only bundle
Base rate
$0.05/min
Overage
$0.06/min
Included (~min)
~— min
Audio Separation
Proprietary denoised voice track
Base rate
$0.01/min
Overage
$0.02/min
Included (~min)
~— min
Gender Detection
Gender classification from voice + context
Base rate
$0.01/min
Overage
$0.01/min
Included (~min)
~— min
Transcription (STT)
Deepgram speech-to-text with diarization
Base rate
$0.02/min
Overage
$0.03/min
Included (~min)
~— min
Translation
Google Translate
Base rate
$0.02/min
Overage
$0.02/min
Included (~min)
~— min
Text-to-Speech
Google TTS + other providers
Base rate
$0.02/min
Overage
$0.03/min
Included (~min)
~— min
Voice Match
Proprietary voice cloning (all-in)
Base rate
$0.03/min
Overage
$0.05/min
Included (~min)
~— min
Minutes shown are equivalent — your credit pool is shared across all services. Using a lower-cost service draws fewer credits per minute.