Installation
requests, websockets, click, audiosample
Initialization
Constructor parameters
Your Deepdub API key. Falls back to
DEEPDUB_API_KEY environment variable if not provided.Base URL for the REST API. Falls back to
DEEPDUB_BASE_URL environment variable.Base URL for the WebSocket API. Falls back to
DEEPDUB_BASE_WEBSOCKET_URL environment variable.Base URL for the WebSocket streaming API. Falls back to
DEEPDUB_BASE_WEBSOCKET_STREAMING_URL environment variable.Use EU region endpoints (
restapi.eu.deepdub.ai, wsapi.eu.deepdub.ai). Falls back to DD_EU environment variable ("1" to enable).Region endpoints
| Region | REST API | WebSocket API |
|---|---|---|
| US (default) | https://restapi.deepdub.ai/api/v1 | wss://wsapi.deepdub.ai/open |
| EU | https://restapi.eu.deepdub.ai/api/v1 | wss://wsapi.eu.deepdub.ai/open |
Text-to-Speech
tts() — Synchronous generation
Generate speech and receive the complete audio as bytes.
bytes — binary audio data in the specified format.
Parameters
Text to convert to speech.
Voice prompt ID to use. Either this or
voice_reference must be provided.Audio reference for instant voice cloning. Accepts a file
Path, raw bytes, or a base64-encoded string. Either this or voice_prompt_id must be provided.Model ID. Available models:
dd-etts-3.0, dd-etts-2.5.Language locale code (e.g.,
en-US, fr-FR).Audio output format:
mp3, headerless-wav, opus, or mulaw.Generation temperature (0.0–1.0). Higher values produce more varied output.
Voice variation level (0.0–1.0).
Target audio duration in seconds. Mutually exclusive with
tempo.Playback speed multiplier. Mutually exclusive with
duration.Random seed for deterministic generation.
Enhance voice prompt characteristics.
Output sample rate in Hz. Supported:
8000, 16000, 22050, 24000, 44100, 48000.Base accent locale (e.g.,
en-US). Must be provided together with accent_locale and accent_ratio.Target accent locale (e.g.,
fr-FR). Must be provided together with accent_base_locale and accent_ratio.Accent blend ratio (0.0–1.0). Must be provided together with
accent_base_locale and accent_locale.Full example with all parameters
Voice cloning from audio reference
tts_retro() — Retroactive generation
Submit a TTS request and receive a URL for later retrieval.
dict with a url key pointing to the generated audio.
Parameters
Text to convert to speech.
Voice prompt ID to use.
Model ID.
Language locale code.
Async / WebSocket TTS
async_tts() — Streaming generation
Stream audio chunks over WebSocket for low-latency playback. Must be used within an async_connect() context.
bytes — audio chunks as they are generated.
Parameters
Same astts(), plus:
Optional UUID for request tracking. Auto-generated if not provided.
Target gender for the output voice.
Print debug information about sent/received messages.
Multiple concurrent generations
The WebSocket connection supports multiplexing — run multiple TTS requests on the same connection:Streaming Input
For real-time text streaming (sending text incrementally), useasync_stream_connect():
Gender Classification
Classify the gender of a speaker from an audio sample:Audio data as raw bytes, base64-encoded string, or file Path. Automatically trimmed to 1 second.
Sample rate of the input audio.
Timeout in seconds for the WebSocket response.
Optional UUID for request tracking.
Voice Management
list_voices() — List all voice prompts
dict with a voicePrompts key containing a list of voice prompt objects.
add_voice() — Upload a voice sample
dict with the created voice prompt information.
Parameters
Audio data — a file
Path, raw bytes, or base64-encoded string.Display name for the voice prompt.
Speaker gender:
"male" or "female".Language locale code (e.g.,
en-US).Whether to make the voice publicly available.
Speaking style descriptor.
Age of the speaker.
CLI Reference
The SDK includes a command-line interface:Environment Variables
| Variable | Description | Default |
|---|---|---|
DEEPDUB_API_KEY | API key for authentication | — |
DEEPDUB_BASE_URL | REST API base URL | https://restapi.deepdub.ai/api/v1 |
DEEPDUB_BASE_WEBSOCKET_URL | WebSocket API base URL | wss://wsapi.deepdub.ai/open |
DEEPDUB_BASE_WEBSOCKET_STREAMING_URL | Streaming WebSocket base URL | wss://wss.deepdub.ai/ws |
DD_EU | Use EU endpoints ("1" to enable) | "0" |
Error Handling
Exception with the error message from the server:
Available Models
| Model ID | Description |
|---|---|
dd-etts-3.0 | Latest model with best quality |
dd-etts-2.5 | Stable production model (default) |
