> ## Documentation Index > Fetch the complete documentation index at: https://docs.deepdub.ai/llms.txt > Use this file to discover all available pages before exploring further. # Python SDK > Install and use the Deepdub Python SDK for text-to-speech, voice management, and real-time streaming ## Installation ```bash theme={null} pip install deepdub ``` **Requirements:** Python 3.9+ **Dependencies:** `requests`, `websockets`, `click`, `audiosample` ## Initialization ```python theme={null} from deepdub import DeepdubClient # Option 1: Pass API key directly client = DeepdubClient(api_key="dd-your-api-key") # Option 2: Use DEEPDUB_API_KEY environment variable # export DEEPDUB_API_KEY=dd-your-api-key client = DeepdubClient() ``` ### Constructor parameters Your Deepdub API key. Falls back to `DEEPDUB_API_KEY` environment variable if not provided. Base URL for the REST API. Falls back to `DEEPDUB_BASE_URL` environment variable. Base URL for the WebSocket API. Falls back to `DEEPDUB_BASE_WEBSOCKET_URL` environment variable. Base URL for the WebSocket streaming API. Falls back to `DEEPDUB_BASE_WEBSOCKET_STREAMING_URL` environment variable. Use EU region endpoints (`restapi.eu.deepdub.ai`, `wsapi.eu.deepdub.ai`). Falls back to `DD_EU` environment variable (`"1"` to enable). ### Region endpoints | Region | REST API | WebSocket API | | ---------------- | -------------------------------------- | -------------------------------- | | **US (default)** | `https://restapi.deepdub.ai/api/v1` | `wss://wsapi.deepdub.ai/open` | | **EU** | `https://restapi.eu.deepdub.ai/api/v1` | `wss://wsapi.eu.deepdub.ai/open` | *** ## Text-to-Speech ### `tts()` — Synchronous generation Generate speech and receive the complete audio as bytes. ```python theme={null} audio_data = client.tts( text="Hello, welcome to Deepdub!", voice_prompt_id="your-voice-id", model="dd-etts-2.5", locale="en-US" ) with open("output.mp3", "wb") as f: f.write(audio_data) ``` **Returns:** `bytes` — binary audio data in the specified format. #### Parameters Text to convert to speech. Voice prompt ID to use. Either this or `voice_reference` must be provided. Audio reference for instant voice cloning. Accepts a file `Path`, raw `bytes`, or a base64-encoded `string`. Either this or `voice_prompt_id` must be provided. Model ID. Available models: `dd-etts-3.0`, `dd-etts-2.5`. Language locale code (e.g., `en-US`, `fr-FR`). Audio output format. REST API supports: `mp3`, `opus`, `mulaw`. WebSocket additionally supports: `wav` (default), `s16le`. Generation temperature (0.0–1.0). Higher values produce more varied output. Voice variation level (0.0–1.0). Target audio duration in seconds. Mutually exclusive with `tempo`. Playback speed multiplier. Mutually exclusive with `duration`. Random seed for deterministic generation. Enhance voice prompt characteristics. Output sample rate in Hz. Supported: `8000`, `16000`, `22050`, `24000`, `44100`, `48000`. Base accent locale (e.g., `en-US`). Must be provided together with `accent_locale` and `accent_ratio`. Target accent locale (e.g., `fr-FR`). Must be provided together with `accent_base_locale` and `accent_ratio`. Accent blend ratio (0.0–1.0). Must be provided together with `accent_base_locale` and `accent_locale`. ### Full example with all parameters ```python theme={null} audio_data = client.tts( text="This demonstrates all available TTS parameters.", voice_prompt_id="your-voice-id", model="dd-etts-2.5", locale="en-US", format="mp3", temperature=0.7, variance=0.6, tempo=1.1, seed=42, prompt_boost=True, sample_rate=44100, accent_base_locale="en-US", accent_locale="fr-FR", accent_ratio=0.3, ) with open("output.mp3", "wb") as f: f.write(audio_data) ``` ### Voice cloning from audio reference ```python theme={null} from pathlib import Path audio_data = client.tts( text="Cloning a voice from an audio sample.", voice_reference=Path("reference_audio.mp3"), model="dd-etts-2.5", locale="en-US", ) with open("cloned_output.mp3", "wb") as f: f.write(audio_data) ``` *** ## Async / WebSocket TTS ### `async_tts()` — Streaming generation Stream audio chunks over WebSocket for low-latency playback. Must be used within an `async_connect()` context. ```python theme={null} import asyncio from deepdub import DeepdubClient client = DeepdubClient(api_key="dd-your-api-key") async def stream_audio(): audio_data = bytearray() async with client.async_connect() as conn: async for chunk in conn.async_tts( text="Streaming audio in real time!", voice_prompt_id="bd1b00bb-be1c-4679-8eaa-0fcbfd4ff773", model="dd-etts-3.0", locale="en-US", format="wav", sample_rate=16000, ): audio_data.extend(chunk) print(f"Received chunk: {len(chunk)} bytes") with open("streamed.wav", "wb") as f: f.write(audio_data) print(f"Total audio: {len(audio_data)} bytes") asyncio.run(stream_audio()) ``` **Yields:** `bytes` — audio chunks as they are generated. #### Parameters Same as `tts()`, plus: Optional UUID for request tracking. Auto-generated if not provided. Target gender for the output voice. Print debug information about sent/received messages. ### Multiple concurrent generations The WebSocket connection supports multiplexing — run multiple TTS requests on the same connection: ```python theme={null} import asyncio from deepdub import DeepdubClient client = DeepdubClient(api_key="dd-your-api-key") async def generate_multiple(): async with client.async_connect() as conn: async def generate_one(text, filename): audio = bytearray() async for chunk in conn.async_tts( text=text, voice_prompt_id="bd1b00bb-be1c-4679-8eaa-0fcbfd4ff773", model="dd-etts-3.0", locale="en-US", format="wav", sample_rate=16000, ): audio.extend(chunk) with open(filename, "wb") as f: f.write(audio) await asyncio.gather( generate_one("First sentence.", "out1.wav"), generate_one("Second sentence.", "out2.wav"), generate_one("Third sentence.", "out3.wav"), ) asyncio.run(generate_multiple()) ``` *** ## Streaming Input For real-time text streaming (sending text incrementally), use `async_stream_connect()`: ```python theme={null} import asyncio from deepdub import DeepdubClient client = DeepdubClient(api_key="dd-your-api-key") async def streaming_input(): async with client.async_stream_connect( model="dd-etts-3.0", locale="en-US", voice_prompt_id="your-voice-id", format="wav", sample_rate=16000, ) as conn: await conn.async_stream_text("Hello, ") await conn.async_stream_text("this is streamed ") await conn.async_stream_text("text input.") await conn.async_stream_end() audio_data = bytearray() while True: audio = await conn.async_stream_recv_audio() if audio is None: break audio_data.extend(audio) print(f"Received chunk: {len(audio)} bytes") print(f"Total audio: {len(audio_data)} bytes") asyncio.run(streaming_input()) ``` *** ## Gender Classification Classify the gender of a speaker from an audio sample: ```python theme={null} import asyncio from pathlib import Path from deepdub import DeepdubClient client = DeepdubClient(api_key="dd-your-api-key") async def classify(): async with client.async_connect() as conn: result = await conn.gender_classify( audio_data=Path("speaker_sample.wav"), sample_rate=16000, timeout=5.0, ) print(result) asyncio.run(classify()) ``` Audio data as raw bytes, base64-encoded string, or file Path. Automatically trimmed to 1 second. Sample rate of the input audio. Timeout in seconds for the WebSocket response. Optional UUID for request tracking. *** ## Voice Management ### `list_voices()` — List all voice prompts ```python theme={null} voices = client.list_voices() for voice in voices.get("voicePrompts", []): print(f"{voice['id']}: {voice.get('name', voice.get('title', 'Untitled'))}") ``` **Returns:** `dict` with a `voicePrompts` key containing a list of voice prompt objects. ### `add_voice()` — Upload a voice sample ```python theme={null} from pathlib import Path response = client.add_voice( data=Path("voice_sample.wav"), name="Professional Narrator", gender="female", locale="en-US", publish=False, speaking_style="Neutral", age=30, ) print(f"Created voice: {response}") ``` **Returns:** `dict` with the created voice prompt information. #### Parameters Audio data — a file `Path`, raw `bytes`, or base64-encoded `string`. Display name for the voice prompt. Speaker gender: `"male"` or `"female"`. Language locale code (e.g., `en-US`). Whether to make the voice publicly available. Speaking style descriptor. Age of the speaker. *** ## CLI Reference The SDK includes a command-line interface: ```bash theme={null} # List available voices deepdub list-voices # Upload a new voice deepdub add-voice \ --file path/to/audio.mp3 \ --name "My Voice" \ --gender male \ --locale en-US # Generate text-to-speech deepdub tts \ --text "Hello from the CLI!" \ --voice-prompt-id your-voice-id # Set API key via flag or environment deepdub --api-key dd-your-key tts --text "Hello!" export DEEPDUB_API_KEY=dd-your-key ``` *** ## Environment Variables | Variable | Description | Default | | -------------------------------------- | ---------------------------------- | ----------------------------------- | | `DEEPDUB_API_KEY` | API key for authentication | — | | `DEEPDUB_BASE_URL` | REST API base URL | `https://restapi.deepdub.ai/api/v1` | | `DEEPDUB_BASE_WEBSOCKET_URL` | WebSocket API base URL | `wss://wsapi.deepdub.ai/open` | | `DEEPDUB_BASE_WEBSOCKET_STREAMING_URL` | Streaming WebSocket base URL | `wss://wss.deepdub.ai/ws` | | `DD_EU` | Use EU endpoints (`"1"` to enable) | `"0"` | *** ## Error Handling ```python theme={null} from deepdub import DeepdubClient import requests client = DeepdubClient(api_key="dd-your-api-key") try: audio = client.tts( text="Hello!", voice_prompt_id="your-voice-id", ) except requests.exceptions.HTTPError as e: if e.response.status_code == 401: print("Invalid API key") elif e.response.status_code == 400: print("Invalid request parameters") else: print(f"API error: {e}") except ValueError as e: print(f"Validation error: {e}") ``` For async operations, WebSocket errors are raised as `Exception` with the error message from the server: ```python theme={null} try: async with client.async_connect() as conn: async for chunk in conn.async_tts(text="Hello!", voice_prompt_id="id"): pass except Exception as e: error_msg = str(e) # Possible errors: "Rate limit exceeded", "Insufficient credits", etc. print(f"WebSocket error: {error_msg}") ``` *** ## Available Models | Model ID | Description | | ------------- | --------------------------------- | | `dd-etts-3.0` | Latest model with best quality | | `dd-etts-2.5` | Stable production model (default) |