Skip to main content

Welcome to Deepdub

Deepdub provides a powerful Text-to-Speech API for generating natural, expressive speech with voice cloning, accent control, and real-time streaming. Whether you’re building voiceover pipelines, conversational agents, or content localization workflows, Deepdub delivers studio-quality audio at scale.

Key capabilities

Text-to-Speech

Generate speech from text using state-of-the-art models with fine-grained control over tempo, variance, and duration.

Voice Cloning

Clone any voice from a short audio sample. Upload voice prompts or pass a base64-encoded audio reference for instant cloning.

Accent Control

Blend accents between locales with precise ratio control — generate an American English speaker with a French accent, or any combination.

Real-time Streaming

Stream audio in real-time over HTTP or WebSocket connections for low-latency applications.

Try it now

Use the free trial API key to generate speech instantly — no sign-up required:
dd-00000000000000000000000065c9cbfe
from deepdub import DeepdubClient

client = DeepdubClient(api_key="dd-00000000000000000000000065c9cbfe")

audio = client.tts(
    text="Welcome to Deepdub!",
    voice_prompt_id="bd1b00bb-be1c-4679-8eaa-0fcbfd4ff773",
    model="dd-etts-3.0",
    locale="en-US",
)

with open("output.mp3", "wb") as f:
    f.write(audio)

API access

Deepdub offers two integration methods:
MethodEndpointUse case
REST APIhttps://restapi.deepdub.ai/api/v1Synchronous audio generation, voice management, retroactive generation
WebSocket APIwss://wsapi.deepdub.aiReal-time streaming with chunked audio delivery