Skip to content

Commit bc918fe

Browse files
feat!: v6 — fully generated SDK with latest APIs and WebSocket support (#640)
## Summary This is the Deepgram Python SDK v6 release. The SDK moves to a fully Fern-generated architecture, replacing all hand-rolled WebSocket code from v5 with generated, type-safe implementations aligned with the latest API spec. ### What's changing - **Fully generated WebSocket clients** — Listen v1/v2, Speak v1, and Agent v1 WebSocket implementations are now generated from the API spec, replacing the manually maintained code in v5. This means faster feature delivery and fewer SDK-specific bugs. - **Latest APIs and features** — Includes all current Deepgram API capabilities: Listen v2 (conversational speech recognition with turn detection), Agent v1 (voice agents), and the latest Speak v1 features. - **Simplified send methods** — `send_media()` now accepts raw `bytes` directly. Control messages use dedicated methods (`send_keep_alive()`, `send_finalize()`, `send_flush()`, etc.) instead of the generic `send_control()` pattern. - **New type system** — Types are generated per-domain (`deepgram.listen.v1.types`, `deepgram.agent.v1.types`, `deepgram.types`) instead of the shared `deepgram.extensions.types.sockets` barrel import. - **22 production-ready examples** covering authentication, transcription (file, URL, live), voice agents, TTS, text intelligence, and management APIs. - **CI/CD improvements** — Matrix testing across Python 3.8–3.13, release-please workflow, PR title validation. ### Breaking changes - All imports from `deepgram.extensions.types.sockets` must be updated to domain-specific type packages - `send_control()` replaced by dedicated methods per WebSocket client - `send_media()` now takes `bytes` instead of wrapper message types - Agent settings types renamed to match generated schema hierarchy (e.g. `AgentV1SettingsMessage` → `AgentV1Settings`) ### Documentation - **[Migration guide](docs/Migrating-v5-to-v6.md)** — Complete v5 to v6 migration guide - **[API Reference](reference.md)** — Full REST and WebSocket reference with v6 types and examples --------- Co-authored-by: fern-api[bot] <115122769+fern-api[bot]@users.noreply.github.com>
1 parent 7c0290d commit bc918fe

File tree

528 files changed

+17581
-24400
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

528 files changed

+17581
-24400
lines changed

.fern/metadata.json

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
{
2+
"cliVersion": "3.77.1",
3+
"generatorName": "fernapi/fern-python-sdk",
4+
"generatorVersion": "4.57.2",
5+
"generatorConfig": {
6+
"client": {
7+
"class_name": "BaseClient",
8+
"filename": "base_client.py",
9+
"exported_class_name": "DeepgramClient",
10+
"exported_filename": "client.py"
11+
},
12+
"use_typeddict_requests": true,
13+
"should_generate_websocket_clients": true,
14+
"enable_wire_tests": true
15+
},
16+
"sdkVersion": "6.0.0-beta.4"
17+
}

.fernignore

Lines changed: 33 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,39 @@
1-
# Development, Configuration Files & Documentation
2-
README.md
3-
CONTRIBUTING.md
4-
.vscode/
5-
.gitignore
6-
mypy.ini
7-
websockets-reference.md
8-
.github/
9-
scripts/run_examples.sh
10-
docs/
11-
pyproject.toml
12-
CHANGELOG.md
1+
# Custom client implementation extending BaseClient with additional features:
2+
# - access_token parameter support (Bearer token authentication)
3+
# - Automatic session ID generation and header injection (x-deepgram-session-id)
4+
# This file is manually maintained and should not be regenerated
5+
src/deepgram/client.py
136

14-
# Examples
15-
examples/
7+
# WireMock mappings: removed duplicate empty-body /v1/listen stub that causes
8+
# non-deterministic matching failures
9+
wiremock/wiremock-mappings.json
1610

17-
# Test Files
18-
tests/unit/
19-
tests/integrations/
11+
# Wire test with manual fix: transcribe_file() requires request=bytes parameter
12+
tests/wire/test_listen_v1_media.py
2013

21-
# Custom Extensions & Clients
22-
src/deepgram/client.py
23-
src/deepgram/extensions/
24-
25-
# Socket Client Implementations
26-
src/deepgram/agent/v1/socket_client.py
14+
# WebSocket socket clients: optional message parameter defaults for send_flush,
15+
# send_close, send_clear, send_finalize, send_close_stream, send_keep_alive
16+
src/deepgram/speak/v1/socket_client.py
2717
src/deepgram/listen/v1/socket_client.py
2818
src/deepgram/listen/v2/socket_client.py
29-
src/deepgram/speak/v1/socket_client.py
19+
src/deepgram/agent/v1/socket_client.py
20+
21+
# Manual standalone tests
22+
tests/manual
23+
24+
# README with custom examples, migration guide links, and contributing section
25+
README.md
26+
27+
# Changelog managed by release-please
28+
CHANGELOG.md
29+
30+
# Contributing guide
31+
CONTRIBUTING.md
32+
33+
# Reference with Fern-generated REST API docs plus manually maintained WebSocket sections
34+
reference.md
3035

31-
# Bug Fixes
32-
src/deepgram/listen/client.py
33-
src/deepgram/core/client_wrapper.py
36+
# Folders to ignore
37+
.github
38+
docs
39+
examples

.github/.commitlintrc.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,4 +49,5 @@
4949
100
5050
]
5151
}
52-
}
52+
}
53+

.github/workflows/changelog-log.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,3 +36,4 @@ jobs:
3636
# include_body_raw: "false"
3737
# log_level: "warn" # trace, debug, info, warn, error, fatal
3838
# extra_body_json: '{"custom":"field"}' # merge custom fields into payload
39+

.github/workflows/ci.yml

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@ jobs:
1818
- name: Bootstrap poetry
1919
run: |
2020
curl -sSL https://install.python-poetry.org | python - -y --version 1.5.1
21+
- name: Add poetry to PATH
22+
run: |
23+
echo "$HOME/.local/bin" >> $GITHUB_PATH
2124
- name: Install dependencies
2225
run: poetry install
2326
- name: Compile
@@ -38,8 +41,42 @@ jobs:
3841
- name: Bootstrap poetry
3942
run: |
4043
curl -sSL https://install.python-poetry.org | python - -y --version 1.5.1
44+
- name: Add poetry to PATH
45+
run: |
46+
echo "$HOME/.local/bin" >> $GITHUB_PATH
4147
- name: Install dependencies
4248
run: poetry install
4349

50+
- name: Verify Docker is available
51+
run: |
52+
docker --version
53+
docker compose version
54+
4455
- name: Test
4556
run: poetry run pytest -rP .
57+
58+
publish:
59+
needs: [compile, test]
60+
if: github.event_name == 'push' && contains(github.ref, 'refs/tags/')
61+
runs-on: ubuntu-latest
62+
permissions:
63+
id-token: write
64+
steps:
65+
- name: Checkout repo
66+
uses: actions/checkout@v4
67+
- name: Set up python
68+
uses: actions/setup-python@v6
69+
with:
70+
python-version: "3.8"
71+
- name: Bootstrap poetry
72+
run: |
73+
curl -sSL https://install.python-poetry.org | python - -y --version 1.5.1
74+
- name: Add poetry to PATH
75+
run: |
76+
echo "$HOME/.local/bin" >> $GITHUB_PATH
77+
- name: Install dependencies
78+
run: poetry install
79+
- name: Build package
80+
run: poetry build
81+
- name: Publish to PyPI
82+
uses: pypa/gh-action-pypi-publish@release/v1

.github/workflows/pr-title-check.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,3 +30,4 @@ jobs:
3030
PR_TITLE: ${{ github.event.pull_request.title }}
3131
run: |
3232
echo "$PR_TITLE" | npx commitlint -g .github/.commitlintrc.json
33+

.github/workflows/tests-daily.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,5 +44,10 @@ jobs:
4444
- name: Install dependencies
4545
run: poetry install
4646

47+
- name: Verify Docker is available
48+
run: |
49+
docker --version
50+
docker compose version
51+
4752
- name: Test
4853
run: poetry run pytest -rP .

.gitignore

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,3 @@
33
__pycache__/
44
dist/
55
poetry.toml
6-
.env
7-
.pytest_cache/
8-
9-
# ignore example output files
10-
examples/**/output.*
11-
12-
# ignore venv
13-
venv/
14-
.DS_Store
15-
16-
# ignore build artifacts and dependencies
17-
Pipfile
18-
Pipfile.lock
19-
deepgram_sdk.egg-info/

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2025 Deepgram.
3+
Copyright (c) 2026 Deepgram.
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 34 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,9 @@ Comprehensive API documentation and guides are available at [developers.deepgram
1313

1414
### Migrating From Earlier Versions
1515

16+
- [v5 to v6](./docs/Migrating-v5-to-v6.md) (current)
17+
- [v3+ to v5](./docs/Migrating-v3-to-v5.md)
1618
- [v2 to v3+](./docs/Migrating-v2-to-v3.md)
17-
- [v3+ to v5](./docs/Migrating-v3-to-v5.md) (current)
1819

1920
## Installation
2021

@@ -26,8 +27,7 @@ pip install deepgram-sdk
2627

2728
## Reference
2829

29-
- **[API Reference](./reference.md)** - Complete reference for all SDK methods and parameters
30-
- **[WebSocket Reference](./websockets-reference.md)** - Detailed documentation for real-time WebSocket connections
30+
- **[API Reference](./reference.md)** - Complete reference for all SDK methods, parameters, and WebSocket connections
3131

3232
## Usage
3333

@@ -37,7 +37,7 @@ The Deepgram SDK provides both synchronous and asynchronous clients for all majo
3737

3838
#### Real-time Speech Recognition (Listen v2)
3939

40-
Our newest and most advanced speech recognition model with contextual turn detection ([WebSocket Reference](./websockets-reference.md#listen-v2-connect)):
40+
Our newest and most advanced speech recognition model with contextual turn detection ([Reference](./reference.md#listen-v2-connect)):
4141

4242
```python
4343
from deepgram import DeepgramClient
@@ -48,7 +48,7 @@ client = DeepgramClient()
4848
with client.listen.v2.connect(
4949
model="flux-general-en",
5050
encoding="linear16",
51-
sample_rate="16000"
51+
sample_rate=16000
5252
) as connection:
5353
def on_message(message):
5454
print(f"Received {message.type} event")
@@ -118,39 +118,45 @@ response = client.read.v1.text.analyze(
118118

119119
#### Voice Agent (Conversational AI)
120120

121-
Build interactive voice agents ([WebSocket Reference](./websockets-reference.md#agent-v1-connect)):
121+
Build interactive voice agents ([Reference](./reference.md#agent-v1-connect)):
122122

123123
```python
124124
from deepgram import DeepgramClient
125-
from deepgram.extensions.types.sockets import (
126-
AgentV1SettingsMessage, AgentV1Agent, AgentV1AudioConfig,
127-
AgentV1AudioInput, AgentV1Listen, AgentV1ListenProvider,
128-
AgentV1Think, AgentV1OpenAiThinkProvider, AgentV1SpeakProviderConfig,
129-
AgentV1DeepgramSpeakProvider
125+
from deepgram.agent.v1.types import (
126+
AgentV1Settings, AgentV1SettingsAgent,
127+
AgentV1SettingsAgentListen, AgentV1SettingsAgentListenProvider_V1,
128+
AgentV1SettingsAudio, AgentV1SettingsAudioInput,
130129
)
130+
from deepgram.types.think_settings_v1 import ThinkSettingsV1
131+
from deepgram.types.think_settings_v1provider import ThinkSettingsV1Provider_OpenAi
132+
from deepgram.types.speak_settings_v1 import SpeakSettingsV1
133+
from deepgram.types.speak_settings_v1provider import SpeakSettingsV1Provider_Deepgram
131134

132135
client = DeepgramClient()
133136

134137
with client.agent.v1.connect() as agent:
135-
settings = AgentV1SettingsMessage(
136-
audio=AgentV1AudioConfig(
137-
input=AgentV1AudioInput(encoding="linear16", sample_rate=44100)
138+
settings = AgentV1Settings(
139+
audio=AgentV1SettingsAudio(
140+
input=AgentV1SettingsAudioInput(encoding="linear16", sample_rate=24000)
138141
),
139-
agent=AgentV1Agent(
140-
listen=AgentV1Listen(
141-
provider=AgentV1ListenProvider(type="deepgram", model="nova-3")
142+
agent=AgentV1SettingsAgent(
143+
listen=AgentV1SettingsAgentListen(
144+
provider=AgentV1SettingsAgentListenProvider_V1(
145+
type="deepgram", model="nova-3"
146+
)
142147
),
143-
think=AgentV1Think(
144-
provider=AgentV1OpenAiThinkProvider(
148+
think=ThinkSettingsV1(
149+
provider=ThinkSettingsV1Provider_OpenAi(
145150
type="open_ai", model="gpt-4o-mini"
146-
)
151+
),
152+
prompt="You are a helpful AI assistant.",
147153
),
148-
speak=AgentV1SpeakProviderConfig(
149-
provider=AgentV1DeepgramSpeakProvider(
154+
speak=SpeakSettingsV1(
155+
provider=SpeakSettingsV1Provider_Deepgram(
150156
type="deepgram", model="aura-2-asteria-en"
151157
)
152-
)
153-
)
158+
),
159+
),
154160
)
155161

156162
agent.send_settings(settings)
@@ -161,18 +167,14 @@ with client.agent.v1.connect() as agent:
161167

162168
For comprehensive documentation of all available methods, parameters, and options:
163169

164-
- **[API Reference](./reference.md)** - Complete reference for REST API methods including:
170+
- **[API Reference](./reference.md)** - Complete reference for all SDK methods including:
165171

166172
- Listen (Speech-to-Text): File transcription, URL transcription, and media processing
167173
- Speak (Text-to-Speech): Audio generation and voice synthesis
168174
- Read (Text Intelligence): Text analysis, sentiment, summarization, and topic detection
169175
- Manage: Project management, API keys, and usage analytics
170176
- Auth: Token generation and authentication management
171-
172-
- **[WebSocket Reference](./websockets-reference.md)** - Detailed documentation for real-time connections:
173-
- Listen v1/v2: Real-time speech recognition with different model capabilities
174-
- Speak v1: Real-time text-to-speech streaming
175-
- Agent v1: Conversational voice agents with integrated STT, LLM, and TTS
177+
- WebSocket connections: Listen v1/v2, Speak v1, and Agent v1 real-time streaming
176178

177179
## Authentication
178180

@@ -242,7 +244,7 @@ async def main():
242244
async with client.listen.v2.connect(
243245
model="flux-general-en",
244246
encoding="linear16",
245-
sample_rate="16000"
247+
sample_rate=16000
246248
) as connection:
247249
async def on_message(message):
248250
print(f"Received {message.type} event")
@@ -378,7 +380,7 @@ We welcome contributions to improve this SDK! However, please note that this lib
378380

379381
5. **Run examples**:
380382
```bash
381-
python -u examples/listen/v2/connect/main.py
383+
python -u examples/07-transcription-live-websocket.py
382384
```
383385

384386
### Contribution Guidelines

0 commit comments

Comments
 (0)