You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- [x] Add new API type constant for Google Gemini in uctypes.go
- [x] Create gemini directory under pkg/aiusechat/
- [x] Implement gemini-backend.go with streaming chat support
- [x] Implement gemini-convertmessage.go for message conversion
- [x] Implement gemini-types.go for Google-specific types
- [x] Add gemini backend to usechat-backend.go
- [x] Support tool calling with structured arguments
- [x] Support image upload (base64 inline data)
- [x] Support PDF upload (base64 inline data)
- [x] Support file upload (text files, directory listings)
- [x] Build verification passed
- [x] Add documentation for Gemini backend usage
- [x] Security scan passed (CodeQL found 0 issues)
- [x] Code review passed with no comments
- [x] Revert tsunami demo go.mod/go.sum files (per feedback - twice)
- [x] Add `--gemini` flag to main-testai.go for testing
- [x] Fix schema validation for tool calling (clean unsupported fields)
- [x] Preserve non-map property values in schema cleaning
## Summary
Successfully implemented a complete Google Gemini backend for WaveTerm's
AI chat system. The implementation:
- **Follows existing patterns**: Matches the structure of OpenAI and
Anthropic backends
- **Fully featured**: Supports all required capabilities including tool
calling, images, PDFs, and files
- **Properly tested**: Builds successfully with no errors or warnings
- **Secure**: Passed CodeQL security scanning with 0 issues
- **Well documented**: Includes comprehensive package documentation with
usage examples
- **Minimal changes**: Only affects backend code under pkg/aiusechat
(tsunami demo files reverted twice)
- **Testable**: Added `--gemini` flag to main-testai.go for easy testing
with SSE output
- **Schema compatible**: Cleans JSON schemas to remove fields
unsupported by Gemini API while preserving valid structure
## Testing
To test the Gemini backend using main-testai.go:
```bash
export GOOGLE_APIKEY="your-api-key"
cd cmd/testai
go run main-testai.go --gemini 'What is 2+2?'
go run main-testai.go --gemini --model gemini-1.5-pro 'Explain quantum computing'
go run main-testai.go --gemini --tools 'Help me configure GitHub Actions monitoring'
```
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: sawka <2722291+sawka@users.noreply.github.com>
flag.BoolVar(&gemini, "gemini", false, "Use Google Gemini API")
355
418
flag.BoolVar(&tools, "tools", false, "Enable GitHub Actions Monitor tools for testing")
356
-
flag.StringVar(&model, "model", "", fmt.Sprintf("AI model to use (defaults: %s for OpenAI, %s for Anthropic, %s for OpenRouter)", DefaultOpenAIModel, DefaultAnthropicModel, DefaultOpenRouterModel))
419
+
flag.StringVar(&model, "model", "", fmt.Sprintf("AI model to use (defaults: %s for OpenAI, %s for Anthropic, %s for OpenRouter, %s for Gemini)", DefaultOpenAIModel, DefaultAnthropicModel, DefaultOpenRouterModel, DefaultGeminiModel))
Copy file name to clipboardExpand all lines: docs/docs/waveai-modes.mdx
+77-14Lines changed: 77 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
2
sidebar_position: 1.6
3
3
id: "waveai-modes"
4
-
title: "Wave AI (Local Models)"
4
+
title: "Wave AI (Local Models + BYOK)"
5
5
---
6
6
7
7
Wave AI supports custom AI modes that allow you to use local models, custom API endpoints, and alternative AI providers. This gives you complete control over which models and providers you use with Wave's AI features.
@@ -37,10 +37,11 @@ Wave AI now supports provider-based configuration which automatically applies se
37
37
38
38
### Supported API Types
39
39
40
-
Wave AI supports two OpenAI-compatible API types:
40
+
Wave AI supports the following API types:
41
41
42
42
-**`openai-chat`**: Uses the `/v1/chat/completions` endpoint (most common)
43
43
-**`openai-responses`**: Uses the `/v1/responses` endpoint (modern API for GPT-5+ models)
44
+
-**`google-gemini`**: Google's Gemini API format (automatically set when using `ai:provider: "google"`, not typically used directly)
44
45
45
46
## Configuration Structure
46
47
@@ -49,7 +50,7 @@ Wave AI supports two OpenAI-compatible API types:
49
50
```json
50
51
{
51
52
"mode-key": {
52
-
"display:name": "Display Name",
53
+
"display:name": "Qwen (OpenRouter)",
53
54
"ai:provider": "openrouter",
54
55
"ai:model": "qwen/qwen-2.5-coder-32b-instruct"
55
56
}
@@ -89,10 +90,10 @@ Wave AI supports two OpenAI-compatible API types:
89
90
|`display:icon`| No | Icon identifier for the mode |
90
91
|`display:description`| No | Full description of the mode |
|`ai:apitype`| No | API type: `openai-chat` or `openai-responses` (defaults to `openai-chat` if not specified) |
93
+
|`ai:apitype`| No | API type: `openai-chat`, `openai-responses`, or `google-gemini` (defaults to `openai-chat` if not specified) |
93
94
|`ai:model`| No | Model identifier (required for most providers) |
94
95
|`ai:thinkinglevel`| No | Thinking level: `low`, `medium`, or `high`|
95
-
|`ai:endpoint`| No | Full API endpoint URL (auto-set by provider when available) |
96
+
|`ai:endpoint`| No |*Full* API endpoint URL (auto-set by provider when available) |
96
97
|`ai:azureapiversion`| No | Azure API version (for `azure-legacy` provider, defaults to `2025-04-01-preview`) |
97
98
|`ai:apitoken`| No | API key/token (not recommended - use secrets instead) |
98
99
|`ai:apitokensecretname`| No | Name of secret containing API token (auto-set by provider) |
@@ -110,6 +111,14 @@ The `ai:capabilities` field specifies what features the AI mode supports:
110
111
-**`images`** - Allows image attachments in chat (model can view uploaded images)
111
112
-**`pdfs`** - Allows PDF file attachments in chat (model can read PDF content)
112
113
114
+
**Provider-specific behavior:**
115
+
-**OpenAI and Google providers**: Capabilities are automatically configured based on the model. You don't need to specify them.
116
+
-**OpenRouter, Azure, Azure-Legacy, and Custom providers**: You must manually specify capabilities based on your model's features.
117
+
118
+
:::warning
119
+
If you don't include `"tools"` in the `ai:capabilities` array, the AI model will not be able to interact with your Wave terminal widgets, read/write files, or execute commands. Most AI modes should include `"tools"` for the best Wave experience.
120
+
:::
121
+
113
122
Most models support `tools` and can benefit from it. Vision-capable models should include `images`. Not all models support PDFs, so only include `pdfs` if your model can process them.
114
123
115
124
## Local Model Examples
@@ -127,7 +136,7 @@ Most models support `tools` and can benefit from it. Vision-capable models shoul
127
136
"display:description": "Local Llama 3.3 70B model via Ollama",
@@ -198,6 +207,7 @@ The provider automatically sets:
198
207
-`ai:endpoint` to `https://api.openai.com/v1/chat/completions`
199
208
-`ai:apitype` to `openai-chat` (or `openai-responses` for GPT-5+ models)
200
209
-`ai:apitokensecretname` to `OPENAI_KEY` (store your OpenAI API key with this name)
210
+
-`ai:capabilities` to `["tools", "images", "pdfs"]` (automatically determined based on model)
201
211
202
212
For newer models like GPT-4.1 or GPT-5, the API type is automatically determined:
203
213
@@ -230,6 +240,40 @@ The provider automatically sets:
230
240
-`ai:apitype` to `openai-chat`
231
241
-`ai:apitokensecretname` to `OPENROUTER_KEY` (store your OpenRouter API key with this name)
232
242
243
+
:::note
244
+
For OpenRouter, you must manually specify `ai:capabilities` based on your model's features. Example:
245
+
```json
246
+
{
247
+
"openrouter-qwen": {
248
+
"display:name": "OpenRouter - Qwen",
249
+
"ai:provider": "openrouter",
250
+
"ai:model": "qwen/qwen-2.5-coder-32b-instruct",
251
+
"ai:capabilities": ["tools"]
252
+
}
253
+
}
254
+
```
255
+
:::
256
+
257
+
### Google AI (Gemini)
258
+
259
+
[Google AI](https://ai.google.dev) provides the Gemini family of models. Using the `google` provider simplifies configuration:
260
+
261
+
```json
262
+
{
263
+
"google-gemini": {
264
+
"display:name": "Gemini 3 Pro",
265
+
"ai:provider": "google",
266
+
"ai:model": "gemini-3-pro-preview"
267
+
}
268
+
}
269
+
```
270
+
271
+
The provider automatically sets:
272
+
-`ai:endpoint` to `https://generativelanguage.googleapis.com/v1beta/models/{model}:streamGenerateContent`
273
+
-`ai:apitype` to `google-gemini`
274
+
-`ai:apitokensecretname` to `GOOGLE_AI_KEY` (store your Google AI API key with this name)
275
+
-`ai:capabilities` to `["tools", "images", "pdfs"]` (automatically configured)
276
+
233
277
### Azure OpenAI (Modern API)
234
278
235
279
For the modern Azure OpenAI API, use the `azure` provider:
@@ -250,6 +294,21 @@ The provider automatically sets:
250
294
-`ai:apitype` based on the model
251
295
-`ai:apitokensecretname` to `AZURE_OPENAI_KEY` (store your Azure OpenAI key with this name)
252
296
297
+
:::note
298
+
For Azure providers, you must manually specify `ai:capabilities` based on your model's features. Example:
299
+
```json
300
+
{
301
+
"azure-gpt4": {
302
+
"display:name": "Azure GPT-4",
303
+
"ai:provider": "azure",
304
+
"ai:model": "gpt-4",
305
+
"ai:azureresourcename": "your-resource-name",
306
+
"ai:capabilities": ["tools", "images"]
307
+
}
308
+
}
309
+
```
310
+
:::
311
+
253
312
### Azure OpenAI (Legacy Deployment API)
254
313
255
314
For legacy Azure deployments, use the `azure-legacy` provider:
@@ -267,6 +326,10 @@ For legacy Azure deployments, use the `azure-legacy` provider:
267
326
268
327
The provider automatically constructs the full endpoint URL and sets the API version (defaults to `2025-04-01-preview`). You can override the API version with `ai:azureapiversion` if needed.
269
328
329
+
:::note
330
+
For Azure Legacy provider, you must manually specify `ai:capabilities` based on your model's features.
331
+
:::
332
+
270
333
## Using Secrets for API Keys
271
334
272
335
Instead of storing API keys directly in the configuration, you should use Wave's secret store to keep your credentials secure. Secrets are stored encrypted using your system's native keychain.
0 commit comments