Skip to content

Releases: basnijholt/agent-cli

v0.45.0

04 Jan 07:25
853ea9d

Choose a tag to compare

🎉 Gemini TTS Support

This release adds Google Gemini as a TTS provider, completing full Gemini provider parity across all three AI services (ASR, LLM, TTS).

✨ New Features

  • Gemini TTS provider - Use Gemini's native text-to-speech across all voice commands:
    agent-cli speak "Hello world" --tts-provider gemini
    agent-cli chat --tts-provider gemini
    agent-cli assistant --tts-provider gemini
    agent-cli voice-edit --tts-provider gemini
  • New CLI options: --tts-gemini-model and --tts-gemini-voice
  • Default model: gemini-2.5-flash-preview-tts
  • Available voices: Kore (default), Puck, Charon, Fenrir, and more

🔧 Improvements

  • Consolidated PCM-to-WAV conversion into a single reusable function
  • Documentation fixes for markdown formatting and emoji rendering

📚 Full Changelog

Full Changelog: v0.44.0...v0.45.0

v0.44.0

03 Jan 21:18
5d65ed6

Choose a tag to compare

What's New

✨ Features

  • Gemini ASR Support: Added Google Gemini as an ASR provider using native audio understanding capabilities (#177)
  • Documentation Site: New Zensical-powered documentation site with auto-generated option tables (#157, #159, #162)

📚 Documentation

  • Simplified Windows installation with native Windows support (#170)
  • Added iOS Shortcut Guide to documentation (#165)
  • Various improvements: GitHub-flavored admonitions, syntax highlighting, updated logo (#166, #169, #173-176)

🔧 Improvements

  • Fixed VAD tests on Windows to prevent torch hang (#156)
  • Cleaned up scripts directory and inlined zellij help text (#171, #172)
  • DRY refactoring in docs generation (#164)

Full Changelog: v0.43.0...v0.44.0

v0.43.0

02 Jan 20:34
455b45d

Choose a tag to compare

What's New

macOS Service Improvements

  • Ollama now runs as a brew service (#155): Instead of manually managing ollama serve, Ollama now runs as a proper background service via brew services. The Zellij dashboard shows service status and control commands when running as a brew service.

  • Whisper runs as a launchd service on Apple Silicon (#154): On ARM Macs, wyoming-mlx-whisper now runs as a native launchd service instead of a manual process, improving reliability and startup behavior.

Bug Fixes

  • Fixed Windows CI test hang (#152): Resolved an issue where sounddevice's Pa_Initialize could hang during tests on Windows CI.

v0.42.0

23 Dec 02:46
75cedd5

Choose a tag to compare

🎙️ New Feature: Continuous Transcription Daemon

This release adds transcribe-daemon, a background service that continuously captures audio and automatically transcribes speech segments using voice activity detection (VAD).

Features

  • Voice Activity Detection: Uses Silero VAD (neural network-based) for real-time speech/silence detection
  • Pre-speech Buffer: Captures 300ms of audio before speech is detected to avoid missing word beginnings
  • Automatic Segmentation: Segments audio based on configurable silence threshold (default 1s)
  • Optional LLM Processing: Cleanup and formatting of transcriptions using existing prompts
  • Audio Storage: Saves segments as MP3 files organized by date (~/.config/agent-cli/audio/YYYY/MM/DD/)
  • JSON Lines Logging: Logs transcriptions with timestamps, role, raw/processed text, audio file paths
  • Systemd Integration: Easy service installation for always-on transcription

Installation

# Install with VAD dependency
pip install 'agent-cli[vad]'
# or
uv tool install 'agent-cli[vad]'

Usage

# Basic daemon
agent-cli transcribe-daemon

# With custom role and silence threshold
agent-cli transcribe-daemon --role meeting --silence-threshold 1.5

# With LLM cleanup
agent-cli transcribe-daemon --llm --role notes

Full Changelog: v0.41.0...v0.42.0

v0.41.0

11 Dec 00:27
8ac0d9d

Choose a tag to compare

What’s Changed

  • feat(memory): use JSON mode for fact reconciliation (#146) @basnijholt
  • feat(memory): add list_all method to MemoryClient (#147) @basnijholt
  • feat(chroma): add batched upsert to prevent embedding service overload (#145) @basnijholt

v0.40.0

10 Dec 08:08
d59b31e

Choose a tag to compare

What’s Changed

v0.39.1

10 Dec 07:04
a2077ca

Choose a tag to compare

What’s Changed

v0.39.0

10 Dec 05:04
61d9914

Choose a tag to compare

What’s Changed

v0.38.0

05 Dec 16:07
e18af87

Choose a tag to compare

What's Changed

  • feat(whisper): use MLX-based Whisper on macOS for Apple Silicon by @basnijholt in #123
  • feat: add stop file mechanism for graceful shutdown on Windows by @basnijholt in #127
  • refactor: simplify Windows process management with psutil by @basnijholt in #129

Full Changelog: v0.37.1...v0.38.0

v0.37.1

04 Dec 21:34
b980e68

Choose a tag to compare

What’s Changed

  • fix(docs): correct AHK v2 script and use SIGINT (#125) @basnijholt
  • fix(docs): correct AutoHotkey v2 syntax in Windows installation guide (#124) @basnijholt