A powerful command-line interface for managing and interacting with the Inference Gateway. This CLI provides tools for configuration, monitoring, and management of inference services.
Early Development Stage: This project is in its early development stage and breaking changes are expected until it reaches a stable version.
Always use pinned versions by specifying a specific version tag when downloading binaries or using install scripts.
- Features
- Installation
- Quick Start
- Commands
- Tools for LLMs
- Configuration
- Tool Approval System
- Shortcuts
- Global Flags
- Examples
- Development
- License
- Automatic Gateway Management: Automatically downloads and runs the Inference Gateway binary (no Docker required!)
- Zero-Configuration Setup: Start chatting immediately with just your API keys in a
.envfile - Interactive Chat: Chat with models using an interactive interface
- Status Monitoring: Check gateway health and resource usage
- Conversation History: Store and retrieve past conversations with multiple storage backends
- Conversation Storage - Detailed storage backend documentation
- Conversation Title Generation - AI-powered title generation system
- Configuration Management: Manage gateway settings via YAML config
- Project Initialization: Set up local project configurations
- Tool Execution: LLMs can execute whitelisted commands and tools - See all tools โ
- Tool Approval System: User approval workflow for sensitive operations with real-time diff visualization
- Agent Modes: Three operational modes for different workflows:
- Standard Mode (default): Normal operation with all configured tools and approval checks
- Plan Mode: Read-only mode for planning and analysis without execution
- Auto-Accept Mode: All tools auto-approved for rapid execution (YOLO mode)
- Toggle between modes with Shift+Tab
- Token Usage Tracking: Accurate token counting with polyfill support for providers that don't return usage metrics
- Inline History Auto-Completion: Smart command history suggestions with inline completion
- Customizable Keybindings: Fully configurable keyboard shortcuts for the chat interface
- Extensible Shortcuts System: Create custom commands with AI-powered snippets - Learn more โ
- MCP Server Support: Direct integration with Model Context Protocol servers for extended tool capabilities - Learn more โ
go install github.com/inference-gateway/cli@latestThis installs the binary as cli. To rename it to infer:
mv $(go env GOPATH)/bin/cli $(go env GOPATH)/bin/inferOr use an alias:
alias infer="$(go env GOPATH)/bin/cli"# Create network and deploy inference gateway first
docker network create inference-gateway
docker run -d --name inference-gateway --network inference-gateway \
--env-file .env \
ghcr.io/inference-gateway/inference-gateway:latest
# Pull and run the CLI
docker pull ghcr.io/inference-gateway/cli:latest
docker run -it --rm --network inference-gateway ghcr.io/inference-gateway/cli:latest chat# Latest version
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash
# Specific version
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --version v0.77.0
# Custom installation directory
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --install-dir $HOME/.local/binDownload the latest release binary for your platform from the releases page.
Verify the binary (recommended for security):
# Download binary and checksums
curl -L -o infer-darwin-amd64 \
https://github.com/inference-gateway/cli/releases/latest/download/infer-darwin-amd64
curl -L -o checksums.txt \
https://github.com/inference-gateway/cli/releases/latest/download/checksums.txt
# Verify checksum
shasum -a 256 infer-darwin-amd64
grep infer-darwin-amd64 checksums.txt
# Install
chmod +x infer-darwin-amd64
sudo mv infer-darwin-amd64 /usr/local/bin/inferFor advanced verification with Cosign signatures, see Binary Verification Guide.
git clone https://github.com/inference-gateway/cli.git
cd cli
go build -o infer cmd/infer/main.go
sudo mv infer /usr/local/bin/- Initialize your project:
infer initThis creates a .infer/ directory with configuration and shortcuts.
- Set up your environment (create
.envfile):
ANTHROPIC_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
DEEPSEEK_API_KEY=your_key_here- Start chatting:
infer chatNow that you're up and running, explore these guides:
- Commands Reference - Complete command documentation
- Tools Reference - Available tools for LLMs
- Configuration Guide - Full configuration options
- Shortcuts Guide - Custom shortcuts and AI-powered snippets
- A2A Agents - Agent-to-agent communication setup
The CLI provides several commands for different workflows. For detailed documentation, see Commands Reference.
infer init - Initialize a new project with configuration and shortcuts
infer init # Initialize project configuration
infer init --userspace # Initialize user-level configurationinfer chat - Start an interactive chat session with model selection
infer chatFeatures: Model selection, real-time streaming, scrollable history, three agent modes (Standard/Plan/Auto-Accept).
infer agent - Execute autonomous tasks in background mode
infer agent "Please fix the github issue 38"
infer agent --model "openai/gpt-4" "Implement feature from issue #42"
infer agent "Analyze this UI issue" --files screenshot.pngFeatures: Autonomous execution, multimodal support (images/files), parallel tool execution.
infer config - Manage CLI configuration settings
# Agent configuration
infer config agent set-model "deepseek/deepseek-chat"
infer config agent set-system "You are a helpful assistant"
infer config agent set-max-turns 100
infer config agent verbose-tools enable
# Tool management
infer config tools enable
infer config tools bash enable
infer config tools safety enable
# Export configuration
infer config export set-model "anthropic/claude-4.1-haiku"See Commands Reference for all configuration options.
infer agents - Manage A2A (Agent-to-Agent) agent configurations
infer agents init # Initialize agents configuration
infer agents add browser-agent # Add an agent from the registry with defaults
infer agents add custom https://... # Add a custom agent
infer agents list # List all agentsFor detailed A2A setup, see A2A Agents Configuration.
infer status - Check gateway health and resource usage
infer statusinfer conversation-title - Manage AI-powered conversation titles
infer conversation-title generate # Generate titles for all conversations
infer conversation-title status # Show generation statusinfer version - Display CLI version information
infer versionWhen tool execution is enabled, LLMs can use various tools to interact with your system. Below is a summary of available tools. For detailed documentation, parameters, and examples, see Tools Reference.
| Tool | Purpose | Approval Required | Documentation |
|---|---|---|---|
| Bash | Execute whitelisted shell commands | Optional | Details |
| Read | Read file contents with line ranges | No | Details |
| Write | Write content to files | Yes | Details |
| Edit | Exact string replacements in files | Yes | Details |
| MultiEdit | Multiple atomic edits to files | Yes | Details |
| Delete | Delete files and directories | Yes | Details |
| Tree | Display directory structure | No | Details |
| Grep | Search files with regex (ripgrep/Go) | No | Details |
| WebSearch | Search the web (DuckDuckGo/Google) | No | Details |
| WebFetch | Fetch content from URLs | No | Details |
| Github | Interact with GitHub API | No | Details |
| TodoWrite | Create and manage task lists | No | Details |
| A2A_SubmitTask | Submit tasks to A2A agents | No | Details |
| A2A_QueryAgent | Query A2A agent capabilities | No | Details |
| A2A_QueryTask | Check A2A task status | No | Details |
| A2A_DownloadArtifacts | Download A2A task outputs | No | Details |
Tool Configuration:
Tools can be enabled/disabled and configured individually:
# Enable/disable specific tools
infer config tools bash enable
infer config tools write enable
# Configure tool settings
infer config tools grep set-backend ripgrep
infer config tools web-fetch add-domain "example.com"See Tools Reference for complete documentation.
The CLI uses a powerful 2-layer configuration system with environment variable support.
Create a minimal configuration:
# .infer/config.yaml
gateway:
url: http://localhost:8080
docker: true # Use Docker mode (or false for binary mode)
tools:
enabled: true
bash:
enabled: true
agent:
model: "deepseek/deepseek-chat"
max_turns: 50
chat:
theme: tokyo-night- Environment Variables (
INFER_*) - Highest priority - Command Line Flags
- Project Config (
.infer/config.yaml) - Userspace Config (
~/.infer/config.yaml) - Built-in Defaults - Lowest priority
Example:
# Set via environment variable (highest priority)
export INFER_AGENT_MODEL="openai/gpt-4"
# Or via config file
infer config agent set-model "deepseek/deepseek-chat"
# Or via command flag
infer chat --model "anthropic/claude-4"- gateway.url - Gateway URL (default:
http://localhost:8080) - gateway.docker - Use Docker mode vs binary mode (default:
true) - tools.enabled - Enable/disable all tools (default:
true) - agent.model - Default model for agent operations
- agent.max_turns - Maximum turns for agent sessions (default:
50) - chat.theme - Chat interface theme (default:
tokyo-night)
All configuration can be set via environment variables with the INFER_ prefix:
export INFER_GATEWAY_URL="http://localhost:8080"
export INFER_AGENT_MODEL="deepseek/deepseek-chat"
export INFER_TOOLS_BASH_ENABLED=true
export INFER_CHAT_THEME="tokyo-night"Format: INFER_<PATH> where dots become underscores.
Example: agent.model โ INFER_AGENT_MODEL
For complete configuration documentation, including all options and environment variables, see Configuration Reference.
The CLI includes a comprehensive approval system for sensitive tool operations, providing security and visibility into what actions LLMs are taking.
When a tool requiring approval is executed:
- Validation: Tool arguments are validated
- Approval Prompt: User sees tool details with:
- Tool name and parameters
- Real-time diff preview (for file modifications)
- Approve/Reject/Auto-approve options
- Execution: Tool runs only if approved
| Tool | Requires Approval | Reason |
|---|---|---|
| Write | Yes | Creates/modifies files |
| Edit | Yes | Modifies file contents |
| MultiEdit | Yes | Multiple file modifications |
| Delete | Yes | Removes files/directories |
| Bash | Optional | Executes system commands |
| Read, Grep, Tree | No | Read-only operations |
| WebSearch, WebFetch | No | External read-only |
| A2A Tools | No | Agent delegation |
Configure approval requirements per tool:
# Enable/disable approval for specific tools
infer config tools safety enable # Global approval
infer config tools bash enable # Enable bash toolOr via configuration file:
tools:
safety:
require_approval: true # Global default
write:
require_approval: true
bash:
require_approval: false # Override for bash- y / Enter - Approve execution
- n / Esc - Reject execution
- a - Auto-approve (disables approval for session)
The CLI provides an extensible shortcuts system for quickly executing common commands with /shortcut-name syntax.
Core:
/clear- Clear conversation history/exit- Exit chat session/help [shortcut]- Show available shortcuts/switch [model]- Switch to different model/theme [name]- Switch chat theme/compact- Compact conversation/export [format]- Export conversation
Git Shortcuts (created by infer init):
/git-status- Show working tree status/git-commit- Generate AI commit message from staged changes/git-push- Push commits to remote/git-log- Show commit logs
SCM Shortcuts (GitHub integration):
/scm-issues- List GitHub issues/scm-issue <number>- Show issue details/scm-pr-create [context]- Generate AI-powered PR plan
Create shortcuts that use LLMs to transform data:
# .infer/shortcuts/custom-example.yaml
shortcuts:
- name: analyze-diff
description: "Analyze git diff with AI"
command: bash
args:
- -c
- |
diff=$(git diff)
jq -n --arg diff "$diff" '{diff: $diff}'
snippet:
prompt: |
Analyze this diff and suggest improvements:
```diff
{diff}
```
template: |
## Analysis
{llm}Create custom shortcuts by adding YAML files to .infer/shortcuts/:
# .infer/shortcuts/custom-dev.yaml
shortcuts:
- name: tests
description: "Run all tests"
command: go
args:
- test
- ./...
- name: build
description: "Build the project"
command: go
args:
- build
- -o
- infer
- .Use with /tests or /build.
For complete shortcuts documentation, including advanced features and examples, see Shortcuts Guide.
-v, --verbose: Enable verbose output--config <path>: Specify custom config file path
# Initialize project
infer init
# Start interactive chat
infer chat
# Execute autonomous task
infer agent "Fix the bug in issue #42"
# Check gateway status
infer status# Start chat
infer chat
# In chat, use shortcuts to get context
/scm-issue 123
# Discuss with AI, let it use tools to:
# - Read files
# - Search codebase
# - Make changes
# - Run tests
# Generate PR plan when ready
/scm-pr-create Fixes the authentication timeout issue# Set default model
infer config agent set-model "deepseek/deepseek-chat"
# Enable bash tool
infer config tools bash enable
# Configure web search
infer config tools web-search enable
# Check current configuration
infer config showFor development, use Task for build automation:
task dev # Format, build, and test
task build # Build binary
task test # Run testsSee CLAUDE.md for detailed development documentation.
MIT License - see LICENSE file for details.