Skip to content

Conversation

@thebtf
Copy link

@thebtf thebtf commented Jan 14, 2026

Summary

  • Aider/LiteLLM Compatibility: Automatically retry requests when the upstream rejects an unsupported parameter (e.g. temperature), preventing hard failures like Unsupported parameter: temperature.

Changes

  • Bumped version to 1.4.10
  • Updated CHANGELOG.md with fix description

Test plan

  • Test with Aider sending temperature parameter
  • Verify retry logic strips unsupported parameter and retries successfully
  • Confirm no regression in normal operation

🤖 Generated with Claude Code

claude and others added 30 commits November 17, 2025 20:45
- Add gpt-5.1 model name normalization mappings in upstream.py
- Include gpt-5.1 and its reasoning variants in OpenAI models endpoint
- Include gpt-5.1 and its reasoning variants in Ollama models endpoint
- Support gpt5.1, gpt-5.1, and gpt-5.1-latest aliases
…oJYExKDyq2P5ZMnG

Add support for gpt-5.1 models
- Add PUID and PGID environment variables to Dockerfile for running container with different user credentials
- Install su-exec for proper user switching in container
- Update entrypoint.sh to handle dynamic user/group ID assignment
- Update .env.example with PUID/PGID configuration
- Update DOCKER.md with comprehensive PUID/PGID documentation
- Add gpt-5.1 model to README.md supported models list
- Create CHANGELOG.md to track project changes
- Create CLAUDE.md with comprehensive project overview and documentation

This allows users to avoid permission issues with Docker volumes by matching
container user IDs with host user IDs.
- Add GitHub Actions workflow for automated Docker image builds
- Publish multi-architecture images (amd64, arm64) to ghcr.io
- Create docker-compose.registry.yml for using pre-built images
- Update DOCKER.md with pre-built image usage instructions
- Update CHANGELOG.md with container registry features
- Configure automated builds on push to main and version tags
- Add metadata and labels for better image management

Images are now available at: ghcr.io/raybytes/chatmock:latest
- Update GitHub Actions workflow to publish to ghcr.io/thebtf/chatmock
- Update docker-compose.registry.yml to use thebtf images
- Update docker-compose.yml comments with correct registry path
- Update CHANGELOG.md with correct image location

All Docker images will now be published to and pulled from the fork's
container registry at ghcr.io/thebtf/chatmock:latest
Add notice at the top of README clarifying that this is a personal fork
and directing users to the original repository for feature requests,
bug reports, and general support.
…Sh6tW8vp4Q8LNND

Claude/update docs docker 01 qptso9 t sh6t w8vp4 q8 lnnd
- Add MANUAL_BUILD.md with detailed instructions for manual Docker builds
- Add build-and-push.sh script for easy multi-arch image publishing
- Add scripts/README.md with quick start guide
- Support for multi-architecture builds (linux/amd64, linux/arm64)
- Include troubleshooting section for common issues

These tools allow manual publishing to GitHub Container Registry
when needed, complementing the automated GitHub Actions workflow.
su-exec is not available in Debian repositories, causing build failures.
Replaced with gosu which is available in official Debian repos and provides
the same functionality for running processes as a different user.

Changes:
- Dockerfile: Install gosu instead of su-exec
- entrypoint.sh: Use gosu instead of su-exec

This fixes the build error: "apt-get install su-exec" exit code 100
…Sh6tW8vp4Q8LNND

Claude/update docs docker 01 qptso9 t sh6t w8vp4 q8 lnnd
…cture documentation

- Add linux/arm/v7 to supported platforms for 32-bit ARM devices
- Support Raspberry Pi 2/3 (32-bit OS), BeagleBone, and other ARM v7 devices
- Update GitHub Actions workflow to build for arm/v7
- Update build script with new platform
- Create ARCHITECTURES.md with detailed platform documentation
- Update CHANGELOG and PR description

Now building for:
- linux/amd64 (Intel/AMD 64-bit)
- linux/arm64 (ARM 64-bit)
- linux/arm/v7 (ARM 32-bit v7) - NEW
Expand multi-architecture support to 5 platforms:
- linux/amd64 (Intel/AMD 64-bit)
- linux/arm64 (ARM 64-bit)
- linux/arm/v7 (ARM 32-bit v7)
- linux/arm/v6 (ARM 32-bit v6) - NEW
- linux/386 (Intel/AMD 32-bit) - NEW

New device support:
- Raspberry Pi Zero, Zero W
- Raspberry Pi 1 (all models)
- Legacy 32-bit x86 systems
- Older embedded systems

Changes:
- Update GitHub Actions workflow to build for all 5 architectures
- Update build script with new platforms
- Comprehensive ARCHITECTURES.md documentation updates
- Update CHANGELOG and PR description

This provides comprehensive coverage for virtually all devices
from legacy systems to modern hardware.
…Sh6tW8vp4Q8LNND

Claude/update docs docker 01 qptso9 t sh6t w8vp4 q8 lnnd
…tegration

This major update transforms ChatMock into a production-ready deployment with significant performance improvements and new features.

## 🚀 Performance Improvements

### High-Performance Web Server
- Replace Flask development server with Gunicorn + gevent workers
- 3-5x performance increase (200-500+ RPS vs 50 RPS)
- Support for 1000+ concurrent connections
- Configurable worker processes via GUNICORN_WORKERS env var
- Graceful worker restarts and health monitoring
- Production-ready WSGI server configuration

## 🎨 New WebUI Dashboard

### Features
- Real-time usage statistics and analytics
- Visual rate limit monitoring with progress bars
- Interactive charts showing requests by model
- Complete model browser with capabilities
- Runtime configuration management
- OAuth authentication status display

### API Endpoints
- GET /api/status - Authentication and user info
- GET /api/stats - Usage statistics and rate limits
- GET /api/models - Available models with details
- GET /api/config - Current configuration
- POST /api/config - Update runtime configuration
- GET /api/login-url - OAuth login information

### Access
- Local: http://localhost:8000/webui
- Production: https://your-domain.com/webui

## 🔒 Traefik Integration

### docker-compose.traefik.yml
- Automatic HTTPS with Let's Encrypt
- HTTP to HTTPS redirect
- CORS middleware configuration
- Health check integration
- Load balancing support
- Production-ready labels

### Features
- Automatic SSL certificate management
- Reverse proxy configuration
- Custom middleware support
- Network isolation
- Service discovery

## 📝 Configuration

### Enhanced .env.example
- Comprehensive configuration documentation
- Gunicorn worker configuration
- Traefik-specific settings
- Domain and ACME email configuration
- All feature toggles documented

### New Options
- USE_GUNICORN: Enable/disable Gunicorn (default: 1)
- GUNICORN_WORKERS: Number of worker processes
- CHATMOCK_DOMAIN: Domain for Traefik
- TRAEFIK_NETWORK: Traefik network name
- TRAEFIK_ACME_EMAIL: Let's Encrypt email

## 📚 Documentation

### New Guides
- docs/WEBUI.md - Complete WebUI documentation
- docs/PRODUCTION.md - Production deployment guide
- docs/TRAEFIK.md - Traefik integration guide
- docs/README.md - Documentation index

### Topics Covered
- Performance tuning and optimization
- Scaling strategies (vertical and horizontal)
- Monitoring and logging
- Security best practices
- High availability setup
- Troubleshooting guides
- Benchmark results

## 🔧 Technical Changes

### Backend
- Add chatmock/routes_webui.py with WebUI routes
- Integrate WebUI blueprint in app.py
- Add statistics tracking with JSON file storage
- Implement runtime configuration API

### Frontend
- Single-page application with embedded CSS/JS
- No build process required
- Auto-refresh every 30 seconds
- Responsive design
- Modern UI with progress bars and charts

### Infrastructure
- gunicorn.conf.py with optimal production settings
- Updated entrypoint.sh with Gunicorn integration
- Fallback to Flask dev server if USE_GUNICORN=0
- Support for custom Gunicorn configuration

### Dependencies
- Add gunicorn==23.0.0
- Add gevent==24.11.1

### Docker
- Enhanced docker-compose.yml with restart policy
- New docker-compose.traefik.yml for production
- Health check improvements
- Network configuration for Traefik

## 📊 Performance Benchmarks

Test results (4 CPU cores, 8GB RAM):
- Flask Dev: 50 RPS, 100ms avg latency
- Gunicorn (4 workers): 200 RPS, 80ms avg latency
- Gunicorn (8 workers): 350 RPS, 60ms avg latency
- Gunicorn (16 workers): 500 RPS, 50ms avg latency

## 🎯 Use Cases

1. Development: Local testing with improved performance
2. Production: Traefik + HTTPS deployment
3. High Availability: Horizontal scaling with load balancing
4. Monitoring: Real-time dashboard for usage tracking
5. Configuration: Dynamic settings via WebUI

## 🔄 Migration Guide

Existing deployments:
1. Pull latest changes
2. Update .env from .env.example
3. Rebuild: docker-compose build
4. Restart: docker-compose up -d
5. Access WebUI: http://localhost:8000/webui

New Traefik deployment:
1. Configure domain in .env
2. Deploy: docker-compose -f docker-compose.traefik.yml up -d
3. Access: https://your-domain.com/webui

## ✨ Highlights

- Production-ready deployment out of the box
- Significant performance improvements
- Modern web dashboard for monitoring
- Automatic HTTPS with Traefik
- Comprehensive documentation
- Scalable architecture
- Zero downtime updates
- Battle-tested components

Closes #<issue_number_if_any>
…v9EzgDsZSRKZTaGWiF

feat: Add production-ready features - Gunicorn, WebUI, and Traefik in…
…ensive documentation links

- Add 'What's New' section highlighting performance, WebUI, and Traefik
- Update Docker quickstart with WebUI access instructions
- Add comprehensive Web Dashboard section with features and API endpoints
- Add Performance benchmarks table comparing different configurations
- Expand Configuration section with three methods: env vars, WebUI, and CLI
- Add detailed configuration options for server, reasoning, and features
- Add Deployment Options section comparing Python, Docker, Traefik, and Kubernetes
- Add Documentation section with links to all guides
- Add Troubleshooting section for common issues
- Update What's supported list with new features
- Add links to new documentation throughout

All sections now include links to:
- docs/README.md (Documentation Index)
- docs/WEBUI.md (WebUI Guide)
- docs/PRODUCTION.md (Production Deployment)
- docs/TRAEFIK.md (Traefik Integration)
- .env.example (Configuration Reference)
…v9EzgDsZSRKZTaGWiF

docs: Update README with WebUI, performance improvements, and compreh…
Add comprehensive automation for building and releasing macOS applications:

Features:
- GitHub Actions workflow for automated macOS DMG builds
- Automatic GitHub Release creation on version tags
- DMG installers automatically attached to releases
- Complete build documentation in BUILD.md
- Build dependencies specification (requirements-build.txt)

Workflow:
- Triggers on version tags (v*.*.*)
- Builds macOS .app bundle with PyInstaller
- Creates DMG installer with Applications symlink
- Uploads DMG as GitHub Release asset
- Generates release notes automatically

Benefits:
- No manual building required
- Consistent release process
- Professional DMG installers
- One-command release: just push a tag!

This complements Docker image automation, providing complete
release automation for both containerized and native deployments.
…Sh6tW8vp4Q8LNND

feat: Add automated macOS application builds and GitHub Releases
Fixed package versions that were causing build failures:
- certifi: 2025.8.3 -> 2024.8.30 (future version doesn't exist)
- urllib3: 2.5.0 -> 2.2.3 (invalid version)
- flask: 3.1.1 -> 3.0.3 (stable version)
- blinker: 1.9.0 -> 1.8.2
- click: 8.2.1 -> 8.1.7
- jinja2: 3.1.6 -> 3.1.4
- markupsafe: 3.0.2 -> 2.1.5
- werkzeug: 3.1.3 -> 3.0.4
- requests: 2.32.5 -> 2.32.3

All versions are now compatible and available in PyPI.
This fixes Docker build error: 'pip install failed with exit code 1'
Added new dependencies from main:
- gunicorn==22.0.0 (was 23.0.0 - invalid version)
- gevent==24.2.1 (was 24.11.1 - invalid version)

All package versions are now valid and available in PyPI.
This resolves the merge conflict with main branch.
Used fixed package versions from our branch:
- All versions are valid and available in PyPI
- Includes both new dependencies (gunicorn, gevent) from main
- All versions are corrected to exist (not from future)

This resolves the merge conflict with main branch.
…Sh6tW8vp4Q8LNND

fix: Update requirements.txt with valid package versions
Documentation Changes:
- Move all documentation to docs/ directory for better organization
- Keep only README.md and CLAUDE.md in root
- Create docs/README.md with comprehensive documentation index
- Update all internal links to point to docs/ directory

Files moved to docs/:
- CHANGELOG.md
- BUILD.md
- MANUAL_BUILD.md
- ARCHITECTURES.md
- DOCKER.md
- CONTRIBUTING.md
- RELEASE_v1.4.0.md
- CREATE_PR_STEPS.md
- PR_DESCRIPTION.md

Requirements.txt fix:
- Replace exact versions with flexible version ranges
- Use >= and < constraints for compatibility
- Allows pip to find compatible versions in PyPI
- Fixes Docker build error: 'pip install failed with exit code 1'

Benefits:
- Cleaner repository structure
- Easier to navigate documentation
- Better separation of concerns
- Resolves package installation issues
…Sh6tW8vp4Q8LNND

refactor: Reorganize documentation and fix requirements.txt
Added gcc, g++, make, and development headers to support compiling
Python packages (especially gevent) on all architectures including
linux/386, linux/arm/v6, etc.

This fixes the Docker build error:
'pip subprocess to install build dependencies did not run successfully'

Build dependencies added:
- gcc, g++, make (compilers)
- libffi-dev (for cffi packages)
- libssl-dev (for cryptography)
- python3-dev (Python headers)

Also upgraded pip before installing requirements to use latest pip.
…TSh6tW8vp4Q8LNND

fix: Add build dependencies to Dockerfile for package compilation
Changed login check from 'docker info' to checking ~/.docker/config.json
which correctly detects ghcr.io authentication.
Kirill Turanskiy and others added 30 commits December 17, 2025 13:39
GPT-5.2 has strict instruction validation and only accepts whitelisted
formats (like BASE_INSTRUCTIONS from Codex CLI). This makes it the
default behavior without needing debug probing requests.

For GPT-5.2:
- Always use BASE_INSTRUCTIONS as the instructions parameter
- Convert client system prompt to user message with [System Context] prefix
- Prepend to input_items

Other models continue to work as before.
When using GPT-5.2 with BASE_INSTRUCTIONS, add environment clarification
to help the model understand it's running in Cursor IDE rather than
standalone Codex CLI terminal. This guides the model to prefer IDE's
built-in tools (Read, Edit, Write, Bash, etc.) for standard operations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Changed role from "user" to "developer" for client system prompts
when using GPT-5.2. According to OpenAI model spec, developer role
has higher authority than user role, which should help the model
better follow IDE instructions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Downloaded official prompts from OpenAI Codex repo:
  - gpt_5_1_prompt.md (28 KB)
  - gpt_5_2_prompt.md (26 KB)
  - gpt_5_codex_prompt.md (11 KB)
  - gpt_5_1_codex_max_prompt.md (12 KB)

- Added get_instructions_for_model() to select correct prompt per model
- Changed GPT-5.2 handling: concatenate client prompt to instructions
  instead of using developer message (matches official Codex behavior)
- Added PROJECT_DOC_SEPARATOR for consistent concatenation

This should improve caching (entire instructions field cached as prefix)
and better follow official Codex patterns.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Avoid confusion with actual project documentation (AGENTS.md).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The client's system prompt should describe its own environment.
Simple concatenation: model_instructions + separator + client_prompt

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
After investigating official Codex source code, found that:
- instructions field: only model-specific base instructions (validated)
- developer_instructions: sent as separate developer message
- user_instructions (AGENTS.md): sent as separate user message

Concatenation to instructions field causes "Instructions are not valid" error.
Now client system prompt goes as role: "developer" message in input array.

Also fixed retry logic to use final_instructions instead of hardcoded BASE_INSTRUCTIONS.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add dump_upstream() function for logging full upstream payloads
- Log upstream_request before sending to ChatGPT (includes input_items)
- Log upstream_response after receiving from ChatGPT (text, tool_calls)
- Enabled via DEBUG_LOG=1 environment variable
- Creates debug_chat_completions_upstream_*.json files

Helps debug GPT-5.2 looping issues by showing full conversation context.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- New _wrap_stream_file_logging() wrapper captures streaming output
- Logs: full text, tool calls, finish reasons
- Writes to debug_chat_completions_upstream_response.json
- Enabled via DEBUG_LOG=true (same as other debug logging)
- Helps debug GPT-5.2 looping issue in Cursor

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Previously, convert_tools_chat_to_responses() only handled type: "function"
tools and silently dropped type: "custom" tools like apply_patch.

This caused GPT-5.2 to not know about apply_patch, leading to infinite
loops where the model kept preparing but couldn't execute file edits.

Changes:
- Added handling for type: "custom" tools (pass through as-is)
- apply_patch with Lark grammar format now sent to GPT-5.2

Debug evidence: raw_tools_count was 98, converted was 97 (1 dropped)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
GPT-5.2/Responses API only understands type: "function" tools.
type: "custom" tools (like apply_patch with Lark grammar) were being
passed through but ignored by the model.

Now custom tools are converted to function format with a single
"content" string parameter. The description already contains the
format instructions (V4A diff format, etc).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
For custom tools like apply_patch with Lark grammar:
1. Extract "content" field from GPT-5.2 response (line 761-762)
2. Pass raw string directly without JSON wrapping (line 785-787)

Cursor expects the patch content as raw V4A diff string,
not wrapped in {"content": "..."} JSON.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Instead of converting type:"custom" to type:"function" with JSON-wrapped
arguments, now passes custom tools through as-is to the Responses API.

Changes:
- convert_tools_chat_to_responses: Pass type:"custom" tools unchanged
- chat_completions: Handle custom_tool_call response items (raw input)
- sse_translate_chat: Handle custom_tool_call in streaming responses
- Remove hardcoded apply_patch workarounds

Custom tools (like apply_patch with Lark grammar) now work correctly:
- Upstream receives type:"custom" with grammar definition
- Response returns custom_tool_call with raw "input" string
- Client receives raw content without JSON wrapping

Ref: https://platform.openai.com/docs/guides/tools#custom-tools

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
For function_call items (not web_search_call), raw string arguments should be
passed through as-is, not wrapped in {"query": ...}. This handles the case
where ChatGPT returns a custom tool response as function_call type instead of
custom_tool_call type.
Instead of sending full arguments in one chunk, stream them in ~100 char
pieces to match OpenAI's streaming format. This might help Cursor properly
track changes from apply_patch tool calls.
OpenAI's streaming format includes role and content fields in the first
delta chunk for tool calls. Added these fields to match the spec.
…sing

This will show if tool results are being accepted or skipped due to
missing function_call in seen_function_call_ids.
- Log all incoming messages with role/type/call_id
- Log passthrough processing for function_call and function_call_output
- This will help identify why tool results might be missing
Root cause of model looping issue found:
- Cursor sends custom_tool_call and custom_tool_call_output items
- These were not in _responses_api_types set
- Result: tool call history was dropped, model never saw results
- Model kept retrying apply_patch → looping

Fix: Added custom_tool_call and custom_tool_call_output to passthrough.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
All [CONVERT], [PASSTHROUGH], [TOOL_RESULT], [CHATMOCK], [TOOL_CALL]
logging now requires DEBUG_LOG=1 (or CHATGPT_LOCAL_DEBUG).

- Added _is_debug_log() function to check env variables
- Wrapped all debug print statements with condition
- Cleaned up unnecessary try/except around simple prints
- [STREAM] logs remain controlled by CHATMOCK_DEBUG_STREAM

This prevents noisy logs in production while keeping debug
capability available when needed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add NDJSON instrumentation around request ingress and upstream payload assembly to diagnose unsupported parameter errors (e.g. temperature) without logging secrets.
ChatGPT upstream rejects temperature for gpt-5.2; drop it before sending and retry once when upstream reports 'Unsupported parameter: X' so clients like aider/litellm keep working.
Do not write NDJSON logs into CHATGPT_LOCAL_HOME; instead emit a compact stdout marker when blocked params (like temperature for gpt-5.2) are dropped so remote deployments can provide runtime evidence.
Ensure temperature is removed from extra_fields for gpt-5.2 before request dumps and upstream forwarding, matching upstream constraints and making behavior visible in existing debug_chat_completions.json logs.
Remove agentlog-based instrumentation and rely on existing debug_* dumps for runtime evidence. Keep compatibility behavior (retry on unsupported param) without introducing a separate logging system.
- Resolve conflicts keeping our get_model_ids() architecture
- Add gpt-5.2-codex to AVAILABLE_MODELS in config.py
- Keep structured documentation in README.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants