-
Notifications
You must be signed in to change notification settings - Fork 37
Feature/openwebui litellm deployment #342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/openwebui litellm deployment #342
Conversation
This comment has been minimized.
This comment has been minimized.
Add a Kubernetes-native deployment of Open WebUI with LiteLLM proxy for chatting with Claude models. This Phase 1 implementation provides a quick, dev-friendly deployment to Kind cluster with minimal configuration. Components: - Base manifests (namespace, deployments, services, PVC, RBAC) - LiteLLM proxy configured for Claude Sonnet 4.5, 3.7, and Haiku 3.5 - Open WebUI frontend with persistent storage - Phase 1 overlay for Kind deployment with nginx-ingress - Comprehensive documentation (README, Phase 1 guide, Phase 2 plan) - Makefile for deployment automation Architecture: - Namespace: openwebui (isolated from ACP) - Ingress: vteam.local/chat (reuses Kind cluster from e2e) - Auth: Disabled in Phase 1 (dev/testing only) - Storage: 500Mi PVC for chat history - Images: ghcr.io/berriai/litellm, ghcr.io/open-webui/open-webui Phase 2 (planned): - OAuth authentication via oauth2-proxy - Long-running Claude Code service for Amber integration - Production hardening (secrets, RBAC, monitoring) - OpenShift compatibility (Routes, SCC compliance) Deployment: ```bash cd components/open-webui-llm # Edit overlays/phase1-kind/secrets.yaml with API key make phase1-deploy # Access: http://vteam.local:8080/chat (Podman) or /chat (Docker) ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Increase memory limit from 512Mi to 2Gi to prevent OOMKilled crashes - Increase CPU limit from 500m to 1000m for better performance - Update health probe paths to LiteLLM-specific endpoints: - /health/liveliness for liveness probe - /health/readiness for readiness probe - Increase resource requests for stability Fixes LiteLLM pod crash loop due to insufficient memory allocation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
ad4140f to
b43bf39
Compare
Claude Code ReviewSummaryThis PR adds a Phase 1 deployment of Open WebUI + LiteLLM for chatting with Claude models via a Kind cluster. The implementation is well-structured with clear documentation and follows Kubernetes best practices for a development/testing environment. However, there are several critical security issues that must be addressed before this can be considered for any production use (even internal), and some architectural decisions that need discussion. Key Finding: This is explicitly a dev/testing prototype (Phase 1), with Phase 2 planned for production hardening. The security issues identified are acceptable for local dev but must not be deployed to shared/production environments. Issues by Severity🚫 Blocker IssuesNone - This is acceptable for Phase 1 dev/testing as documented. 🔴 Critical Issues1. Missing SecurityContext in Deployments
|
this is a prototype UX for using openwebui to interact with the new amber codebase agent added in #337