Feature/openwebui litellm deployment #342

jeremyeder · 2025-11-19T01:35:23Z

this is a prototype UX for using openwebui to interact with the new amber codebase agent added in #337

Add a Kubernetes-native deployment of Open WebUI with LiteLLM proxy for chatting with Claude models. This Phase 1 implementation provides a quick, dev-friendly deployment to Kind cluster with minimal configuration. Components: - Base manifests (namespace, deployments, services, PVC, RBAC) - LiteLLM proxy configured for Claude Sonnet 4.5, 3.7, and Haiku 3.5 - Open WebUI frontend with persistent storage - Phase 1 overlay for Kind deployment with nginx-ingress - Comprehensive documentation (README, Phase 1 guide, Phase 2 plan) - Makefile for deployment automation Architecture: - Namespace: openwebui (isolated from ACP) - Ingress: vteam.local/chat (reuses Kind cluster from e2e) - Auth: Disabled in Phase 1 (dev/testing only) - Storage: 500Mi PVC for chat history - Images: ghcr.io/berriai/litellm, ghcr.io/open-webui/open-webui Phase 2 (planned): - OAuth authentication via oauth2-proxy - Long-running Claude Code service for Amber integration - Production hardening (secrets, RBAC, monitoring) - OpenShift compatibility (Routes, SCC compliance) Deployment: ```bash cd components/open-webui-llm # Edit overlays/phase1-kind/secrets.yaml with API key make phase1-deploy # Access: http://vteam.local:8080/chat (Podman) or /chat (Docker) ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Increase memory limit from 512Mi to 2Gi to prevent OOMKilled crashes - Increase CPU limit from 500m to 1000m for better performance - Update health probe paths to LiteLLM-specific endpoints: - /health/liveliness for liveness probe - /health/readiness for readiness probe - Increase resource requests for stability Fixes LiteLLM pod crash loop due to insufficient memory allocation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

github-actions · 2025-12-20T19:34:41Z

Claude Code Review

Summary

This PR adds a Phase 1 deployment of Open WebUI + LiteLLM for chatting with Claude models via a Kind cluster. The implementation is well-structured with clear documentation and follows Kubernetes best practices for a development/testing environment. However, there are several critical security issues that must be addressed before this can be considered for any production use (even internal), and some architectural decisions that need discussion.

Key Finding: This is explicitly a dev/testing prototype (Phase 1), with Phase 2 planned for production hardening. The security issues identified are acceptable for local dev but must not be deployed to shared/production environments.

Issues by Severity

🚫 Blocker Issues

None - This is acceptable for Phase 1 dev/testing as documented.

🔴 Critical Issues

1. Missing SecurityContext in Deployments ⚠️

Location: components/open-webui-llm/base/litellm/deployment.yaml, components/open-webui-llm/base/open-webui/deployment.yaml

Issue: Both deployments lack SecurityContext, violating CLAUDE.md container security standards.

From CLAUDE.md (lines 659-670):

SecurityContext: &corev1.SecurityContext{
    AllowPrivilegeEscalation: boolPtr(false),
    ReadOnlyRootFilesystem:   boolPtr(false),  // Only if temp files needed
    Capabilities: &corev1.Capabilities{
        Drop: []corev1.Capability{"ALL"},
    },
},

Impact: Pods run with default security context, potentially allowing privilege escalation.

Recommendation: Add SecurityContext to both deployments:

spec:
  template:
    spec:
      securityContext:
        runAsNonRoot: true
        seccompProfile:
          type: RuntimeDefault
      containers:
      - name: litellm
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          runAsNonRoot: true

2. Authentication Disabled in Phase 1 (WEBUI_AUTH=false) ⚠️

Location: components/open-webui-llm/base/open-webui/deployment.yaml:37

Issue: WEBUI_AUTH: "false" disables all authentication - anyone with network access can use the UI and consume Anthropic API credits.

Impact:

Unauthenticated access to chat interface
No user tracking or audit trail
Potential API abuse/cost explosion
Documented as Phase 1 limitation (acceptable for local dev only)

Recommendation:

✅ OK for Phase 1 local Kind clusters
❌ NEVER deploy to shared clusters without Phase 2 OAuth
Add clear warning in deployment logs

3. Hardcoded Development Master Key Exposed ⚠️

Location: components/open-webui-llm/overlays/phase1-kind/secrets.yaml:11,22

Issue: LITELLM_MASTER_KEY: "sk-litellm-dev-master-key" is hardcoded in git-tracked file.

Impact: Anyone reading this file knows the master key for LiteLLM API access.

Recommendation:

Document that this is a dev-only key
Add comment: # WARNING: Dev-only key. Generate unique key for production.
In Phase 2, generate random keys: openssl rand -base64 32

4. Probe Paths Don't Match LiteLLM Documentation 🔴

Location: components/open-webui-llm/base/litellm/deployment.yaml:56-67

Issue: LiteLLM deployment uses /health/liveliness and /health/readiness, but README.md:65 and Makefile:61 use /health endpoint.

Current:

livenessProbe:
  httpGet:
    path: /health/liveliness  # Typo: liveliness vs liveness
    port: http
readinessProbe:
  httpGet:
    path: /health/readiness

README uses:

curl -s http://litellm-service:4000/health

Impact:

Potential probe failures if LiteLLM doesn't expose these specific paths
Inconsistency between docs and deployment

Recommendation: Verify LiteLLM's actual health endpoints and align code/docs. Most likely should be:

Liveness: /health or /health/liveness (not "liveliness")
Readiness: /health or /health/readiness

🟡 Major Issues

5. No Resource Limits Match Original Commit vs Current State

Location: components/open-webui-llm/base/litellm/deployment.yaml:49-55

Issue: Commit message says "100m CPU / 256Mi memory requests" but deployment has:

resources:
  requests:
    cpu: 200m      # Commit says 100m
    memory: 512Mi  # Commit says 256Mi
  limits:
    cpu: 1000m
    memory: 2Gi

Impact: Deployment differs from documented commit message - discrepancy between design and implementation.

Recommendation: Either update deployment to match commit message OR update commit message to reflect actual values. Current values (200m/512Mi) are reasonable for LiteLLM.

6. No OwnerReferences Set on Secrets

Location: components/open-webui-llm/overlays/phase1-kind/secrets.yaml

Issue: Secrets lack OwnerReferences - won't be cleaned up when deployment is deleted.

From CLAUDE.md (lines 447-451):

5. OwnerReferences for Resource Lifecycle
   - REQUIRED: Set OwnerReferences on all child resources (Jobs, Secrets, PVCs, Services)
   - REQUIRED: Use Controller: boolPtr(true) for primary owner

Impact: Secrets persist after make phase1-clean, requiring manual cleanup.

Recommendation: For Phase 2, use Kustomize secret generators OR add OwnerReferences to track deployment lifecycle.

7. Namespace Isolation Not Enforced

Location: components/open-webui-llm/base/namespace.yaml

Issue: No NetworkPolicies to restrict traffic between openwebui namespace and other namespaces (e.g., ambient-code, vteam-dev).

Impact: Pods in openwebui namespace can communicate with ACP components, violating principle of least privilege.

Recommendation: Add NetworkPolicy in Phase 2:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: openwebui-isolation
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: openwebui  # Only allow intra-namespace
  - to:
    - namespaceSelector: {}
      podSelector:
        matchLabels:
          k8s-app: kube-dns  # Allow DNS
  - to:  # Allow Anthropic API
    - podSelector: {}
    ports:
    - port: 443

8. imagePullPolicy: Always on :main-latest Tags

Location: Both deployments use imagePullPolicy: Always with :main-latest and :main tags.

Issue: Unpredictable behavior - deployments can break if upstream images change.

Impact:

Non-reproducible deployments
Unexpected breaking changes from upstream
Increased pull latency

Recommendation: Pin to specific image digests or semantic version tags for Phase 2:

image: ghcr.io/berriai/litellm@sha256:abcd1234...
# OR
image: ghcr.io/berriai/litellm:v1.23.4

9. Memory Limit 4x Higher Than Request (LiteLLM)

Location: components/open-webui-llm/base/litellm/deployment.yaml:52-55

resources:
  requests:
    memory: 512Mi
  limits:
    memory: 2Gi  # 4x request

Issue: Large gap between request and limit can cause OOM kills or QoS issues.

Impact: Pod may be scheduled on node with insufficient memory, leading to node pressure.

Recommendation: Reduce gap to 2x max:

limits:
  memory: 1Gi  # 2x request

🔵 Minor Issues

10. Documentation Says "(coming soon)" for PHASE2.md but File Exists

Location: components/open-webui-llm/README.md:188

See `docs/PHASE2.md` for migration plan (coming soon).

Issue: PHASE2.md exists (428 lines) but README says "coming soon".

Recommendation: Update to: See docs/PHASE2.md for migration plan.

11. Typo in Makefile Comment

Location: components/open-webui-llm/Makefile:19

#  1. Kind cluster running (run: make -C ../../e2e setup-kind)

Issue: Target setup-kind doesn't exist in e2e/Makefile. Correct command is in README: ./scripts/setup-kind.sh.

Recommendation: Update to:

#  1. Kind cluster running (run: cd ../../e2e && ./scripts/setup-kind.sh)

12. .gitignore Includes .vscode but CLAUDE.md Doesn't Recommend IDE Files

Location: components/open-webui-llm/.gitignore:12

Issue: Includes .vscode/ and .idea/, but CLAUDE.md doesn't specify IDE file handling standards.

Recommendation: Acceptable as-is, but consider adding to root .gitignore if repo-wide standard.

13. No Tests for Deployment

Issue: No automated tests to verify deployment works (no Cypress tests, no integration tests).

Recommendation: Add simple smoke test in e2e/ directory for Phase 2:

# Test LiteLLM health
kubectl exec -n openwebui deploy/litellm -- curl -sf http://localhost:4000/health

# Test OpenWebUI health  
kubectl exec -n openwebui deploy/openwebui -- curl -sf http://localhost:8080/health

Positive Highlights

✅ Excellent Documentation

Comprehensive README with quick start, troubleshooting, and cleanup
Detailed PHASE1.md guide (359 lines) with step-by-step instructions
Well-planned PHASE2.md (428 lines) addressing production concerns
Clear distinction between dev (Phase 1) and production (Phase 2)

✅ Good Kustomize Structure

Clean separation of base and overlays
Phase-based approach allows gradual migration
Follows Kubernetes/Kustomize best practices

✅ Developer-Friendly Makefile

Clear targets with help text
Logical grouping (deploy, status, logs, test, clean)
Works with both Docker and Podman

✅ Proper Use of ConfigMaps

LiteLLM config externalized in ConfigMap
Easy to add new models without rebuilding images

✅ Resource Requests Set

Both deployments have CPU/memory requests
Enables proper scheduling

✅ Health Probes Configured

Liveness and readiness probes on both deployments
Appropriate initial delays and intervals

✅ Service Accounts Created

Dedicated ServiceAccounts for litellm and openwebui
Follows RBAC best practices (even though minimal permissions for Phase 1)

✅ Clear Phase 1 vs Phase 2 Separation

Explicitly documents dev-only nature of Phase 1
Acknowledges security limitations upfront
Provides migration path to production

Recommendations

Before Merging

Add SecurityContext to deployments (Critical Outcome: Reduce Refinement Time with agent System #1)
Fix LiteLLM health probe paths (Critical Epic: AI Agent Development #4) - verify with LiteLLM docs

Add prominent warning in README about auth being disabled:

## ⚠️ SECURITY WARNING

**Phase 1 is for LOCAL DEVELOPMENT ONLY.**

- No authentication (WEBUI_AUTH=false)
- Hardcoded development keys
- No network policies
- Do NOT deploy to shared/production clusters

For production deployment, use Phase 2 (see docs/PHASE2.md).

For Phase 2

Implement all OAuth/security features outlined in PHASE2.md
Add NetworkPolicies for namespace isolation
Pin image versions to specific tags/digests
Set OwnerReferences on secrets
Add automated integration tests

Documentation Improvements

Fix README "coming soon" reference to PHASE2.md
Update Makefile help text with correct Kind setup command
Align resource limits in code with commit message (or vice versa)

Architecture Review

Alignment with ACP

Concern: This component is isolated from the main Ambient Code Platform:

Separate namespace (openwebui vs ambient-code)
No integration with ACP's AgenticSession CRs
Different authentication model (none vs user tokens)

Question for Discussion:

Is this intended to eventually integrate with ACP's backend, or remain standalone?
Should Open WebUI create AgenticSessions via ACP API instead of direct Anthropic calls?
Should this use ACP's existing OAuth infrastructure instead of separate OAuth2-proxy?

From PR description: "prototype UX for using openwebui to interact with the new amber codebase agent added in #337"

If Amber integration is a goal, consider Phase 2 Claude Service should use ACP's backend API, not direct Anthropic calls
This would enable RBAC, audit logging, and session tracking

Colocating Documentation

From CLAUDE.md (lines 1070-1076):

- **Prefer inline updates**: Improve existing markdown files or code comments
- **Colocate new docs**: When feasible, documentation should live in the subdirectory 
  that has the relevant code (e.g., components/backend/README.md) not at the top level
- **Avoid top-level proliferation**: Only create top-level docs for cross-cutting concerns

Current structure: Excellent - README + docs/ within components/open-webui-llm/ follows this guidance.

Testing Coverage

What's Missing:

No automated deployment tests
No validation that ingress routes work
No tests for LiteLLM → Anthropic API connectivity

Recommendation: Add to e2e/ directory in Phase 2 (similar to existing vTeam e2e tests).

Final Assessment

Code Quality: 7/10

Well-structured, clear documentation
Missing SecurityContext and some K8s best practices
Good separation of concerns

Security: 3/10 (Phase 1) → Expected 8/10 (Phase 2)

Phase 1 explicitly dev-only, acceptable for local testing
Phase 2 plan addresses all major concerns
Must not deploy Phase 1 to shared environments

Architecture: 8/10

Clean design, good use of Kustomize
Unclear integration story with main ACP
Phase-based approach is pragmatic

Documentation: 9/10

Comprehensive and well-organized
Minor inconsistencies between docs and code

Recommendation: ✅ Approve with required changes

Merge after addressing:

Add SecurityContext to deployments
Fix LiteLLM health probe paths
Add security warning to README

This is good work for a Phase 1 prototype. The clear documentation of limitations and Phase 2 migration plan demonstrates good engineering judgment.

🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

CLAUDE.md - Master project instructions, development standards
backend-development.md - Go backend, K8s integration patterns
frontend-development.md - NextJS, Shadcn UI, React Query patterns
security-standards.md - Auth, RBAC, token handling
k8s-client-usage.md - User token vs service account patterns
error-handling.md - Consistent error patterns
react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

jeremyeder enabled auto-merge (squash) November 19, 2025 01:35

This comment has been minimized.

Sign in to view

bobbravo2 added this to the v0.0.15 milestone Dec 16, 2025

bobbravo2 added the priority: medium label Dec 16, 2025

jeremyeder and others added 2 commits December 20, 2025 14:31

jeremyeder force-pushed the feature/openwebui-litellm-deployment branch from ad4140f to b43bf39 Compare December 20, 2025 19:32

jeremyeder merged commit f97fa0a into ambient-code:main Dec 20, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/openwebui litellm deployment #342

Feature/openwebui litellm deployment #342

Uh oh!

jeremyeder commented Nov 19, 2025

Uh oh!

This comment has been minimized.

Uh oh!

github-actions bot commented Dec 20, 2025 •

edited

Loading

What Amber Loaded for Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature/openwebui litellm deployment #342

Feature/openwebui litellm deployment #342

Uh oh!

Conversation

jeremyeder commented Nov 19, 2025

Uh oh!

This comment has been minimized.

Uh oh!

github-actions bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

Issues by Severity

🚫 Blocker Issues

🔴 Critical Issues

1. Missing SecurityContext in Deployments ⚠️

2. Authentication Disabled in Phase 1 (WEBUI_AUTH=false) ⚠️

3. Hardcoded Development Master Key Exposed ⚠️

4. Probe Paths Don't Match LiteLLM Documentation 🔴

🟡 Major Issues

5. No Resource Limits Match Original Commit vs Current State

6. No OwnerReferences Set on Secrets

7. Namespace Isolation Not Enforced

8. imagePullPolicy: Always on :main-latest Tags

9. Memory Limit 4x Higher Than Request (LiteLLM)

🔵 Minor Issues

10. Documentation Says "(coming soon)" for PHASE2.md but File Exists

11. Typo in Makefile Comment

12. .gitignore Includes .vscode but CLAUDE.md Doesn't Recommend IDE Files

13. No Tests for Deployment

Positive Highlights

✅ Excellent Documentation

✅ Good Kustomize Structure

✅ Developer-Friendly Makefile

✅ Proper Use of ConfigMaps

✅ Resource Requests Set

✅ Health Probes Configured

✅ Service Accounts Created

✅ Clear Phase 1 vs Phase 2 Separation

Recommendations

Before Merging

For Phase 2

Documentation Improvements

Architecture Review

Alignment with ACP

Colocating Documentation

Testing Coverage

Final Assessment

What Amber Loaded for Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Dec 20, 2025 •

edited

Loading