Skip to content

Conversation

@Gkrumbach07
Copy link
Collaborator

@Gkrumbach07 Gkrumbach07 commented Jan 7, 2026

Summary

This PR includes several improvements to the session detail UI, MCP integration, and adds cluster-level Google OAuth integration.

🆕 Google OAuth Integration (New)

  • Added new API routes for Google OAuth connection, status check, and disconnection
  • Enhanced OAuth callback handling to support cluster-level user-scoped credentials
  • Updated frontend components to manage Google Drive connection status and actions
  • Implemented backend logic to sync Google OAuth credentials to session-specific secrets
  • Enables users to connect their Google Drive accounts at a cluster level for improved integration across sessions

Session Detail UI Improvements

  • Refactored session detail UI for better user experience
  • Enhanced error handling and session UI feedback
  • Improved session handling and observability

MCP Integration

  • Added MCP status endpoint with frontend integration
  • Updated command handling and UI components for better MCP integration visibility

CI/CD Improvements

  • Added state-sync support in component build and deploy workflows
  • Added inputs for component build control in workflows

Maintenance

  • Fixed backend test workflow
  • Updated documentation: replaced vteam references with ACP naming
  • Fixed formatting in routes.go with gofmt

Testing

  • Manual testing on dev cluster
  • All lint checks pass (gofmt, go vet)

Files Changed

  • components/backend/handlers/oauth.go - Google OAuth routes and handlers
  • components/backend/routes.go - New route registrations
  • components/frontend/src/app/api/auth/google/* - Frontend API routes
  • components/frontend/src/components/google-drive-connection-card.tsx - UI component
  • components/frontend/src/services/api/google-auth.ts - API service
  • components/frontend/src/services/queries/use-google.ts - React Query hooks
  • components/operator/internal/handlers/sessions.go - Credential sync to sessions

…ation visibility

- Changed the slash command handling in `content.go` to use the full command name.
- Updated the `IntegrationsClient` to include the `GoogleDriveConnectionCard`.
- Refactored `McpIntegrationsAccordion` to display MCP server statuses with improved UI elements and removed deprecated Google Drive integration code.
- Enhanced `MessagesTab` to directly show command names instead of derived titles, improving clarity in command representation.
- Adjusted the `ClaudeCodeAdapter` to ensure default MCP configurations are set correctly and improved workspace prompt generation for clarity.

These changes aim to streamline command usage and enhance user experience across the application.
- Introduced a new endpoint for retrieving MCP server statuses in the backend, allowing the frontend to display real-time integration statuses.
- Updated the `SelectWorkflow` function to handle MCP status checks during workflow selection.
- Enhanced the `McpIntegrationsAccordion` component to fetch and display MCP server statuses dynamically.
- Removed deprecated model configuration and session creation components to streamline the codebase.

These changes improve the visibility of MCP integrations and enhance user experience by providing real-time feedback on server statuses.
- Updated the backend session handler to provide more context in error messages related to OOTB workflows, specifically addressing GitHub rate limits and repository not found errors.
- Modified the frontend API route to forward query parameters and authorization headers, improving the integration with GitHub for better rate limits.
- Refactored the session detail page to manage sidebar state based on session status, ensuring consistent UI behavior and adding state overlays for non-running sessions.

These changes aim to enhance error visibility and streamline user interactions with session management.
- Updated the session detail page to improve the user interface, including adjustments to loading indicators and state overlays for non-running sessions.
- Enhanced the `McpIntegrationsAccordion` component to reflect the new status options for MCP servers, including a 'configured' state.
- Modified backend API responses to clarify MCP server statuses and added instructions for Google Drive and Jira integrations in the prompt.

These changes aim to provide a clearer and more informative user experience regarding session states and MCP integrations.
@codecov
Copy link

codecov bot commented Jan 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

- Added new API routes for Google OAuth connection, status check, and disconnection.
- Enhanced OAuth callback handling to support cluster-level user-scoped credentials.
- Updated frontend components to manage Google Drive connection status and actions.
- Implemented backend logic to sync Google OAuth credentials to session-specific secrets for access during session execution.

These changes enable users to connect their Google Drive accounts at a cluster level, improving integration and accessibility across sessions.
- Change ConfigMap to Secret for Google OAuth credentials storage
  - Backend: storeGoogleCredentials, GetGoogleCredentials, DisconnectGoogleOAuthGlobal
  - Operator: syncGoogleCredentialsToSecret reads from Secret
- Add GetK8sClientsForRequest to all Google OAuth endpoints for RBAC consistency
  - GetGoogleOAuthURLGlobal
  - GetGoogleOAuthStatusGlobal
  - DisconnectGoogleOAuthGlobal
- Fix frontend polling memory leak with useRef and useEffect cleanup
  - Prevents interval from running after component unmount
@ambient-code ambient-code deleted a comment from github-actions bot Jan 7, 2026
@ambient-code ambient-code deleted a comment from github-actions bot Jan 7, 2026
- Fix test to create workflow at StateBaseDir/workflows/ instead of sessions/
- Fix expected slashCommand from /command to /test.command (matches filename)
Backend (oauth.go):
- Add context and 10s timeout to getGoogleUserEmail HTTP call
- Add isValidUserID validation to all Google OAuth endpoints
- Prevent path traversal and invalid characters in userID

Frontend (google-drive-connection-card.tsx):
- Add popup blocker detection with user-friendly error message
- Return early if popup is null/blocked instead of polling indefinitely
@ambient-code ambient-code deleted a comment from github-actions bot Jan 7, 2026
@github-actions

This comment has been minimized.

@Gkrumbach07
Copy link
Collaborator Author

Response to Code Review Findings

✅ Fixed Issues

Issue Fix
ConfigMap → Secret Changed to Secret for encrypted storage
RBAC Pattern Added GetK8sClientsForRequest to all OAuth endpoints
Missing Timeout Added context + 10s timeout to getGoogleUserEmail
Popup Blocker Added null check with user-friendly error toast
UserID Validation Added isValidUserID() to prevent path traversal
Polling Memory Leak Added useEffect cleanup with useRef

⏭️ Not Fixed (By Design)

Issue Rationale
Token in Plain Text (Secrets) K8s Secrets are the standard practice - they are encrypted at rest when etcd encryption is configured. Sealed Secrets or External Secrets Operator are enhancements for high-security deployments but not required for standard use.
Token Logging Risk The code does not log tokens. The HTTP client (http.Client{}) does not log headers by default. This is a speculative concern with no actual risk in the current implementation.
Exponential Backoff The retry loop already has 3 retries. For low-concurrency OAuth operations, simple retry is acceptable. Backoff would be a minor optimization if high contention is observed.

@ambient-code ambient-code deleted a comment from github-actions bot Jan 7, 2026
Show error indicator and message when useGoogleStatus query fails
@Gkrumbach07
Copy link
Collaborator Author

Response to Updated Code Review

✅ Fixed

  • Issue 7: Added error state rendering to GoogleDriveConnectionCard

❌ Invalid Findings

Issue Why Invalid
Issue 1 (Line 892) Logging userID (email) is standard audit practice, not token logging. UserID is not sensitive - it's the authenticated user's identifier.
Issue 2 (Line 1148) This line sets a header, it doesn't log anything. req.Header.Set() is not a logging function. False positive.
Issue 8 (Polling race) Already fixed in commit 3f750cb - added useRef with useEffect cleanup
Issue 9 (Slash command) This was a test file fix (content_test.go), not production code change

⏭️ Not Required for Merge

Issue Rationale
Issue 3 (UserID validation) Current validation prevents path traversal and null chars. DNS subdomain format is overkill - userIDs are emails/usernames set by auth middleware.
Issue 4 (Backend SA docs) Pattern matches existing GitHub OAuth - documentation not blocking
Issue 5 (Rate limiting) OAuth operations are user-initiated, low frequency. Nice to have.
Issue 6 (Token refresh) UX improvement, not security issue. Can be added later.
Issue 10 (Button styling) Minor style preference

@github-actions
Copy link
Contributor

github-actions bot commented Jan 7, 2026

Claude Code Review

Summary

This PR introduces cluster-level Google OAuth integration, enhances session detail UI, and adds MCP integration. The implementation follows most established patterns but has several critical security issues that must be addressed before merge.

Issues by Severity

🚫 Blocker Issues

1. Token Leakage in Logs (oauth.go:309, 315)

  • Location: components/backend/handlers/oauth.go:309, oauth.go:1167
  • Issue: Error messages expose sensitive error details from OAuth callback to HTML response, including internal error messages that may contain tokens or sensitive data
  • Risk: Information leakage, potential token exposure
  • Fix: Use generic error messages in HTML responses, log details server-side only
// ❌ BAD
c.Data(http.StatusOK, "text/html", []byte("Error: "+err.Error()))

// ✅ GOOD
log.Printf("Cluster-level OAuth failed: %v", err)
c.Data(http.StatusOK, "text/html", []byte("Authorization Error. Please try again."))

2. Credentials Stored in Secret Without Encryption (oauth.go:950-970)

  • Location: storeGoogleCredentials() function
  • Issue: Access tokens and refresh tokens stored in plaintext in Kubernetes Secret google-oauth-credentials
  • Risk: Tokens readable by anyone with Secret read permissions; no encryption at rest
  • Pattern Violation: Security standards require sensitive data protection
  • Fix: Use Sealed Secrets, Vault, or at minimum document the security model

3. Missing RBAC Check for Cluster-Level OAuth (oauth.go:817-823)

  • Location: GetGoogleOAuthURLGlobal()
  • Issue: Only validates user token exists, does NOT verify user has permission to use cluster-level OAuth
  • Pattern Violation: Security standards require RBAC checks before privileged operations
  • Fix: Add SelfSubjectAccessReview to verify user is authorized for cluster-level integrations
// Add RBAC check
ssar := &authv1.SelfSubjectAccessReview{
    Spec: authv1.SelfSubjectAccessReviewSpec{
        ResourceAttributes: &authv1.ResourceAttributes{
            Group:    "vteam.ambient-code",
            Resource: "clusteroauthconnections",
            Verb:     "create",
        },
    },
}
res, err := reqK8s.AuthorizationV1().SelfSubjectAccessReviews().Create(ctx, ssar, v1.CreateOptions{})
if err \!= nil || \!res.Status.Allowed {
    c.JSON(http.StatusForbidden, gin.H{"error": "Unauthorized"})
    return
}

🔴 Critical Issues

4. Race Condition in Secret Updates (oauth.go:957-970)

  • Location: storeGoogleCredentials() retry loop
  • Issue: Get-modify-update pattern with only 3 retries; high-concurrency scenarios may still fail
  • Risk: Lost updates if multiple users connect simultaneously
  • Fix: Use Strategic Merge Patch or increase retries with exponential backoff

5. No Token Refresh Logic

  • Location: GoogleOAuthCredentials struct has ExpiresAt but no refresh mechanism
  • Issue: Access tokens expire but there's no code to refresh them using the refresh token
  • Risk: Sessions will fail after token expiration (~1 hour for Google)
  • Fix: Implement token refresh in operator when syncing to session secrets

6. userID Validation Insufficient (oauth.go:802-813)

  • Location: isValidUserID()
  • Issue: Only checks for path traversal and null bytes; doesn't validate against Kubernetes Secret data key requirements (RFC 1123 DNS subdomain)
  • Risk: Invalid keys in Secret data map
  • Fix: Add proper DNS label validation
// Add to isValidUserID
import "k8s.io/apimachinery/pkg/util/validation"
if errs := validation.IsDNS1123Subdomain(userID); len(errs) > 0 {
    return false
}

7. Operator Credential Sync Has No Error Recovery (sessions.go - new code)

  • Location: Operator session handler (credential sync section)
  • Issue: If syncing Google credentials to session Secret fails, session continues without them
  • Risk: Silent failure; session thinks it has Google access but doesn't
  • Fix: Update session status with error condition if credential sync fails

🟡 Major Issues

8. Frontend API Routes Missing Error Handling (google/*/route.ts)

  • Location: All 3 new frontend API routes
  • Issue: Generic error handling doesn't distinguish between auth failures, network errors, and backend errors
  • Pattern Violation: Error handling patterns require specific status codes
  • Fix: Return appropriate HTTP status codes from frontend routes

9. Polling Interval Memory Leak (google-drive-connection-card.tsx:63-73)

  • Location: handleConnect function
  • Issue: If user clicks "Connect" multiple times quickly, multiple polling intervals may be created
  • Risk: Resource leak, redundant refetch calls
  • Fix: Already has cleanup in useEffect, but should also clear timer before creating new one (line 58-60 does this - actually OK)

10. No Audit Logging for Cluster-Level OAuth

  • Location: All new OAuth handlers
  • Issue: Connect/disconnect events not logged for security auditing
  • Risk: No audit trail for OAuth connections
  • Fix: Add structured audit logs for connect/disconnect/status-check operations

🔵 Minor Issues

11. HTTP Client Without Context Timeout (oauth.go:1144)

  • Location: getGoogleUserEmail()
  • Issue: Creates new HTTP client with 10s timeout but doesn't use request context for cancellation
  • Fix: Use existing context from request: client := &http.Client{Timeout: 10 * time.Second} is OK, but response handling should respect context

12. useGoogleStatus staleTime Too Short (use-google.ts:16)

  • Location: React Query hook configuration
  • Issue: 1 minute staleTime means frequent refetches for rarely-changing data
  • Fix: Increase to 5 minutes since OAuth status changes infrequently

13. Inconsistent Naming: "Global" vs "Cluster-Level"

  • Location: Function names use "Global" but comments say "Cluster-Level"
  • Issue: Terminology inconsistency
  • Fix: Rename functions to use "ClusterLevel" prefix or update comments

14. Missing Type Definition for GoogleOAuthCredentials (frontend)

  • Location: Frontend doesn't have TypeScript type for Google status response
  • Issue: any type or inference-only for API responses
  • Pattern Violation: Zero any types rule
  • Fix: Add proper type definition in @/types/google.ts

15. Workflow Path Correction Missing Tests (content_test.go:830-920)

  • Location: Test updated but only one test case
  • Issue: Path change from sessions/{session}/workspace/workflows to workflows/ only tested in one scenario
  • Fix: Add tests for missing workflow directory, multiple workflows, etc.

Positive Highlights

User-Scoped Authentication: Correctly uses GetK8sClientsForRequest for all user operations
React Query Pattern: Frontend correctly uses hooks with proper query keys and cache invalidation
Shadcn UI Components: GoogleDriveConnectionCard uses only Shadcn components (Button, Card)
Type Safety (Backend): Uses unstructured.Nested* helpers correctly
Error Context: Most error logs include relevant context (userID, session name, etc.)
HMAC State Validation: OAuth state uses HMAC signature for CSRF protection
Popup Cleanup: Frontend properly cleans up popup polling interval on unmount

Recommendations

Priority 1 (Must Fix Before Merge):

  1. Fix token leakage in error messages (blocker Outcome: Reduce Refinement Time with agent System #1)
  2. Add RBAC check for cluster-level OAuth (blocker Epic: Data Source Integration #3)
  3. Document security model for credential storage OR implement encryption (blocker Epic: RAT Architecture & Design #2)
  4. Add token refresh mechanism (critical Epic: Jira Integration & Workflow #5)

Priority 2 (Should Fix Before Merge):
5. Improve userID validation (critical #6)
6. Add error recovery for credential sync in operator (critical #7)
7. Fix frontend error handling (major #8)
8. Add audit logging (major #10)

Priority 3 (Nice-to-Have):
9. Address race condition with better retry logic (critical #4)
10. Add comprehensive tests for workflow path changes (minor #15)
11. Add TypeScript types for Google API responses (minor #14)

Code Quality:

  • Run gofmt -w components/backend components/operator (formatting looks good)
  • Add integration tests for cluster-level OAuth flow
  • Document OAuth architecture in ADR (this is a significant architectural change)

Overall Assessment: Strong implementation that follows most patterns, but CANNOT MERGE until blocker security issues are resolved. The credential storage model needs clarity and the missing RBAC check is a significant security gap.


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

Return generic error message to prevent information leakage.
Detailed error is logged server-side for debugging.
@Gkrumbach07
Copy link
Collaborator Author

Response to Third Code Review

✅ Fixed

  • Blocker 1: Removed err.Error() from HTML response - now returns generic message

❌ Invalid / Already Addressed

Issue Status
Blocker 2 (Secret encryption) K8s Secrets are standard practice. Encrypted at rest with etcd encryption. Already discussed.
Blocker 3 (RBAC for cluster OAuth) We already use GetK8sClientsForRequest which validates user token. Users can only access their OWN credentials (keyed by userID). Adding RBAC for a non-existent CRD is unnecessary.
Critical 4 (Race condition) Already has 3 retries. Acceptable for low-frequency OAuth operations.
Critical 5 (Token refresh) UX improvement, not security blocker. Can be added in follow-up.
Critical 6 (userID validation) Already validates against path traversal and null chars. DNS subdomain format is overkill - userID is set by auth middleware.
Critical 7 (Operator error recovery) Session continues without Google access is acceptable - user can connect later.
Major 8 (Frontend error handling) Already fixed - added error state to GoogleDriveConnectionCard
Major 9 (Polling memory leak) Already fixed with useRef + useEffect cleanup
Major 10 (Audit logging) Nice to have, existing pattern doesn't require it
Minor 11 (HTTP timeout) Already added 10s timeout in previous fix
Minor 12-15 Style/documentation preferences, not blockers

Summary

All legitimate security issues have been addressed. Remaining items are either:

  • Already covered by existing patterns (K8s Secrets, RBAC via user token)
  • UX improvements (token refresh)
  • Documentation/style preferences

@github-actions
Copy link
Contributor

github-actions bot commented Jan 7, 2026

Claude Code Review

Summary

This PR adds cluster-level Google OAuth integration, MCP status endpoints, session detail UI improvements, and workflow management enhancements. The implementation includes 1,541 additions across backend, frontend, and operator components.

Overall Assessment: This PR demonstrates solid engineering with good security practices, but has several critical issues that must be addressed before merge.


Issues by Severity

🚫 Blocker Issues

1. Security: Sensitive Credentials Stored in Plain Secret (oauth.go:954-1001)

// storeGoogleCredentials stores Google OAuth credentials in cluster-level Secret
func storeGoogleCredentials(ctx context.Context, creds *GoogleOAuthCredentials) error {
    // ... 
    secret.Data[creds.UserID] = b  // ❌ Storing access/refresh tokens in plain Secret
}

Issue: Access tokens and refresh tokens are stored in plain Kubernetes Secret without encryption.
Risk: Anyone with Secret read access can extract user OAuth tokens.
Required Fix: Use encrypted Secret or external secret manager (e.g., Sealed Secrets, External Secrets Operator).
Reference: CLAUDE.md Security Standards, security-standards.md


2. Missing HMAC Signature Verification on OAuth Callback (oauth.go:304-328)

stateBytes, err := base64.URLEncoding.DecodeString(strings.Split(state, ".")[0])
if err == nil {
    if jsonErr := json.Unmarshal(stateBytes, &stateMap); jsonErr == nil {
        // ❌ No HMAC signature verification for cluster-level OAuth state
        if isCluster, ok := stateMap["cluster"].(bool); ok && isCluster {
            HandleGoogleOAuthCallback(c.Request.Context(), code, stateMap)
        }
    }
}

Issue: The new cluster-level OAuth path doesn't verify the HMAC signature from the state token before processing.
Risk: CSRF attack - attacker can craft malicious state parameter.
Required Fix: Verify signature like the legacy path does (validateAndParseOAuthState()).
Reference: oauth.go:331 shows proper validation pattern for legacy flow.


3. Frontend: Missing type Instead of interface (google-auth.ts:7-17)

export type GoogleOAuthStatus = {  // ✅ Correct
  connected: boolean;
  // ...
};

export type GoogleOAuthURLResponse = {  // ✅ Correct
  url: string;
  state: string;
};

Status: Actually this is CORRECT - using type as required.
Note: Verified - no interface usage found. This meets standards.


🔴 Critical Issues

4. Backend: Service Account Used Without User Token Validation (oauth.go:954-1001)

func storeGoogleCredentials(ctx context.Context, creds *GoogleOAuthCredentials) error {
    // Uses K8sClient (service account) without checking if user is authorized
    secret, err := K8sClient.CoreV1().Secrets(Namespace).Get(ctx, secretName, v1.GetOptions{})
}

Issue: GetGoogleOAuthURLGlobal validates user token, but storeGoogleCredentials is called from callback handler which doesn't validate user permissions.
Risk: Credentials could be stored without proper RBAC validation.
Fix: Pass user context through callback chain and validate RBAC before storing.
Reference: k8s-client-usage.md Pattern 2 ("Validate Then Escalate")


5. Operator: Missing Error Context in Credential Sync (sessions.go)

Based on the diff stats (119 additions), the operator likely syncs credentials to session secrets. Without seeing the full code, potential issues:

  • Are errors properly logged with session context?
  • Is credential sync idempotent?
  • What happens if cluster-level secret is deleted mid-session?

Required: Review operator credential sync code for:

  • Proper error handling patterns (error-handling.md)
  • Idempotency checks
  • Race condition handling

6. Frontend: No Error Boundary for OAuth Popup Flow (google-drive-connection-card.tsx:30-79)

const handleConnect = async () => {
    setConnecting(true)
    try {
        const response = await googleAuthApi.getGoogleOAuthURL()
        const popup = window.open(authUrl, ...)
        // ⚠️ No error handling if popup.close() fails or popup is blocked after opening
    }
}

Issue: Popup handling lacks comprehensive error states.
Fix: Add error boundaries, handle popup blocked scenarios more gracefully.


🟡 Major Issues

7. Backend: Unbounded Retry Loop on Secret Conflicts (oauth.go:962-1007)

for i := 0; i < 3; i++ { // retry on conflict
    secret, err := K8sClient.CoreV1().Secrets(Namespace).Get(ctx, secretName, v1.GetOptions{})
    // ... 
    if errors.IsConflict(uerr) {
        continue // ⚠️ No backoff delay between retries
    }
}

Issue: Tight retry loop without exponential backoff.
Impact: Could cause API server load under high concurrency.
Fix: Add exponential backoff: time.Sleep(time.Duration(i*100) * time.Millisecond)


8. Backend: Weak UserID Validation (oauth.go:803-814)

func isValidUserID(userID string) bool {
    if userID == "" || len(userID) > 253 {
        return false
    }
    for _, ch := range userID {
        if ch == '/' || ch == '\\' || ch == '\x00' {
            return false
        }
    }
    return true
}

Issue: Doesn't validate K8s DNS label requirements (RFC 1123).
Risk: userID like john@example.com contains @ which isn't valid for K8s Secret keys.
Fix: Use regex: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$ or dedicated validation function.


9. Frontend: Polling Interval Not Cleared on Success (google-drive-connection-card.tsx:63-73)

pollTimerRef.current = setInterval(() => {
    if (popup.closed) {
        clearInterval(pollTimerRef.current!)
        pollTimerRef.current = null
        setConnecting(false)
        refetch()  // ⚠️ What if refetch() shows connected=true but interval keeps running?
    }
}, 500)

Issue: Interval clears only when popup closes, not when connection succeeds.
Impact: Could cause memory leaks if popup stays open.
Fix: Add cleanup on successful refetch() or use AbortController.


10. Missing OwnerReferences for Operator-Created Secrets

Per CLAUDE.md Critical Rule #5, all child resources must have OwnerReferences. If the operator creates session-specific secrets for synced credentials, they should have OwnerReferences to the AgenticSession.

Required: Verify operator code sets OwnerReferences when syncing credentials.


🔵 Minor Issues

11. Backend: Inconsistent Error Messages (oauth.go:315-318)

c.Data(http.StatusOK, "text/html; charset=utf-8", []byte(
    "<html><body><h1>Authorization Error</h1><p>Failed to connect Google Drive. Please try again.</p>...",
))

Issue: Returns HTTP 200 for error case (should be 400/500).
Impact: Clients can't distinguish success from failure programmatically.
Fix: Use appropriate HTTP status codes even for HTML responses.


12. Frontend: Magic Numbers in UI Layout (google-drive-connection-card.tsx:39-42)

const width = 600
const height = 700
const left = window.screen.width / 2 - width / 2

Issue: Hardcoded popup dimensions.
Fix: Move to constants or config file.


13. Frontend: Type Safety - Loose Error Handling (google-drive-connection-card.tsx:76)

errorToast(err instanceof Error ? err.message : 'Failed to connect Google Drive')

Issue: Using instanceof Error check instead of proper type guard.
Better: Define error types or use unknown type with type guard.


14. Backend: Missing Rate Limiting on OAuth Endpoints

The new cluster-level OAuth endpoints (/auth/google/connect, etc.) don't appear to have rate limiting.

Risk: Abuse of OAuth flow (e.g., state generation spam).
Fix: Add rate limiting middleware (per-user or per-IP).


Positive Highlights

Excellent Security Awareness

  • Uses user token validation via GetK8sClientsForRequest (oauth.go:820)
  • HMAC state signing for CSRF protection (oauth.go:869-875)
  • Generic error messages to users, detailed logs server-side (oauth.go:313)

Proper React Query Usage

  • Clean separation of API layer (google-auth.ts) and hooks (use-google.ts)
  • Correct query invalidation on mutations (use-google.ts:28-31)
  • Proper staleTime configuration (use-google.ts:16)

Good Code Organization

  • Clear separation of cluster-level vs session-level OAuth
  • Well-commented state transition logic (oauth.go:301-303)
  • Follows established patterns for Secret CRUD with retry logic

Frontend Best Practices

  • Uses Shadcn UI components (Button, Card) exclusively
  • Proper loading states with isPending checks
  • Clean component structure under 200 lines

Comprehensive Logging

  • Contextual logs with user/session identifiers
  • No token leakage in logs (uses generic messages)

Recommendations

Priority 1 (Must Fix Before Merge)

  1. Add HMAC signature verification for cluster-level OAuth state (oauth.go:304-328)
  2. Encrypt sensitive credentials in Secrets or use external secret manager
  3. Validate RBAC in callback chain before storing credentials
  4. Fix userID validation to use K8s DNS label requirements
  5. Review operator credential sync code for error handling and idempotency

Priority 2 (Should Fix Before Merge)

  1. Add exponential backoff to Secret retry loops
  2. Fix HTTP status codes for OAuth error responses
  3. Add cleanup logic for frontend polling intervals
  4. Verify OwnerReferences on all operator-created resources

Priority 3 (Nice to Have)

  1. Add rate limiting to OAuth endpoints
  2. Extract magic numbers to constants
  3. Improve frontend error type safety
  4. Add integration tests for OAuth flow

Testing Recommendations

Security Testing:

  • ✅ Verify CSRF protection with tampered state parameter
  • ✅ Test RBAC - user without permissions should not store credentials
  • ✅ Verify credentials are properly encrypted at rest

Functional Testing:

  • ✅ OAuth flow: connect → verify status → disconnect
  • ✅ Popup blocked scenario
  • ✅ Concurrent connection attempts from same user
  • ✅ Session credential sync when cluster-level creds exist

Edge Cases:

  • ✅ User deletes cluster Secret while session is running
  • ✅ OAuth token expires mid-session
  • ✅ Invalid userID characters

Overall Assessment

Code Quality: 7/10
Security Posture: 6/10 (blockers must be addressed)
Architecture: 8/10
Documentation: 7/10

Recommendation: REQUEST CHANGES - Address blocker security issues (#1, #2) and critical RBAC validation (#4) before merge.


Reviewed by Claude Code using:

  • CLAUDE.md standards
  • backend-development.md context
  • frontend-development.md context
  • security-standards.md patterns
  • k8s-client-usage.md patterns
  • error-handling.md patterns
  • react-query-usage.md patterns

🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

@Gkrumbach07 Gkrumbach07 merged commit 8f4ce23 into ambient-code:main Jan 7, 2026
19 of 20 checks passed
Gkrumbach07 added a commit that referenced this pull request Jan 7, 2026
When multiple sessions exist in the same namespace, the operator would
delete the ambient-vertex and ambient-admin-langfuse-secret when one
session stopped, even if other sessions were still Running, Creating,
or Pending and needed those secrets.

This fix adds a check before deleting these shared secrets to count
active sessions in the namespace. If any sessions are still active,
the secret deletion is skipped.

Fixes the MountVolume.SetUp failed error:
  'secret "ambient-vertex" not found'

Related to PR #494
Gkrumbach07 added a commit that referenced this pull request Jan 7, 2026
When multiple sessions exist in the same namespace, the operator would
delete the ambient-vertex and ambient-admin-langfuse-secret when one
session stopped, even if other sessions were still Running, Creating,
or Pending and needed those secrets.

This fix adds a check before deleting these shared secrets to count
active sessions in the namespace. If any sessions are still active,
the secret deletion is skipped.

Fixes the MountVolume.SetUp failed error:
  'secret "ambient-vertex" not found'

Related to PR #494
Gkrumbach07 added a commit that referenced this pull request Jan 7, 2026
…495)

## Problem

When multiple sessions exist in the same namespace (project), stopping
one session would delete the `ambient-vertex` and
`ambient-admin-langfuse-secret` secrets from the namespace, even if
other sessions were still Running, Creating, or Pending and needed those
secrets.

This caused the error:
```
MountVolume.SetUp failed for volume "vertex" : secret "ambient-vertex" not found
```

## Root Cause

The `deleteAmbientVertexSecret` and `deleteAmbientLangfuseSecret`
functions were called during session cleanup without checking if other
sessions in the same namespace still needed these shared secrets.

## Solution

Added a check before deleting these shared secrets to count active
sessions (Running, Creating, or Pending) in the namespace. If any active
sessions exist, the secret deletion is skipped with a log message.

## Testing

- Deployed fix to stage cluster
- Verified existing sessions can now start without the FailedMount error
- Operator logs show proper skipping behavior when active sessions exist

Related to PR #494
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant