Skip to content

Conversation

@hiroTamada
Copy link
Contributor

@hiroTamada hiroTamada commented Jan 27, 2026

Summary

Root Cause

The validateRegistryToken function checked len(claims.Repositories) == 0 to detect registry tokens, but new tokens use repo_access instead of repos, causing all new tokens to be rejected as "not a registry token".

Changes

  • Add RepoPermission struct and RepoAccess field to RegistryTokenClaims
  • Update validateRegistryToken to check both formats with proper per-repo scope validation
  • Add comprehensive tests for both token formats

Test plan

  • Existing middleware tests pass
  • New tests for RepoAccess token format validation
  • Deploy to staging and verify builds complete successfully
  • Deploy to production

Note

Adds full support for the new two-tier registry token format while maintaining legacy compatibility, and hardens the registry auth path.

  • Extend RegistryTokenClaims with RepoAccess (via RepoPermission) and update validateRegistryToken to handle both formats with per-repo scope checks; allow /v2/ base path
  • Enhance JwtAuth for /v2 routes: accept Basic or Bearer, return 401 with WWW-Authenticate header, and expand internal IP fallback to include 172.30.*; use docker/distribution router for repo parsing
  • Inject request logger into /v2 route in cmd/api/main.go for better debug logs
  • Builder agent: write Docker config.json to both /home/builder/.docker and /root/.docker to ensure BuildKit finds creds; add logging around write attempts
  • Add comprehensive tests covering RepoAccess/legacy tokens, registry path handling, IP fallback, and error headers

Written by Cursor Bugbot for commit 122280e. This will update automatically on new commits. Configure here.

The two-tier build cache PR (#70) introduced a new token format using
`repo_access` field with per-repository scopes, but the middleware
wasn't updated to parse it.

This caused 401 Unauthorized errors when builder VMs tried to push
images to the registry, as the middleware only checked for the legacy
`repos` field which is empty in new tokens.

Changes:
- Add RepoPermission struct and RepoAccess field to RegistryTokenClaims
- Update validateRegistryToken to check both RepoAccess (new) and
  Repositories (legacy) formats
- Add per-repo scope checking for write operations
- Add comprehensive tests for both token formats

Fixes build failures in production where the new token format was
being used but not recognized by the registry auth middleware.
The IP fallback for registry authentication only allowed 10.100.x.x and
10.102.x.x subnets (staging/dev), but production uses 172.30.x.x. This
caused builds to fail in production while working in staging.

Changes:
- Add 172.30.x.x to allowed subnets in isInternalVMRequest()
- Add logger injection to /v2 routes for debug logging
- Add tests for internal VM request subnet matching
Add comprehensive tests for JwtAuth middleware on /v2/ registry paths:
- Valid token access (both legacy and RepoAccess formats)
- IP fallback for staging (10.100.x.x, 10.102.x.x) and production (172.30.x.x)
- External IP rejection without valid token
- Invalid token fallback behavior
- Bearer and Basic auth support

These tests would have caught the production subnet issue earlier.
Root cause: Registry was returning 401 without WWW-Authenticate header,
so BuildKit didn't know to send credentials from Docker config.

Changes:
- Add WWW-Authenticate: Basic realm="registry" to 401 responses
- Remove production subnet (172.30.x.x) from IP fallback
  (staging 10.100.x.x still has fallback as safety net)
- Builder agent: write Docker config to both /home/builder/.docker
  and /root/.docker to ensure BuildKit finds it
- Add tests for BuildKit auth flow simulation

This enables proper token-based registry auth in production.
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

log.Printf("Warning: failed to write %s: %v", configPath, err)
continue
}
log.Printf("Registry auth configured at %s", configPath)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registry auth config may silently fail completely

Medium Severity

The setupRegistryAuth function returns nil unconditionally even if all config writes fail. The loop logs warnings and continues on each failure, but doesn't track whether at least one write succeeded. If both /home/builder/.docker and /root/.docker writes fail, the function reports success despite no authentication being configured. The build then fails later with a cryptic 401 error instead of a clear "failed to write docker config" message.

Fix in Cursor Fix in Web

// knows to send credentials from the Docker config
if statusCode == http.StatusUnauthorized {
w.Header().Set("WWW-Authenticate", `Basic realm="registry"`)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WWW-Authenticate header incorrectly added to all API 401s

Low Severity

The OapiErrorHandler function now adds WWW-Authenticate: Basic realm="registry" to all 401 responses, not just registry endpoints. This affects non-registry API endpoints like /instances/{id}/exec, /instances/{id}/cp, and all OpenAPI-validated endpoints. These endpoints expect Bearer authentication, so returning a Basic auth challenge with a "registry" realm is misleading and could confuse API clients about how to authenticate.

Fix in Cursor Fix in Web

BuildKit with registry.insecure=true doesn't do WWW-Authenticate
challenge-response flow - it just fails on 401 without retrying
with credentials.

Re-enable IP fallback for production (172.30.x.x) until we find
a way to make BuildKit send auth proactively.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants