Skip to content

Conversation

@emmyzhou-db
Copy link
Contributor

@emmyzhou-db emmyzhou-db commented May 30, 2025

Motivation

Synchronous token refresh introduces unnecessary blocking and latency, which becomes especially problematic in high QPS (Queries Per Second) workloads. In such scenarios, blocking token refresh operations can create bottlenecks, impact throughput, and cause request queuing.

Solution

This PR introduces asynchronous token refresh to improve responsiveness and reliability, with particular benefits for high-throughput applications.

What changes are proposed in this pull request?

Async Refresh Option: Added withAsyncRefresh(boolean enabled) method to enable/disable asynchronous token refresh. When enabled, tokens are refreshed in the background when they become "stale" (close to expiry).

Token State Management

  • Three-State Token System:
    • FRESH: Token is valid and not close to expiry
    • STALE: Token is valid but will expire soon (triggers async refresh if enabled)
    • EXPIRED: Token has expired (requires blocking refresh)
  • Token class is now a pure data class, holding only token information (access token, refresh token, expiry time).
  • TokenSource implementations are responsible for token state management and refresh logic.

Performance Optimizations

  • Non-blocking for Stale Tokens: When async is enabled and the token is stale but not expired, API calls continue using the current token while a background refresh is triggered, reducing latency and avoiding thread blocking.
  • Blocking Only on Expiry: If the token is expired, calls will still block until a new token is fetched, ensuring correctness.
  • Default Stale Duration: Tokens are considered stale 3 minutes before expiry, allowing proactive refresh while maintaining validity.

Thread Safety & Reliability

  • Synchronized Refresh Logic: All async refresh operations are synchronized to prevent race conditions and redundant refreshes.
  • Refresh State Tracking: The implementation tracks:
    • Whether a refresh is already in progress
    • Whether the last refresh succeeded
    • This prevents unnecessary or repeated background refreshes
  • Failure Handling: If an async refresh fails, subsequent refreshes will be forced to be synchronous until a successful refresh occurs.

Configuration Options

Experimental Features (may change in future releases):

  • withClockSupplier() - Custom clock supplier for testing
  • withAsyncRefresh() - Enable/disable asynchronous token refresh
  • withExpiryBuffer() - Configure the expiry buffer duration

Testing

  • Unit tests cover async refresh triggering, correct token usage during background refresh, and thread safety.
  • Time Control for Testing:
    The RefreshableTokenSource class can be configured with a custom clock supplier via withClockSupplier(ClockSupplier clockSupplier), allowing tests to precisely control and simulate token expiry and refresh timing. This enables deterministic testing of token state transitions and refresh behavior without relying on real system time.

NO_CHANGELOG=true

@emmyzhou-db emmyzhou-db temporarily deployed to test-trigger-is June 3, 2025 12:24 — with GitHub Actions Inactive
Copy link
Contributor

@parthban-db parthban-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo comments.

@github-actions
Copy link

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-java

Inputs:

  • PR number: 455
  • Commit SHA: f6f0dd0f2a26f95ac267a27ad3f6aa27f2c7b4ba

Checks will be approved automatically on success.

@emmyzhou-db emmyzhou-db added this pull request to the merge queue Jun 17, 2025
Merged via the queue into main with commit 580c015 Jun 17, 2025
15 checks passed
@emmyzhou-db emmyzhou-db deleted the emmyzhou-db/async_token_cache branch June 17, 2025 16:27
github-merge-queue bot pushed a commit that referenced this pull request Jun 18, 2025
## What changes are proposed in this pull request?
This PR introduces a new environment variable
`DATABRICKS_ENABLE_EXPERIMENTAL_ASYNC_TOKEN_REFRESH` to enable
asynchronous token refresh in the Databricks SDK for Java. This feature
improves performance by allowing token refresh operations to happen in
the background, reducing latency for API calls.

This change activates the asynchronous refresh capability that was
previously added in
#455. When
enabled, stale tokens will trigger a background refresh while expired
tokens will still block until a new token is fetched.

### How to Enable Async Token Refresh
Set the environment variable:
```bash
export DATABRICKS_ENABLE_EXPERIMENTAL_ASYNC_TOKEN_REFRESH=true
```
This setting will be automatically picked up by the SDK and applied to
all token refresh operations.

## How is this tested?
Manual verification that existing unit tests and integration tests pass
with both async refresh disabled and enabled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants