Support stream-level rebalancing for long-running RPCs (MaxStreamAge/MaxStreamGrace)

### Is your feature request related to a problem?

At our scale (1000+ client streams across 150 servers), long-running streaming RPCs (VStream CDC, Pub/Sub, Bigtable watch) create a load balancing problem with ORCA metrics. Using MaxConnectionAge causes connection churn for idle streams that only receive ORCA metrics. We need per-stream lifecycle management for efficient L7 load balancing.

The problem is that each server or client gRPC stream must implement its own stream termination logic in order to effectively use ORCA metrics and L7 load balancing. See best practices mentioned in https://github.com/grpc/grpc-java/issues/12525#issuecomment-3564341605.

### Describe the solution you'd like
Similar to server-side connection management ([gRPC A9](https://github.com/grpc/proposal/blob/master/A9-server-side-conn-mgt.md)) with MaxConnectionAge & MaxConnectionGrace for L4 load balancers, we propose adding a MaxStreamAge and MaxStreamGrace that
- terminate a stream after the given age and within the grace period (with jitter to avoid thundering herd)
- work at L7 (stream level) instead of L4 (connection level)
- allow orca metrics to continue flowing on the connection
- send an error code that clients could handle and immediately retry on the connection (possibly connecting to another server based on gRPC metrics)

This would prevent every application from re-implementing the same interval/jitter/status logic for long-running streams.

### Describe alternatives you've considered

- MaxConnectionAge: Inefficient with L7 LB; closes connection even for idle ORCA streams
- Client-side timers: Every client must implement jitter/retry logic differently
- Server-side timers: Requires custom code per service (repeated development effort); no standard status codes
- MaxConnectionIdle: Only triggers when ALL streams are idle, not per-stream

### Additional context

Our production environment uses:
- Server: grpc-go (Vitess vttablet)
- Client: grpc-java (application clients)

We need feature parity in both implementations for this to work. We're willing to implement both and contribute the code if the design is accepted. If we gain support on this issue, we can work on a gRFC as this would likely benefit from formal design review.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support stream-level rebalancing for long-running RPCs (MaxStreamAge/MaxStreamGrace) #12575

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support stream-level rebalancing for long-running RPCs (MaxStreamAge/MaxStreamGrace) #12575

Description

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions