feat: Reduce streaming DNS failures with stale-fallback DNS cache#323
Open
abelonogov-ld wants to merge 6 commits intomainfrom
Open
feat: Reduce streaming DNS failures with stale-fallback DNS cache#323abelonogov-ld wants to merge 6 commits intomainfrom
abelonogov-ld wants to merge 6 commits intomainfrom
Conversation
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/CachingDns.java
Show resolved
Hide resolved
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/CachingDns.java
Show resolved
Hide resolved
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/CachingDns.java
Show resolved
Hide resolved
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/ComponentsImpl.java
Show resolved
Hide resolved
… improved clarity and performance.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
CachingDns, a thread-safe OkHttpDnswrapper that caches successful lookups (10-min TTL) and returns stale cached addresses when a fresh resolution fails — preventingUnknownHostExceptionfrom killing the stream during network transitionsConnectionPooland the DNS resolver acrossStreamingDataSourcerestarts (context switches, foreground/background toggles, network changes) viaStreamingDataSourceBuilderImpl, so cached state survives data source recreationretryOnConnectionFailure(true)on the streaming OkHttpClient on all API levelsBackground
We observed a high percentage of DNS failures in streaming connections. The root cause is that
StreamingDataSourcecreates a newOkHttpClienton everystart()call, andConnectivityManagerrestarts the data source on every network change — exactly when DNS is most fragile. OkHttp usesDns.SYSTEM(InetAddress.getAllByName) with no caching, and Android's system DNS cache has very short TTLs (sometimes ~2 seconds) that get cleared on network transitions.This pattern of application-level DNS caching with stale fallback is well-established: Alibaba's HTTPDNS SDK, gRPC-Java's
DnsNameResolver, and Square's ownDnsOverHttpsmodule all implement similar approaches. Google validated the pattern by addingDnsOptions.StaleDnsOptionsto the Android framework in API 34.Test plan
CachingDns: fresh resolution, cache hits within TTL, TTL expiry refresh, stale fallback on failure, cold failure propagation, per-hostname isolation, expiration boundaryStreamingDataSourceTestpasses (builder creates data source with new constructor args transparently)CachingDnswarns on stale fallback and that stream reconnects succeed during network changesNote
Medium Risk
Touches streaming network connection setup and DNS resolution behavior; incorrect caching/pooling could cause connectivity regressions or use stale IPs longer than intended.
Overview
Adds
CachingDns, an OkHttpDnswrapper that caches successful lookups with a TTL and falls back to stale cached addresses when fresh resolution fails, reducingUnknownHostExceptiondisruptions during mobile network transitions.Updates streaming to reuse a shared DNS resolver and
ConnectionPoolacrossStreamingDataSourcerestarts, and wires these into the EventSource OkHttp client configuration (including enablingretryOnConnectionFailure(true)). Includes unit tests covering cache hit/expiry behavior, stale fallback, per-host caching, and eviction behavior when exceedingMAX_ENTRIES.Written by Cursor Bugbot for commit 0a6a23e. This will update automatically on new commits. Configure here.