-
Notifications
You must be signed in to change notification settings - Fork 8
Add WebSocket failover counter metric and URL change logging #661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add WebSocket failover counter metric and URL change logging #661
Conversation
… URL change logging
NPM Publishing labels 🏷️🛑 This PR needs labels to indicate how to increase the current package version in the automated workflows. Please add one of the following labels: |
| logger.info('Websocket URL has changed, closing connection to reconnect...') | ||
| censorLogs(() => | ||
| logger.debug( | ||
| `Websocket url has changed from ${this.currentUrl} to ${urlFromConfig}, closing connection...`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this no longer close the connection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, but that string got moved to the info level log, so it is always visible with info level logs.
| }), | ||
| wsConnectionFailoverCount: new client.Gauge({ | ||
| name: 'ws_connection_failover_count', | ||
| help: 'The number of consecutive connection issues (unresponsive/no data, abnormal closures), used to trigger URL failover. Resets to 0 when data flows successfully.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see where this is reset to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't. The underlying variable it is meant to expose streamHandlerInvocationsWithNoConnection also never resets. It just increments forever, and Tiingo uses modulo arithmetic on it, it is used in this PR:
https://github.com/smartcontractkit/external-adapters-js/pull/4543/files (even before my changes).
Open to resetting it if there is a good reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it should reset, but the description says "Resets to 0 when data flows successfully."
Summary
Adds observability for WebSocket failover mechanism to help diagnose connection issues.
Problem
During a Tiingo incident (2026-01-13 03:19-03:32 UTC), we could not determine if failover triggered:
streamHandlerInvocationsWithNoConnectioncounter not exposed as metricCENSOR_SENSITIVE_LOGS=trueThis made it impossible to answer:
Changes
1. New Prometheus Metric
ws_connection_failover_countgauge metricstreamHandlerInvocationsWithNoConnectionvalue in real-timetransport_namefor per-transport tracking