Skip to content

Conversation

@tpoliaw
Copy link
Contributor

@tpoliaw tpoliaw commented Dec 12, 2025

If a callback raises an exception, it shouldn't prevent other callbacks
receiving the same event. Instead, catch any exceptions and re-raise
them after all callbacks have been called as a single ExceptionGroup.
This allows the task to be aborted if any of the callbacks fail but
still allows callbacks that require the "plan failed" events to run.

@tpoliaw tpoliaw requested a review from a team as a code owner December 12, 2025 15:52
@codecov
Copy link

codecov bot commented Dec 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.00%. Comparing base (ae766ca) to head (7d6cd80).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1302      +/-   ##
==========================================
+ Coverage   94.99%   95.00%   +0.01%     
==========================================
  Files          42       42              
  Lines        2755     2763       +8     
==========================================
+ Hits         2617     2625       +8     
  Misses        138      138              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tpoliaw tpoliaw force-pushed the failsafe-event-handling branch from 6eda306 to 73206be Compare December 12, 2025 16:43
@tpoliaw tpoliaw changed the title Ignore (but log) errors from event stream callbacks fix: Handle errors from event stream callbacks Dec 12, 2025
@tpoliaw
Copy link
Contributor Author

tpoliaw commented Dec 12, 2025

The main symptom of this was that if a scan failed due to eg a stomp connection error, even if it reconnected, any subsequent scans would fail as the unsubscribe the tiled writer callback would never be called and tiled would return 409 conflict errors when two tiled writers tried to write the same events.

@tpoliaw tpoliaw force-pushed the failsafe-event-handling branch from 73206be to 67c37d1 Compare January 6, 2026 14:40
abbiemery
abbiemery previously approved these changes Jan 6, 2026
Copy link
Contributor

@abbiemery abbiemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Demo'd in person

@abbiemery
Copy link
Contributor

As it stands this will continue a scan running, even if rabbitmq is down. So we will loose data from the nexus file. Looking instead at a different way to get the errors to shut things down correctly.

@abbiemery abbiemery dismissed their stale review January 13, 2026 13:44

Not desired behaviour.

@tpoliaw tpoliaw force-pushed the failsafe-event-handling branch from 67c37d1 to deed376 Compare January 13, 2026 14:07
If a callback raises an exception, it shouldn't prevent other callbacks
receiving the same event. It should also not raise the exception at the
call site that published the event.
@tpoliaw tpoliaw force-pushed the failsafe-event-handling branch 2 times, most recently from deed376 to 7d6cd80 Compare January 13, 2026 14:38
@tpoliaw tpoliaw requested a review from abbiemery January 13, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants