[Bugfix] Saga recovery: sagas that failed before any step completed #47
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Saga recovery: sagas that failed before any step completed
Problem: Sagas that failed on the first step (no steps with
act+COMPLETEDin the log) were still selected by the recovery task every run. For them, compensation was never executed (there were no completed steps to compensate), but they stayed in a "recoverable" state and were picked again on the next run, causing repeated "recovered" logs and unnecessary DB load.Changes:
Exclude FAILED sagas from recovery
get_sagas_for_recovery()now returns only sagas in RUNNING or COMPENSATING status. Sagas already in FAILED are no longer returned, so they are not retried every minute.SqlAlchemySagaStorage,MemorySagaStorage, and the storage protocol docstring.Explicit handling when there are no steps to compensate
When compensation runs with an empty list of completed steps (saga failed before completing any step):
compensation.py: At the start ofcompensate_steps(), ifcompleted_stepsis empty, the saga status is set to FAILED, a short info log is written, and the function returns (nocompensate()calls).saga.py: When recovering a saga in COMPENSATING/FAILED with no completed steps, a warning is logged that the saga failed before any step completed and is being marked FAILED without callingcompensate().Result: Sagas that fail on the first step are marked FAILED once and no longer appear in recovery. Sagas that fail in the middle (e.g. external service down) continue to be recovered as RUNNING or COMPENSATING until compensation finishes or the saga is marked FAILED.
Tests: Integration tests for
get_sagas_for_recoverynow assert that only RUNNING and COMPENSATING sagas are returned, and that FAILED sagas are excluded.