Skip to content

Conversation

@vadikko2
Copy link
Owner

Saga recovery: sagas that failed before any step completed

Problem: Sagas that failed on the first step (no steps with act + COMPLETED in the log) were still selected by the recovery task every run. For them, compensation was never executed (there were no completed steps to compensate), but they stayed in a "recoverable" state and were picked again on the next run, causing repeated "recovered" logs and unnecessary DB load.

Changes:

  1. Exclude FAILED sagas from recovery
    get_sagas_for_recovery() now returns only sagas in RUNNING or COMPENSATING status. Sagas already in FAILED are no longer returned, so they are not retried every minute.

    • Updated in: SqlAlchemySagaStorage, MemorySagaStorage, and the storage protocol docstring.
  2. Explicit handling when there are no steps to compensate
    When compensation runs with an empty list of completed steps (saga failed before completing any step):

    • compensation.py: At the start of compensate_steps(), if completed_steps is empty, the saga status is set to FAILED, a short info log is written, and the function returns (no compensate() calls).
    • saga.py: When recovering a saga in COMPENSATING/FAILED with no completed steps, a warning is logged that the saga failed before any step completed and is being marked FAILED without calling compensate().

Result: Sagas that fail on the first step are marked FAILED once and no longer appear in recovery. Sagas that fail in the middle (e.g. external service down) continue to be recovered as RUNNING or COMPENSATING until compensation finishes or the saga is marked FAILED.

Tests: Integration tests for get_sagas_for_recovery now assert that only RUNNING and COMPENSATING sagas are returned, and that FAILED sagas are excluded.

@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Jan 28, 2026

Merging this PR will not alter performance

✅ 11 untouched benchmarks


Comparing bugfix-fix-fetching-recovery-saga-candidates (6a04f32) with master (a1427f3)

Open in CodSpeed

@vadikko2 vadikko2 merged commit 05ee3e0 into master Jan 28, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants