-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Overview
Implement a configurable retry_strategy parameter for the Agent class to replace hardcoded retry logic with a flexible, hook-based retry system. This will allow users to configure retry behavior and implement custom retry strategies.
Current State
The SDK currently has hardcoded retry logic in event_loop.py:
MAX_ATTEMPTS = 6INITIAL_DELAY = 4secondsMAX_DELAY = 240seconds (4 minutes)- Exponential backoff for
ModelThrottledException - Emits
EventLoopThrottleEventduring retries
Related Issues and PRs
- [FEATURE] make event loop settings configurable strands-agents/sdk-python#283
- [FEATURE] Make event loop try/retry logic sleep times configurable strands-agents/sdk-python#527
- [FEATURE] Retries for ServiceUnavailableException (503 errors) strands-agents/sdk-python#370
- feat: allow hooks to retry model invocations on exceptions strands-agents/sdk-python#1405 (Hook-based retry support)
Implementation Requirements
1. Create Retry Strategy Classes
Location: src/strands/agent/retry.py (or similar appropriate location)
Create a base retry strategy and built-in implementations:
ModelRetryStrategy (Default)
- Implements
HookProviderprotocol - Configurable parameters:
max_attempts(default: 6)initial_delay(default: 4 seconds)max_delay(default: 240 seconds)
- Implements exponential backoff for
ModelThrottledException - Registers callback for
AfterModelCallEvent - Sets
event.retry = Trueon throttling exceptions (respecting max_attempts) - Includes logging for retry attempts (using SDK logging standards)
- Supports async sleep during backoff delays
- Must emit
EventLoopThrottleEventfor backwards compatibility (this might be hardcoded to the agent loop if needed since hooks cannot emit events)
Naming: Choose a better name than "ModelRetrys" - something that represents model retries but isn't throttling-specific (e.g., ModelRetryStrategy, RetryStrategy, etc.)
NoopRetryStrategy
- Implements
HookProviderprotocol - No-op implementation for users who want to explicitly disable retries
register_hooks()does nothing
2. Update Agent Class
Location: src/strands/agent/agent.py
Add retry_strategy parameter to Agent.__init__():
def __init__(
self,
# ... existing parameters ...
retry_strategy: Optional[HookProvider] = None,
# ... other parameters ...
):Behavior:
- If
retry_strategyisNone: Default toModelRetryStrategy()with current defaults (6 attempts, 4s initial, 240s max) - Store as read-only property:
self._retry_strategy - Register retry_strategy as a hook like any other HookProvider
- Type hint as
HookProvider(or create more specificRetryStrategyprotocol if needed)
Integration:
- May need to access
retry_strategyfrom event loop for backwards compatibility (emittingEventLoopThrottleEvent) - Works alongside other hooks - retry_strategy is just another registered hook
3. Refactor Event Loop
Location: src/strands/event_loop/event_loop.py
Remove:
MAX_ATTEMPTS = 6INITIAL_DELAY = 4MAX_DELAY = 240- Hardcoded throttling retry logic in
_handle_model_execution()
Refactor:
- Move retry logic from event loop to
ModelRetryStrategyhook - Keep the retry loop structure but rely on hooks setting
AfterModelCallEvent.retry - Ensure
EventLoopThrottleEventis still emitted (may need special handling for built-inModelRetryStrategy) - The event loop should be simpler - just invoke hooks and respect the
retryfield
4. Backwards Compatibility
Critical Requirement: Existing code relying on EventLoopThrottleEvent must continue to work.
Approach:
ModelRetryStrategymust emitEventLoopThrottleEventduring retries- May need to check if retry_strategy is the built-in
ModelRetryStrategyfor special event handling - Default behavior (when retry_strategy=None) must be identical to current behavior
5. Testing
Location: tests/strands/agent/ and tests/strands/event_loop/
Required Test Scenarios:
- Default behavior: Verify that not specifying retry_strategy uses default ModelRetryStrategy with 6 attempts
- Custom retry strategy: Test a user-implemented custom retry strategy
- Backwards compatibility: Verify that
EventLoopThrottleEventis emitted as before - NoopRetryStrategy: Test that retries can be disabled
- Configured parameters: Test ModelRetryStrategy with custom max_attempts, initial_delay, max_delay
- Integration with other hooks: Verify retry_strategy works alongside other hooks (no special interaction tests needed, just basic compatibility)
Files to Modify
-
src/strands/hooks/retry.py(new file)- Create
ModelRetryStrategyclass - Create
NoopRetryStrategyclass - Implement HookProvider protocol
- Handle retry logic and event emission
- Create
-
src/strands/agent/agent.py- Add
retry_strategyparameter to__init__() - Add
_retry_strategyread-only property - Register retry_strategy as hook
- Add
-
src/strands/event_loop/event_loop.py- Remove hardcoded constants
- Refactor
_handle_model_execution()to rely on hooks - Simplify retry loop logic
-
tests/strands/hooks/test_retry.py(new file)- Test ModelRetryStrategy with default and custom parameters
- Test NoopRetryStrategy
- Test custom retry strategy implementation
-
tests/strands/agent/test_agent_retry_strategy.py(new file or add to existing)- Test Agent initialization with retry_strategy
- Test backwards compatibility
- Test EventLoopThrottleEvent emission
-
tests/strands/event_loop/test_event_loop_retry.py(update existing)- Update existing retry tests to work with new system
- Test backwards compatibility
-
Documentation (location TBD)
- User guide for retry_strategy feature
- Examples of custom retry strategies
Acceptance Criteria
-
ModelRetryStrategyclass implements HookProvider and handles throttling retries -
NoopRetryStrategyclass implements HookProvider with no-op behavior - Agent accepts
retry_strategyparameter (defaults to ModelRetryStrategy) - Hardcoded retry constants removed from event_loop.py
- Event loop refactored to rely on hook-based retries
-
EventLoopThrottleEventstill emitted for backwards compatibility - Tests pass for default behavior with same retry characteristics as before
- Tests pass for custom retry strategy implementation
- Tests verify backwards compatibility (EventLoopThrottleEvent emission)
- Tests pass for NoopRetryStrategy
- Tests pass for configured retry parameters
- Documentation created for retry_strategy feature
- All existing tests continue to pass
- Code follows SDK patterns (logging, type hints, docstrings)
- Pre-commit hooks pass (formatting, linting, type checking)
Technical Approach
Implementation Strategy
- Create retry strategy classes with HookProvider protocol
- Integrate into Agent by adding retry_strategy parameter and registering as hook
- Refactor event loop to remove hardcoded logic and rely on hooks
- Ensure backwards compatibility by emitting EventLoopThrottleEvent
- Write comprehensive tests covering all scenarios
- Document the feature with examples
Key Design Decisions
- Hook-based approach: Retry strategies are HookProviders, registered like any other hook
- Read-only property: Store as
_retry_strategyfor potential backwards compat access - Default behavior preserved: None defaults to ModelRetryStrategy with current settings
- Explicit disable: Use NoopRetryStrategy instead of None to disable retries
- Backwards compatible: EventLoopThrottleEvent emission preserved
Integration Points
- Retry strategies integrate via the existing hook system
- No special handling needed for interaction with other hooks
- Event loop continues to respect
AfterModelCallEvent.retryfield - ModelRetryStrategy sets retry field based on its configuration
Notes
- The name "ModelRetrys" should be improved to something more generic that represents model retries without being throttling-specific
- This feature enables users to implement sophisticated retry logic beyond throttling (rate limiting, circuit breakers, custom backoff strategies, etc.)
- The hook-based approach maintains consistency with SDK patterns and provides maximum flexibility