Skip to content

Commit ccdb24a

Browse files
committed
fix: add litellm retry with exponential backoff for rate limit errors
- Add `num_retries=3` default to LLMConfig so litellm retries on OpenAI 429 rate limit errors with built-in exponential backoff - Increase Temporal DEFAULT_RETRY_POLICY from 1 attempt (no retries) to 3 attempts with exponential backoff (1s, 2s, 4s... up to 30s) This complements the HTTPX connection limit reduction in agentex backend (scaleapi/scale-agentex#144) to address OpenAI rate limiting under high concurrent load.
1 parent a277f10 commit ccdb24a

File tree

2 files changed

+8
-1
lines changed

2 files changed

+8
-1
lines changed

src/agentex/lib/adk/providers/_modules/litellm.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,13 @@
2626
logger = make_logger(__name__)
2727

2828
# Default retry policy for all LiteLLM operations
29-
DEFAULT_RETRY_POLICY = RetryPolicy(maximum_attempts=1)
29+
# Retries with exponential backoff: 1s, 2s, 4s, ... up to 30s between attempts
30+
DEFAULT_RETRY_POLICY = RetryPolicy(
31+
maximum_attempts=3,
32+
initial_interval=timedelta(seconds=1),
33+
backoff_coefficient=2.0,
34+
maximum_interval=timedelta(seconds=30),
35+
)
3036

3137

3238
class LiteLLMModule:

src/agentex/lib/types/llm_messages.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ class LLMConfig(BaseModel):
5858
parallel_tool_calls: bool | None = None
5959
logprobs: bool | None = None
6060
top_logprobs: int | None = None
61+
num_retries: int | None = 3
6162

6263

6364
class ContentPartText(BaseModel):

0 commit comments

Comments
 (0)