Skip to content

Conversation

@dicksontsai
Copy link
Collaborator

Summary

  • Add configurable timeout (5s default) for task group cleanup in Query.close()
  • Prevents indefinite CPU hang when tasks don't respond to cancellation

Problem

The Query.close() method could hang indefinitely when tasks in the anyio task group didn't properly respond to cancellation (issue #378). This caused anyio's _deliver_cancellation() to spin at 100% CPU.

Affected scenarios:

  • Browser disconnection mid-stream (SSE with FastAPI)
  • Subprocess OOM-killed (exit code -9 / SIGKILL)
  • JupyterLab extension after calling client.disconnect()

Solution

Set a deadline on the task group's own cancel scope during close():

self._tg.cancel_scope.deadline = (
    anyio.current_time() + self._task_group_close_timeout
)

This approach avoids cancel scope nesting issues that occur when wrapping with anyio.move_on_after() or anyio.fail_after().

Configuration

A new environment variable CLAUDE_CODE_TASK_GROUP_CLOSE_TIMEOUT (milliseconds, default 5000) allows users to adjust the timeout if needed.

Test plan

  • Added unit test for close() timeout behavior
  • All existing tests pass (118 tests)
  • Linting passes

Closes #378

The Query.close() method could hang indefinitely when tasks in the anyio
task group didn't properly respond to cancellation. This caused anyio's
_deliver_cancellation() to spin at 100% CPU, affecting production
deployments (FastAPI SSE, ECS/Fargate, JupyterLab).

Changes:
- Add configurable timeout (5s default) for task group cleanup
- Set deadline on task group's cancel scope to avoid scope nesting issues
- Log warning when timeout occurs for debugging
- Add CLAUDE_CODE_TASK_GROUP_CLOSE_TIMEOUT env var for configurability

The fix uses the task group's own cancel scope deadline rather than
wrapping with a new cancel scope, which avoids the "Attempted to exit
a cancel scope that isn't the current task's current cancel scope" error.

Closes #378

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Query.close() can hang indefinitely causing 100% CPU usage due to missing timeout on task group cleanup

2 participants