Skip to content

Fix AsyncioConnection race conditions causing EBADF errors (#614)#697

Open
dkropachev wants to merge 1 commit intomasterfrom
fix/asyncio-close-race-614
Open

Fix AsyncioConnection race conditions causing EBADF errors (#614)#697
dkropachev wants to merge 1 commit intomasterfrom
fix/asyncio-close-race-614

Conversation

@dkropachev
Copy link
Collaborator

@dkropachev dkropachev commented Feb 12, 2026

Summary

  • Fix close() to wait for _close() completion from non-event-loop threads, eliminating the window where is_closed=True but the socket fd is still open (root cause of EBADF)
  • Set last_error on server EOF in handle_read() so factory() detects dead connections instead of returning them
  • Add is_closed/is_defunct guard in push() to reject writes on closed connections
  • Treat BrokenPipeError/ConnectionResetError in handle_write() as clean peer disconnections instead of defuncting, and skip defunct() in both I/O handlers if the connection is already shutting down

Test plan

  • New unit tests in tests/unit/io/test_asyncio_race_614.py cover all four race conditions using real socket pairs
  • Full unit test suite passes with no regressions
  • Integration tests with TLS and node restart scenarios

@dkropachev dkropachev force-pushed the fix/asyncio-close-race-614 branch 3 times, most recently from 46524ac to 7b815c8 Compare February 16, 2026 13:09
@dkropachev dkropachev self-assigned this Feb 16, 2026
@dkropachev dkropachev marked this pull request as ready for review February 16, 2026 13:26
@dkropachev dkropachev force-pushed the fix/asyncio-close-race-614 branch from 2b9d555 to 85d08e1 Compare February 17, 2026 04:52
@dkropachev dkropachev force-pushed the fix/asyncio-close-race-614 branch from 85d08e1 to 776d2d3 Compare February 17, 2026 12:55
Fix race conditions in AsyncioConnection that cause "[Errno 9] Bad
file descriptor" errors during node restarts, especially with TLS:

1. close() now waits for _close() to complete when called from outside
   the event loop thread, eliminating the window where is_closed=True
   but the socket fd is still open.

2. handle_read() sets last_error on server EOF so factory() detects dead
   connections instead of returning them to callers.

3. handle_write() treats peer disconnections as clean close instead of
   defuncting, and both I/O handlers skip defunct() if the connection
   is already shutting down.

Peer-disconnect detection is extracted into _is_peer_disconnect() helper
covering platform-specific behaviors:

- Windows: ProactorEventLoop raises plain OSError with winerror 10054
  (WSAECONNRESET) or 10053 (WSAECONNABORTED) instead of
  ConnectionResetError. Detection uses ConnectionError base class plus
  winerror check.

- macOS: Raises OSError(57) ENOTCONN when writing to a
  peer-disconnected socket, which is not a ConnectionError subclass.
  Detection uses errno-based checks for ENOTCONN, ESHUTDOWN,
  ECONNRESET, and ECONNABORTED.

- Windows _close(): ProactorEventLoop does not support
  remove_reader/remove_writer (raises NotImplementedError). These calls
  are wrapped so the socket is always closed regardless, and try/finally
  ensures connected_event is always set even if cleanup fails.
@dkropachev dkropachev force-pushed the fix/asyncio-close-race-614 branch from d73e85d to 9915064 Compare February 17, 2026 20:40
@dkropachev dkropachev requested a review from Lorak-mmk February 17, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants