Fix: stack-use-after-free after recent run_loop changes #1742

sithhell · 2026-01-08T08:45:01Z

The changes introduced to #1683 (changing run_loop from a condition variable to atomics) introduced a stack-use-after-free when the task completion is happening on a different thread: run_loop::finish might cause
sync_wait_t::apply_sender to return before scheduling the noop task, ending
the lifetime of the queue before the noop task push could be finalized:

Thread 1 sets _finishing to true
Thread 2 observed __finishing is true, __noop_task hasn't been pushed yet and finishes successfully returns.
Thread 1 tries to push __noop_task to the queue, which now is out of scope.

To avoid this situation, the sync_wait state is additionally synchronized on the finish being returning successfully (avoiding the aforementioned race condition).

The changes introduced to NVIDIA#1683 (changing run_loop from a condition variable to atomics) introduced a stack-use-after-free when the task completion is happening on a different thread: `run_loop::finish` might cause `sync_wait_t::apply_sender` to return before scheduling the noop task, ending the lifetime of the queue before the noop task push could be finalized: - Thread 1 sets __finishing_ to true - Thread 2 observed __finishing is true, __noop_task hasn't been pushed yet and finishes successfully returns. - Thread 1 tries to push __noop_task to the queue, which now is out of scope. To avoid this situation, the sync_wait state is additionally synchronized on the `finish` being returning successfully (avoiding the aforementioned race condition).

copy-pr-bot · 2026-01-08T08:45:04Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

ccotter · 2026-01-09T04:07:02Z

include/stdexec/__detail/__sync_wait.hpp

+
+      void wait() noexcept {
+        // Account for spurios wakeups
+        while (!__done_.load(stdexec::__std::memory_order_acquire)) {


There's still a use-after-free bug; a background thread can store true to __done_, then before it calls notify, the waiter thread may end up seeing __done_ is true and break (say, a spurious wakeup, or the wait had never reached the first iteration of the loop), then the sync_wait returns and destructs.

Btw, I don't think you need to loop here .. std::atomic::wait only returns when the value has changed, accounting for spurious wakeups (it has a loop internally).

I was able to detect this with Relacy, which I am in the process of updating to work with stdexec. The test is https://github.com/NVIDIA/stdexec/compare/main...ccotter:stdexec:sync-wait-rrd-bug?expand=1

To run the test, follow these instructions, except use this branch instead of relacy's main branch: dvyukov/relacy#33

I'm actually not sure how this can be addressed. In code like

exec::static_thread_pool pool{1}; { auto s = ex::schedule(pool.get_scheduler()) | ex::then([] { return 0; }); ex::sync_wait(s); }

when the background thread notifies the foreground thread to wake up and unblock, the communication channel (condition variable, atomic, other) may become destroyed when the sync_wait objects on the stack created by the foreground thread go out of scope. Whether the notifier uses done.store(true); done.notify_one(); or done = true; cv.notify();, I'm not sure how to ensure the foreground thread will wake up and continue, but only after the background thread has finishing doing the notification...

@lewissbaker - wondering if you might be able to shed some light on this .. I seem to have hit a mental dead end in not understanding how it's possible to to have a sync_wait() without allocating the internal run_loop on the heap, and giving a strong reference to the receiver to guarantee lifetime.

@ccotter The receiver is the one allocating the shared state (currently on the stack).
In fact, the code you mentioned above was very similar to the code where I detected the bug this fix is for (see here: https://github.com/olympus-robotics/hephaestus/blob/main/modules/concurrency/tests/context_tests.cpp#L96).
In fact, the solution I came up with should be actually safe. I don't have a full proof, but ran my testcase for around a million times.

Let me try to answer why what you are describing should be safe:

Both the state and the receiver are owned by thread calling sync_wait

The execution context on a different thread merely has a reference to the shared state

The execution context signals completion by setting performing the following steps (in a strongly "happens before" order):
1.1. Set the finishing of the run loop to true
1.2. (optional) signalling the run loop
1.3. Set done to true
1.4. Signal done

The execution context waiting on completion is performing the following steps:
2.1. If finished is true, goto 5.
2.2. Wait for new task
2.3. Execute task
2.4. Goto 1.
2.6. Wait for done to be set to true
2.7. return (ends lifetime of the state)

Without the 'done' step, we run into the situation that step 1.2 might access invalid memory because we already ended up in 2.7. too early.
Previously, this was ensured by actually holding the mutex while signalling the condition variable (

stdexec/include/stdexec/__detail/__run_loop.hpp

Line 210 in b273eee

__cv_.notify_all();

) which had the same effect.

I can see how there is still a tiny window which can cause a use after free as you described (even after removing the loop).

I think the only way this can be truly fixed is to introduce a count of 'tasks in flight' for the run_loop:

Each run_loop::__push_back call increments this count

Each run_loop::__execute_all decrements this count by the amount it handled

run_loop::finish increments the count before setting the finishing flag and decrements it before returning.

run_loop::run is looping as long as the count is non zero and finishing is set to true.

I wanted to avoid this, since it sounded expensive and the solution I propose here at least seems to work...

sithhell · 2026-01-09T13:04:41Z

Btw, I don't think you need to loop here .. std::atomic::wait only returns when the value has changed, accounting for spurious wakeups (it has a loop internally).

Thanks for pointing this out!

We need to synchronize returning from `__run_loop_base::run` with potentially concurrent calls to `__run_loop_base::finish`. This is done by introducing a counter, ensuring proper completion of all tasks in flight. Also see NVIDIA#1742 for additional information.

sithhell · 2026-01-10T13:43:01Z

Closing in favor of #1746

We need to synchronize returning from `__run_loop_base::run` with potentially concurrent calls to `__run_loop_base::finish`. This is done by introducing a counter, ensuring proper completion of all tasks in flight. Also see NVIDIA#1742 for additional information.

ccotter reviewed Jan 9, 2026

View reviewed changes

ccotter mentioned this pull request Jan 9, 2026

async scope bug #1731

Closed

avoid unnecessary looping

d711074

sithhell mentioned this pull request Jan 9, 2026

Avoid use-after-free with stdexec::run_loop #1746

Open

sithhell closed this Jan 10, 2026

ccotter mentioned this pull request Jan 11, 2026

Integrate Relacy into CI, or otherwise find ways to make it easier to include in build/test workflows #1749

Open

sithhell deleted the fix/sync_wait_stack_use_after_free branch January 11, 2026 08:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: stack-use-after-free after recent run_loop changes #1742

Fix: stack-use-after-free after recent run_loop changes #1742

sithhell commented Jan 8, 2026

Uh oh!

copy-pr-bot bot commented Jan 8, 2026

Uh oh!

ccotter Jan 9, 2026

Uh oh!

ccotter Jan 9, 2026

Uh oh!

ccotter Jan 9, 2026

Uh oh!

sithhell Jan 9, 2026

Uh oh!

sithhell commented Jan 9, 2026

Uh oh!

sithhell commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix: stack-use-after-free after recent run_loop changes #1742

Fix: stack-use-after-free after recent run_loop changes #1742

Conversation

sithhell commented Jan 8, 2026

Uh oh!

copy-pr-bot bot commented Jan 8, 2026

Uh oh!

ccotter Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

ccotter Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

ccotter Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

sithhell Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

sithhell commented Jan 9, 2026

Uh oh!

sithhell commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants