Implement single reactor scheduler for epoll backend #49

sgerbino · 2026-01-21T21:56:29Z

Implement a single reactor thread coordination model for the epoll scheduler, matching Windows IOCP semantics for handler parallelism.

The original epoll scheduler had all threads block on epoll_wait() simultaneously. When wakeup() was called, the eventfd write caused ALL waiting threads to wake up, but only one had actual work.

The new single reactor model:

ONE thread runs epoll_wait() at a time (the reactor)
OTHER threads wait on a condition variable for handler work
notify_one() wakes exactly one worker when handlers are posted
eventfd interrupts the reactor only when no idle workers exist

scheduler.hpp:

Add condition_variable, reactor flags, idle thread count
Add run_reactor(), wake_one_thread_and_unlock(), interrupt_reactor()
Update class documentation to describe single reactor model

scheduler.cpp:

Rewrite do_one() with single reactor loop logic
Add run_reactor() for epoll_wait and I/O completion processing
Add wake_one_thread_and_unlock() for smart wake coordination
Fix errno corruption by saving it before timer processing
Fix memory leak in post() using unique_ptr
Fix over-counting bug in batched wake logic
Fix work_finished() to wake all threads when work reaches 0

op.hpp:

Remove unused 'events' field from epoll_op
Remove unused get_epoll_op() function

sockets.hpp/sockets.cpp:

Move implementations from header to new .cpp file
Change vectors to unordered_map for O(1) destroy_impl()
Remove redundant unregister_fd() calls in close_socket()

test/unit/io_context.cpp:

Add testMultithreaded() for concurrent post and run
Add testMultithreadedStress() for repeated iterations

Closes #33.

Summary by CodeRabbit

New Features
- Epoll-based socket backend added: non-blocking connect, read, write, accept, shutdown, and cancellation support.
Refactor
- Scheduler reworked to a single-reactor model with improved thread coordination for more efficient I/O handling.
- Socket/acceptor ownership tracking redesigned for faster lookup and cleanup.
Tests
- Added multithreaded executor tests using atomic-counter coroutines for stress verification.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-21T21:56:51Z

📝 Walkthrough

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.55% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check	❓ Inconclusive	The PR implements a single-reactor model but does not include EPOLLEXCLUSIVE flag implementation, which was the core investigation goal in issue `#33`.	Clarify whether the single-reactor model is intended as an alternative solution to EPOLLEXCLUSIVE, or if EPOLLEXCLUSIVE adoption was deferred to a future PR.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Implement single reactor scheduler for epoll backend' accurately and concisely summarizes the main change in the changeset—replacing the all-threads epoll_wait model with a single-reactor coordination model.
Out of Scope Changes check	✅ Passed	All changes align with implementing the single-reactor scheduler model: scheduler refactor, socket backend implementation, test coverage, and internal header cleanup.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/corosio/src/detail/epoll/scheduler.cpp`:
- Around line 627-633: The condvar wait in the reactor path does not honor the
requested timeout because after wakeup_event_.wait_for(lock,
std::chrono::microseconds(timeout_us)) returns we never check elapsed time or
propagate a timeout; update the logic in the function that implements wait_one
(or the block using timeout_us and idle_thread_count_) to record the start time
before calling wakeup_event_.wait/_for, compute elapsed time after the wait, and
if elapsed >= timeout_us return a timeout result (or adjust remaining timeout
and return accordingly) so the caller observing wait_one(timeout_us) will not
block beyond the intended timeout; reference idle_thread_count_,
wakeup_event_.wait_for, wakeup_event_.wait and timeout_us to locate and modify
the code path.

In `@src/corosio/src/detail/epoll/sockets.cpp`:
- Around line 346-357: In epoll_acceptor_impl::cancel() ensure the acceptor's
shared ownership is captured before posting the cancel op: call
shared_from_this() and assign acc_.impl_ptr (handling std::bad_weak_ptr) prior
to checking was_registered and before svc_.post(&acc_); if shared_from_this()
fails, do not post the operation. Keep acc_.request_cancel() and the
unregister_fd call, but move the shared_from_this() capture earlier (or capture
into a local shared_ptr then assign acc_.impl_ptr) so svc_.post(&acc_) cannot
run without a valid impl_ptr, preventing a potential use-after-free.

🧹 Nitpick comments (3)

test/unit/io_context.cpp (1)
422-450: Stress test may benefit from additional diagnostic on failure.

The stress test runs 10 iterations, which is good for catching intermittent races. Consider adding iteration index to the failure message for easier debugging if a specific iteration fails.
💡 Optional: Add iteration context to assertion
-            BOOST_TEST(counter.load() == handlers_per_iteration);
+            BOOST_TEST_MESSAGE("Iteration " << iter);
+            BOOST_TEST(counter.load() == handlers_per_iteration);
src/corosio/src/detail/epoll/scheduler.cpp (2)
428-439: Potential redundant lock acquisition in work_finished().

The lock is acquired, then notify_all() is called, but if the reactor needs interruption, the lock is unlocked before calling interrupt_reactor(). This is correct, but consider whether notify_all() alone is sufficient when outstanding_work_ reaches zero, since all threads should exit anyway.

578-581: Consider batching notify_one() calls.

Calling notify_one() in a loop may have performance overhead. Since the lock is held, consider using a single notify_all() if completions_queued equals or exceeds idle_thread_count_, falling back to the loop otherwise.
💡 Optional: Batch wakeups when all idle threads needed
     // Wake idle workers if we queued I/O completions
     int to_wake = (std::min)(idle_thread_count_, completions_queued);
-    for (int i = 0; i < to_wake; ++i)
-        wakeup_event_.notify_one();
+    if (to_wake > 0)
+    {
+        if (to_wake >= idle_thread_count_)
+            wakeup_event_.notify_all();
+        else
+            for (int i = 0; i < to_wake; ++i)
+                wakeup_event_.notify_one();
+    }

src/corosio/src/detail/epoll/scheduler.cpp

src/corosio/src/detail/epoll/sockets.cpp

cppalliance-bot · 2026-01-21T22:02:08Z

An automated preview of the documentation is available at https://49.corosio.prtest3.cppalliance.org/index.html

If more commits are pushed to the pull request, the docs will rebuild at the same URL.

2026-01-21 22:26:52 UTC

Implement a single reactor thread coordination model for the epoll scheduler, matching Windows IOCP semantics for handler parallelism. The original epoll scheduler had all threads block on epoll_wait() simultaneously. When wakeup() was called, the eventfd write caused ALL waiting threads to wake up, but only one had actual work. The new single reactor model: - ONE thread runs epoll_wait() at a time (the reactor) - OTHER threads wait on a condition variable for handler work - notify_one() wakes exactly one worker when handlers are posted - eventfd interrupts the reactor only when no idle workers exist scheduler.hpp: - Add condition_variable, reactor flags, idle thread count - Add run_reactor(), wake_one_thread_and_unlock(), interrupt_reactor() - Update class documentation to describe single reactor model scheduler.cpp: - Rewrite do_one() with single reactor loop logic - Add run_reactor() for epoll_wait and I/O completion processing - Add wake_one_thread_and_unlock() for smart wake coordination - Fix errno corruption by saving it before timer processing - Fix memory leak in post() using unique_ptr - Fix over-counting bug in batched wake logic - Fix work_finished() to wake all threads when work reaches 0 op.hpp: - Remove unused 'events' field from epoll_op - Remove unused get_epoll_op() function sockets.hpp/sockets.cpp: - Move implementations from header to new .cpp file - Change vectors to unordered_map for O(1) destroy_impl() - Remove redundant unregister_fd() calls in close_socket() test/unit/io_context.cpp: - Add testMultithreaded() for concurrent post and run - Add testMultithreadedStress() for repeated iterations

coderabbitai bot reviewed Jan 21, 2026

View reviewed changes

src/corosio/src/detail/epoll/scheduler.cpp Show resolved Hide resolved

src/corosio/src/detail/epoll/sockets.cpp Show resolved Hide resolved

sgerbino force-pushed the feature/single-reactor branch from 7325c8d to 8cd5f14 Compare January 21, 2026 22:21

sgerbino merged commit 2539302 into cppalliance:develop Jan 21, 2026
26 of 27 checks passed

This was referenced Jan 22, 2026

Implement async DNS resolution for epoll backend #54

Closed

Enhance TLS testing with SNI, mTLS, certificate chain, and cancellation tests - part 1 #67

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement single reactor scheduler for epoll backend #49

Implement single reactor scheduler for epoll backend #49

Uh oh!

sgerbino commented Jan 21, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 21, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

cppalliance-bot commented Jan 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Implement single reactor scheduler for epoll backend #49

Implement single reactor scheduler for epoll backend #49

Uh oh!

Conversation

sgerbino commented Jan 21, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cppalliance-bot commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sgerbino commented Jan 21, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 21, 2026 •

edited

Loading

cppalliance-bot commented Jan 21, 2026 •

edited

Loading