fiber/doc/integration.qbk

[/
      Copyright Oliver Kowalke, Nat Goodspeed 2015.
 Distributed under the Boost Software License, Version 1.0.
    (See accompanying file LICENSE_1_0.txt or copy at
          http://www.boost.org/LICENSE_1_0.txt
]

[/ import path is relative to this .qbk file]

[#integration]
[section:integration Sharing a Thread with Another Main Loop]

[section Overview]

As always with cooperative concurrency, it is important not to let any one
fiber monopolize the processor too long: that could ["starve] other ready
fibers. This section discusses a couple of solutions.

[endsect]
[section Event-Driven Program]

Consider a classic event-driven program, organized around a main loop that
fetches and dispatches incoming I/O events. You are introducing
__boost_fiber__ because certain asynchronous I/O sequences are logically
sequential, and for those you want to write and maintain code that looks and
acts sequential.

You are launching fibers on the application[s] main thread because certain of
their actions will affect its user interface, and the application[s] UI
framework permits UI operations only on the main thread. Or perhaps those
fibers need access to main-thread data, and it would be too expensive in
runtime (or development time) to robustly defend every such data item with
thread synchronization primitives.

You must ensure that the application[s] main loop ['itself] doesn[t] monopolize
the processor: that the fibers it launches will get the CPU cycles they need.

The solution is the same as for any fiber that might claim the CPU for an
extended time: introduce calls to [ns_function_link this_fiber..yield]. The
most straightforward approach is to call `yield()` on every iteration of your
existing main loop. In effect, this unifies the application[s] main loop with
__boost_fiber__[s] internal main loop. `yield()` allows the fiber manager to
run any fibers that have become ready since the previous iteration of the
application[s] main loop. When these fibers have had a turn, control passes to
the thread[s] main fiber, which returns from `yield()` and resumes the
application[s] main loop.

[endsect]
[#embedded_main_loop]
[section Embedded Main Loop]

More challenging is when the application[s] main loop is embedded in some other
library or framework. Such an application will typically, after performing all
necessary setup, pass control to some form of `run()` function from which
control does not return until application shutdown.

A __boost_asio__ program might call
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service/run.html
`io_service::run()`] in this way.

In general, the trick is to arrange to pass control to [ns_function_link
this_fiber..yield] frequently. You could use an
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/high_resolution_timer.html
Asio timer] for that purpose. You could instantiate the timer, arranging to
call a handler function when the timer expires.
The handler function could call `yield()`, then reset the timer and arrange to
wake up again on its next expiration.

Since, in this thought experiment, we always pass control to the fiber manager
via `yield()`, the calling fiber is never blocked. Therefore there is always
at least one ready fiber. Therefore the fiber manager never calls [member_link
algorithm..suspend_until].

Using
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service/post.html
`io_service::post()`] instead of setting a timer for some nonzero interval
would be unfriendly to other threads. When all I/O is pending and all fibers
are blocked, the io_service and the fiber manager would simply spin the CPU,
passing control back and forth to each other. Using a timer allows tuning the
responsiveness of this thread relative to others.

[endsect]
[section Deeper Dive into __boost_asio__]

By now the alert reader is thinking: but surely, with Asio in particular, we
ought to be able to do much better than periodic polling pings!

[/ @path link is relative to (eventual) doc/html/index.html, hence ../..]
This turns out to be surprisingly tricky. We present a possible approach in
[@../../examples/asio/round_robin.hpp `examples/asio/round_robin.hpp`].

[import ../examples/asio/round_robin.hpp]
[import ../examples/asio/autoecho.cpp]

One consequence of using __boost_asio__ is that you must always let Asio
suspend the running thread. Since Asio is aware of pending I/O requests, it
can arrange to suspend the thread in such a way that the OS will wake it on
I/O completion. No one else has sufficient knowledge.

So the fiber scheduler must depend on Asio for suspension and resumption. It
requires Asio handler calls to wake it.

One dismaying implication is that we cannot support multiple threads calling
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service/run.html
`io_service::run()`] on the same `io_service` instance. The reason is that
Asio provides no way to constrain a particular handler to be called only on a
specified thread. A fiber scheduler instance is locked to a particular thread:
that instance cannot manage any other thread[s] fibers. Yet if we allow
multiple threads to call `io_service::run()` on the same `io_service`
instance, a fiber scheduler which needs to sleep can have no guarantee that it
will reawaken in a timely manner. It can set an Asio timer, as described above
[mdash] but that timer[s] handler may well execute on a different thread!

Another implication is that since an Asio-aware fiber scheduler (not to
mention [link callbacks_asio `boost::fibers::asio::yield`]) depends on handler
calls from the `io_service`, it is the application[s] responsibility to ensure
that
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service/stop.html
`io_service::stop()`] is not called until every fiber has terminated.

It is easier to reason about the behavior of the presented `asio::round_robin`
scheduler if we require that after initial setup, the thread[s] main fiber is
the fiber that calls `io_service::run()`, so let[s] impose that requirement.

Naturally, the first thing we must do on each thread using a custom fiber
scheduler is call [function_link use_scheduling_algorithm]. However, since
`asio::round_robin` requires an `io_service` instance, we must first declare
that.

[asio_rr_setup]

`use_scheduling_algorithm()` instantiates `asio::round_robin`, which naturally
calls its constructor:

[asio_rr_ctor]

`asio::round_robin` binds the passed `io_service` pointer and initializes a
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/steady_timer.html
`boost::asio::steady_timer`]:

[asio_rr_suspend_timer]

Then it calls
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/add_service.html
`boost::asio::add_service()`] with a nested `service` struct:

[asio_rr_service_top]
            ...
[asio_rr_service_bottom]

The `service` struct has a couple of roles.

Its foremost role is to manage a
[^std::unique_ptr<[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service__work.html
`boost::asio::io_service::work`]>]. We want the `io_service` instance to
continue its main loop even when there is no pending Asio I/O.

But when
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service__service/shutdown_service.html
`boost::asio::io_service::service::shutdown_service()`] is called, we discard
the `io_service::work` instance so the `io_service` can shut down properly.

Its other purpose is to
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service/post.html
`post()`] a lambda (not yet shown).
Let[s] walk further through the example program before coming back to explain
that lambda.

The `service` constructor returns to `asio::round_robin`[s] constructor,
which returns to `use_scheduling_algorithm()`, which returns to the
application code.

Once it has called `use_scheduling_algorithm()`, the application may now
launch some number of fibers:

[asio_rr_launch_fibers]

Since we don[t] specify a [class_link launch], these fibers are ready
to run, but have not yet been entered.

Having set everything up, the application calls
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service/run.html
`io_service::run()`]:

[asio_rr_run]

Now what?

Because this `io_service` instance owns an `io_service::work` instance,
`run()` does not immediately return. But [mdash] none of the fibers that will
perform actual work has even been entered yet!

Without that initial `post()` call in `service`[s] constructor, ['nothing]
would happen. The application would hang right here.

So, what should the `post()` handler execute? Simply [ns_function_link
this_fiber..yield]?

That would be a promising start. But we have no guarantee that any of the
other fibers will initiate any Asio operations to keep the ball rolling. For
all we know, every other fiber could reach a similar `boost::this_fiber::yield()`
call first. Control would return to the `post()` handler, which would return
to Asio, and... the application would hang.

The `post()` handler could `post()` itself again. But as discussed in [link
embedded_main_loop the previous section], once there are actual I/O operations
in flight [mdash] once we reach a state in which no fiber is ready [mdash]
that would cause the thread to spin.

We could, of course, set an Asio timer [mdash] again as [link
embedded_main_loop previously discussed]. But in this ["deeper dive,] we[,]re
trying to do a little better.

The key to doing better is that since we[,]re in a fiber, we can run an actual
loop [mdash] not just a chain of callbacks. We can wait for ["something to
happen] by calling
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service/run_one.html
`io_service::run_one()`] [mdash] or we can execute already-queued Asio
handlers by calling
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service/poll.html
`io_service::poll()`].

Here[s] the body of the lambda passed to the `post()` call.

[asio_rr_service_lambda]

We want this loop to exit once the `io_service` instance has been
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/io_service/stopped.html
`stopped()`].

As long as there are ready fibers, we interleave running ready Asio handlers
with running ready fibers.

If there are no ready fibers, we wait by calling `run_one()`. Once any Asio
handler has been called [mdash] no matter which [mdash] `run_one()` returns.
That handler may have transitioned some fiber to ready state, so we loop back
to check again.

(We won[t] describe `awakened()`, `pick_next()` or `has_ready_fibers()`, as
these are just like [member_link round_robin..awakened], [member_link
round_robin..pick_next] and [member_link round_robin..has_ready_fibers].)

That leaves `suspend_until()` and `notify()`.

Doubtless you have been asking yourself: why are we calling
`io_service::run_one()` in the lambda loop? Why not call it in
`suspend_until()`, whose very API was designed for just such a purpose?

Under normal circumstances, when the fiber manager finds no ready fibers, it
calls [member_link algorithm..suspend_until]. Why test
`has_ready_fibers()` in the lambda loop? Why not leverage the normal
mechanism?

The answer is: it matters who[s] asking.

Consider the lambda loop shown above. The only __boost_fiber__ APIs it engages
are `has_ready_fibers()` and [ns_function_link this_fiber..yield]. `yield()`
does not ['block] the calling fiber: the calling fiber does not become
unready. It is immediately passed back to [member_link
algorithm..awakened], to be resumed in its turn when all other ready
fibers have had a chance to run. In other words: during a `yield()` call,
['there is always at least one ready fiber.]

As long as this lambda loop is still running, the fiber manager does not call
`suspend_until()` because it always has a fiber ready to run.

However, the lambda loop ['itself] can detect the case when no ['other] fibers are
ready to run: the running fiber is not ['ready] but ['running.]

That said, `suspend_until()` and `notify()` are in fact called during orderly
shutdown processing, so let[s] try a plausible implementation.

[asio_rr_suspend_until]

As you might expect, `suspend_until()` sets an
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/steady_timer.html
`asio::steady_timer`] to
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/basic_waitable_timer/expires_at.html
`expires_at()`] the passed
[@http://en.cppreference.com/w/cpp/chrono/steady_clock
`std::chrono::steady_clock::time_point`]. Usually.

As indicated in comments, we avoid setting `suspend_timer_` multiple times to
the ['same] `time_point` value since every `expires_at()` call cancels any
previous
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/basic_waitable_timer/async_wait.html
`async_wait()`] call. There is a chance that we could spin. Reaching
`suspend_until()` means the fiber manager intends to yield the processor to
Asio. Cancelling the previous `async_wait()` call would fire its handler,
causing `run_one()` to return, potentially causing the fiber manager to call
`suspend_until()` again with the same `time_point` value...

Given that we suspend the thread by calling `io_service::run_one()`, what[s]
important is that our `async_wait()` call will cause a handler to run, which
will cause `run_one()` to return. It[s] not so important specifically what
that handler does.

[asio_rr_notify]

Since an `expires_at()` call cancels any previous `async_wait()` call, we can
make `notify()` simply call `steady_timer::expires_at()`. That should cause
the `io_service` to call the `async_wait()` handler with `operation_aborted`.

The comments in `notify()` explain why we call `expires_at()` rather than
[@http://www.boost.org/doc/libs/release/doc/html/boost_asio/reference/basic_waitable_timer/cancel.html
`cancel()`].

[/ @path link is relative to (eventual) doc/html/index.html, hence ../..]
This `boost::fibers::asio::round_robin` implementation is used in
[@../../examples/asio/autoecho.cpp `examples/asio/autoecho.cpp`].

It seems possible that you could put together a more elegant Fiber / Asio
integration. But as noted at the outset: it[s] tricky.

[endsect]
[endsect]