1# Considerations around Event Loops 2 3Much of the software we use is written around an **event loop**. Some examples 4 5 - Chrome / Chromium, transmission, tmux, ntp SNTP... [libevent](https://libevent.org/) 6 - node.js / cjdns / Julia / cmake ... [libuv](https://archive.is/64pOt) 7 - Gstreamer, Gnome / GTK apps ... [glib](https://people.gnome.org/~desrt/glib-docs/glib-The-Main-Event-Loop.html) 8 - SystemD ... sdevent 9 - OpenWRT ... uloop 10 11Many applications roll their own event loop using poll() or epoll() or similar, 12using the same techniques. Another set of apps use message dispatchers that 13take the same approach, but are for cases that don't need to support sockets. 14Event libraries provide crossplatform abstractions for this functoinality, and 15provide the best backend for their event waits on the platform automagically. 16 17libwebsockets networking operations require an event loop, it provides a default 18one for the platform (based on poll() for Unix) if needed, but also can natively 19use any of the event loop libraries listed above, including "foreign" loops 20already created and managed by the application. 21 22## What is an 'event loop'? 23 24Event loops have the following characteristics: 25 26 - they have a **single thread**, therefore they do not require locking 27 - they are **not threadsafe** 28 - they require **nonblocking IO** 29 - they **sleep** while there are no events (aka the "event wait") 30 - if one or more event seen, they call back into user code to handle each in 31 turn and then return to the wait (ie, "loop") 32 33### They have a single thread 34 35By doing everything in turn on a single thread, there can be no possibility of 36conflicting access to resources from different threads... if the single thread 37is in callback A, it cannot be in two places at the same time and also in 38callback B accessing the same thing: it can never run any other code 39concurrently, only sequentially, by design. 40 41It means that all mutexes and other synchronization and locking can be 42eliminated, along with the many kinds of bugs related to them. 43 44### They are not threadsafe 45 46Event loops mandate doing everything in a single thread. You cannot call their 47apis from other threads, since there is no protection against reentrancy. 48 49Lws apis cannot be called safely from any thread other than the event loop one, 50with the sole exception of `lws_cancel_service()`. 51 52### They have nonblocking IO 53 54With blocking IO, you have to create threads in order to block them to learn 55when your IO could proceed. In an event loop, all descriptors are set to use 56nonblocking mode, we only attempt to read or write when we have been informed by 57an event that there is something to read, or it is possible to write. 58 59So sacrificial, blocking discrete IO threads are also eliminated, we just do 60what we should do sequentially, when we get the event indicating that we should 61do it. 62 63### They sleep while there are no events 64 65An OS "wait" of some kind is used to sleep the event loop thread until something 66to do. There's an explicit wait on file descriptors that have pending read or 67write, and also an implicit wait for the next scheduled event. Even if idle for 68descriptor events, the event loop will wake and handle scheduled events at the 69right time. 70 71In an idle system, the event loop stays in the wait and takes 0% CPU. 72 73### If one or more event, they handle them and then return to sleep 74 75As you can expect from "event loop", it is an infinite loop alternating between 76sleeping in the event wait and sequentially servicing pending events, by calling 77callbacks for each event on each object. 78 79The callbacks handle the event and then "return to the event loop". The state 80of things in the loop itself is guaranteed to stay consistent while in a user 81callback, until you return from the callback to the event loop, when socket 82closes may be processed and lead to object destruction. 83 84Event libraries like libevent are operating the same way, once you start the 85event loop, it sits in an inifinite loop in the library, calling back on events 86until you "stop" or "break" the loop by calling apis. 87 88## Why are event libraries popular? 89 90Developers prefer an external library solution for the event loop because: 91 92 - the quality is generally higher than self-rolled ones. Someone else is 93 maintaining it, a fulltime team in some cases. 94 - the event libraries are crossplatform, they will pick the most effective 95 event wait for the platform without the developer having to know the details. 96 For example most libs can conceal whether the platform is windows or unix, 97 and use native waits like epoll() or WSA accordingly. 98 - If your application uses a event library, it is possible to integrate very 99 cleanly with other libraries like lws that can use the same event library. 100 That is extremely messy or downright impossible to do with hand-rolled loops. 101 102Compared to just throwing threads on it 103 104 - thread lifecycle has to be closely managed, threads must start and must be 105 brought to an end in a controlled way. Event loops may end and destroy 106 objects they control at any time a callback returns to the event loop. 107 108 - threads may do things sequentially or genuinely concurrently, this requires 109 locking and careful management so only deterministic and expected things 110 happen at the user data. 111 112 - threads do not scale well to, eg, serving tens of thousands of connections; 113 web servers use event loops. 114 115## Multiple codebases cooperating on one event loop 116 117The ideal situation is all your code operates via a single event loop thread. 118For lws-only code, including lws_protocols callbacks, this is the normal state 119of affairs. 120 121When there is other code that also needs to handle events, say already existing 122application code, or code handling a protocol not supported by lws, there are a 123few options to allow them to work together, which is "best" depends on the 124details of what you're trying to do and what the existing code looks like. 125In descending order of desirability: 126 127### 1) Use a common event library for both lws and application code 128 129This is the best choice for Linux-class devices. If you write your application 130to use, eg, a libevent loop, then you only need to configure lws to also use 131your libevent loop for them to be able to interoperate perfectly. Lws will 132operate as a guest on this "foreign loop", and can cleanly create and destroy 133its context on the loop without disturbing the loop. 134 135In addition, your application can merge and interoperate with any other 136libevent-capable libraries the same way, and compared to hand-rolled loops, the 137quality will be higher. 138 139### 2) Use lws native wsi semantics in the other code too 140 141Lws supports raw sockets and file fd abstractions inside the event loop. So if 142your other code fits into that model, one way is to express your connections as 143"RAW" wsis and handle them using lws_protocols callback semantics. 144 145This ties the application code to lws, but it has the advantage that the 146resulting code is aware of the underlying event loop implementation and will 147work no matter what it is. 148 149### 3) Make a custom lws event lib shim for your custom loop 150 151Lws provides an ops struct abstraction in order to integrate with event 152libraries, you can find it in ./includes/libwebsockets/lws-eventlib-exports.h. 153 154Lws uses this interface to implement its own event library plugins, but you can 155also use it to make your own customized event loop shim, in the case there is 156too much written for your custom event loop to be practical to change it. 157 158In other words this is a way to write a customized event lib "plugin" and tell 159the lws_context to use it at creation time. See [minimal-http-server.c](https://libwebsockets.org/git/libwebsockets/tree/minimal-examples/http-server/minimal-http-server-eventlib-custom/minimal-http-server.c) 160 161### 4) Cooperate at thread level 162 163This is less desirable because it gives up on unifying the code to run from a 164single thread, it means the codebases cannot call each other's apis directly. 165 166In this scheme the existing threads do their own thing, lock a shared 167area of memory and list what they want done from the lws thread context, before 168calling `lws_cancel_service()` to break the lws event wait. Lws will then 169broadcast a `LWS_CALLBACK_EVENT_WAIT_CANCELLED` protocol callback, the handler 170for which can lock the shared area and perform the requested operations from the 171lws thread context. 172 173### 5) Glue the loops together to wait sequentially (don't do this) 174 175If you have two or more chunks of code with their own waits, it may be tempting 176to have them wait sequentially in an outer event loop. (This is only possible 177with the lws default loop and not the event library support, event libraries 178have this loop inside their own `...run(loop)` apis.) 179 180``` 181 while (1) { 182 do_lws_wait(); /* interrupted at short intervals */ 183 do_app_wait(); /* interrupted at short intervals */ 184 } 185``` 186 187This never works well, either: 188 189 - the whole thing spins at 100% CPU when idle, or 190 191 - the waits have timeouts where they sleep for short periods, but then the 192 latency to service on set of events is increased by the idle timeout period 193 of the wait for other set of events 194 195## Common Misunderstandings 196 197### "Real Men Use Threads" 198 199Sometimes you need threads or child processes. But typically, whatever you're 200trying to do does not literally require threads. Threads are an architectural 201choice that can go either way depending on the goal and the constraints. 202 203Any thread you add should have a clear reason to specifically be a thread and 204not done on the event loop, without a new thread or the consequent locking (and 205bugs). 206 207### But blocking IO is faster and simpler 208 209No, blocking IO has a lot of costs to conceal the event wait by blocking. 210 211For any IO that may wait, you must spawn an IO thread for it, purely to handle 212the situation you get blocked in read() or write() for an arbitrary amount of 213time. It buys you a simple story in one place, that you will proceed on the 214thread if read() or write() has completed, but costs threads and locking to get 215to that. 216 217Event loops dispense with the threads and locking, and still provide a simple 218story, you will get called back when data arrives or you may send. 219 220Event loops can scale much better, a busy server with 50,000 connections active 221does not have to pay the overhead of 50,000 threads and their competing for 222locking. 223 224With blocked threads, the thread can do no useful work at all while it is stuck 225waiting. With event loops the thread can service other events until something 226happens on the fd. 227 228### Threads are inexpensive 229 230In the cases you really need threads, you must have them, or fork off another 231process. But if you don't really need them, they bring with them a lot of 232expense, some you may only notice when your code runs on constrained targets 233 234 - threads have an OS-side footprint both as objects and in the scheduler 235 236 - thread context switches are not slow on modern CPUs, but have side effects 237 like cache flushing 238 239 - threads are designed to be blocked for arbitrary amounts of time if you use 240 blocking IO apis like write() or read(). Then how much concurrency is really 241 happening? Since blocked threads just go away silently, it is hard to know 242 when in fact your thread is almost always blocked and not doing useful work. 243 244 - threads require their own stack, which is on embedded is typically suffering 245 from a dedicated worst-case allocation where the headroom is usually idle 246 247 - locking must be handled, and missed locking or lock order bugs found 248 249### But... what about latency if only one thing happens at a time? 250 251 - Typically, at CPU speeds, nothing is happening at any given time on most 252 systems, the event loop is spending most of its time in the event wait 253 asleep at 0% cpu. 254 255 - The POSIX sockets layer is disjoint from the actual network device driver. 256 It means that once you hand off the packet to the networking stack, the POSIX 257 api just returns and leaves the rest of the scheduling, retries etc to the 258 networking stack and device, descriptor queuing is driven by interrupts in 259 the driver part completely unaffected by the event loop part. 260 261 - Passing data around via POSIX apis between the user code and the networking 262 stack tends to return almost immediately since its onward path is managed 263 later in another, usually interrupt, context. 264 265 - So long as enough packets-worth of data are in the network stack ready to be 266 handed to descriptors, actual throughput is completely insensitive to jitter 267 or latency at the application event loop 268 269 - The network device itself is inherently serializing packets, it can only send 270 one thing at a time. The networking stack locking also introduces hidden 271 serialization by blocking multiple threads. 272 273 - Many user systems are decoupled like the network stack and POSIX... the user 274 event loop and its latencies do not affect backend processes occurring in 275 interrupt or internal thread or other process contexts 276 277## Conclusion 278 279Event loops have been around for a very long time and are in wide use today due 280to their advantages. Working with them successfully requires understand how to 281use them and why they have the advantages and restrictions they do. 282 283The best results come from all the participants joining the same loop directly. 284Using a common event library in the participating codebases allows completely 285different code can call each other's apis safely without locking. 286