1<?xml version='1.0' encoding='utf-8' ?> 2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [ 3<!ENTITY % BOOK_ENTITIES SYSTEM "Wayland.ent"> 4%BOOK_ENTITIES; 5]> 6<chapter id="chap-Wayland-Architecture"> 7 <title>Wayland Architecture</title> 8 <section id="sect-Wayland-Architecture-wayland_architecture"> 9 <title>X vs. Wayland Architecture</title> 10 <para> 11 A good way to understand the Wayland architecture 12 and how it is different from X is to follow an event 13 from the input device to the point where the change 14 it affects appears on screen. 15 </para> 16 <para> 17 This is where we are now with X: 18 </para> 19 <figure> 20 <title>X architecture diagram</title> 21 <mediaobjectco> 22 <imageobjectco> 23 <areaspec id="map1" units="other" otherunits="imagemap"> 24 <area id="area1_1" linkends="x_flow_1" x_steal="#step_1"/> 25 <area id="area1_2" linkends="x_flow_2" x_steal="#step_2"/> 26 <area id="area1_3" linkends="x_flow_3" x_steal="#step_3"/> 27 <area id="area1_4" linkends="x_flow_4" x_steal="#step_4"/> 28 <area id="area1_5" linkends="x_flow_5" x_steal="#step_5"/> 29 <area id="area1_6" linkends="x_flow_6" x_steal="#step_6"/> 30 </areaspec> 31 <imageobject> 32 <imagedata fileref="images/x-architecture.png" format="PNG" /> 33 </imageobject> 34 </imageobjectco> 35 </mediaobjectco> 36 </figure> 37 <para> 38 <orderedlist> 39 <listitem id="x_flow_1"> 40 <para> 41 The kernel gets an event from an input 42 device and sends it to X through the evdev 43 input driver. The kernel does all the hard 44 work here by driving the device and 45 translating the different device specific 46 event protocols to the linux evdev input 47 event standard. 48 </para> 49 </listitem> 50 <listitem id="x_flow_2"> 51 <para> 52 The X server determines which window the 53 event affects and sends it to the clients 54 that have selected for the event in question 55 on that window. The X server doesn't 56 actually know how to do this right, since 57 the window location on screen is controlled 58 by the compositor and may be transformed in 59 a number of ways that the X server doesn't 60 understand (scaled down, rotated, wobbling, 61 etc). 62 </para> 63 </listitem> 64 <listitem id="x_flow_3"> 65 <para> 66 The client looks at the event and decides 67 what to do. Often the UI will have to change 68 in response to the event - perhaps a check 69 box was clicked or the pointer entered a 70 button that must be highlighted. Thus the 71 client sends a rendering request back to the 72 X server. 73 </para> 74 </listitem> 75 <listitem id="x_flow_4"> 76 <para> 77 When the X server receives the rendering 78 request, it sends it to the driver to let it 79 program the hardware to do the rendering. 80 The X server also calculates the bounding 81 region of the rendering, and sends that to 82 the compositor as a damage event. 83 </para> 84 </listitem> 85 <listitem id="x_flow_5"> 86 <para> 87 The damage event tells the compositor that 88 something changed in the window and that it 89 has to recomposite the part of the screen 90 where that window is visible. The compositor 91 is responsible for rendering the entire 92 screen contents based on its scenegraph and 93 the contents of the X windows. Yet, it has 94 to go through the X server to render this. 95 </para> 96 </listitem> 97 <listitem id="x_flow_6"> 98 <para> 99 The X server receives the rendering requests 100 from the compositor and either copies the 101 compositor back buffer to the front buffer 102 or does a pageflip. In the general case, the 103 X server has to do this step so it can 104 account for overlapping windows, which may 105 require clipping and determine whether or 106 not it can page flip. However, for a 107 compositor, which is always fullscreen, this 108 is another unnecessary context switch. 109 </para> 110 </listitem> 111 </orderedlist> 112 </para> 113 <para> 114 As suggested above, there are a few problems with this 115 approach. The X server doesn't have the information to 116 decide which window should receive the event, nor can it 117 transform the screen coordinates to window-local 118 coordinates. And even though X has handed responsibility for 119 the final painting of the screen to the compositing manager, 120 X still controls the front buffer and modesetting. Most of 121 the complexity that the X server used to handle is now 122 available in the kernel or self contained libraries (KMS, 123 evdev, mesa, fontconfig, freetype, cairo, Qt etc). In 124 general, the X server is now just a middle man that 125 introduces an extra step between applications and the 126 compositor and an extra step between the compositor and the 127 hardware. 128 </para> 129 <para> 130 In Wayland the compositor is the display server. We transfer 131 the control of KMS and evdev to the compositor. The Wayland 132 protocol lets the compositor send the input events directly 133 to the clients and lets the client send the damage event 134 directly to the compositor: 135 </para> 136 <figure> 137 <title>Wayland architecture diagram</title> 138 <mediaobjectco> 139 <imageobjectco> 140 <areaspec id="mapB" units="other" otherunits="imagemap"> 141 <area id="areaB_1" linkends="wayland_flow_1" x_steal="#step_1"/> 142 <area id="areaB_2" linkends="wayland_flow_2" x_steal="#step_2"/> 143 <area id="areaB_3" linkends="wayland_flow_3" x_steal="#step_3"/> 144 <area id="areaB_4" linkends="wayland_flow_4" x_steal="#step_4"/> 145 </areaspec> 146 <imageobject> 147 <imagedata fileref="images/wayland-architecture.png" format="PNG" /> 148 </imageobject> 149 </imageobjectco> 150 </mediaobjectco> 151 </figure> 152 <para> 153 <orderedlist> 154 <listitem id="wayland_flow_1"> 155 <para> 156 The kernel gets an event and sends 157 it to the compositor. This 158 is similar to the X case, which is 159 great, since we get to reuse all the 160 input drivers in the kernel. 161 </para> 162 </listitem> 163 <listitem id="wayland_flow_2"> 164 <para> 165 The compositor looks through its 166 scenegraph to determine which window 167 should receive the event. The 168 scenegraph corresponds to what's on 169 screen and the compositor 170 understands the transformations that 171 it may have applied to the elements 172 in the scenegraph. Thus, the 173 compositor can pick the right window 174 and transform the screen coordinates 175 to window-local coordinates, by 176 applying the inverse 177 transformations. The types of 178 transformation that can be applied 179 to a window is only restricted to 180 what the compositor can do, as long 181 as it can compute the inverse 182 transformation for the input events. 183 </para> 184 </listitem> 185 <listitem id="wayland_flow_3"> 186 <para> 187 As in the X case, when the client 188 receives the event, it updates the 189 UI in response. But in the Wayland 190 case, the rendering happens in the 191 client, and the client just sends a 192 request to the compositor to 193 indicate the region that was 194 updated. 195 </para> 196 </listitem> 197 <listitem id="wayland_flow_4"> 198 <para> 199 The compositor collects damage 200 requests from its clients and then 201 recomposites the screen. The 202 compositor can then directly issue 203 an ioctl to schedule a pageflip with 204 KMS. 205 </para> 206 </listitem> 207 208 209 </orderedlist> 210 </para> 211 </section> 212 <section id="sect-Wayland-Architecture-wayland_rendering"> 213 <title>Wayland Rendering</title> 214 <para> 215 One of the details I left out in the above overview 216 is how clients actually render under Wayland. By 217 removing the X server from the picture we also 218 removed the mechanism by which X clients typically 219 render. But there's another mechanism that we're 220 already using with DRI2 under X: direct rendering. 221 With direct rendering, the client and the server 222 share a video memory buffer. The client links to a 223 rendering library such as OpenGL that knows how to 224 program the hardware and renders directly into the 225 buffer. The compositor in turn can take the buffer 226 and use it as a texture when it composites the 227 desktop. After the initial setup, the client only 228 needs to tell the compositor which buffer to use and 229 when and where it has rendered new content into it. 230 </para> 231 232 <para> 233 This leaves an application with two ways to update its window contents: 234 </para> 235 <para> 236 <orderedlist> 237 <listitem> 238 <para> 239 Render the new content into a new buffer and tell the compositor 240 to use that instead of the old buffer. The application can 241 allocate a new buffer every time it needs to update the window 242 contents or it can keep two (or more) buffers around and cycle 243 between them. The buffer management is entirely under 244 application control. 245 </para> 246 </listitem> 247 <listitem> 248 <para> 249 Render the new content into the buffer that it previously 250 told the compositor to to use. While it's possible to just 251 render directly into the buffer shared with the compositor, 252 this might race with the compositor. What can happen is that 253 repainting the window contents could be interrupted by the 254 compositor repainting the desktop. If the application gets 255 interrupted just after clearing the window but before 256 rendering the contents, the compositor will texture from a 257 blank buffer. The result is that the application window will 258 flicker between a blank window or half-rendered content. The 259 traditional way to avoid this is to render the new content 260 into a back buffer and then copy from there into the 261 compositor surface. The back buffer can be allocated on the 262 fly and just big enough to hold the new content, or the 263 application can keep a buffer around. Again, this is under 264 application control. 265 </para> 266 </listitem> 267 </orderedlist> 268 </para> 269 <para> 270 In either case, the application must tell the compositor 271 which area of the surface holds new contents. When the 272 application renders directly to the shared buffer, the 273 compositor needs to be noticed that there is new content. 274 But also when exchanging buffers, the compositor doesn't 275 assume anything changed, and needs a request from the 276 application before it will repaint the desktop. The idea 277 that even if an application passes a new buffer to the 278 compositor, only a small part of the buffer may be 279 different, like a blinking cursor or a spinner. 280 </para> 281 </section> 282 <section id="sect-Wayland-Architecture-wayland_hw_enabling"> 283 <title>Hardware Enabling for Wayland</title> 284 <para> 285 Typically, hardware enabling includes modesetting/display 286 and EGL/GLES2. On top of that Wayland needs a way to share 287 buffers efficiently between processes. There are two sides 288 to that, the client side and the server side. 289 </para> 290 <para> 291 On the client side we've defined a Wayland EGL platform. In 292 the EGL model, that consists of the native types 293 (EGLNativeDisplayType, EGLNativeWindowType and 294 EGLNativePixmapType) and a way to create those types. In 295 other words, it's the glue code that binds the EGL stack and 296 its buffer sharing mechanism to the generic Wayland API. The 297 EGL stack is expected to provide an implementation of the 298 Wayland EGL platform. The full API is in the wayland-egl.h 299 header. The open source implementation in the mesa EGL stack 300 is in wayland-egl.c and platform_wayland.c. 301 </para> 302 <para> 303 Under the hood, the EGL stack is expected to define a 304 vendor-specific protocol extension that lets the client side 305 EGL stack communicate buffer details with the compositor in 306 order to share buffers. The point of the wayland-egl.h API 307 is to abstract that away and just let the client create an 308 EGLSurface for a Wayland surface and start rendering. The 309 open source stack uses the drm Wayland extension, which lets 310 the client discover the drm device to use and authenticate 311 and then share drm (GEM) buffers with the compositor. 312 </para> 313 <para> 314 The server side of Wayland is the compositor and core UX for 315 the vertical, typically integrating task switcher, app 316 launcher, lock screen in one monolithic application. The 317 server runs on top of a modesetting API (kernel modesetting, 318 OpenWF Display or similar) and composites the final UI using 319 a mix of EGL/GLES2 compositor and hardware overlays if 320 available. Enabling modesetting, EGL/GLES2 and overlays is 321 something that should be part of standard hardware bringup. 322 The extra requirement for Wayland enabling is the 323 EGL_WL_bind_wayland_display extension that lets the 324 compositor create an EGLImage from a generic Wayland shared 325 buffer. It's similar to the EGL_KHR_image_pixmap extension 326 to create an EGLImage from an X pixmap. 327 </para> 328 <para> 329 The extension has a setup step where you have to bind the 330 EGL display to a Wayland display. Then as the compositor 331 receives generic Wayland buffers from the clients (typically 332 when the client calls eglSwapBuffers), it will be able to 333 pass the struct wl_buffer pointer to eglCreateImageKHR as 334 the EGLClientBuffer argument and with EGL_WAYLAND_BUFFER_WL 335 as the target. This will create an EGLImage, which can then 336 be used by the compositor as a texture or passed to the 337 modesetting code to use as an overlay plane. Again, this is 338 implemented by the vendor specific protocol extension, which 339 on the server side will receive the driver specific details 340 about the shared buffer and turn that into an EGL image when 341 the user calls eglCreateImageKHR. 342 </para> 343 </section> 344</chapter> 345