1// Copyright 2022-2024 The Khronos Group Inc. 2// 3// SPDX-License-Identifier: CC-BY-4.0 4 5# Proposal: `VK_EXT_frame_boundary` 6:toc: left 7 8`VK_EXT_frame_boundary` is a device extension that helps *tools* (such as 9debuggers) to group queue submissions per frames in non-trivial scenarios, 10typically when `vkQueuePresentKHR` is not a relevant frame boundary delimiter. 11 12See also the discussion in https://gitlab.khronos.org/vulkan/vulkan/-/issues/2436 13 14## Problem Statement 15 16Various Vulkan tools (e.g. debuggers) use a layer to monitor all Vulkan 17commands, and need to know where a frame starts and ends. In general, 18`vkQueuePresentKHR` is a natural frame boundary delimiter: a frame is made by 19all commands between two `vkQueuePresentKHR`. However, there are scenarios where 20`vkQueuePresentKHR` is not a suitable frame boundary delimiter. The 21`VK_EXT_frame_boundary` device extension lets application developers indicate to 22tools where the frames start and end. 23 24Note: here, "`frame`" is understood as "`a unit of workload that spans one or more 25queue submissions`". The notion of frame is application-dependent. In graphical 26applications, a frame is typically the work needed to render into the image that 27is eventually presented. In a compute-only application, a frame could be a set 28of compute kernels dispatches that treat one unit of work. 29 30There are a number of cases where `vkQueuePresentKHR` is not a suitable frame 31boundary delimiter. 32 33### Graphics applications bypassing the Khronos swapchain 34 35Graphics applications are not tied to use the Khronos swapchain, and they may 36interact directly with a platform presentation engine. In this case, they will 37not call `vkQueuePresentKHR`. 38 39### Compute-only applications 40 41Compute-only applications typically do not interact with a presentation engine at 42all, so they would not call `vkQueuePresentKHR`. 43 44### Overlapping frames 45 46A graphics application may pipeline its frame preparation such that work for 47different frames is being submitted in an interleaved way, a scenario we call 48here "`overlapping frames`". 49 50For instance, consider a graphics app where each frame is executed via two 51`vkQueueSubmit` followed by a `vkQueuePresentKHR`, and frame preparation is 52pipelined, such that the serialized commands look like: 53 54.... 55vkQueueSubmit(); // 1st submit of Frame N 56vkQueueSubmit(); // 2nd submit of Frame N-1 57vkQueuePresentKHR(); // present Frame N-1 58vkQueueSubmit(); // 1st submit of Frame N+1 59vkQueueSubmit(); // 2nd submit of Frame N 60vkQueuePresentKHR(); // present Frame N 61.... 62 63Here, relying on `vkQueuePresentKHR` as a frame boundary delimiter would lead to 64the erroneous grouping of queue submissions of different frames as the work of a 65single frame. 66 67## Solution Space 68 69* Debug utilities: existing debug utilities let us tag Vulkan objects, could we 70 use that to identify which work belongs to which frame? We can think of 71 tagging the command buffers submitted with a frame identifier. However some 72 command buffers may be used concurrently by several frames, so that is not a 73 viable approach. In the same vein, queue debug label regions are not 74 satisfactory since they cannot handle overlapping frames. 75 76What we need is a way to associate a frame identifier to the one or more queue 77submissions that submit the work for this frame. This is what the 78VK_EXT_frame_boundary extension does. 79 80## Proposal 81 82### Overview 83 84We want applications to be able to group queue submissions by frames. To this 85aim, we let applications tag queue submissions with a `uint64_t` frame 86identifier. However, this is not sufficient: given that a frame may span more 87than one queue submissions, in order to know when a frame ends, tools also need 88to know which queue submission is the last one for a given frame. So in addition 89to the frame identifier, we also want to be able to tag queue submissions with a 90"`frame end`" flag, to mark the last submission for a given frame identifier. 91 92There is one clarification left to do: the "`frame end`" submission is the 93"`logical last`" queue submission, but in the presence of timeline semaphores it 94may not be the last one to be submitted. Since timeline semaphores permit queue 95submissions to wait on semaphores whose signal is not yet submitted, the 96semaphore meant to be the last part of work for a given frame may not be the 97last one to be submitted. In this context, we want to mark the "`frame end`" 98submission as the one that is logically the last submission for the frame: if 99this submission waits on semaphores whose signal is not yet submitted, then all 100subsequent submissions with the same frame identifier until the submission that 101signals these semaphores are also associated to that frame. 102 103To illustrate this on a small example, considering serialized Vulkan commands: 104 105.... 106// At this point, the latest signal of timeline semaphore TLS set its value to 1 107 108// Logical last submission for frame N, wait on TLS value 2 109vkQueueSubmit( frameID:N, frameEnd, wait:TLS(2) ) 110 111// The actual final submission, which unblocks the previous one, is also part 112// of the work for frame N, even if in submit order it comes after the frameEnd 113// submission. 114vkQueueSubmit( frameID:N, signal:TLS(2) ) 115.... 116 117So we want a way to tag queue submissions with a `uint64_t` frame identifier, 118and a frameEnd flag. To this aim, the `VK_EXT_frame_boundary` device extension 119defines the new `VkFrameBoundaryEXT` type that is meant to be passed in queue 120submission pNext chains. 121 122### The `VkFrameBoundaryEXT` type 123 124The `VK_EXT_frame_boundary` device extension defines a new 125`VkFrameBoundaryEXT` type that is meant to be added to pNext chains of queue 126submissions, such as `VkSubmitInfo`, `VkSubmitInfo2`, `VkBindSparseInfo` 127or `VkPresentInfoKHR`. This type looks like: 128 129.... 130// Flags 131typedef enum VkFrameBoundaryFlagBitsEXT { 132 VK_FRAME_BOUNDARY_FRAME_END_BIT_EXT = 0x00000001, 133} VkFrameBoundaryFlagBitsEXT; 134typedef VkFlags VkFrameBoundaryFlagsEXT; 135 136// VkFrameBoundaryEXT can be passed in any queue submission's pNext chain 137typedef struct VkFrameBoundaryEXT { 138 VkStructureType sType; 139 const void* pNext; 140 141 // Necessary members: 142 // flags is necessary to mark the last submission of a frame 143 VkFrameBoundaryFlagsEXT flags; 144 // frameID is necessary to disambiguate overlapping frames 145 uint64_t frameID; 146 147 // Extra members: provide a list of objects which No need to pass the layout as 148 // trace-replay tools will track the layout anyway. 149 uint32_t imageCount; 150 const VkImage* pImages; 151 uint32_t bufferCount; 152 const VkBuffer* pBuffers; 153 154 // Extra info can be passed with an arbitrary tag payload, typically 155 // a tool-specific struct. 156 uint64_t tagName; 157 size_t tagSize; 158 const void* pTag; 159} VkFrameBoundaryEXT; 160.... 161 162Where: 163 164. `flags` provides a way to tag submissions with a frameEnd flag. 165 166. `frameID` provides a way to tag submissions with a frame identifier. 167 168In addition to these two necessary members, we have a few extras: 169 170. a list of VkImage: this makes this extension as expressive as 171 `vkQueuePresentKHR`, the classic frame boundary delimiter. For the classic 172 frame-oriented graphics workloads, it is convenient to have a list of images 173 storing the final frame renderings. We do not need the image layout as the 174 trace-replay tools would have to track image layout already anyway. 175 176. a list of VkBuffer: which allows applications that do not produce their 177 final result as an image (eg. compute applications) to provide the final 178 result of the frame. 179 180. a way to attach a binary payload: this can be used to pass tool-specific 181 extra information. 182 183### Validation 184 185Since the concept of a frame is application dependent, there is no way to 186validate relevant use of frame identifier. As such there is no restrictions 187imposed on frame identifiers and is the responsibility of the application 188to use them in a relevant way. 189 190In practice it is advised that applications use a single monotonically 191increasing counter to base their frame identifiers on and not to reuse 192identifiers between separate frames. 193 194However, there is no way for the validation layer to detect an application 195not adhering to these rules, since the validation layer has no idea which 196submissions should be grouped together, so a valid grouping like this might 197be flagged as invalid because of the application using wait before signal: 198 199.... 200vkQueueSubmit( frame:0 ) // start of a frame 201vkQueueSubmit( frame:0 ) // part of the frame 202vkQueueSubmit( frame:0, frameEnd, wait:TLS(42) ) // logical end, waiting on a not-yet-signaled TLS 203vkQueueSubmit( frame:0, signal:TLS(42) ) // this is still part of the current frame, after the frameEnd marker. 204.... 205 206## Examples 207 208### Compute-only 209 210Compute-only that want to split their work into frames can do so with: 211 212.... 213vkQueueSubmit( frame:N ) // Zero or more submits for frame N 214vkQueueSubmit( frame:N, frameEnd ) // Last submit for frame N 215 216vkQueueSubmit( frame:N+1 ) // Zero or more submits for frame N+1 217vkQueueSubmit( frame:N+1, frameEnd ) // Last submit for frame N+1 218.... 219 220### Graphics, sequential frames, not using the KHR swapchain 221 222A graphics application that prepare frames in sequence (as opposed to 223overlapping frames), but makes no use of the KHR swapchain, can group 224submissions with: 225 226.... 227vkQueueSubmit( frame:N ) // Zero or more submits for frame N 228vkQueueSubmit( frame:N, frameEnd, imageCount:1, pImages:0x12345 ) // Last submit for frame N 229// here code that passes pImages to the presentation engine 230 231vkQueueSubmit( frame:N+1 ) // Zero or more submits for frame N+1 232vkQueueSubmit( frame:N+1, frameEnd, imageCount:1, pImages:0x54321 ) // Last submit for frame N+1 233// here code that passes pImages to the presentation engine 234.... 235 236### Overlapping frames with wait-after-signal 237 238A graphics application with overlapping frames and wait-after-signal (that may 239be due to multithreading, here we look at a serialized view of Vulkan commands), 240can group queue submissions per frame with: 241 242.... 243vkQueueSubmit( frame:N ); // 1st submit of frame N 244 245vkQueueSubmit( frame:N-1 ); // Some other submissions for an other frame 246vkQueueSubmit( frame:N+1 ); // Some other submissions for an other frame 247 248// 2nd submit of frame N, logically the last one, but waits on a TLS not yet 249// signalled for that value 250vkQueueSubmit( frame:N, frameEnd, wait:TLS(42) ); 251 252vkQueueSubmit( frame:... ); // Some other submissions for other frames 253 254// 3rd submit of frame N, not the logical last one, but the last one in submit 255// order (here serialized) since it signals the TLS on which the logical last 256// submission waits 257vkQueueSubmit( frame:N, signal:TLS(42) ); 258.... 259 260## Issues 261 262### RESOLVED: What should this extension be named? 263 264VK_EXT_frame_boundary. 265 266"`Frame`" is still the best word to convey the meaning of "`a unit of workload 267spanning one or more queue submissions`". "`Boundary`" might be seen as too 268specific since this can be seen more generally as tagging queue submissions 269with frame identifiers, but really the goal of this tagging is precisely to 270know when a frame starts and ends, i.e. to know its boundaries. 271 272### RESOLVED: What information should be included in VkFrameBoundaryEXT? 273 274Beyond the necessary flags and frameID, we keep only a list of objects that 275contain the end result of the frame, and a binary blob where other extra info 276can be provided. 277 278The list of VkImage and VkBuffer objects allow the application to provide the 279end result of the frame. There is no need to provide extra information about 280the object like the layout of these images since capture-replay tools would 281track the Vulkan state whilst the application is running. 282 283The list of VkImage lets this extension be as expressive as 284`vkQueuePresentKHR`, which has a list of swapchain images. 285 286A binary blob (called "`tag`" to be homogeneous with 287VkDebugUtilsObjectTagInfoEXT), allows tools to define their own data containing 288any extra information that is required and update this without having to change 289the Vulkan specification. 290 291### RESOLVED: How should frame identifiers be validated? 292 293Do not impose conditions on frame identifiers. 294 295Frame identifiers are just a way to indicate to tools how to group queue 296submissions, and that there is no ground to impose any kind of monotonic 297increase. Frame identifiers may be reused and the application is responsible to 298reuse them in a "`safe`" way. In practice it is advised that applications do not 299reuse frame identifiers, but if the application is not careful when reusing 300frame identifiers, it only makes a difference for tools, so it should not have 301a semantic impact. 302