1// Copyright 2021-2022 The Khronos Group Inc. 2// 3// SPDX-License-Identifier: CC-BY-4.0 4 5= VK_QCOM_tile_properties 6:toc: left 7:refpage: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/ 8:sectnums: 9 10This document details API design ideas for the VK_QCOM_tile_properties extension, which allows application to query the tile properties. This extension supports both renderpasses and dynamic rendering. 11 12== Background 13 14Adreno GPUs uses a rendering technique called tiled rendering. In this technique, the attachments are divided into a uniform grid of small regions or "tiles". Tile size is shared by and affects all attachments in use. Splitting a render target into multiple chunks or tiles, and rendering each tile individually in order to reconstruct the full render target can be faster and more power-efficient. 15 16A typical tile size will be such that it is contained completely within the attachment but tiles can span outside the attachment's extent as well. This is because _Number of tiles = ceil(attachment_width / tile_width)_. Such tiles are called partially filled tiles and are less-efficient to render. 17 18In the case of fragment density map, a "local framebuffer region" and all fragments within it will share a value for "fragment area" determined from a corresponding texel in the fragment density map as described in "Fragment Area Conversion" section in the "Fragment Density Map Operations" chapter of the Vulkan specification. Implementations are also free to fetch additional texels within an implementation-defined window as described in "Fragment Area Filter" section of "Fragment Density Map Operations" chapter of the Vulkan specification. Adreno implementations utilize this behavior and will perform both windowing and fragment area assignment within a region defined by a "tile". 19 20The "tiles" exposed in this extension also each define a "framebuffer-local region" as described in the "Framebuffer Region Dependencies" section in the "Synchronization and Cache Control" chapter of the Vulkan specification. 21 22== Problem Statement 23 24Currently, developers do not know how the implementation is tiling their applications. Several application-controlled factors will implicitly influence the tile size, such as attachment resolution, number of attachments, formats, number of samples, etc. 25 26Adreno implementations will window and apply fragment density map values on a tile-basis. Currently, applications are unable to determine the size and location of tiles, preventing them from knowing how their fragment density map will be applied and the final fragment areas that will result. 27 28With regard to framebuffer-local dependencies, applications are unable to determine the size of the framebuffer-local region and thus must assume it is the size of a single fragment or sample. Due to this, applications must use framebuffer-global dependencies outside of single pixel or sample sized regions, possibly at the cost of efficiency. 29 30Another problem is that currently, applications are unable to align the renderArea with tile boundaries which would achieve the most efficient rendering. The command link:{refpage}vkGetRenderAreaGranularity.html[vkGetRenderAreaGranularity] does not allow implementations to fully describe the tiling grid and reported granularity is based solely on a renderpass. 31 32== Solution Space 33 34Create a new extension with API entrypoints that allows developer to query the tile properties. 35 36With the knowledge from this extension, applications can create fragment density maps that will apply in a more direct way to the final fragment areas in use, allowing more purposeful creation of maps. 37 38Information from this extension can also be used to determine the size and location of framebuffer-local regions, allowing applications to use local-region dependencies in place of framebuffer-global ones for potential increases in efficiency. 39 40== Proposal 41 42This extension introduces new API calls and a new struct: 43 44[source,c] 45---- 46VKAPI_ATTR QGLENTRY_ATTR VkResult VKAPI_CALL vkGetFramebufferTilePropertiesQCOM( 47 VkDevice device, 48 VkFramebuffer vkFramebuffer, 49 uint32_t* pPropertiesCount, 50 VkTilePropertiesQCOM* pProperties); 51---- 52 53When using renderpasses, use the above command after framebuffer creation to query the tile properties from the framebuffer. `pPropertiesCount` is a pointer to an integer related to the number of `pProperties` available or queried. `pProperties` is a pointer to an array of `VkTilePropertiesQCOM` structure that holds the returned properties. 54If `pProperties` is NULL, then the total number of tile properties available is returned in `pPropertiesCount`. `pPropertiesCount` must point to a variable set by the user to the number of elements in the `pProperties` array, and on return the variable is overwritten with the number of properties actually written to `pProperties`. If `pPropertiesCount` is less than the number of `pProperties` available, at most `pPropertiesCount` structures will be written, and `VK_INCOMPLETE` will be returned instead of `VK_SUCCESS`, to indicate that not all the available tile properties were returned. 55 56The number of tile properties available is determined by the number of merged subpasses, and each tile property is associated with a merged subpass. There will be at most as many properties as there are subpasses within the render pass. To obtain the tile properties for a given merged subpass, the `pProperties` array can be indexed using the `postMergeIndex` value provided in link:{refpage}VkRenderPassSubpassFeedbackInfoEXT.html[VkRenderPassSubpassFeedbackInfoEXT]. 57 58For dynamic rendering, a new API entrypoint is introduced because it does not have a framebuffer: 59 60[source,c] 61---- 62VKAPI_ATTR QGLENTRY_ATTR VkResult VKAPI_CALL vkGetDynamicRenderingTilePropertiesQCOM( 63 VkDevice device, 64 const VkRenderingInfo* pRenderingInfo, 65 VkTilePropertiesQCOM* pProperties); 66---- 67 68When using dynamic rendering, use the above command to query the tile properties. `pRenderingInfo` is a pointer to the `VkRenderingInfo` structure specifying details of the render pass instance in dynamic rendering. Tile properties are returned in `pProperties` which is a pointer to `VkTilePropertiesQCOM` structure that holds the available properties. 69 70Support for querying tile properties is indicated by feature bit in a 71structure that extends 72link:{refpage}VkPhysicalDeviceFeatures2.html[VkPhysicalDeviceFeatures2]. 73 74[source,c] 75---- 76typedef struct VkPhysicalDeviceTilePropertiesFeaturesQCOM { 77 VkStructureType sType; 78 void* pNext; 79 VkBool32 tileProperties; 80} VkPhysicalDeviceTilePropertiesFeaturesQCOM; 81---- 82 83`tileProperties` indicates that the implementation supports queries for tile 84properties. 85 86A new structure is introduced to hold the tile properties. 87 88[source,c] 89---- 90typedef struct VkTilePropertiesQCOM { 91 VkStructureType sType; 92 void* pNext; 93 VkExtent3D tileSize; 94 VkExtent2D apronSize; 95 VkOffset2D origin; 96} VkTilePropertiesQCOM; 97---- 98 99The reported value for `apronSize` will be zero and its functionality will be described in a future extension. 100 101`tileSize` describes the dimensions of a tile, with width and height describing the width and height of a tile 102in pixels, and depth corresponding to the number of slices the tile spans. All attachments share the same tile 103width and height. The tile depth value reflects the maximum slice count of all in-use attachments. 104 105`origin` is top-left corner of the first tile in attachment space. 106 107All tiles will be tightly packed around the first tile, with edges being multiples of tile width and/or height from the origin. 108 109== Examples 110 111 112=== Query tile properties when using render pass 113 114[source,c] 115---- 116uint32_t subpassCount = 2; 117 118VkTilePropertiesQCOM* tileProperties = 119 malloc(sizeof(VkTilePropertiesQCOM) * subpassCount); 120 121// `device` is a valid VkDevice handle 122// `hFramebuffer` is a handle to a valid VkFramebuffer object that we want to query 123vkGetFramebufferTilePropertiesQCOM(device, hFramebuffer, tileProperties, &subpassCount); 124---- 125 126=== Query tile properties when using dynamic rendering 127 128[source,c] 129---- 130VkRenderingInfoKHR renderingInfo = { 131 .sType = VK_STRUCTURE_TYPE_RENDERING_INFO_KHR, 132 .pNext = NULL, 133 .flags = 0, 134 .renderArea = { ... }, 135 .layerCount = 1, 136 .colorAttachmentCount = 2, 137 .pColorAttachments = colorAttachments, 138 .pDepthAttachment = &depthStencilAttachment, 139 .pStencilAttachment = &depthStencilAttachment }; 140 141 VkTilePropertiesQCOM tileProperties = { 142 .sType = VK_STRUCTURE_TYPE_TILE_PROPERTIES_QCOM, 143 .pNext = NULL, 144 .... }; 145 146// `device` is a valid VkDevice handle 147// `pRenderingInfo` is pointer to the `VkRenderingInfoKHR` struct that was passed to `vkCmdBeginRenderingKHR` 148vkGetDynamicRenderingTilePropertiesQCOM(device, pRenderingInfo, &tileProperties); 149---- 150 151=== Interpreting tile size values 152 153 . If attachment dimensions are (768, 1440) and tile size returned is (768, 480) then it implies that there are three tiles in a (1 x 3) tile-grid. All tiles are full tiles contained within the attachment. 154 155 . If attachment dimensions are (720, 1440) and tile size returned is (768, 480) then it implies that there are three tiles in a (1 x 3) tile-grid. All tiles are _partially filled_ tiles as they span outside the attachment extent. 156 157 . If attachment dimensions are (1920, 1080) and tile size returned is (672, 576) then it implies that there are six tiles in a (3 x 2) tile-grid. Last tiles in each row and column are _partially filled_ tiles as they span outside the attachment extent. 158 159=== Interpreting origin values 160 161 . If returned origin is (0, 0) then the first tile's top-left corner is at the attachment's origin (0,0). 162 163 . If returned origin is (-32, -64) and tile size is (768, 480), then tile boundaries in x will lie at -32, 736, 1504, ... and tile boundaries in y will lie at -64, 416, 896, ...". 164 165== Issues 166 167This section describes issues that came up during discussion and their resolution. 168 169 170=== RESOLVED: How to handle dynamic rendering? 171 172Since the extension should support both renderpasses and dynamic rendering, dedicated API entrypoints were added for both. 173 174=== RESOLVED: This extension returns only one set of dimensions for tile size so how to handle the case of non-merged subpasses where each subpass can have a different tile size? 175 176The extension was modified to return an array of tile properties which holds properties for all requested or available subpassses instead of single value for tile properties. 177 178=== RESOLVED: Adreno implementation may decide to execute certain workloads in direct rendering mode a.k.a Flex render. What is the interaction of this extension with Flex render? 179 180In those cases, the information returned by this extension may not indicate the true execution mode of the GPU. 181 182 183 184 185