• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1// Copyright 2021-2022 The Khronos Group Inc.
2//
3// SPDX-License-Identifier: CC-BY-4.0
4
5= VK_QCOM_tile_properties
6:toc: left
7:refpage: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/
8:sectnums:
9
10This document details API design ideas for the VK_QCOM_tile_properties extension, which allows application to query the tile properties. This extension supports both renderpasses and dynamic rendering.
11
12== Background
13
14Adreno GPUs uses a rendering technique called tiled rendering. In this technique, the attachments are divided into a uniform grid of small regions or "tiles". Tile size is shared by and affects all attachments in use. Splitting a render target into multiple chunks or tiles, and rendering each tile individually in order to reconstruct the full render target can be faster and more power-efficient.
15
16A typical tile size will be such that it is contained completely within the attachment but tiles can span outside the attachment's extent as well. This is because _Number of tiles = ceil(attachment_width / tile_width)_. Such tiles are called partially filled tiles and are less-efficient to render.
17
18In the case of fragment density map, a "local framebuffer region" and all fragments within it will share a value for "fragment area" determined from a corresponding texel in the fragment density map as described in "Fragment Area Conversion" section in the "Fragment Density Map Operations" chapter of the Vulkan specification. Implementations are also free to fetch additional texels within an implementation-defined window as described in "Fragment Area Filter" section of "Fragment Density Map Operations" chapter of the Vulkan specification. Adreno implementations utilize this behavior and will perform both windowing and fragment area assignment within a region defined by a "tile".
19
20The "tiles" exposed in this extension also each define a "framebuffer-local region" as described in the "Framebuffer Region Dependencies" section in the "Synchronization and Cache Control" chapter of the Vulkan specification.
21
22== Problem Statement
23
24Currently, developers do not know how the implementation is tiling their applications. Several application-controlled factors will implicitly influence the tile size, such as attachment resolution, number of attachments, formats, number of samples, etc.
25
26Adreno implementations will window and apply fragment density map values on a tile-basis. Currently, applications are unable to determine the size and location of tiles, preventing them from knowing how their fragment density map will be applied and the final fragment areas that will result.
27
28With regard to framebuffer-local dependencies, applications are unable to determine the size of the framebuffer-local region and thus must assume it is the size of a single fragment or sample. Due to this, applications must use framebuffer-global dependencies outside of single pixel or sample sized regions, possibly at the cost of efficiency.
29
30Another problem is that currently, applications are unable to align the renderArea with tile boundaries which would achieve the most efficient rendering. The command link:{refpage}vkGetRenderAreaGranularity.html[vkGetRenderAreaGranularity] does not allow implementations to fully describe the tiling grid and reported granularity is based solely on a renderpass.
31
32== Solution Space
33
34Create a new extension with API entrypoints that allows developer to query the tile properties.
35
36With the knowledge from this extension, applications can create fragment density maps that will apply in a more direct way to the final fragment areas in use, allowing more purposeful creation of maps.
37
38Information from this extension can also be used to determine the size and location of framebuffer-local regions, allowing applications to use local-region dependencies in place of framebuffer-global ones for potential increases in efficiency.
39
40== Proposal
41
42This extension introduces new API calls and a new struct:
43
44[source,c]
45----
46VKAPI_ATTR QGLENTRY_ATTR VkResult VKAPI_CALL vkGetFramebufferTilePropertiesQCOM(
47    VkDevice                     device,
48    VkFramebuffer                vkFramebuffer,
49    uint32_t*                    pPropertiesCount,
50    VkTilePropertiesQCOM*        pProperties);
51----
52
53When using renderpasses, use the above command after framebuffer creation to query the tile properties from the framebuffer. `pPropertiesCount` is a pointer to an integer related to the number of `pProperties` available or queried. `pProperties` is a pointer to an array of `VkTilePropertiesQCOM` structure that holds the returned properties.
54If `pProperties` is NULL, then the total number of tile properties available is returned in `pPropertiesCount`. `pPropertiesCount` must point to a variable set by the user to the number of elements in the `pProperties` array, and on return the variable is overwritten with the number of properties actually written to `pProperties`. If `pPropertiesCount` is less than the number of `pProperties` available, at most `pPropertiesCount` structures will be written, and `VK_INCOMPLETE` will be returned instead of `VK_SUCCESS`, to indicate that not all the available tile properties were returned.
55
56The number of tile properties available is determined by the number of merged subpasses, and each tile property is associated with a merged subpass. There will be at most as many properties as there are subpasses within the render pass. To obtain the tile properties for a given merged subpass, the `pProperties` array can be indexed using the `postMergeIndex` value provided in link:{refpage}VkRenderPassSubpassFeedbackInfoEXT.html[VkRenderPassSubpassFeedbackInfoEXT].
57
58For dynamic rendering, a new API entrypoint is introduced because it does not have a framebuffer:
59
60[source,c]
61----
62VKAPI_ATTR QGLENTRY_ATTR VkResult VKAPI_CALL vkGetDynamicRenderingTilePropertiesQCOM(
63    VkDevice                     device,
64    const VkRenderingInfo*       pRenderingInfo,
65    VkTilePropertiesQCOM*        pProperties);
66----
67
68When using dynamic rendering, use the above command to query the tile properties. `pRenderingInfo` is a pointer to the `VkRenderingInfo` structure specifying details of the render pass instance in dynamic rendering. Tile properties are returned in `pProperties` which is a pointer to `VkTilePropertiesQCOM` structure that holds the available properties.
69
70Support for querying tile properties is indicated by feature bit in a
71structure that extends
72link:{refpage}VkPhysicalDeviceFeatures2.html[VkPhysicalDeviceFeatures2].
73
74[source,c]
75----
76typedef struct VkPhysicalDeviceTilePropertiesFeaturesQCOM {
77    VkStructureType    sType;
78    void*              pNext;
79    VkBool32           tileProperties;
80} VkPhysicalDeviceTilePropertiesFeaturesQCOM;
81----
82
83`tileProperties` indicates that the implementation supports queries for tile
84properties.
85
86A new structure is introduced to hold the tile properties.
87
88[source,c]
89----
90typedef struct VkTilePropertiesQCOM {
91    VkStructureType       sType;
92    void*                 pNext;
93    VkExtent3D            tileSize;
94    VkExtent2D            apronSize;
95    VkOffset2D            origin;
96} VkTilePropertiesQCOM;
97----
98
99The reported value for `apronSize` will be zero and its functionality will be described in a future extension.
100
101`tileSize` describes the dimensions of a tile, with width and height describing the width and height of a tile
102in pixels, and depth corresponding to the number of slices the tile spans. All attachments share the same tile
103width and height.  The tile depth value reflects the maximum slice count of all in-use attachments.
104
105`origin` is top-left corner of the first tile in attachment space.
106
107All tiles will be tightly packed around the first tile, with edges being multiples of tile width and/or height from the origin.
108
109== Examples
110
111
112=== Query tile properties when using render pass
113
114[source,c]
115----
116uint32_t subpassCount = 2;
117
118VkTilePropertiesQCOM* tileProperties =
119  malloc(sizeof(VkTilePropertiesQCOM) * subpassCount);
120
121// `device` is a valid VkDevice handle
122// `hFramebuffer` is a handle to a valid VkFramebuffer object that we want to query
123vkGetFramebufferTilePropertiesQCOM(device, hFramebuffer, tileProperties, &subpassCount);
124----
125
126=== Query tile properties when using dynamic rendering
127
128[source,c]
129----
130VkRenderingInfoKHR renderingInfo = {
131    .sType = VK_STRUCTURE_TYPE_RENDERING_INFO_KHR,
132    .pNext = NULL,
133    .flags = 0,
134    .renderArea = { ... },
135    .layerCount = 1,
136    .colorAttachmentCount = 2,
137    .pColorAttachments = colorAttachments,
138    .pDepthAttachment = &depthStencilAttachment,
139    .pStencilAttachment = &depthStencilAttachment };
140
141    VkTilePropertiesQCOM tileProperties = {
142    .sType = VK_STRUCTURE_TYPE_TILE_PROPERTIES_QCOM,
143    .pNext = NULL,
144    .... };
145
146// `device` is a valid VkDevice handle
147// `pRenderingInfo` is pointer to the `VkRenderingInfoKHR` struct that was passed to `vkCmdBeginRenderingKHR`
148vkGetDynamicRenderingTilePropertiesQCOM(device, pRenderingInfo, &tileProperties);
149----
150
151=== Interpreting tile size values
152
153  . If attachment dimensions are (768, 1440) and tile size returned is (768, 480) then it implies that there are three tiles in a (1 x 3) tile-grid. All tiles are full tiles contained within the attachment.
154
155  . If attachment dimensions are (720, 1440) and tile size returned is (768, 480) then it implies that there are three tiles in a (1 x 3) tile-grid. All tiles are _partially filled_ tiles as they span outside the attachment extent.
156
157  . If attachment dimensions are (1920, 1080) and tile size returned is (672, 576) then it implies that there are six tiles in a (3 x 2) tile-grid. Last tiles in each row and column are _partially filled_ tiles as they span outside the attachment extent.
158
159=== Interpreting origin values
160
161  . If returned origin is (0, 0) then the first tile's top-left corner is at the attachment's origin (0,0).
162
163  . If returned origin is (-32, -64) and tile size is (768, 480), then tile boundaries in x will lie at -32, 736, 1504, ... and tile boundaries in y will lie at -64, 416, 896, ...".
164
165== Issues
166
167This section describes issues that came up during discussion and their resolution.
168
169
170=== RESOLVED: How to handle dynamic rendering?
171
172Since the extension should support both renderpasses and dynamic rendering, dedicated API entrypoints were added for both.
173
174=== RESOLVED: This extension returns only one set of dimensions for tile size so how to handle the case of non-merged subpasses where each subpass can have a different tile size?
175
176The extension was modified to return an array of tile properties which holds properties for all requested or available subpassses instead of single value for tile properties.
177
178=== RESOLVED: Adreno implementation may decide to execute certain workloads in direct rendering mode a.k.a Flex render. What is the interaction of this extension with Flex render?
179
180In those cases, the information returned by this extension may not indicate the true execution mode of the GPU.
181
182
183
184
185