1.. _context: 2 3Context 4======= 5 6A Gallium rendering context encapsulates the state which effects 3D 7rendering such as blend state, depth/stencil state, texture samplers, 8etc. 9 10Note that resource/texture allocation is not per-context but per-screen. 11 12 13Methods 14------- 15 16CSO State 17^^^^^^^^^ 18 19All Constant State Object (CSO) state is created, bound, and destroyed, 20with triplets of methods that all follow a specific naming scheme. 21For example, ``create_blend_state``, ``bind_blend_state``, and 22``destroy_blend_state``. 23 24CSO objects handled by the context object: 25 26* :ref:`Blend`: ``*_blend_state`` 27* :ref:`Sampler`: Texture sampler states are bound separately for fragment, 28 vertex, geometry and compute shaders with the ``bind_sampler_states`` 29 function. The ``start`` and ``num_samplers`` parameters indicate a range 30 of samplers to change. NOTE: at this time, start is always zero and 31 the CSO module will always replace all samplers at once (no sub-ranges). 32 This may change in the future. 33* :ref:`Rasterizer`: ``*_rasterizer_state`` 34* :ref:`depth-stencil-alpha`: ``*_depth_stencil_alpha_state`` 35* :ref:`Shader`: These are create, bind and destroy methods for vertex, 36 fragment and geometry shaders. 37* :ref:`vertexelements`: ``*_vertex_elements_state`` 38 39 40Resource Binding State 41^^^^^^^^^^^^^^^^^^^^^^ 42 43This state describes how resources in various flavors (textures, 44buffers, surfaces) are bound to the driver. 45 46 47* ``set_constant_buffer`` sets a constant buffer to be used for a given shader 48 type. index is used to indicate which buffer to set (some APIs may allow 49 multiple ones to be set, and binding a specific one later, though drivers 50 are mostly restricted to the first one right now). 51 52* ``set_inlinable_constants`` sets inlinable constants for constant buffer 0. 53 54These are constants that the driver would like to inline in the IR 55of the current shader and recompile it. Drivers can determine which 56constants they prefer to inline in finalize_nir and store that 57information in shader_info::*inlinable_uniform*. When the state tracker 58or frontend uploads constants to a constant buffer, it can pass 59inlinable constants separately via this call. 60 61Any ``set_constant_buffer`` call invalidates inlinable constants, so 62``set_inlinable_constants`` must be called after it. Binding a shader also 63invalidates this state. 64 65There is no ``PIPE_CAP`` for this. Drivers shouldn't set the shader_info 66fields if they don't implement ``set_inlinable_constants``. 67 68* ``set_framebuffer_state`` 69 70* ``set_vertex_buffers`` 71 72 73Non-CSO State 74^^^^^^^^^^^^^ 75 76These pieces of state are too small, variable, and/or trivial to have CSO 77objects. They all follow simple, one-method binding calls, e.g. 78``set_blend_color``. 79 80* ``set_stencil_ref`` sets the stencil front and back reference values 81 which are used as comparison values in stencil test. 82* ``set_blend_color`` 83* ``set_sample_mask`` sets the per-context multisample sample mask. Note 84 that this takes effect even if multisampling is not explicitly enabled if 85 the framebuffer surface(s) are multisampled. Also, this mask is AND-ed 86 with the optional fragment shader sample mask output (when emitted). 87* ``set_sample_locations`` sets the sample locations used for rasterization. 88 ```get_sample_position``` still returns the default locations. When NULL, 89 the default locations are used. 90* ``set_min_samples`` sets the minimum number of samples that must be run. 91* ``set_clip_state`` 92* ``set_polygon_stipple`` 93* ``set_scissor_states`` sets the bounds for the scissor test, which culls 94 pixels before blending to render targets. If the :ref:`Rasterizer` does 95 not have the scissor test enabled, then the scissor bounds never need to 96 be set since they will not be used. Note that scissor xmin and ymin are 97 inclusive, but xmax and ymax are exclusive. The inclusive ranges in x 98 and y would be [xmin..xmax-1] and [ymin..ymax-1]. The number of scissors 99 should be the same as the number of set viewports and can be up to 100 PIPE_MAX_VIEWPORTS. 101* ``set_viewport_states`` 102* ``set_window_rectangles`` sets the window rectangles to be used for 103 rendering, as defined by GL_EXT_window_rectangles. There are two 104 modes - include and exclude, which define whether the supplied 105 rectangles are to be used for including fragments or excluding 106 them. All of the rectangles are ORed together, so in exclude mode, 107 any fragment inside any rectangle would be culled, while in include 108 mode, any fragment outside all rectangles would be culled. xmin/ymin 109 are inclusive, while xmax/ymax are exclusive (same as scissor states 110 above). Note that this only applies to draws, not clears or 111 blits. (Blits have their own way to pass the requisite rectangles 112 in.) 113* ``set_tess_state`` configures the default tessellation parameters: 114 115 * ``default_outer_level`` is the default value for the outer tessellation 116 levels. This corresponds to GL's ``PATCH_DEFAULT_OUTER_LEVEL``. 117 * ``default_inner_level`` is the default value for the inner tessellation 118 levels. This corresponds to GL's ``PATCH_DEFAULT_INNER_LEVEL``. 119 120* ``set_debug_callback`` sets the callback to be used for reporting 121 various debug messages, eventually reported via KHR_debug and 122 similar mechanisms. 123 124Samplers 125^^^^^^^^ 126 127pipe_sampler_state objects control how textures are sampled (coordinate 128wrap modes, interpolation modes, etc). Note that samplers are not used 129for texture buffer objects. That is, pipe_context::bind_sampler_views() 130will not bind a sampler if the corresponding sampler view refers to a 131PIPE_BUFFER resource. 132 133Sampler Views 134^^^^^^^^^^^^^ 135 136These are the means to bind textures to shader stages. To create one, specify 137its format, swizzle and LOD range in sampler view template. 138 139If texture format is different than template format, it is said the texture 140is being cast to another format. Casting can be done only between compatible 141formats, that is formats that have matching component order and sizes. 142 143Swizzle fields specify the way in which fetched texel components are placed 144in the result register. For example, ``swizzle_r`` specifies what is going to be 145placed in first component of result register. 146 147The ``first_level`` and ``last_level`` fields of sampler view template specify 148the LOD range the texture is going to be constrained to. Note that these 149values are in addition to the respective min_lod, max_lod values in the 150pipe_sampler_state (that is if min_lod is 2.0, and first_level 3, the first mip 151level used for sampling from the resource is effectively the fifth). 152 153The ``first_layer`` and ``last_layer`` fields specify the layer range the 154texture is going to be constrained to. Similar to the LOD range, this is added 155to the array index which is used for sampling. 156 157* ``set_sampler_views`` binds an array of sampler views to a shader stage. 158 Every binding point acquires a reference 159 to a respective sampler view and releases a reference to the previous 160 sampler view. 161 162 Sampler views outside of ``[start_slot, start_slot + num_views)`` are 163 unmodified. If ``views`` is NULL, the behavior is the same as if 164 ``views[n]`` was NULL for the entire range, i.e. releasing the reference 165 for all the sampler views in the specified range. 166 167* ``create_sampler_view`` creates a new sampler view. ``texture`` is associated 168 with the sampler view which results in sampler view holding a reference 169 to the texture. Format specified in template must be compatible 170 with texture format. 171 172* ``sampler_view_destroy`` destroys a sampler view and releases its reference 173 to associated texture. 174 175Hardware Atomic buffers 176^^^^^^^^^^^^^^^^^^^^^^^ 177 178Buffers containing hw atomics are required to support the feature 179on some drivers. 180 181Drivers that require this need to fill the ``set_hw_atomic_buffers`` method. 182 183Shader Resources 184^^^^^^^^^^^^^^^^ 185 186Shader resources are textures or buffers that may be read or written 187from a shader without an associated sampler. This means that they 188have no support for floating point coordinates, address wrap modes or 189filtering. 190 191There are 2 types of shader resources: buffers and images. 192 193Buffers are specified using the ``set_shader_buffers`` method. 194 195Images are specified using the ``set_shader_images`` method. When binding 196images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view 197fields specify the mipmap level and the range of layers the image will be 198constrained to. 199 200Surfaces 201^^^^^^^^ 202 203These are the means to use resources as color render targets or depthstencil 204attachments. To create one, specify the mip level, the range of layers, and 205the bind flags (either PIPE_BIND_DEPTH_STENCIL or PIPE_BIND_RENDER_TARGET). 206Note that layer values are in addition to what is indicated by the geometry 207shader output variable XXX_FIXME (that is if first_layer is 3 and geometry 208shader indicates index 2, the 5th layer of the resource will be used). These 209first_layer and last_layer parameters will only be used for 1d array, 2d array, 210cube, and 3d textures otherwise they are 0. 211 212* ``create_surface`` creates a new surface. 213 214* ``surface_destroy`` destroys a surface and releases its reference to the 215 associated resource. 216 217Stream output targets 218^^^^^^^^^^^^^^^^^^^^^ 219 220Stream output, also known as transform feedback, allows writing the primitives 221produced by the vertex pipeline to buffers. This is done after the geometry 222shader or vertex shader if no geometry shader is present. 223 224The stream output targets are views into buffer resources which can be bound 225as stream outputs and specify a memory range where it's valid to write 226primitives. The pipe driver must implement memory protection such that any 227primitives written outside of the specified memory range are discarded. 228 229Two stream output targets can use the same resource at the same time, but 230with a disjoint memory range. 231 232Additionally, the stream output target internally maintains the offset 233into the buffer which is incremented every time something is written to it. 234The internal offset is equal to how much data has already been written. 235It can be stored in device memory and the CPU actually doesn't have to query 236it. 237 238The stream output target can be used in a draw command to provide 239the vertex count. The vertex count is derived from the internal offset 240discussed above. 241 242* ``create_stream_output_target`` create a new target. 243 244* ``stream_output_target_destroy`` destroys a target. Users of this should 245 use pipe_so_target_reference instead. 246 247* ``set_stream_output_targets`` binds stream output targets. The parameter 248 offset is an array which specifies the internal offset of the buffer. The 249 internal offset is, besides writing, used for reading the data during the 250 draw_auto stage, i.e. it specifies how much data there is in the buffer 251 for the purposes of the draw_auto stage. -1 means the buffer should 252 be appended to, and everything else sets the internal offset. 253 254NOTE: The currently-bound vertex or geometry shader must be compiled with 255the properly-filled-in structure pipe_stream_output_info describing which 256outputs should be written to buffers and how. The structure is part of 257pipe_shader_state. 258 259Clearing 260^^^^^^^^ 261 262Clear is one of the most difficult concepts to nail down to a single 263interface (due to both different requirements from APIs and also driver/hw 264specific differences). 265 266``clear`` initializes some or all of the surfaces currently bound to 267the framebuffer to particular RGBA, depth, or stencil values. 268Currently, this does not take into account color or stencil write masks (as 269used by GL), and always clears the whole surfaces (no scissoring as used by 270GL clear or explicit rectangles like d3d9 uses). It can, however, also clear 271only depth or stencil in a combined depth/stencil surface. 272If a surface includes several layers then all layers will be cleared. 273 274``clear_render_target`` clears a single color rendertarget with the specified 275color value. While it is only possible to clear one surface at a time (which can 276include several layers), this surface need not be bound to the framebuffer. 277If render_condition_enabled is false, any current rendering condition is ignored 278and the clear will be unconditional. 279 280``clear_depth_stencil`` clears a single depth, stencil or depth/stencil surface 281with the specified depth and stencil values (for combined depth/stencil buffers, 282it is also possible to only clear one or the other part). While it is only 283possible to clear one surface at a time (which can include several layers), 284this surface need not be bound to the framebuffer. 285If render_condition_enabled is false, any current rendering condition is ignored 286and the clear will be unconditional. 287 288``clear_texture`` clears a non-PIPE_BUFFER resource's specified level 289and bounding box with a clear value provided in that resource's native 290format. 291 292``clear_buffer`` clears a PIPE_BUFFER resource with the specified clear value 293(which may be multiple bytes in length). Logically this is a memset with a 294multi-byte element value starting at offset bytes from resource start, going 295for size bytes. It is guaranteed that size % clear_value_size == 0. 296 297Evaluating Depth Buffers 298^^^^^^^^^^^^^^^^^^^^^^^^ 299 300``evaluate_depth_buffer`` is a hint to decompress the current depth buffer 301assuming the current sample locations to avoid problems that could arise when 302using programmable sample locations. 303 304If a depth buffer is rendered with different sample location state than 305what is current at the time of reading the depth buffer, the values may differ 306because depth buffer compression can depend the sample locations. 307 308 309Uploading 310^^^^^^^^^ 311 312For simple single-use uploads, use ``pipe_context::stream_uploader`` or 313``pipe_context::const_uploader``. The latter should be used for uploading 314constants, while the former should be used for uploading everything else. 315PIPE_USAGE_STREAM is implied in both cases, so don't use the uploaders 316for static allocations. 317 318Usage: 319 320Call u_upload_alloc or u_upload_data as many times as you want. After you are 321done, call u_upload_unmap. If the driver doesn't support persistent mappings, 322u_upload_unmap makes sure the previously mapped memory is unmapped. 323 324Gotchas: 325- Always fill the memory immediately after u_upload_alloc. Any following call 326to u_upload_alloc and u_upload_data can unmap memory returned by previous 327u_upload_alloc. 328- Don't interleave calls using stream_uploader and const_uploader. If you use 329one of them, do the upload, unmap, and only then can you use the other one. 330 331 332Drawing 333^^^^^^^ 334 335``draw_vbo`` draws a specified primitive. The primitive mode and other 336properties are described by ``pipe_draw_info``. 337 338The ``mode``, ``start``, and ``count`` fields of ``pipe_draw_info`` specify the 339the mode of the primitive and the vertices to be fetched, in the range between 340``start`` to ``start``+``count``-1, inclusive. 341 342Every instance with instanceID in the range between ``start_instance`` and 343``start_instance``+``instance_count``-1, inclusive, will be drawn. 344 345If ``index_size`` != 0, all vertex indices will be looked up from the index 346buffer. 347 348In indexed draw, ``min_index`` and ``max_index`` respectively provide a lower 349and upper bound of the indices contained in the index buffer inside the range 350between ``start`` to ``start``+``count``-1. This allows the driver to 351determine which subset of vertices will be referenced during te draw call 352without having to scan the index buffer. Providing a over-estimation of the 353the true bounds, for example, a ``min_index`` and ``max_index`` of 0 and 3540xffffffff respectively, must give exactly the same rendering, albeit with less 355performance due to unreferenced vertex buffers being unnecessarily DMA'ed or 356processed. Providing a underestimation of the true bounds will result in 357undefined behavior, but should not result in program or system failure. 358 359In case of non-indexed draw, ``min_index`` should be set to 360``start`` and ``max_index`` should be set to ``start``+``count``-1. 361 362``index_bias`` is a value added to every vertex index after lookup and before 363fetching vertex attributes. 364 365When drawing indexed primitives, the primitive restart index can be 366used to draw disjoint primitive strips. For example, several separate 367line strips can be drawn by designating a special index value as the 368restart index. The ``primitive_restart`` flag enables/disables this 369feature. The ``restart_index`` field specifies the restart index value. 370 371When primitive restart is in use, array indexes are compared to the 372restart index before adding the index_bias offset. 373 374If a given vertex element has ``instance_divisor`` set to 0, it is said 375it contains per-vertex data and effective vertex attribute address needs 376to be recalculated for every index. 377 378 attribAddr = ``stride`` * index + ``src_offset`` 379 380If a given vertex element has ``instance_divisor`` set to non-zero, 381it is said it contains per-instance data and effective vertex attribute 382address needs to recalculated for every ``instance_divisor``-th instance. 383 384 attribAddr = ``stride`` * instanceID / ``instance_divisor`` + ``src_offset`` 385 386In the above formulas, ``src_offset`` is taken from the given vertex element 387and ``stride`` is taken from a vertex buffer associated with the given 388vertex element. 389 390The calculated attribAddr is used as an offset into the vertex buffer to 391fetch the attribute data. 392 393The value of ``instanceID`` can be read in a vertex shader through a system 394value register declared with INSTANCEID semantic name. 395 396 397Queries 398^^^^^^^ 399 400Queries gather some statistic from the 3D pipeline over one or more 401draws. Queries may be nested, though not all gallium frontends exercise this. 402 403Queries can be created with ``create_query`` and deleted with 404``destroy_query``. To start a query, use ``begin_query``, and when finished, 405use ``end_query`` to end the query. 406 407``create_query`` takes a query type (``PIPE_QUERY_*``), as well as an index, 408which is the vertex stream for ``PIPE_QUERY_PRIMITIVES_GENERATED`` and 409``PIPE_QUERY_PRIMITIVES_EMITTED``, and allocates a query structure. 410 411``begin_query`` will clear/reset previous query results. 412 413``get_query_result`` is used to retrieve the results of a query. If 414the ``wait`` parameter is TRUE, then the ``get_query_result`` call 415will block until the results of the query are ready (and TRUE will be 416returned). Otherwise, if the ``wait`` parameter is FALSE, the call 417will not block and the return value will be TRUE if the query has 418completed or FALSE otherwise. 419 420``get_query_result_resource`` is used to store the result of a query into 421a resource without synchronizing with the CPU. This write will optionally 422wait for the query to complete, and will optionally write whether the value 423is available instead of the value itself. 424 425``set_active_query_state`` Set whether all current non-driver queries except 426TIME_ELAPSED are active or paused. 427 428The interface currently includes the following types of queries: 429 430``PIPE_QUERY_OCCLUSION_COUNTER`` counts the number of fragments which 431are written to the framebuffer without being culled by 432:ref:`depth-stencil-alpha` testing or shader KILL instructions. 433The result is an unsigned 64-bit integer. 434This query can be used with ``render_condition``. 435 436In cases where a boolean result of an occlusion query is enough, 437``PIPE_QUERY_OCCLUSION_PREDICATE`` should be used. It is just like 438``PIPE_QUERY_OCCLUSION_COUNTER`` except that the result is a boolean 439value of FALSE for cases where COUNTER would result in 0 and TRUE 440for all other cases. 441This query can be used with ``render_condition``. 442 443In cases where a conservative approximation of an occlusion query is enough, 444``PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE`` should be used. It behaves 445like ``PIPE_QUERY_OCCLUSION_PREDICATE``, except that it may return TRUE in 446additional, implementation-dependent cases. 447This query can be used with ``render_condition``. 448 449``PIPE_QUERY_TIME_ELAPSED`` returns the amount of time, in nanoseconds, 450the context takes to perform operations. 451The result is an unsigned 64-bit integer. 452 453``PIPE_QUERY_TIMESTAMP`` returns a device/driver internal timestamp, 454scaled to nanoseconds, recorded after all commands issued prior to 455``end_query`` have been processed. 456This query does not require a call to ``begin_query``. 457The result is an unsigned 64-bit integer. 458 459``PIPE_QUERY_TIMESTAMP_DISJOINT`` can be used to check the 460internal timer resolution and whether the timestamp counter has become 461unreliable due to things like throttling etc. - only if this is FALSE 462a timestamp query (within the timestamp_disjoint query) should be trusted. 463The result is a 64-bit integer specifying the timer resolution in Hz, 464followed by a boolean value indicating whether the timestamp counter 465is discontinuous or disjoint. 466 467``PIPE_QUERY_PRIMITIVES_GENERATED`` returns a 64-bit integer indicating 468the number of primitives processed by the pipeline (regardless of whether 469stream output is active or not). 470 471``PIPE_QUERY_PRIMITIVES_EMITTED`` returns a 64-bit integer indicating 472the number of primitives written to stream output buffers. 473 474``PIPE_QUERY_SO_STATISTICS`` returns 2 64-bit integers corresponding to 475the result of 476``PIPE_QUERY_PRIMITIVES_EMITTED`` and 477the number of primitives that would have been written to stream output buffers 478if they had infinite space available (primitives_storage_needed), in this order. 479XXX the 2nd value is equivalent to ``PIPE_QUERY_PRIMITIVES_GENERATED`` but it is 480unclear if it should be increased if stream output is not active. 481 482``PIPE_QUERY_SO_OVERFLOW_PREDICATE`` returns a boolean value indicating 483whether a selected stream output target has overflowed as a result of the 484commands issued between ``begin_query`` and ``end_query``. 485This query can be used with ``render_condition``. The output stream is 486selected by the stream number passed to ``create_query``. 487 488``PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE`` returns a boolean value indicating 489whether any stream output target has overflowed as a result of the commands 490issued between ``begin_query`` and ``end_query``. This query can be used 491with ``render_condition``, and its result is the logical OR of multiple 492``PIPE_QUERY_SO_OVERFLOW_PREDICATE`` queries, one for each stream output 493target. 494 495``PIPE_QUERY_GPU_FINISHED`` returns a boolean value indicating whether 496all commands issued before ``end_query`` have completed. However, this 497does not imply serialization. 498This query does not require a call to ``begin_query``. 499 500``PIPE_QUERY_PIPELINE_STATISTICS`` returns an array of the following 50164-bit integers: 502Number of vertices read from vertex buffers. 503Number of primitives read from vertex buffers. 504Number of vertex shader threads launched. 505Number of geometry shader threads launched. 506Number of primitives generated by geometry shaders. 507Number of primitives forwarded to the rasterizer. 508Number of primitives rasterized. 509Number of fragment shader threads launched. 510Number of tessellation control shader threads launched. 511Number of tessellation evaluation shader threads launched. 512If a shader type is not supported by the device/driver, 513the corresponding values should be set to 0. 514 515``PIPE_QUERY_PIPELINE_STATISTICS_SINGLE`` returns a single counter from 516the ``PIPE_QUERY_PIPELINE_STATISTICS`` group. The specific counter must 517be selected when calling ``create_query`` by passing one of the 518``PIPE_STAT_QUERY`` enums as the query's ``index``. 519 520Gallium does not guarantee the availability of any query types; one must 521always check the capabilities of the :ref:`Screen` first. 522 523 524Conditional Rendering 525^^^^^^^^^^^^^^^^^^^^^ 526 527A drawing command can be skipped depending on the outcome of a query 528(typically an occlusion query, or streamout overflow predicate). 529The ``render_condition`` function specifies the query which should be checked 530prior to rendering anything. Functions always honoring render_condition include 531(and are limited to) draw_vbo and clear. 532The blit, clear_render_target and clear_depth_stencil functions (but 533not resource_copy_region, which seems inconsistent) can also optionally honor 534the current render condition. 535 536If ``render_condition`` is called with ``query`` = NULL, conditional 537rendering is disabled and drawing takes place normally. 538 539If ``render_condition`` is called with a non-null ``query`` subsequent 540drawing commands will be predicated on the outcome of the query. 541Commands will be skipped if ``condition`` is equal to the predicate result 542(for non-boolean queries such as OCCLUSION_QUERY, zero counts as FALSE, 543non-zero as TRUE). 544 545If ``mode`` is PIPE_RENDER_COND_WAIT the driver will wait for the 546query to complete before deciding whether to render. 547 548If ``mode`` is PIPE_RENDER_COND_NO_WAIT and the query has not yet 549completed, the drawing command will be executed normally. If the query 550has completed, drawing will be predicated on the outcome of the query. 551 552If ``mode`` is PIPE_RENDER_COND_BY_REGION_WAIT or 553PIPE_RENDER_COND_BY_REGION_NO_WAIT rendering will be predicated as above 554for the non-REGION modes but in the case that an occlusion query returns 555a non-zero result, regions which were occluded may be ommitted by subsequent 556drawing commands. This can result in better performance with some GPUs. 557Normally, if the occlusion query returned a non-zero result subsequent 558drawing happens normally so fragments may be generated, shaded and 559processed even where they're known to be obscured. 560 561 562Flushing 563^^^^^^^^ 564 565``flush`` 566 567PIPE_FLUSH_END_OF_FRAME: Whether the flush marks the end of frame. 568 569PIPE_FLUSH_DEFERRED: It is not required to flush right away, but it is required 570to return a valid fence. If fence_finish is called with the returned fence 571and the context is still unflushed, and the ctx parameter of fence_finish is 572equal to the context where the fence was created, fence_finish will flush 573the context. 574 575PIPE_FLUSH_ASYNC: The flush is allowed to be asynchronous. Unlike 576``PIPE_FLUSH_DEFERRED``, the driver must still ensure that the returned fence 577will finish in finite time. However, subsequent operations in other contexts of 578the same screen are no longer guaranteed to happen after the flush. Drivers 579which use this flag must implement pipe_context::fence_server_sync. 580 581PIPE_FLUSH_HINT_FINISH: Hints to the driver that the caller will immediately 582wait for the returned fence. 583 584Additional flags may be set together with ``PIPE_FLUSH_DEFERRED`` for even 585finer-grained fences. Note that as a general rule, GPU caches may not have been 586flushed yet when these fences are signaled. Drivers are free to ignore these 587flags and create normal fences instead. At most one of the following flags can 588be specified: 589 590PIPE_FLUSH_TOP_OF_PIPE: The fence should be signaled as soon as the next 591command is ready to start executing at the top of the pipeline, before any of 592its data is actually read (including indirect draw parameters). 593 594PIPE_FLUSH_BOTTOM_OF_PIPE: The fence should be signaled as soon as the previous 595command has finished executing on the GPU entirely (but data written by the 596command may still be in caches and inaccessible to the CPU). 597 598 599``flush_resource`` 600 601Flush the resource cache, so that the resource can be used 602by an external client. Possible usage: 603- flushing a resource before presenting it on the screen 604- flushing a resource if some other process or device wants to use it 605This shouldn't be used to flush caches if the resource is only managed 606by a single pipe_screen and is not shared with another process. 607(i.e. you shouldn't use it to flush caches explicitly if you want to e.g. 608use the resource for texturing) 609 610Fences 611^^^^^^ 612 613``pipe_fence_handle``, and related methods, are used to synchronize 614execution between multiple parties. Examples include CPU <-> GPU synchronization, 615renderer <-> windowing system, multiple external APIs, etc. 616 617A ``pipe_fence_handle`` can either be 'one time use' or 're-usable'. A 'one time use' 618fence behaves like a traditional GPU fence. Once it reaches the signaled state it 619is forever considered to be signaled. 620 621Once a re-usable ``pipe_fence_handle`` becomes signaled, it can be reset 622back into an unsignaled state. The ``pipe_fence_handle`` will be reset to 623the unsignaled state by performing a wait operation on said object, i.e. 624``fence_server_sync``. As a corollary to this behavior, a re-usable 625``pipe_fence_handle`` can only have one waiter. 626 627This behavior is useful in producer <-> consumer chains. It helps avoid 628unnecessarily sharing a new ``pipe_fence_handle`` each time a new frame is 629ready. Instead, the fences are exchanged once ahead of time, and access is synchronized 630through GPU signaling instead of direct producer <-> consumer communication. 631 632``fence_server_sync`` inserts a wait command into the GPU's command stream. 633 634``fence_server_signal`` inserts a signal command into the GPU's command stream. 635 636There are no guarantees that the wait/signal commands will be flushed when 637calling ``fence_server_sync`` or ``fence_server_signal``. An explicit 638call to ``flush`` is required to make sure the commands are emitted to the GPU. 639 640The Gallium implementation may implicitly ``flush`` the command stream during a 641``fence_server_sync`` or ``fence_server_signal`` call if necessary. 642 643Resource Busy Queries 644^^^^^^^^^^^^^^^^^^^^^ 645 646``is_resource_referenced`` 647 648 649 650Blitting 651^^^^^^^^ 652 653These methods emulate classic blitter controls. 654 655These methods operate directly on ``pipe_resource`` objects, and stand 656apart from any 3D state in the context. Blitting functionality may be 657moved to a separate abstraction at some point in the future. 658 659``resource_copy_region`` blits a region of a resource to a region of another 660resource, provided that both resources have the same format, or compatible 661formats, i.e., formats for which copying the bytes from the source resource 662unmodified to the destination resource will achieve the same effect of a 663textured quad blitter.. The source and destination may be the same resource, 664but overlapping blits are not permitted. 665This can be considered the equivalent of a CPU memcpy. 666 667``blit`` blits a region of a resource to a region of another resource, including 668scaling, format conversion, and up-/downsampling, as well as a destination clip 669rectangle (scissors) and window rectangles. It can also optionally honor the 670current render condition (but either way the blit itself never contributes 671anything to queries currently gathering data). 672As opposed to manually drawing a textured quad, this lets the pipe driver choose 673the optimal method for blitting (like using a special 2D engine), and usually 674offers, for example, accelerated stencil-only copies even where 675PIPE_CAP_SHADER_STENCIL_EXPORT is not available. 676 677 678Transfers 679^^^^^^^^^ 680 681These methods are used to get data to/from a resource. 682 683``transfer_map`` creates a memory mapping and the transfer object 684associated with it. 685The returned pointer points to the start of the mapped range according to 686the box region, not the beginning of the resource. If transfer_map fails, 687the returned pointer to the buffer memory is NULL, and the pointer 688to the transfer object remains unchanged (i.e. it can be non-NULL). 689 690``transfer_unmap`` remove the memory mapping for and destroy 691the transfer object. The pointer into the resource should be considered 692invalid and discarded. 693 694``texture_subdata`` and ``buffer_subdata`` perform a simplified 695transfer for simple writes. Basically transfer_map, data write, and 696transfer_unmap all in one. 697 698 699The box parameter to some of these functions defines a 1D, 2D or 3D 700region of pixels. This is self-explanatory for 1D, 2D and 3D texture 701targets. 702 703For PIPE_TEXTURE_1D_ARRAY and PIPE_TEXTURE_2D_ARRAY, the box::z and box::depth 704fields refer to the array dimension of the texture. 705 706For PIPE_TEXTURE_CUBE, the box:z and box::depth fields refer to the 707faces of the cube map (z + depth <= 6). 708 709For PIPE_TEXTURE_CUBE_ARRAY, the box:z and box::depth fields refer to both 710the face and array dimension of the texture (face = z % 6, array = z / 6). 711 712 713.. _transfer_flush_region: 714 715transfer_flush_region 716%%%%%%%%%%%%%%%%%%%%% 717 718If a transfer was created with ``FLUSH_EXPLICIT``, it will not automatically 719be flushed on write or unmap. Flushes must be requested with 720``transfer_flush_region``. Flush ranges are relative to the mapped range, not 721the beginning of the resource. 722 723 724 725.. _texture_barrier: 726 727texture_barrier 728%%%%%%%%%%%%%%% 729 730This function flushes all pending writes to the currently-set surfaces and 731invalidates all read caches of the currently-set samplers. This can be used 732for both regular textures as well as for framebuffers read via FBFETCH. 733 734 735 736.. _memory_barrier: 737 738memory_barrier 739%%%%%%%%%%%%%%% 740 741This function flushes caches according to which of the PIPE_BARRIER_* flags 742are set. 743 744 745 746.. _resource_commit: 747 748resource_commit 749%%%%%%%%%%%%%%% 750 751This function changes the commit state of a part of a sparse resource. Sparse 752resources are created by setting the ``PIPE_RESOURCE_FLAG_SPARSE`` flag when 753calling ``resource_create``. Initially, sparse resources only reserve a virtual 754memory region that is not backed by memory (i.e., it is uncommitted). The 755``resource_commit`` function can be called to commit or uncommit parts (or all) 756of a resource. The driver manages the underlying backing memory. 757 758The contents of newly committed memory regions are undefined. Calling this 759function to commit an already committed memory region is allowed and leaves its 760content unchanged. Similarly, calling this function to uncommit an already 761uncommitted memory region is allowed. 762 763For buffers, the given box must be aligned to multiples of 764``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``. As an exception to this rule, if the size 765of the buffer is not a multiple of the page size, changing the commit state of 766the last (partial) page requires a box that ends at the end of the buffer 767(i.e., box->x + box->width == buffer->width0). 768 769 770 771.. _pipe_transfer: 772 773PIPE_MAP 774^^^^^^^^^^^^^ 775 776These flags control the behavior of a transfer object. 777 778``PIPE_MAP_READ`` 779 Resource contents read back (or accessed directly) at transfer create time. 780 781``PIPE_MAP_WRITE`` 782 Resource contents will be written back at transfer_unmap time (or modified 783 as a result of being accessed directly). 784 785``PIPE_MAP_DIRECTLY`` 786 a transfer should directly map the resource. May return NULL if not supported. 787 788``PIPE_MAP_DISCARD_RANGE`` 789 The memory within the mapped region is discarded. Cannot be used with 790 ``PIPE_MAP_READ``. 791 792``PIPE_MAP_DISCARD_WHOLE_RESOURCE`` 793 Discards all memory backing the resource. It should not be used with 794 ``PIPE_MAP_READ``. 795 796``PIPE_MAP_DONTBLOCK`` 797 Fail if the resource cannot be mapped immediately. 798 799``PIPE_MAP_UNSYNCHRONIZED`` 800 Do not synchronize pending operations on the resource when mapping. The 801 interaction of any writes to the map and any operations pending on the 802 resource are undefined. Cannot be used with ``PIPE_MAP_READ``. 803 804``PIPE_MAP_FLUSH_EXPLICIT`` 805 Written ranges will be notified later with :ref:`transfer_flush_region`. 806 Cannot be used with ``PIPE_MAP_READ``. 807 808``PIPE_MAP_PERSISTENT`` 809 Allows the resource to be used for rendering while mapped. 810 PIPE_RESOURCE_FLAG_MAP_PERSISTENT must be set when creating 811 the resource. 812 If COHERENT is not set, memory_barrier(PIPE_BARRIER_MAPPED_BUFFER) 813 must be called to ensure the device can see what the CPU has written. 814 815``PIPE_MAP_COHERENT`` 816 If PERSISTENT is set, this ensures any writes done by the device are 817 immediately visible to the CPU and vice versa. 818 PIPE_RESOURCE_FLAG_MAP_COHERENT must be set when creating 819 the resource. 820 821Compute kernel execution 822^^^^^^^^^^^^^^^^^^^^^^^^ 823 824A compute program can be defined, bound or destroyed using 825``create_compute_state``, ``bind_compute_state`` or 826``destroy_compute_state`` respectively. 827 828Any of the subroutines contained within the compute program can be 829executed on the device using the ``launch_grid`` method. This method 830will execute as many instances of the program as elements in the 831specified N-dimensional grid, hopefully in parallel. 832 833The compute program has access to four special resources: 834 835* ``GLOBAL`` represents a memory space shared among all the threads 836 running on the device. An arbitrary buffer created with the 837 ``PIPE_BIND_GLOBAL`` flag can be mapped into it using the 838 ``set_global_binding`` method. 839 840* ``LOCAL`` represents a memory space shared among all the threads 841 running in the same working group. The initial contents of this 842 resource are undefined. 843 844* ``PRIVATE`` represents a memory space local to a single thread. 845 The initial contents of this resource are undefined. 846 847* ``INPUT`` represents a read-only memory space that can be 848 initialized at ``launch_grid`` time. 849 850These resources use a byte-based addressing scheme, and they can be 851accessed from the compute program by means of the LOAD/STORE TGSI 852opcodes. Additional resources to be accessed using the same opcodes 853may be specified by the user with the ``set_compute_resources`` 854method. 855 856In addition, normal texture sampling is allowed from the compute 857program: ``bind_sampler_states`` may be used to set up texture 858samplers for the compute stage and ``set_sampler_views`` may 859be used to bind a number of sampler views to it. 860 861Mipmap generation 862^^^^^^^^^^^^^^^^^ 863 864If PIPE_CAP_GENERATE_MIPMAP is true, ``generate_mipmap`` can be used 865to generate mipmaps for the specified texture resource. 866It replaces texel image levels base_level+1 through 867last_level for layers range from first_layer through last_layer. 868It returns TRUE if mipmap generation succeeds, otherwise it 869returns FALSE. Mipmap generation may fail when it is not supported 870for particular texture types or formats. 871 872Device resets 873^^^^^^^^^^^^^ 874 875Gallium frontends can query or request notifications of when the GPU 876is reset for whatever reason (application error, driver error). When 877a GPU reset happens, the context becomes unusable and all related state 878should be considered lost and undefined. Despite that, context 879notifications are single-shot, i.e. subsequent calls to 880``get_device_reset_status`` will return PIPE_NO_RESET. 881 882* ``get_device_reset_status`` queries whether a device reset has happened 883 since the last call or since the last notification by callback. 884* ``set_device_reset_callback`` sets a callback which will be called when 885 a device reset is detected. The callback is only called synchronously. 886 887Bindless 888^^^^^^^^ 889 890If PIPE_CAP_BINDLESS_TEXTURE is TRUE, the following ``pipe_context`` functions 891are used to create/delete bindless handles, and to make them resident in the 892current context when they are going to be used by shaders. 893 894* ``create_texture_handle`` creates a 64-bit unsigned integer texture handle 895 that is going to be directly used in shaders. 896* ``delete_texture_handle`` deletes a 64-bit unsigned integer texture handle. 897* ``make_texture_handle_resident`` makes a 64-bit unsigned texture handle 898 resident in the current context to be accessible by shaders for texture 899 mapping. 900* ``create_image_handle`` creates a 64-bit unsigned integer image handle that 901 is going to be directly used in shaders. 902* ``delete_image_handle`` deletes a 64-bit unsigned integer image handle. 903* ``make_image_handle_resident`` makes a 64-bit unsigned integer image handle 904 resident in the current context to be accessible by shaders for image loads, 905 stores and atomic operations. 906 907Using several contexts 908---------------------- 909 910Several contexts from the same screen can be used at the same time. Objects 911created on one context cannot be used in another context, but the objects 912created by the screen methods can be used by all contexts. 913 914Transfers 915^^^^^^^^^ 916A transfer on one context is not expected to synchronize properly with 917rendering on other contexts, thus only areas not yet used for rendering should 918be locked. 919 920A flush is required after transfer_unmap to expect other contexts to see the 921uploaded data, unless: 922 923* Using persistent mapping. Associated with coherent mapping, unmapping the 924 resource is also not required to use it in other contexts. Without coherent 925 mapping, memory_barrier(PIPE_BARRIER_MAPPED_BUFFER) should be called on the 926 context that has mapped the resource. No flush is required. 927 928* Mapping the resource with PIPE_MAP_DIRECTLY. 929