Allow early z-test and early-lrz (if applicable) Disable early z-test and early-lrz test (if applicable) A special mode that allows early-lrz test but disables early-z test. Which might sound a bit funny, since lrz-test happens before z-test. But as long as a couple conditions are maintained this allows using lrz-test in cases where fragment shader has kill/discard: 1) Disable lrz-write in cases where it is uncertain during binning pass that a fragment will pass. Ie. if frag shader has-kill, writes-z, or alpha/stencil test is enabled. (For correctness, lrz-write must be disabled when blend is enabled.) This is analogous to how a z-prepass works. 2) Disable lrz-write and test if a depth-test direction reversal is detected. Due to condition (1), the contents of the lrz buffer are a conservative estimation of the depth buffer during the draw pass. Meaning that geometry that we know for certain will not be visible will not pass lrz-test. But geometry which may be (or contributes to blend) will pass the lrz-test. This allows us to keep early-lrz-test in cases where the frag shader does not write-z (ie. we know the z-value before FS) and does not have side-effects (image/ssbo writes, etc), but does have kill/discard. Which turns out to be a common enough case that it is useful to keep early-lrz test against the conservative lrz buffer to discard fragments that we know will definitely not be visible. b0..7 seems to contain the size of buffered by not yet processed RB level cmdstream.. it's possible that it is a low threshold and b8..15 is a high threshold? b16..23 identifies where IB1 data starts (and RB data ends?) b24..31 identifies where IB2 data starts (and IB1 data ends) low bits identify where CP_SET_DRAW_STATE stateobj processing starts (and IB2 data ends). I'm guessing b8 is part of this since (from downstream kgsl): /* ROQ sizes are twice as big on a640/a680 than on a630 */ if (adreno_is_a640(adreno_dev) || adreno_is_a680(adreno_dev)) { kgsl_regwrite(device, A6XX_CP_ROQ_THRESHOLDS_2, 0x02000140); kgsl_regwrite(device, A6XX_CP_ROQ_THRESHOLDS_1, 0x8040362C); } ... number of remaining dwords incl current dword being consumed? number of remaining dwords incl current dword being consumed? number of dwords that have already been read but haven't been consumed by $addr Configures the mapping between VSC_PIPE buffer and bin, X/Y specify the bin index in the horiz/vert direction (0,0 is upper left, 0,1 is leftmost bin on second row, and so on). W/H specify the number of bins assigned to this VSC_PIPE in the horiz/vert dimension. Seems to be a bitmap of which tiles mapped to the VSC pipe contain geometry. I suppose we can connect a maximum of 32 tiles to a single VSC pipe. Has the size of data written to corresponding VSC_PRIM_STRM buffer. Has the size of data written to corresponding VSC pipe, ie. same thing that is written out to VSC_DRAW_STRM_SIZE_ADDRESS_LO/HI LRZ write also disabled for blend/etc. update MAX instead of MIN value, ie. GL_GREATER/GL_GEQUAL Z_TEST_ENABLE bit is set for zfunc other than GL_ALWAYS or GL_NEVER also set when Z_BOUNDS_ENABLE is set For clearing depth/stencil 1 - depth 2 - stencil 3 - depth+stencil For clearing color buffer: then probably a component mask, I always see 0xf num of varyings plus four for gl_Position (plus one if gl_PointSize) plus # of transform-feedback (streamout) varyings if using the hw streamout (rather than stg instructions in shader) The number of extra copies of POSITION, i.e. number of views minus one when multi-position output is enabled, otherwise 0. This VPC location will be overwritten with ViewID when multiview is enabled. It's used when fragment shaders read ViewID. It's only strictly required for multi-position output, where the same VS invocation is used for all the views at once, but it can be used when multi-pos output is disabled too, to avoid having to pass ViewID through the VS. num of varyings plus four for gl_Position (plus one if gl_PointSize) plus # of transform-feedback (streamout) varyings if using the hw streamout (rather than stg instructions in shader) geometry shader size in vec4s of per-primitive storage for gs. TODO: not actually in VPC Multi-position output lets the last geometry stage shader write multiple copies of gl_Position. If disabled then the VS is run once for each view, and ViewID is passed as a register to the VS. bit 0 seems to toggle between 2k and 32k of shared storage the ldl/stl offset seems to be rewritten to 0 when it is beyond this limit. This is different from ldlw/stlw, which wraps at 64k (and has 36k of storage on A640 - reads between 36k-64k always return 0) per MRT This register clears pending loads queued up by CP_LOAD_STATE6. Each bit resets a particular kind(s) of CP_LOAD_STATE6. Shared constants are intended to be used for Vulkan push constants. When enabled, 8 vec4's are reserved in the FS const pool and 16 in the geometry const pool although only 8 are actually used (why?) and they are mapped to c504-c511 in each stage. Both VS and FS shared consts are written using ST6_CONSTANTS/SB6_IBO, so that both the geometry and FS shared consts can be written at once by using CP_LOAD_STATE6 rather than CP_LOAD_STATE6_FRAG/CP_LOAD_STATE6_GEOM. In addition DST_OFF and NUM_UNIT are in units of dwords instead of vec4's. There is also a separate shared constant pool for CS, which is loaded through CP_LOAD_STATE6_FRAG with ST6_UBO/ST6_IBO. However the only real difference for CS is the dword units. Texture sampler dwords Texture constant dwords Pitch in bytes (so actually stride) Pitch in bytes (so actually stride)