Lines Matching +full:flat +full:- +full:cache
8 D.u = abs(S0.i - S1.i) + S2.u.
15 ABS_DIFF (A,B) = (A>B) ? (A-B) : (B-A)
21 `v_sad_u32(-5, 0, 0)` would return `4294967291` (`-5` interpreted as unsigned),
78 > and sent to the texture cache. Any texture or buffer resources and samplers
79 > are also sent immediately. However, write-data is not immediately sent to the
80 > texture cache.
102 ## FLAT, Scratch, Global instructions
126 ## RDNA L0, L1 cache and DLC, GLC bits
128 The old L1 cache was renamed to L0, and a new L1 cache was added to RDNA. The
129 L1 cache is 1 cache per shader array. Some instruction encodings have DLC and
130 GLC bits that interact with the cache.
132 * DLC ("device level coherent") bit: controls the L1 cache
133 * GLC ("globally coherent") bit: controls the L0 cache
139 Stores and atomics always bypass the L1 cache, so they don't support the DLC bit,
184 …tps://github.com/llvm/llvm-project/blob/acb089e12ae48b82c0b05c42326196a030df9b82/llvm/lib/Target/A…
193 …VM source.](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/Utils/AMDGPUBase…
219 VMEM/FLAT/GLOBAL/SCRATCH/DS instruction reads an SGPR (or EXEC, or M0).
231 Any non-SOPP SALU instruction (except `s_setvskip`, `s_version`, and any non-lgkmcnt `s_waitcnt`).
245 When there is a misaligned multi-dword FLAT load/store instruction in WGP mode,
246 it needs to be split into multiple single-dword FLAT instructions.
248 ACO doesn't use FLAT load/store on GFX10, so is unaffected.
252 The 12-bit immediate OFFSET field of FLAT instructions must always be 0.
255 ACO doesn't use FLAT load/store on GFX10, so is unaffected.
268 Any non-VALU instruction reads the EXEC mask. Then, any VALU instruction writes the EXEC mask.
286 "MIMG-NSA in a hard clause has unpredictable results on GFX10.1"