Lines Matching refs:aco
132 - radv/aco: "Failed to allocate registers" in AC:Valhalla
177 - radv,aco: CTS image robustness tests fail to compile
191 - \[aco\] problem compiling compute pipeline
808 - aco/ra: use get_reg_specified() for p_extract_vector
809 - aco: don't create dead exec mask phis on merge blocks
810 - aco: fix DCE of rematerializable phi operands
811 - aco/spill: only prevent rematerializable vars from being DCE'd if they haven't been renamed
812 - aco/ra: fix phi operand renaming
814 - aco: don't emit parallelcopy when switching to WQM.
815 - aco: make pred_by_exec_mask() accessible in other files
816 - aco: allow to schedule SALU/SMEM through exec changes
817 - aco: fix def-use distance calculation when scheduling.
818 - aco: schedule position exports in the same pass as memory operations
819 - aco: create VMEM clauses slightly more aggressive
833 - aco: simplify and fix operand/definition sizes
834 - aco/ra: fix infinite recursion in get_reg_simple() with subdword registers
835 - aco: fix VOP3P assembly, VN and validation
836 - aco/RA: fix subdword operands on VOP3P instructions
837 - aco: allow constants/literals on every src position for VOP3P
838 - aco: allow SGPRs on every src position for VOP3P
839 - aco: change usesModifiers() considering opsel_hi on packed instructions
840 - aco: create helpers to emit vop3p instructions
841 - aco: emit packed 16bit instructions
843 - aco: simplify multiply-add combining
844 - aco: optimize packed mul+add to v_pk_fma_f16
845 - aco: optimize packed clamp
846 - aco: optimize packed fneg
847 - aco: optimize v_pk_fma_f16 -\> v_pk_fmac_f16 on GFX10
848 - aco: propagate swizzles when optimizing packed clamp & fma
849 - aco: remove divergent branches which only jump over very few instructions
850 - aco/optimizer: don't propagate subdword temps of different size
851 - aco/optimizer: don't copy-prop logical phis
852 - aco: fix nir_intrinsic_ballot with wave32
853 - aco: fix shared VGPR allocation on RDNA2
1637 - aco: Define NOMINMAX in Meson build file
1638 - aco: Fix warnings about unsafe integer/bool mix
1639 - aco: Add missing C++ includes
1640 - aco: Remove nonstandard parentheses
1641 - aco: Declare num_reduce_ops for array size
1642 - aco: Const correct aco_compiler_statistics
1643 - aco: Replace indexed array initialization
1644 - aco: Use u_memstream instead of POSIX memstream
1645 - aco: Initialize union within Operand for MSVC
1646 - aco: Fix warnings for bools in bitwise logic
1647 - aco: Stub sections that don't have \_WIN32 support
1648 - aco: Avoid extra bitfield padding
1857 - aco: use UINT64_C on 64 bit constant arguments
2677 - aco: don't combine precise max(min()) to med3
2678 - aco: fix combine_constant_comparison_ordering() NaN check with 16/64-bit
2679 - aco: disallow various v_add_u32 opts if modifiers are used
2680 - aco/tests: initialize debug function
2681 - aco/tests: expand optimize.const_comparison_ordering tests
2682 - aco/tests: add some more clamp combining tests
2685 - aco: disable omod if the sign of zeros should be preserved
2686 - aco: fix fp16 \*0.5 omod
2687 - aco/tests: add output modifier tests
2688 - aco: don't use SMEM for SSBO stores
2689 - aco: create v_mad_u32_u24
2698 - aco: don't create v_mov_b32 in v_mul_imm()
2699 - aco: count v_mul_lo_u32 as 16 cycles
2700 - aco: create vgpr constant copies using v_bfrev_b32
2701 - aco: copy constant to sgpr in Builder::v_mul_imm()
2702 - aco: try harder to not create v_mul_lo_u32
2703 - aco: use v_mul_imm() for some nir_op_imul
2704 - aco/tests: add Builder::v_mul_imm() tests
2705 - aco: fix v_mul_hi_u32_u24 format
2708 - radv/llvm,aco/ngg: fix large shift exponent in ngg_gs_vertex_lds_addr
2709 - aco: fix GS with no outputs
2710 - aco/ngg: fix division-by-zero in assertion
2730 - aco: use binding chasing helpers
2732 - aco: use FALLTHROUGH macro
2734 - aco: don't assume src=lower when splitting self-intersecting copies
2735 - aco: test self-intersecting copies when src=higher
2736 - aco: remove sign-extension in constantValue64()
2737 - aco: allow 64-bit literals if they can be sign/zero-extended from 32-bit
2738 - aco: add get_const/is_constant_representable helpers
2739 - aco: use v_lshrrev_b64 for 64-bit VGPR copies on GFX10+
2740 - aco: coalesce constant copies
2741 - aco: clear operands in update_renames()
2742 - aco: don't fill killed operands in update_renames()
2743 - aco: remove rollback code in get_reg_create_vector()
2744 - aco: repeat get_reg_create_vector() with increased register demand if fail
2745 - aco: use clear() helper instead of writing reg file directly
2746 - aco: simplify get_reg_impl()
2747 - aco: remove rollback code around parallelcopy creation
2748 - aco: remove rollback code for blocked fixed definitions
2749 - aco: move update_renames() out of get_reg()
2750 - aco: remove rollback code when making an instruction vop3
2763 - aco: fix various s_subb_u32 operands to SCC
2764 - aco: rename s_subb_u32 operands to borrow
2766 - aco: fix mbcnt_amd with wave32
2767 - aco: allow divergent mbcnt_amd masks
2768 - aco: add block to worklist in mark_block_wqm()
2770 - aco: fix incorrect address calculation for load_barycentric_at_sample
2788 - aco: fix unreachable() for uniform 8/16-bit nir_op_mov from VGPR
2789 - aco: fix MIMG_instruction::lwe comment
2790 - aco: move MIMG VDATA to its own operand
2791 - aco: implement nir_op_vec5
2792 - aco: implement sparse texture fetches
2793 - aco: implement sparse image loads
2794 - aco: form sparse load clauses
2802 - aco: try to better align 8+ dword SGPR vectors
2803 - aco: remove can_reorder semantic in get_sync_info_with_hack
2806 - aco: improve nir_op_vec with constant operands
2807 - aco/tests: don't rely on argument evaluation order
2809 - aco: fix convert_to_SDWA() check in add_subdword_definition()
2810 - radv,aco: don't use MUBUF for multi-channel loads on GFX8 with robustness2
2811 - aco: don't consider a phi trivial if same's register doesn't match the def
2815 - aco: always set exec_live=false
2816 - aco: do not flag all blocks WQM to ensure we enter all nested loops in WQM
2817 - aco: add fallback algorithm in get_reg()
2818 - aco/lower_phis: fix all_preds_uniform with continue_or_break
2819 - aco: add missing usable_read2 check
2823 - aco: calculate all p_as_uniform and v_readfirstlane_b32 sources in WQM
2950 - aco: fix combining add/sub to b2i if a new dest needs to be allocated
2952 - aco/tests: add some tests for combining s_add+s_lshl to s_lshl<n>_add
2953 - aco: combine more s_add+s_lshl to s_lshl<n>_add by ignoring uses
2954 - aco: introduce a generic label for labelling instructions
2955 - aco: add a new Operand flag to indicate that is 16-bit
2956 - aco: optimize v_mad_u32_u16 with acc=0 to v_mul_u32_u24
2957 - aco: select v_mad_u32_u16 for 16-bit multiplications on GFX9+
2958 - aco: select v_mul_lo_u16 for 16-bit multiplications that can't overflow
2959 - aco: optimize v_add_u32(v_mul_lo_u16) -\> v_mad_u32_u16
2960 - aco: optimize v_add(v_bcnt(a, 0), b) to v_bcnt(a, b)
2963 - aco: remove v_{add,sub,subrev}_u32 on GFX8
2968 - aco: fix combining max(-min(a, b), c) if a or b uses the neg modifier
2982 - aco/tests: extend the optimize.add_lshl tests to GFX8
2983 - aco: add a new Operand flag to indicate that is 24-bit
2984 - aco: allow to use the range analysis UB in emit_{sop2,vop2}_instruction()
2985 - aco: optimize v_add+s_lshl to v_mad_u32_u24 on GFX6-8
2986 - aco: optimize v_add+v_lshlrev to v_mad_u32_u24 on GFX6-8
2999 - radv/llvm,aco: always split typed vertex buffer loads on GFX6 and GFX10+
3001 - Revert "radv/llvm,aco: always split typed vertex buffer loads on GFX6 and GFX10+"
3021 - aco: implement fragment shading rate
3024 - aco: implement a workaround for gl_FragCoord.z with VRS on GFX10.3
3052 - aco: fix creating the dest vector when 16-bit vertex fetches are splitted
3053 - radv/llvm,aco: always split typed vertex buffer loads on GFX6 and GFX10+
3070 - aco: fix inserting expcnt for MIMG on GFX6
3098 - radv,aco: fix shifting input VGPRs for the LS VGPR init bug on GFX9
3156 - aco/optimizer: Only set scc_needed when it is actually needed.
3157 - aco/optimizer: Propagate scc_needed label through p_wqm.
3158 - aco: Fix NGG GS assert failure from the WG scan.
3159 - aco: Skip TCS s_barrier when VS outputs are not stored in the LDS.
3160 - aco: Use program->num_waves as maximum in scheduler.
3161 - aco: Keep live-though variables and constants spilled.
3162 - aco: Spill more optimally before loops.
3163 - aco: Note if rasterization can start early.
3164 - aco: Wait for stores when NGG or legacy VS can finish early.
3169 - aco: Disallow LSHS temp-only I/O when VS output is written indirectly.
3170 - aco: Fix LDS statistics of tess control shaders.
3183 - aco: Fix -Wshadow warnings
3184 - aco/tests: Fix -Wshadow warnings
3185 - aco/tests: Fix -Wunused warnings in release mode
3187 - radv,aco: Compile with -Wshadow when available
3190 - aco: Annotate switch fallthroughs
3191 - radv,aco: Compile with -Wimplicit-fallthrough when available
3193 - aco/ra: Add policy parameter to select implementation details for testing
3194 - aco/tests: Fix GFX10_3 being printed as gfx11
3195 - aco/tests: Allow specifiying the test subvariant in setup_cs
3196 - aco/tests: Fix deadlock for too large test lists
3197 - aco: Add tests for subdword register allocation
3198 - aco/ra: Add some documentation
3199 - aco/ra: Fix register allocation for subdword operands
3200 - aco/ra: Avoid redundant RegisterFile copies in get_reg_impl
3201 - aco: Fix vector::reserve() being called with the wrong size
3263 - aco: Initialize ds_state.front.writeMask.