Lines Matching refs:aco
48 - New compiler backend "ACO" for RADV (RADV_PERFTEST=aco)
58 - radv/aco Jedi Fallen Order hair rendering buggy
484 - amd: Build aco only if radv is enabled
782 - aco: Initial commit of independent AMD compiler
783 - radv/aco: Setup alternate path in RADV to support the experimental
786 - radv/aco: enable VK_EXT_shader_demote_to_helper_invocation
788 - aco: only emit waitcnt on loop continues if we there was some load or
793 - radv/aco: Don't lower subtractions
794 - aco: call nir_opt_algebraic_late() exhaustively
796 - aco: re-use existing phi instruction when lowering boolean phis
797 - aco: don't reorder instructions in order to lower boolean phis
798 - aco: don't combine minmax3 if there is a neg or abs modifier in
800 - aco: ensure that uniform booleans are computed in WQM if their uses
802 - aco: refactor value numbering
803 - aco: restrict scheduling depending on max_waves
804 - aco: only skip RAR dependencies if the variable is killed somewhere
805 - aco: add can_reorder flags to load_ubo and load_constant
806 - aco: don't schedule instructions through depending VMEM instructions
807 - aco: Lower to CSSA
808 - aco: improve live variable analysis
809 - aco: remove potential critical edge on loops.
810 - aco: fix live-range splits of phis
811 - aco: fix transitive affinities of spilled variables
812 - aco: don't insert the exec mask into set of live-out variables when
814 - aco: consider loop_exit blocks like merge blocks, even if they have
816 - aco: don't add interferences between spilled phi operands
817 - aco: simplify calculation of target register pressure when spilling
818 - aco: ensure that spilled VGPR reloads are done after p_logical_start
819 - aco: omit linear VGPRs as spill variables
820 - aco: always set scratch_offset in startpgm
821 - aco: implement VGPR spilling
823 - aco: fix immediate offset for spills if scratch is used
824 - aco: only use single-dword loads/stores for spilling
825 - aco: fix accidential reordering of instructions when scheduling
826 - aco: workaround Tonga/Iceland hardware bug
827 - aco: fix invalid access on Pseudo_instructions
828 - aco: preserve kill flag on moved operands during RA
829 - aco: don't split live-ranges of linear VGPRs
830 - aco: fix a couple of value numbering issues
2539 - android: aco: fix undefined template 'std::__1::array' build errors
2541 - android: aco: add support for libmesa_aco
2543 - android: aco: fix Lower to CSSA
2554 - aco: Cleanup insert_before_logical_end
2798 - aco: run nir_lower_int64() before nir_lower_idiv()
2799 - aco: implement 64-bit ineg
2800 - aco: fix GFX9 opcode for v_xad_u32
2801 - aco: fix v_subrev_co_u32_e64 opcode
2802 - aco: fix opcode for s_mul_hi_i32
2803 - aco: check for duplicate opcode numbers
2804 - radv/aco: actually disable ACO when unsupported
2805 - aco,radv/aco: get dissassembly for release builds if requested
2806 - aco: store printed backend IR in binary
2807 - radv/aco: return a correct name and description for the backend IR
2808 - aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string
2809 - aco: don't CSE v_readlane_b32/v_readfirstlane_b32
2810 - aco: CSE readlane/readfirstlane/permute/reduce with the same exec
2812 - aco: set loop_info::has_discard for demotes
2813 - aco: don't remove the loop exec mask in transition_to_Exact()
2814 - radv/aco,aco: set lower_fmod
2816 - aco: fix load_constant with multiple arrays
2819 - aco: move s_andn2_b64 instructions out of the p_discard_if
2820 - aco: enable nir_opt_sink
2821 - aco: Allow literals on VOP3 instructions.
2822 - aco: Assemble opsel in VOP3 instructions.
2823 - aco: workaround GFX10 0x3f branch bug
2824 - aco: pad code with s_code_end on GFX10
2825 - aco: Initial work to avoid GFX10 hazards.
2826 - aco: Use the VOP3-only add/sub GFX10 instructions if needed.
2827 - aco: Have s_waitcnt_vscnt write to NULL.
2828 - radv/aco: disable NGG when ACO is used
2829 - aco/gfx10: fix inline uniform blocks
2830 - aco/gfx10: disable GFX9 1D texture workarounds
2831 - aco: rework scratch resource code
2832 - aco: update print_ir
2835 - aco: don't apply sgprs/constants to read/write lane instructions
2836 - aco: use can_accept_constant in valu_can_accept_literal
2837 - aco: readfirstlane vgpr pointers in convert_pointer_to_64_bit()
2838 - aco: implement divergent vulkan_resource_index
2839 - aco: don't use p_as_uniform for vgpr sampler/image indices
2840 - aco: fix scheduling with s_memtime/s_memrealtime
2841 - aco: don't CSE s_memtime
2842 - aco: emit_split_vector() s_memtime results
2844 - aco: use nir_lower_idiv_precise
2845 - aco: run opt_algebraic in a loop
2846 - aco: small stage corrections
2847 - aco: fix 64-bit p_extract_vector on 32-bit p_create_vector
2848 - aco: create load_lds/store_lds helpers
2849 - aco: fix sparse store_lds()
2850 - aco: properly combine additions into ds_write2_b64/ds_read2_b64
2851 - aco: use ds_read2_b64/ds_write2_b64
2852 - aco: add a few missing checks in value numbering
2853 - aco: keep can_reorder/barrier when combining addition into SMEM
2854 - aco: add missing bld.scc()
2855 - Revert "aco: only emit waitcnt on loop continues if we there was some
2858 - aco: increase accuracy of SGPR limits
2859 - aco: take LDS into account when calculating num_waves
2860 - aco: Fix reductions on GFX10.
2861 - aco: Remove dead code in reduction lowering.
2862 - aco: try to group together VMEM loads of the same resource
2863 - aco: a couple loop handling fixes for GFX10 hazard pass
2864 - aco: rename README to README.md
2865 - aco: fix new_demand calculation for first instructions
2866 - aco: fix shuffle with uniform operands
2867 - aco: fix read_invocation with VGPR lane index
2868 - aco: don't propagate vgprs into v_readlane/v_writelane
2869 - aco: don't combine literals into v_cndmask_b32/v_subb/v_addc
2870 - aco: fix 64-bit fsign with 0
2871 - aco: propagate p_wqm on an image_sample's coordinate p_create_vector
2872 - aco: fix i2i64
2873 - aco: add v_nop inbetween exec write and VMEM/DS/FLAT
3280 - aco: Set +wavefrontsize64 for LLVM disassembler in GFX10 wave64 mode.
3281 - aco: Add missing GFX10 specific fields and some README notes.
3282 - aco: Support GFX10 SMEM in aco_assembler.
3283 - aco: Support GFX10 VINTRP in aco_assembler.
3284 - aco: Support GFX10 DS in aco_assembler.
3285 - aco: Support GFX10 MUBUF in aco_assembler.
3287 - aco: Link ACO with amd/common.
3288 - aco: Support GFX10 MTBUF in aco_assembler.
3289 - aco: Support GFX10 MIMG and GFX9 D16 in aco_assembler.
3290 - aco: Fix GFX9 FLAT, SCRATCH, GLOBAL instructions, add GFX10 support.
3291 - aco: Support GFX10 EXP in aco_assembler.
3292 - aco: Support GFX10 VOP3 and VOP1 as VOP3 in aco_assembler.
3293 - aco: Set GFX10 DLC bit properly.
3294 - aco: Use ac_get_sampler_dim, delete duplicate code.
3295 - aco: Set GFX10 dimensionality on the instructions that need it.
3296 - aco: Support subvector loops in aco_assembler.
3297 - aco: Fix VS input VGPRs on GFX10.
3298 - aco: Fix s_dcache_wb on GFX10.
3299 - aco: Add extra assertion for number of FS input VGPRs.
3300 - aco: Clean up usages of PhysReg::reg from aco_assembler.
3301 - aco/gfx10: Wait for pending SMEM stores before loads
3302 - aco/gfx10: Fix PS exports for SPI_SHADER_32_AR.
3303 - aco/gfx10: Update constant addresses in fix_branches_gfx10.
3304 - aco/gfx10: Add notes about some GFX10 hazards.
3305 - aco/gfx10: Mitigate VcmpxPermlaneHazard.
3306 - aco/gfx10: Mitigate VcmpxExecWARHazard.
3307 - aco/gfx10: Mitigate SMEMtoVectorWriteHazard.
3308 - aco/gfx10: Mitigate LdsBranchVmemWARHazard.
3309 - aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard.
3310 - aco: Refactor hazard mitigations, separate pass for GFX10.
3313 - aco: Implement subgroup shuffle in GFX10 wave64 mode.
3314 - aco: Introduce vgpr_limit to keep track of available VGPRs.