1=========================== 2LLVM Branch Weight Metadata 3=========================== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11Branch Weight Metadata represents branch weights as its likeliness to be taken 12(see :doc:`BlockFrequencyTerminology`). Metadata is assigned to the 13``TerminatorInst`` as a ``MDNode`` of the ``MD_prof`` kind. The first operator 14is always a ``MDString`` node with the string "branch_weights". Number of 15operators depends on the terminator type. 16 17Branch weights might be fetch from the profiling file, or generated based on 18`__builtin_expect`_ instruction. 19 20All weights are represented as an unsigned 32-bit values, where higher value 21indicates greater chance to be taken. 22 23Supported Instructions 24====================== 25 26``BranchInst`` 27^^^^^^^^^^^^^^ 28 29Metadata is only assigned to the conditional branches. There are two extra 30operands for the true and the false branch. 31 32.. code-block:: none 33 34 !0 = metadata !{ 35 metadata !"branch_weights", 36 i32 <TRUE_BRANCH_WEIGHT>, 37 i32 <FALSE_BRANCH_WEIGHT> 38 } 39 40``SwitchInst`` 41^^^^^^^^^^^^^^ 42 43Branch weights are assigned to every case (including the ``default`` case which 44is always case #0). 45 46.. code-block:: none 47 48 !0 = metadata !{ 49 metadata !"branch_weights", 50 i32 <DEFAULT_BRANCH_WEIGHT> 51 [ , i32 <CASE_BRANCH_WEIGHT> ... ] 52 } 53 54``IndirectBrInst`` 55^^^^^^^^^^^^^^^^^^ 56 57Branch weights are assigned to every destination. 58 59.. code-block:: none 60 61 !0 = metadata !{ 62 metadata !"branch_weights", 63 i32 <LABEL_BRANCH_WEIGHT> 64 [ , i32 <LABEL_BRANCH_WEIGHT> ... ] 65 } 66 67``CallInst`` 68^^^^^^^^^^^^^^^^^^ 69 70Calls may have branch weight metadata, containing the execution count of 71the call. It is currently used in SamplePGO mode only, to augment the 72block and entry counts which may not be accurate with sampling. 73 74.. code-block:: none 75 76 !0 = metadata !{ 77 metadata !"branch_weights", 78 i32 <CALL_BRANCH_WEIGHT> 79 } 80 81Other 82^^^^^ 83 84Other terminator instructions are not allowed to contain Branch Weight Metadata. 85 86.. _\__builtin_expect: 87 88Built-in ``expect`` Instructions 89================================ 90 91``__builtin_expect(long exp, long c)`` instruction provides branch prediction 92information. The return value is the value of ``exp``. 93 94It is especially useful in conditional statements. Currently Clang supports two 95conditional statements: 96 97``if`` statement 98^^^^^^^^^^^^^^^^ 99 100The ``exp`` parameter is the condition. The ``c`` parameter is the expected 101comparison value. If it is equal to 1 (true), the condition is likely to be 102true, in other case condition is likely to be false. For example: 103 104.. code-block:: c++ 105 106 if (__builtin_expect(x > 0, 1)) { 107 // This block is likely to be taken. 108 } 109 110``switch`` statement 111^^^^^^^^^^^^^^^^^^^^ 112 113The ``exp`` parameter is the value. The ``c`` parameter is the expected 114value. If the expected value doesn't show on the cases list, the ``default`` 115case is assumed to be likely taken. 116 117.. code-block:: c++ 118 119 switch (__builtin_expect(x, 5)) { 120 default: break; 121 case 0: // ... 122 case 3: // ... 123 case 5: // This case is likely to be taken. 124 } 125 126CFG Modifications 127================= 128 129Branch Weight Metatada is not proof against CFG changes. If terminator operands' 130are changed some action should be taken. In other case some misoptimizations may 131occur due to incorrect branch prediction information. 132 133Function Entry Counts 134===================== 135 136To allow comparing different functions during inter-procedural analysis and 137optimization, ``MD_prof`` nodes can also be assigned to a function definition. 138The first operand is a string indicating the name of the associated counter. 139 140Currently, one counter is supported: "function_entry_count". The second operand 141is a 64-bit counter that indicates the number of times that this function was 142invoked (in the case of instrumentation-based profiles). In the case of 143sampling-based profiles, this operand is an approximation of how many times 144the function was invoked. 145 146For example, in the code below, the instrumentation for function foo() 147indicates that it was called 2,590 times at runtime. 148 149.. code-block:: llvm 150 151 define i32 @foo() !prof !1 { 152 ret i32 0 153 } 154 !1 = !{!"function_entry_count", i64 2590} 155 156If "function_entry_count" has more than 2 operands, the later operands are 157the GUID of the functions that needs to be imported by ThinLTO. This is only 158set by sampling based profile. It is needed because the sampling based profile 159was collected on a binary that had already imported and inlined these functions, 160and we need to ensure the IR matches in the ThinLTO backends for profile 161annotation. The reason why we cannot annotate this on the callsite is that it 162can only goes down 1 level in the call chain. For the cases where 163foo_in_a_cc()->bar_in_b_cc()->baz_in_c_cc(), we will need to go down 2 levels 164in the call chain to import both bar_in_b_cc and baz_in_c_cc. 165