Lines Matching +full:asm +full:- +full:operand +full:- +full:widths
14 type safety, low-level operations, flexibility, and the capability of
15 representing 'all' high-level languages cleanly. It is the common code
23 forms: as an in-memory compiler IR, as an on-disk bitcode representation
24 (suitable for fast loading by a Just-In-Time compiler), and as a human
32 The LLVM representation aims to be light-weight and low-level while
35 high-level ideas may be cleanly mapped to it (similar to how
45 Well-Formedness
46 ---------------
53 .. code-block:: llvm
78 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
107 .. code-block:: llvm
113 .. code-block:: llvm
119 .. code-block:: llvm
131 #. Unnamed temporaries are numbered sequentially (using a per-function
145 ----------------
154 .. code-block:: llvm
190 -------------
266 "one definition rule" --- "ODR"). Such languages can use the
269 types are otherwise the same as their non-``odr`` versions.
281 -------------------
290 "``ccc``" - The C calling convention
296 "``fastcc``" - The fast calling convention
306 "``coldcc``" - The cold calling convention
315 "``cc 10``" - GHC convention
326 - On *X86-32* only supports up to 4 bit type parameters. No
328 - On *X86-64* only supports up to 10 bit type parameters and 6
334 "``cc 11``" - The HiPE calling convention
336 the `High-Performance Erlang
341 convention and defines no callee-saved registers. The calling
349 "``webkit_jscc``" - WebKit's JavaScript calling convention
354 "``anyregcc``" - Dynamic calling convention for code patching
361 "``preserve_mostcc``" - The `PreserveMost` calling convention
365 uses a different set of caller/callee-saved registers. This alleviates the
367 call in the caller. If the arguments are passed in callee-saved registers,
369 apply for values returned in callee-saved registers.
371 - On X86-64 the callee preserves all general purpose registers, except for
372 R11. R11 can be used as a scratch register. Floating-point registers
378 another function and therefore only needs to preserve the caller-saved
381 convention in terms of caller/callee-saved registers, but they are used for
392 supports X86-64, but the intention is to support more architectures in the
394 "``preserve_allcc``" - The `PreserveAll` calling convention
399 caller/callee-saved registers. This removes the burden of saving and
401 the arguments are passed in callee-saved registers, then they will be
403 returned in callee-saved registers.
405 - On X86-64 the callee preserves all general purpose registers, except for
407 all floating-point registers (XMMs/YMMs).
415 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
416 Clang generates an access function to access C++-style TLS. The access
419 a few TLS IR variables, each access will be lowered to a platform-specific
428 caller/callee-saved registers.
433 - On X86-64 the callee preserves all general purpose registers, except for
435 "``swiftcc``" - This calling convention is used for Swift language.
436 - On X86-64 RCX and R8 are available for additional integer returns, and
438 - On iOS platforms, we use AAPCS-VFP calling convention.
439 "``cc <n>``" - Numbered convention
441 target-specific calling conventions to be used. Target specific
444 More calling conventions can be added/defined on an as-needed basis, to
445 support Pascal conventions or any other well-known target-independent
451 -----------------
456 "``default``" - Default style
463 "``hidden``" - Hidden style
469 "``protected``" - Protected style
481 -------------------
502 ---------------------------
506 variable). Not all targets support thread-local variables. Optionally, a
519 Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
527 For platforms without linker support of ELF TLS model, the -femulated-tls
533 ---------------
542 .. code-block:: llvm
552 ----------------
555 instead of run-time.
567 optimization, allowing the global data to be placed in the read-only
595 A global variable may be declared to reside in a target-specific
619 to over-align the global if the global has an assigned section. In this
643 .. code-block:: llvm
649 .. code-block:: llvm
653 The following example defines a thread-local global with the
656 .. code-block:: llvm
663 ---------
743 -------
786 -------
805 -------
836 Note that the Mach-O platform doesn't support COMDATs and ELF only supports
842 .. code-block:: llvm
854 .. code-block:: llvm
869 The contents and size of this object may be used during link-time to determine
878 .. code-block:: llvm
893 in individual sections (e.g. when `-data-sections` or `-function-sections`
899 --------------
922 --------------------
935 .. code-block:: llvm
948 value should be zero-extended to the extent required by the target's
952 value should be sign-extended to the extent required by the target's
953 ABI (which is usually 32-bits) by the caller (for a parameter) or
957 in a special target-dependent fashion while emitting code for
961 target-specific.
978 makes a target-specific assumption.
1075 passed in is non-null, or the callee must ensure that the returned pointer
1076 is non-null.
1091 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1092 time. All non-null pointers tagged with
1110 pointer-sized alloca. At the call site, the actual argument that corresponds
1120 on a parameter is not ABI-compatible with one which does not.
1129 --------------------------------
1134 .. code-block:: llvm
1139 <builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1148 -----------
1153 language-specific runtime metadata with specific functions and make it
1159 index -1. This implies that the IR symbol points just past the end of
1163 .. code-block:: llvm
1169 .. code-block:: llvm
1172 %a = getelementptr inbounds i32, i32* %0, i32 -1
1190 -------------
1194 function hot-patching and instrumentation.
1208 .. code-block:: llvm
1216 .. code-block:: llvm
1229 --------------------
1237 ----------------
1245 An attribute group is a module-level object. To use an attribute group, an
1253 .. code-block:: llvm
1255 ; Target-independent attributes:
1258 ; Target-dependent attributes:
1259 attributes #1 = { "no-sse" }
1261 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1267 -------------------
1278 .. code-block:: llvm
1292 least a given number of bytes (or null). Its arguments are zero-indexed
1305 recognized as a built-in function, even though the function's declaration
1311 computing edge weights, basic blocks post-dominated by a cold
1316 made control-dependent on any additional values. We call such operations
1321 this function should not be made control-dependent on additional values.
1323 calls to this intrinsic cannot be made control-dependent on additional
1329 may treat such calls as though the target is non-convergent.
1350 jump-instruction table at code-generation time, and that all address-taken
1352 appropriate jump-instruction-table function pointer. Note that this creates
1354 on function-pointer identity can break. So, any function annotated with
1363 function. This can have very system-specific consequences.
1366 a built-in function. LLVM will retain the original call and not replace it
1367 with equivalent code based on the semantics of the built-in function, unless
1393 red zone, even if the target-specific ABI normally permits it.
1426 ``"patchable-function"``
1433 * ``"prologue-short-redirect"`` - This style of patchable
1439 fully changed via an atomic compare-and-swap instruction.
1441 enough NOP, LLVM can and will try to re-purpose an existing
1445 ``"prologue-short-redirect"`` is currently only supported on
1446 x86-64.
1449 inter-procedural optimizations. All of the semantic effects the
1487 loads and stores from objects pointed to by its pointer-typed arguments,
1518 smashing protector. It is in the form of a "canary" --- a random value
1524 - Character arrays larger than ``ssp-buffer-size`` (default 8).
1525 - Aggregates containing character arrays larger than ``ssp-buffer-size``.
1526 - Calls to alloca() with variable sizes or constant sizes greater than
1527 ``ssp-buffer-size``.
1545 (``>= ssp-buffer-size``) are closest to the stack protector.
1547 (``< ssp-buffer-size``) are 2nd closest to the protector.
1561 - Arrays of any size and type
1562 - Aggregates containing an array of any size and type.
1563 - Calls to alloca().
1564 - Local variables that have had their address taken.
1571 (``>= ssp-buffer-size``) are closest to the stack protector.
1573 (``< ssp-buffer-size``) are 2nd closest to the protector.
1591 the ELF x86-64 abi, but it can be disabled for some compilation
1597 Operand Bundles
1598 ---------------
1600 Note: operand bundles are a work in progress, and they should be
1603 Operand bundles are tagged sets of SSA values that can be associated
1610 operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
1611 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
1612 bundle operand ::= SSA value
1615 Operand bundles are **not** part of a function's signature, and a
1617 of operand bundles. This reflects the fact that the operand bundles
1621 Operand bundles are a generic mechanism intended to support
1622 runtime-introspection-like functionality for managed languages. While
1623 the exact semantics of an operand bundle depend on the bundle tag,
1624 there are certain limitations to how much the presence of an operand
1626 are described as the semantics of an "unknown" operand bundle. As
1627 long as the behavior of an operand bundle is describable within these
1629 operand bundle to not miscompile programs containing it.
1631 - The bundle operands for an unknown operand bundle escape in unknown
1633 - Calls and invokes with operand bundles have unknown read / write
1637 - An operand bundle at a call site cannot change the implementation
1638 of the called function. Inter-procedural optimizations work as
1641 More specific types of operand bundles are described below.
1645 Deoptimization Operand Bundles
1648 Deoptimization operand bundles are characterized by the ``"deopt"``
1649 operand bundle tag. These operand bundles represent an alternate
1652 specified call site. There can be at most one ``"deopt"`` operand
1657 From the compiler's perspective, deoptimization operand bundles make
1661 operand bundles do not capture their operands except during
1666 operand bundles. Just like inlining through a normal call site
1668 through a call site with a deoptimization operand bundle needs to
1674 .. code-block:: llvm
1690 .. code-block:: llvm
1707 Funclet Operand Bundles
1710 Funclet operand bundles are characterized by the ``"funclet"``
1711 operand bundle tag. These operand bundles indicate that a call site
1713 ``"funclet"`` operand bundle attached to a call site and it must have
1714 exactly one bundle operand.
1717 `description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
1722 * has a ``"funclet"`` bundle whose operand is not the most-recently-entered
1723 not-yet-exited funclet EH pad.
1725 Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
1728 GC Transition Operand Bundles
1731 GC transition operand bundles are characterized by the
1732 ``"gc-transition"`` operand bundle tag. These operand bundles mark a
1742 Module-Level Inline Assembly
1743 ----------------------------
1745 Modules may contain "module-level inline asm" blocks, which corresponds
1746 to the GCC "file scope inline asm" blocks. These blocks are internally
1750 .. code-block:: llvm
1752 module asm "inline asm code goes here"
1753 module asm "more can go here"
1755 The strings can contain any character by escaping non-printable
1765 -----------
1771 .. code-block:: llvm
1776 separated by the minus sign character ('-'). Each specification starts
1782 Specifies that the target lays out data in big-endian form. That is,
1786 Specifies that the target lays out data in little-endian form. That
1793 must be a multiple of 8-bits. If omitted, the natural stack
1822 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
1824 * ``w``: Windows COFF prefix: Similar to Mach-O, but stdcall and fastcall
1829 This specifies a set of native integer widths for the target CPU in
1830 bits. For example, it might contain ``n32`` for 32-bit PowerPC,
1831 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
1844 - ``E`` - big endian
1845 - ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
1846 - ``p[n]:64:64:64`` - Other address spaces are assumed to be the
1848 - ``S0`` - natural stack alignment is unspecified
1849 - ``i1:8:8`` - i1 is 8-bit (byte) aligned
1850 - ``i8:8:8`` - i8 is 8-bit (byte) aligned
1851 - ``i16:16:16`` - i16 is 16-bit aligned
1852 - ``i32:32:32`` - i32 is 32-bit aligned
1853 - ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
1854 alignment of 64-bits
1855 - ``f16:16:16`` - half is 16-bit aligned
1856 - ``f32:32:32`` - float is 32-bit aligned
1857 - ``f64:64:64`` - double is 64-bit aligned
1858 - ``f128:128:128`` - quad is 128-bit aligned
1859 - ``v64:64:64`` - 64-bit vector is 64-bit aligned
1860 - ``v128:128:128`` - 128-bit vector is 128-bit aligned
1861 - ``a:0:64`` - aggregates are 64-bit aligned
1886 mid-level optimizers to improve code, and this only works if it matches
1888 that does not embed this target-specific detail into the IR. If you
1897 -------------
1902 .. code-block:: llvm
1904 target triple = "x86_64-apple-macosx10.7.0"
1907 by the minus sign character ('-'). The canonical forms are:
1911 ARCHITECTURE-VENDOR-OPERATING_SYSTEM
1912 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
1916 command line with the ``-mtriple`` command line option.
1921 ----------------------
1928 - A pointer value is associated with the addresses associated with any
1930 - An address of a global variable is associated with the address range
1932 - The result value of an allocation instruction is associated with the
1934 - A null pointer in the default address-space is associated with no
1936 - An integer constant other than zero or a pointer value returned from
1945 - A pointer value formed from a ``getelementptr`` operation is *based*
1946 on the first value operand of the ``getelementptr``.
1947 - The result value of a ``bitcast`` is *based* on the operand of the
1949 - A pointer value formed by an ``inttoptr`` is *based* on all pointer
1952 - The "*based* on" relationship is transitive.
1960 operand type of a ``store`` similarly only indicates the size and
1963 Consequently, type-based alias analysis, aka TBAA, aka
1964 ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
1966 which specialized optimization passes may use to implement type-based
1972 ------------------------
1979 operations relative to non-volatile operations. This is not Java's
1980 "volatile" and has no cross-thread synchronization behavior.
1982 IR-level volatile loads and stores cannot safely be optimized into
1985 target-legal volatile load/store instructions.
1991 this holds for an l-value of volatile primitive type with native
2000 --------------------------------------
2004 platform-specific ways to create them, and we define LLVM IR's behavior
2009 We define a *happens-before* partial order as the least partial order
2012 - Is a superset of single-thread program order, and
2013 - When a *synchronizes-with* ``b``, includes an edge from ``a`` to
2014 ``b``. *Synchronizes-with* pairs are introduced by platform-specific
2019 Note that program order does not introduce *happens-before* edges
2023 loads/read-modify-writes, etc.) R reads a series of bytes written by
2025 stores/read-modify-writes, memcpy, etc.). For the purposes of this
2031 - If write\ :sub:`1` happens before write\ :sub:`2`, and
2034 - If R\ :sub:`byte` happens before write\ :sub:`3`, then
2039 - If R is volatile, the result is target-dependent. (Volatile is
2042 like normal memory. It does not generally provide cross-thread
2044 - Otherwise, if there is no write to the same byte that happens before
2046 - Otherwise, if R\ :sub:`byte` may see exactly one write,
2048 - Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
2052 - Otherwise R\ :sub:`byte` returns ``undef``.
2062 is required for single-threaded execution: introducing a store to a byte
2071 ----------------------------------
2089 The set of values that can be read is governed by the happens-before
2092 Java's non-volatile shared variables. This ordering cannot be
2093 specified for read-modify-write operations; it is not strong enough
2099 happens-before order. There is no guarantee that the modification
2102 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
2109 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
2115 *synchronizes-with* edge may be formed with a ``release`` operation.
2120 operation, it *synchronizes-with* that operation. (This isn't a
2131 sequentially-consistent operations on all addresses, which is
2132 consistent with the *happens-before* partial order and with the
2134 sequentially-consistent read sees the last preceding write to the
2147 Fast-Math Flags
2148 ---------------
2150 LLVM IR floating-point binary ops (:ref:`fadd <i_fadd>`,
2156 No NaNs - Allow optimizations to assume the arguments and result are not
2161 No Infs - Allow optimizations to assume the arguments and result are not
2162 +/-Inf. Such optimizations are required to retain defined behavior over
2163 +/-Inf, but the value of the result is undefined.
2166 No Signed Zeros - Allow optimizations to treat the sign of a zero
2170 Allow Reciprocal - Allow optimizations to use the reciprocal of an
2174 Fast - Allow algebraically equivalent transformations that may
2180 Use-list Order Directives
2181 -------------------------
2183 Use-list directives encode the in-memory order of each use-list, allowing the
2184 order to be recreated. ``<order-indexes>`` is a comma-separated list of
2186 value's use-list is immediately sorted by these indexes.
2188 Use-list directives may appear at function scope or global scope. They are not
2193 ``uselistorder_bb`` can be used to reorder their use-lists from outside their
2200 uselistorder <ty> <value>, { <order-indexes> }
2201 uselistorder_bb @function, %block { <order-indexes> }
2227 ---------------
2240 .. code-block:: llvm
2260 ---------
2278 -------------
2285 type is a void type or first class type --- except for :ref:`label <t_label>`
2294 ...where '``<parameter list>``' is a comma-separated list of type
2303 …---------------------------------+----------------------------------------------------------------…
2305 …---------------------------------+----------------------------------------------------------------…
2307 …---------------------------------+----------------------------------------------------------------…
2309 …---------------------------------+----------------------------------------------------------------…
2311 …---------------------------------+----------------------------------------------------------------…
2316 -----------------
2338 bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified.
2352 +----------------+------------------------------------------------+
2353 | ``i1`` | a single-bit integer. |
2354 +----------------+------------------------------------------------+
2355 | ``i32`` | a 32-bit integer. |
2356 +----------------+------------------------------------------------+
2358 +----------------+------------------------------------------------+
2365 .. list-table::
2366 :header-rows: 1
2368 * - Type
2369 - Description
2371 * - ``half``
2372 - 16-bit floating point value
2374 * - ``float``
2375 - 32-bit floating point value
2377 * - ``double``
2378 - 64-bit floating point value
2380 * - ``fp128``
2381 - 128-bit floating point value (112-bit mantissa)
2383 * - ``x86_fp80``
2384 - 80-bit floating point value (X87)
2386 * - ``ppc_fp128``
2387 - 128-bit floating point value (two 64-bits)
2396 return values, load and store, and bitcast. User-specified MMX
2397 instructions are represented as intrinsic or asm calls with arguments
2419 numbered address space where the pointed-to object resides. The default
2420 address space is number zero. The semantics of non-zero address spaces
2421 are target-specific.
2434 +-------------------------+------------------------------------------------------------------------…
2436 +-------------------------+------------------------------------------------------------------------…
2438 +-------------------------+------------------------------------------------------------------------…
2440 +-------------------------+------------------------------------------------------------------------…
2467 +-------------------+--------------------------------------------------+
2468 | ``<4 x i32>`` | Vector of 4 32-bit integer values. |
2469 +-------------------+--------------------------------------------------+
2470 | ``<8 x float>`` | Vector of 8 32-bit floating-point values. |
2471 +-------------------+--------------------------------------------------+
2472 | ``<2 x i64>`` | Vector of 2 64-bit integer values. |
2473 +-------------------+--------------------------------------------------+
2474 | ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. |
2475 +-------------------+--------------------------------------------------+
2560 +------------------+--------------------------------------+
2561 | ``[40 x i32]`` | Array of 40 32-bit integer values. |
2562 +------------------+--------------------------------------+
2563 | ``[41 x i32]`` | Array of 41 32-bit integer values. |
2564 +------------------+--------------------------------------+
2565 | ``[4 x i8]`` | Array of 4 8-bit integer values. |
2566 +------------------+--------------------------------------+
2570 +-----------------------------+----------------------------------------------------------+
2571 | ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. |
2572 +-----------------------------+----------------------------------------------------------+
2574 +-----------------------------+----------------------------------------------------------+
2575 | ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. |
2576 +-----------------------------+----------------------------------------------------------+
2581 single-dimension 'variable sized array' addressing can be implemented in
2604 between the elements. In non-packed structs, padding between field types
2624 …------------------------------+-------------------------------------------------------------------…
2626 …------------------------------+-------------------------------------------------------------------…
2628 …------------------------------+-------------------------------------------------------------------…
2630 …------------------------------+-------------------------------------------------------------------…
2652 +--------------+-------------------+
2654 +--------------+-------------------+
2665 ----------------
2678 decimal value of a floating-point constant. For example, the
2689 The one non-intuitive notation for constants is the hexadecimal form of
2701 double are represented using the 16-digit form shown above (which
2705 double, and there are three forms of long double. The 80-bit format used
2707 128-bit format used by PowerPC (two adjacent doubles) is represented by
2708 ``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is
2711 The IEEE 16-bit format (half precision) is represented by ``0xH``
2712 followed by 4 hexadecimal digits. All hexadecimal formats are big-endian
2720 -----------------
2740 constants may also be represented as a double-quoted string using the ``c``
2745 less-than/greater-than's (``<>``)). For example:
2764 --------------------------------------
2768 (link-time) constants. These constants are explicitly referenced when
2773 .. code-block:: llvm
2782 ----------------
2786 bit-pattern. Undefined values may be of any type (other than '``label``'
2794 .. code-block:: llvm
2807 .. code-block:: llvm
2812 %A = -1
2825 all the bits of the '``undef``' operand to the '``or``' could be set,
2826 allowing the '``or``' to be folded to -1.
2828 .. code-block:: llvm
2847 allowed to assume that the '``undef``' operand could be the same as
2850 .. code-block:: llvm
2881 .. code-block:: llvm
2891 allowed to have an arbitrary bit-pattern. This means that the ``%A``
2902 .. code-block:: llvm
2904 a: store undef -> %X
2905 b: store %X -> undef
2919 -------------
2932 - Values other than :ref:`phi <i_phi>` nodes depend on their operands.
2933 - :ref:`Phi <i_phi>` nodes depend on the operand corresponding to
2935 - Function arguments depend on the corresponding actual argument values
2937 - :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>`
2939 - :ref:`Invoke <i_invoke>` instructions depend on the
2940 :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing
2942 - Non-volatile loads and stores depend on the most recent stores to all
2946 - An instruction with externally visible side effects depends on the
2950 - An instruction *control-depends* on a :ref:`terminator
2955 - Additionally, an instruction also *control-depends* on a terminator
2959 - Dependence is transitive.
2967 .. code-block:: llvm
2989 store volatile i32 0, i32* @g ; This is control-dependent on %cmp, so
2996 ; control-dependent on %cmp, so this
3014 ; control-equivalent to %end, so this is
3015 ; well-defined (ignoring earlier undefined
3021 -------------------------
3029 This value only has defined behavior when used as an operand to the
3032 undefined behavior --- though, again, comparison against null is ok, and
3040 as the operand to an inline assembly, but that is target specific.
3045 --------------------
3155 ----------------------------
3157 LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
3160 instructions to emit), a list of operand constraints (stored as a string), a
3161 flag that indicates whether or not the inline asm expression has side effects,
3162 and a flag indicating whether the function containing the asm needs to align its
3168 be used, where ``MODIFIER`` is a target-specific annotation for how to print the
3169 operand (See :ref:`inline-asm-modifiers`).
3175 disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
3178 LLVM's support for inline asm is modeled closely on the requirements of Clang's
3179 GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
3180 modifier codes listed here are similar or identical to those in GCC's inline asm
3183 while most constraint letters are passed through as-is by Clang, some get
3189 .. code-block:: llvm
3191 i32 (i32) asm "bswap $0", "=r,r"
3193 Inline assembler expressions may **only** be used as the callee operand
3197 .. code-block:: llvm
3199 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
3205 .. code-block:: llvm
3207 call void asm sideeffect "eieio", ""()
3211 x86, yet will not contain code that does that alignment within the asm.
3212 The compiler should make conservative assumptions about what the asm
3216 .. code-block:: llvm
3218 call void asm alignstack "eieio", ""()
3220 Inline asms also support using non-standard assembly dialects. The
3222 the inline asm is using the Intel dialect. Currently, ATT and Intel are
3225 .. code-block:: llvm
3227 call void asm inteldialect "eieio", ""()
3233 Inline Asm Constraint String
3236 The constraint list is a comma-separated string, each element containing one or
3240 operand will be chosen, and it will be made available to assembly template
3251 - Register constraint. This is either a register class, or a fixed physical
3254 - Memory constraint. This kind of constraint is for use with an instruction
3255 taking a memory operand. Different constraints allow for different addressing
3257 - Immediate value constraint. This kind of constraint is for an integer or other
3259 various target-specific constraints allow the selection of a value in the
3266 indicates that the assembly will write to this operand, and the operand will
3267 then be made available as a return value of the ``asm`` expression. Output
3277 "early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
3284 Input constraints do not have a prefix -- just the constraint codes. Each input
3286 permitted for the asm to write to any input register or memory location (unless
3294 take up a position in the asm template numbering as is usual -- they will simply
3300 It is permitted to tie an input to an "early-clobber" output. In that case, no
3301 *other* input may share the same register as the input tied to the early-clobber
3309 type operand provided as input, the input value will be split into multiple
3310 registers, and all of them passed to the inline asm.
3316 instructions, this is not an appropriate way to support them. (e.g. the 32-bit
3317 SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
3319 feature of inline asm would not be useful to support that.)
3322 to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
3334 (which goes after the "``=``" in case of an output). This indicates that the asm
3339 "output" only in that the asm is expected to write to the contents of the input
3348 input, after the provided inline asm. (It's not clear what value this
3349 functionality provides, compared to writing the store explicitly after the asm
3358 consume an input operand, nor generate an output. Clobbers cannot use any of the
3359 general constraint code letters -- they may use only explicit register
3362 memory locations -- not only the memory pointed to by a declared indirect
3371 followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
3379 compatibility with the translation of GCC inline asm coming from clang.
3382 inline asm constraint list:
3394 Putting those together, you might have a two operand constraint string like
3395 ``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
3396 operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
3397 may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
3415 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
3416 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
3421 - ``r``: A register in the target's general purpose register class.
3422 - ``m``: A memory address operand. It is target-specific what addressing modes
3424 or register + immediate offset (of some target-specific size).
3425 - ``i``: An integer constant (of target-specific width). Allows either a simple
3427 - ``n``: An integer constant -- *not* including relocatable values.
3428 - ``s``: An integer constant, but allowing *only* relocatable values.
3429 - ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
3430 useful to pass a label for an asm branch or call.
3432 .. FIXME: but that surely isn't actually okay to jump out of an asm
3435 - ``{register-name}``: Requires exactly the named physical register.
3437 Other constraints are target-specific:
3441 - ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
3442 - ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
3444 - ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
3445 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
3446 - ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
3447 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
3448 - ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
3449 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
3450 - ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
3451 32-bit register. This is a superset of ``K``: in addition to the bitmask
3454 - ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
3455 64-bit register. This is a superset of ``L``.
3456 - ``Q``: Memory address operand must be in a single register (no
3459 - ``r``: A 32 or 64-bit integer register (W* or X*).
3460 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register.
3461 - ``x``: A lower 128-bit floating-point/SIMD register (``V0`` to ``V15``).
3465 - ``r``: A 32 or 64-bit integer register.
3466 - ``[0-9]v``: The 32-bit VGPR register, number 0-9.
3467 - ``[0-9]s``: The 32-bit SGPR register, number 0-9.
3472 - ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
3473 operand. Treated the same as operand ``m``, at the moment.
3477 - ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
3478 - ``I``: An immediate integer valid for a data-processing instruction.
3479 - ``J``: An immediate integer between -4095 and 4095.
3480 - ``K``: An immediate integer whose bitwise inverse is valid for a
3481 data-processing instruction. (Can be used with template modifier "``B``" to
3483 - ``L``: An immediate integer whose negation is valid for a data-processing
3486 - ``M``: A power of two or a integer between 0 and 32.
3487 - ``N``: Invalid immediate constraint.
3488 - ``O``: Invalid immediate constraint.
3489 - ``r``: A general-purpose 32-bit integer register (``r0-r15``).
3490 - ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
3492 - ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
3494 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
3495 ``d0-d31``, or ``q0-q15``.
3496 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
3497 ``d0-d7``, or ``q0-q3``.
3498 - ``t``: A floating-point/SIMD register, only supports 32-bit values:
3499 ``s0-s31``.
3503 - ``I``: An immediate integer between 0 and 255.
3504 - ``J``: An immediate integer between -255 and -1.
3505 - ``K``: An immediate integer between 0 and 255, with optional left-shift by
3507 - ``L``: An immediate integer between -7 and 7.
3508 - ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
3509 - ``N``: An immediate integer between 0 and 31.
3510 - ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
3511 - ``r``: A low 32-bit GPR register (``r0-r7``).
3512 - ``l``: A low 32-bit GPR register (``r0-r7``).
3513 - ``h``: A high GPR register (``r0-r7``).
3514 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
3515 ``d0-d31``, or ``q0-q15``.
3516 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
3517 ``d0-d7``, or ``q0-q3``.
3518 - ``t``: A floating-point/SIMD register, only supports 32-bit values:
3519 ``s0-s31``.
3524 - ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
3526 - ``r``: A 32 or 64-bit register.
3530 - ``r``: An 8 or 16-bit register.
3534 - ``I``: An immediate signed 16-bit integer.
3535 - ``J``: An immediate integer zero.
3536 - ``K``: An immediate unsigned 16-bit integer.
3537 - ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
3538 - ``N``: An immediate integer between -65535 and -1.
3539 - ``O``: An immediate signed 15-bit integer.
3540 - ``P``: An immediate integer between 1 and 65535.
3541 - ``m``: A memory address operand. In MIPS-SE mode, allows a base address
3542 register plus 16-bit immediate offset. In MIPS mode, just a base register.
3543 - ``R``: A memory address operand. In MIPS-SE mode, allows a base address
3544 register plus a 9-bit signed offset. In MIPS mode, the same as constraint
3546 - ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
3548 - ``r``, ``d``, ``y``: A 32 or 64-bit GPR register.
3549 - ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
3550 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
3552 - ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
3554 - ``l``: The ``lo`` register, 32 or 64-bit.
3555 - ``x``: Invalid.
3559 - ``b``: A 1-bit integer register.
3560 - ``c`` or ``h``: A 16-bit integer register.
3561 - ``r``: A 32-bit integer register.
3562 - ``l`` or ``N``: A 64-bit integer register.
3563 - ``f``: A 32-bit float register.
3564 - ``d``: A 64-bit float register.
3569 - ``I``: An immediate signed 16-bit integer.
3570 - ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
3571 - ``K``: An immediate unsigned 16-bit integer.
3572 - ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
3573 - ``M``: An immediate integer greater than 31.
3574 - ``N``: An immediate integer that is an exact power of 2.
3575 - ``O``: The immediate integer constant 0.
3576 - ``P``: An immediate integer constant whose negation is a signed 16-bit
3578 - ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
3580 - ``r``: A 32 or 64-bit integer register.
3581 - ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
3582 ``R1-R31``).
3583 - ``f``: A 32 or 64-bit float register (``F0-F31``), or when QPX is enabled, a
3584 128 or 256-bit QPX register (``Q0-Q31``; aliases the ``F`` registers).
3585 - ``v``: For ``4 x f32`` or ``4 x f64`` types, when QPX is enabled, a
3586 128 or 256-bit QPX register (``Q0-Q31``), otherwise a 128-bit
3587 altivec vector register (``V0-V31``).
3592 - ``y``: Condition register (``CR0-CR7``).
3593 - ``wc``: An individual CR bit in a CR register.
3594 - ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
3595 register set (overlapping both the floating-point and vector register files).
3596 - ``ws``: A 32 or 64-bit floating point register, from the full VSX register
3601 - ``I``: An immediate 13-bit signed integer.
3602 - ``r``: A 32-bit integer register.
3606 - ``I``: An immediate unsigned 8-bit integer.
3607 - ``J``: An immediate unsigned 12-bit integer.
3608 - ``K``: An immediate signed 16-bit integer.
3609 - ``L``: An immediate signed 20-bit integer.
3610 - ``M``: An immediate integer 0x7fffffff.
3611 - ``Q``: A memory address operand with a base address and a 12-bit immediate
3613 - ``R``: A memory address operand with a base address, a 12-bit immediate
3615 - ``S``: A memory address operand with a base address and a 20-bit immediate
3617 - ``T``: A memory address operand with a base address, a 20-bit immediate
3619 - ``r`` or ``d``: A 32, 64, or 128-bit integer register.
3620 - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
3622 - ``h``: A 32-bit value in the high part of a 64bit data register
3623 (LLVM-specific)
3624 - ``f``: A 32, 64, or 128-bit floating point register.
3628 - ``I``: An immediate integer between 0 and 31.
3629 - ``J``: An immediate integer between 0 and 64.
3630 - ``K``: An immediate signed 8-bit integer.
3631 - ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
3633 - ``M``: An immediate integer between 0 and 3.
3634 - ``N``: An immediate unsigned 8-bit integer.
3635 - ``O``: An immediate integer between 0 and 127.
3636 - ``e``: An immediate 32-bit signed integer.
3637 - ``Z``: An immediate 32-bit unsigned integer.
3638 - ``o``, ``v``: Treated the same as ``m``, at the moment.
3639 - ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
3640 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
3641 registers, and on X86-64, it is all of the integer registers.
3642 - ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
3644 - ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
3645 - ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
3647 - ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
3648 - ``y``: A 64-bit MMX register, if MMX is enabled.
3649 - ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
3650 operand in a SSE register. If AVX is also enabled, can also be a 256-bit
3651 vector operand in an AVX register. If AVX-512 is also enabled, can also be a
3652 512-bit vector operand in an AVX512 register, Otherwise, an error.
3653 - ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
3654 - ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
3655 32-bit mode, a 64-bit integer operand will get split into two registers). It
3656 is not recommended to use this constraint, as in 64-bit mode, the 64-bit
3657 operand will get allocated only to RAX -- if two 32-bit operands are needed,
3658 you're better off splitting it yourself, before passing it to the asm
3663 - ``r``: A 32-bit integer register.
3666 .. _inline-asm-modifiers:
3668 Asm template argument modifiers
3671 In the asm template string, modifiers can be used on the operand reference, like
3675 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
3676 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
3679 Target-independent:
3681 - ``c``: Print an immediate integer constant unadorned, without
3682 the target-specific immediate punctuation (e.g. no ``$`` prefix).
3683 - ``n``: Negate and print immediate integer constant unadorned, without the
3684 target-specific immediate punctuation (e.g. no ``$`` prefix).
3685 - ``l``: Print as an unadorned label, without the target-specific label
3690 - ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
3692 - ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
3693 - ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
3699 - ``r``: No effect.
3703 - ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
3705 - ``P``: No effect.
3706 - ``q``: No effect.
3707 - ``y``: Print a VFP single-precision register as an indexed double (e.g. print
3709 - ``B``: Bitwise invert and print an immediate integer constant without ``#``
3711 - ``L``: Print the low 16-bits of an immediate integer constant.
3712 - ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
3714 - ``Q``: Print the low-order register of a register-pair, or the low-order
3715 register of a two-register operand.
3716 - ``R``: Print the high-order register of a register-pair, or the high-order
3717 register of a two-register operand.
3718 - ``H``: Print the second register of a register-pair. (On a big-endian system,
3719 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
3723 of a two-register operand.
3725 - ``e``: Print the low doubleword register of a NEON quad register.
3726 - ``f``: Print the high doubleword register of a NEON quad register.
3727 - ``m``: Print the base register of a memory operand without the ``[`` and ``]``
3732 - ``L``: Print the second register of a two-register operand. Requires that it
3738 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
3747 - ``X``: Print an immediate integer as hexadecimal
3748 - ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
3749 - ``d``: Print an immediate integer as decimal.
3750 - ``m``: Subtract one and print an immediate integer as decimal.
3751 - ``z``: Print $0 if an immediate zero, otherwise print normally.
3752 - ``L``: Print the low-order register of a two-register operand, or prints the
3753 address of the low-order word of a double-word memory operand.
3755 .. FIXME: L seems to be missing memory operand support.
3757 - ``M``: Print the high-order register of a two-register operand, or prints the
3758 address of the high-order word of a double-word memory operand.
3760 .. FIXME: M seems to be missing memory operand support.
3762 - ``D``: Print the second register of a two-register operand, or prints the
3763 second word of a double-word memory operand. (On a big-endian system, ``D`` is
3764 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
3766 - ``w``: No effect. Provided for compatibility with GCC which requires this
3767 modifier in order to print MSA registers (``W0-W31``) with the ``f``
3772 - ``r``: No effect.
3776 - ``L``: Print the second register of a two-register operand. Requires that it
3782 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
3784 - ``y``: For a memory operand, prints formatter for a two-register X-form
3785 instruction. (Currently always prints ``r0,OPERAND``).
3786 - ``U``: Prints 'u' if the memory operand is an update form, and nothing
3789 - ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
3794 - ``r``: No effect.
3799 target-independent modifiers.
3803 - ``c``: Print an unadorned integer or symbol name. (The latter is
3804 target-specific behavior for this typically target-independent modifier).
3805 - ``A``: Print a register name with a '``*``' before it.
3806 - ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
3807 operand.
3808 - ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
3809 memory operand.
3810 - ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
3811 operand.
3812 - ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
3813 operand.
3814 - ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
3815 available, otherwise the 32-bit register name; do nothing on a memory operand.
3816 - ``n``: Negate and print an unadorned integer, or, for operands other than an
3817 immediate integer (e.g. a relocatable symbol expression), print a '-' before
3818 the operand. (The behavior for relocatable symbol expressions is a
3819 target-specific behavior for this typically target-independent modifier)
3820 - ``H``: Print a memory reference with additional offset +8.
3821 - ``P``: Print a memory reference or operand for use as the argument of a call
3822 instruction. (E.g. omit ``(rip)``, even though it's PC-relative.)
3829 Inline Asm Metadata
3832 The call instructions that wrap inline asm nodes may have a
3836 error reporting mechanisms. This allows a front-end to correlate backend
3837 errors that occur with inline asm back to the source code that produced
3840 .. code-block:: llvm
3842 call void asm sideeffect "something bad", ""(), !srcloc !42
3846 It is up to the front-end to make sense of the magic numbers it places
3848 will use the one that corresponds to the line of the asm that the error
3858 code generator. One example application of metadata is source-level
3866 .. _metadata-string:
3869 -----------------------------------
3872 contain any character by escaping non-printable characters with
3879 their operand. For example:
3881 .. code-block:: llvm
3887 .. code-block:: llvm
3899 .. code-block:: llvm
3906 .. code-block:: llvm
3913 .. code-block:: llvm
3920 .. code-block:: llvm
3929 .. _specialized-metadata:
3952 .. code-block:: llvm
3955 isOptimized: true, flags: "-O2", runtimeVersion: 2,
3973 .. code-block:: llvm
3988 .. code-block:: llvm
3997 .. code-block:: llvm
4013 refers to a tuple; the first operand is the return type, while the rest are the
4014 types of the formal arguments in order. If the first operand is ``null``, that
4017 .. code-block:: llvm
4031 .. code-block:: llvm
4040 .. code-block:: llvm
4084 derived types <DIDerivedTypeMember>` that reference the ODR-type in their
4092 .. code-block:: llvm
4096 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
4103 .. code-block:: llvm
4133 :ref:`DICompositeType`. ``count: -1`` indicates an empty array.
4135 .. code-block:: llvm
4139 !2 = !DISubrange(count: -1) ; empty array.
4149 .. code-block:: llvm
4153 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
4162 .. code-block:: llvm
4175 .. code-block:: llvm
4184 .. code-block:: llvm
4193 .. code-block:: llvm
4222 .. code-block:: llvm
4247 .. code-block:: llvm
4264 .. code-block:: llvm
4279 .. code-block:: llvm
4289 the ``arg:`` field is set to non-zero, then this variable is a subprogram
4293 .. code-block:: llvm
4311 - ``DW_OP_deref`` dereferences the working expression.
4312 - ``DW_OP_plus, 93`` adds ``93`` to the working expression.
4313 - ``DW_OP_bit_piece, 16, 8`` specifies the offset and size (``16`` and ``8``
4316 .. code-block:: llvm
4326 ``DIObjCProperty`` nodes represent Objective-C property nodes.
4328 .. code-block:: llvm
4339 .. code-block:: llvm
4349 defining a function-like macro, and the ``value`` field is the token-string
4352 .. code-block:: llvm
4365 .. code-block:: llvm
4382 .. code-block:: llvm
4399 from multiple front-ends is handled conservatively.
4421 For each group of three, the first operand gives the byte offset of a
4425 .. code-block:: llvm
4441 noalias memory-access sets. This means that some collection of memory access
4442 instructions (loads, stores, memory-accessing calls, etc.) that carry
4463 self-reference can be used to create globally unique domain names. A
4469 self-reference can be used to create globally unique scope names. A metadata
4475 .. code-block:: llvm
4516 floating-point numbers ``a`` and ``b``, without being equal to one
4517 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
4518 distance between the two non-equal finite floating-point numbers
4524 .. code-block:: llvm
4528 .. _range-metadata:
4540 - The type must match the type loaded by the instruction.
4541 - The pair ``a,b`` represents the range ``[a,b)``.
4542 - Both ``a`` and ``b`` are constants.
4543 - The range is allowed to wrap.
4544 - The range should not represent the full or empty set. That is,
4548 they must be non-contiguous.
4552 .. code-block:: llvm
4555 %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
4558 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
4563 !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
4591 .. code-block:: llvm
4597 per-loop metadata. Any operands after the first operand can be treated
4598 as user-defined metadata. For example the ``llvm.loop.unroll.count``
4601 .. code-block:: llvm
4612 used to control per-loop vectorization and interleaving parameters such as
4618 which contains information about loop-carried memory dependencies can be helpful
4625 The first operand is the string ``llvm.loop.interleave.count`` and the
4626 second operand is an integer specifying the interleave count. For
4629 .. code-block:: llvm
4641 first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
4642 is a bit. If the bit operand value is 1 vectorization is enabled. A value of
4645 .. code-block:: llvm
4654 operand is the string ``llvm.loop.vectorize.width`` and the second
4655 operand is an integer specifying the width. For example:
4657 .. code-block:: llvm
4680 first operand is the string ``llvm.loop.unroll.count`` and the second
4681 operand is a positive integer specifying the unroll factor. For
4684 .. code-block:: llvm
4694 This metadata disables loop unrolling. The metadata has a single operand
4697 .. code-block:: llvm
4705 operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
4707 .. code-block:: llvm
4716 at compile time. The metadata has a single operand which is the string
4719 .. code-block:: llvm
4727 metadata has a single operand which is the string ``llvm.loop.unroll.full``.
4730 .. code-block:: llvm
4738 of enabling loop-invariant code motion (LICM). The metadata has a single operand
4741 .. code-block:: llvm
4754 loop. The first operand is the string ``llvm.loop.distribute.enable`` and the
4755 second operand is a bit. If the bit operand value is 1 distribution is
4758 .. code-block:: llvm
4804 .. code-block:: llvm
4822 .. code-block:: llvm
4855 the optimizer that every ``load`` and ``store`` to the same pointer operand
4862 .. code-block:: llvm
4900 this. These flags are in the form of key / value pairs --- much like a
4901 dictionary --- making it easy for any subsystem who cares about a flag to
4907 - The first element is a *behavior* flag, which specifies the behavior
4911 - The second element is a metadata string that is a unique ID for the
4914 - The third element is the value of the flag.
4925 .. list-table::
4926 :header-rows: 1
4927 :widths: 10 90
4929 * - Value
4930 - Behavior
4932 * - 1
4933 - **Error**
4937 * - 2
4938 - **Warning**
4940 operand for the flag from the first module being linked.
4942 * - 3
4943 - **Require**
4952 * - 4
4953 - **Override**
4958 * - 5
4959 - **Append**
4962 * - 6
4963 - **AppendUnique**
4974 .. code-block:: llvm
4986 - Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
4990 - Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
4994 - Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
4998 - Metadata ``!3`` has the ID ``!"qux"`` and the value:
5008 Objective-C Garbage Collection Module Flags Metadata
5009 ----------------------------------------------------
5011 On the Mach-O platform, Objective-C stores metadata about garbage
5018 The Objective-C garbage collection module flags metadata consists of the
5019 following key-value pairs:
5021 .. list-table::
5022 :header-rows: 1
5023 :widths: 30 70
5025 * - Key
5026 - Value
5028 * - ``Objective-C Version``
5029 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
5031 * - ``Objective-C Image Info Version``
5032 - **[Required]** --- The version of the image info section. Currently
5035 * - ``Objective-C Image Info Section``
5036 - **[Required]** --- The section to place the metadata. Valid values are
5037 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
5039 Objective-C ABI version 2.
5041 * - ``Objective-C Garbage Collection``
5042 - **[Required]** --- Specifies whether garbage collection is supported or
5046 * - ``Objective-C GC Only``
5047 - **[Optional]** --- Specifies that only garbage collection is supported.
5049 ``Objective-C Garbage Collection`` flag have the value 2.
5053 - If a module with ``Objective-C Garbage Collection`` set to 0 is
5054 merged with a module with ``Objective-C Garbage Collection`` set to
5056 ``Objective-C Garbage Collection`` flag set to 0.
5057 - A module with ``Objective-C Garbage Collection`` set to 0 cannot be
5058 merged with a module with ``Objective-C GC Only`` set to 6.
5061 --------------------------------------------
5080 !{ !"-lz" },
5081 !{ !"-framework", !"Cocoa" } } }
5095 ----------------------------------
5098 options that it was compiled with (in a compiler-independent way) to prevent
5104 flags metadata, using the following key-value pairs:
5106 .. list-table::
5107 :header-rows: 1
5108 :widths: 30 70
5110 * - Key
5111 - Value
5113 * - short_wchar
5114 - * 0 --- sizeof(wchar_t) == 4
5115 * 1 --- sizeof(wchar_t) == 2
5117 * - short_enum
5118 - * 0 --- Enums are at least as large as an ``int``.
5119 * 1 --- Enums are stored in the smallest integer type which can
5144 -----------------------------------
5152 .. code-block:: llvm
5177 --------------------------------------------
5191 -------------------------------------------
5193 .. code-block:: llvm
5204 If the third field is present, non-null, and points to a global variable
5211 -------------------------------------------
5213 .. code-block:: llvm
5224 If the third field is present, non-null, and points to a global variable
5240 -----------------------
5267 ret <type> <value> ; Return a value from a non-void function
5287 A function is not :ref:`well formed <wellformed>` if it it has a non-void
5308 .. code-block:: llvm
5353 .. code-block:: llvm
5411 .. code-block:: llvm
5472 .. code-block:: llvm
5487 [operand bundles] to label <normal label> unwind label <exception label>
5506 as its first non-PHI instruction. The restrictions on the
5545 #. The optional :ref:`operand bundles <opbundles>` list.
5558 '``catch``' clauses in high-level languages that support them.
5568 .. code-block:: llvm
5604 (in-flight) exception whose unwinding was interrupted with a
5610 .. code-block:: llvm
5639 this operand may be the token ``none``.
5644 the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
5657 it must be both the first non-phi instruction and last instruction in the basic
5658 block. Therefore, it must be the only non-phi instruction in the block.
5663 .. code-block:: llvm
5700 The '``catchret``' instruction ends an existing (in-flight) exception whose
5707 If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
5708 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
5714 .. code-block:: llvm
5743 If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
5744 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
5751 `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
5764 .. code-block:: llvm
5787 after a no-return function cannot be reached, and other facts.
5797 -----------------
5854 .. code-block:: llvm
5868 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
5886 instruction can also take any number of :ref:`fast-math flags <fastmath>`,
5893 .. code-block:: llvm
5945 .. code-block:: llvm
5947 <result> = sub i32 4, %var ; yields i32:result = 4 - %var
5948 <result> = sub i32 0, %val ; yields i32:result = -%var
5960 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
5981 This instruction can also take any number of :ref:`fast-math
5988 .. code-block:: llvm
5990 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
5991 <result> = fsub float -0.0, %val ; yields float:result = -%var
6030 (e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
6031 sign-extended or zero-extended as appropriate to the width of the full
6042 .. code-block:: llvm
6056 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
6074 This instruction can also take any number of :ref:`fast-math
6081 .. code-block:: llvm
6125 .. code-block:: llvm
6163 doing a 32-bit division of -2147483648 by -1.
6171 .. code-block:: llvm
6185 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
6203 This instruction can also take any number of :ref:`fast-math
6210 .. code-block:: llvm
6252 .. code-block:: llvm
6299 occur, for example, by taking the remainder of a 32-bit division of
6300 -2147483648 by -1. (The remainder doesn't actually overflow, but this
6307 .. code-block:: llvm
6321 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
6341 number of :ref:`fast-math flags <fastmath>`, which are optimization hints
6347 .. code-block:: llvm
6354 -------------------------
6356 Bitwise binary operators are used to do various forms of bit-twiddling
6378 The '``shl``' instruction returns the first operand shifted to the left
6399 value <poisonvalues>` if it shifts out any non-zero bits. If the
6409 .. code-block:: llvm
6432 operand shifted to the right a specified number of bits with zero fill.
6453 non-zero.
6458 .. code-block:: llvm
6463 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
6465 …<result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7…
6482 operand shifted to the right a specified number of bits with sign
6504 non-zero.
6509 .. code-block:: llvm
6514 <result> = ashr i8 -2, 1 ; yields i8:result = -1
6516 …<result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1,…
6546 +-----+-----+-----+
6548 +-----+-----+-----+
6550 +-----+-----+-----+
6552 +-----+-----+-----+
6554 +-----+-----+-----+
6556 +-----+-----+-----+
6561 .. code-block:: llvm
6595 +-----+-----+-----+
6597 +-----+-----+-----+
6599 +-----+-----+-----+
6601 +-----+-----+-----+
6603 +-----+-----+-----+
6605 +-----+-----+-----+
6645 +-----+-----+-----+
6647 +-----+-----+-----+
6649 +-----+-----+-----+
6651 +-----+-----+-----+
6653 +-----+-----+-----+
6655 +-----+-----+-----+
6660 .. code-block:: llvm
6665 <result> = xor i32 %V, -1 ; yields i32:result = ~%V
6668 -----------------
6671 target-independent manner. These instructions cover the element-access
6672 and vector-specific operations needed to process vectors effectively.
6674 sophisticated algorithms will want to use target-specific intrinsics to
6698 The first operand of an '``extractelement``' instruction is a value of
6699 :ref:`vector <t_vector>` type. The second operand is an index indicating
6713 .. code-block:: llvm
6738 The first operand of an '``insertelement``' instruction is a value of
6739 :ref:`vector <t_vector>` type. The second operand is a scalar value whose
6740 type must equal the element type of the first operand. The third operand
6755 .. code-block:: llvm
6787 The shuffle mask operand is required to be a constant vector with either
6794 across both of the vectors. The shuffle mask operand specifies, for each
6797 care") and the second operand may be undef if performing a shuffle from
6803 .. code-block:: llvm
6808 … <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle.
6815 --------------------
6841 The first operand of an '``extractvalue``' instruction is a value of
6848 - Since the value being indexed is not a pointer, the first index is
6850 - At least one index must be specified.
6851 - Not only struct indices but also array indices must be in bounds.
6862 .. code-block:: llvm
6887 The first operand of an '``insertvalue``' instruction is a value of
6888 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
6889 a first-class value to insert. The following operands are constant
6905 .. code-block:: llvm
6914 ---------------------------------------
6916 A key design point of an SSA-based representation is how it represents
6972 .. code-block:: llvm
7014 stores. The type of the pointee must be an integer, pointer, or floating-point
7016 than or equal to a target-specific size limit. ``align`` must be explicitly
7047 operand to this load points to memory which can be assumed unchanged.
7104 .. code-block:: llvm
7132 address at which to store it. The type of the ``<pointer>`` operand must be a
7134 operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
7144 stores. The type of the pointee must be an integer, pointer, or floating-point
7146 than or equal to a target-specific size limit. ``align`` must be explicitly
7182 location specified by the ``<pointer>`` operand. If ``<value>`` is
7193 .. code-block:: llvm
7214 The '``fence``' instruction is used to introduce happens-before edges
7221 defines what *synchronizes-with* edges they add. They can only be given
7233 *happens-before* dependency between A and B. Rather than an explicit
7236 still *synchronize-with* the explicit ``fence`` and establish the
7237 *happens-before* edge.
7250 .. code-block:: llvm
7282 than or equal to a target-specific size limit. '<cmp>' and '<new>' must
7300 equal to the size in memory of the operand.
7305 The contents of memory at the location specified by the '``<pointer>``' operand
7317 A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
7324 .. code-block:: llvm
7365 - xchg
7366 - add
7367 - sub
7368 - and
7369 - nand
7370 - or
7371 - xor
7372 - max
7373 - min
7374 - umax
7375 - umin
7379 target-specific size limit. The type of the '``<pointer>``' operand must
7389 operand are atomically read, modified, and written back. The original
7393 - xchg: ``*ptr = val``
7394 - add: ``*ptr = *ptr + val``
7395 - sub: ``*ptr = *ptr - val``
7396 - and: ``*ptr = *ptr & val``
7397 - nand: ``*ptr = ~(*ptr & val)``
7398 - or: ``*ptr = *ptr | val``
7399 - xor: ``*ptr = *ptr ^ val``
7400 - max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
7401 - min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
7402 - umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned
7404 - umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned
7410 .. code-block:: llvm
7447 can be non-zero), etc. The first type indexed into must be a pointer
7463 .. code-block:: c
7482 .. code-block:: llvm
7511 .. code-block:: llvm
7531 ``inbounds`` keyword applies to each of the computations element-wise.
7534 base address with silently-wrapping two's complement arithmetic. If the
7535 offsets have a different width from the pointer, they are sign-extended
7549 .. code-block:: llvm
7568 .. code-block:: llvm
7586 .. code-block:: llvm
7600 .. code-block:: c
7608 .. code-block:: llvm
7617 ---------------------
7620 (casting) which all take a single operand and a type. They perform
7621 various bit conversions on the operand.
7636 The '``trunc``' instruction truncates its operand to the type ``ty2``.
7652 be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
7658 .. code-block:: llvm
7678 The '``zext``' instruction zero extends its operand to type ``ty2``.
7699 .. code-block:: llvm
7735 When sign extending from i1, the extension always results in -1 or 0.
7740 .. code-block:: llvm
7742 %X = sext i8 -1 to i16 ; yields i16 :65535
7743 %Y = sext i1 true to i32 ; yields i32:-1
7767 implies that ``fptrunc`` cannot be used to make a *no-op cast*.
7782 .. code-block:: llvm
7816 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
7817 *no-op cast* for a floating point cast.
7822 .. code-block:: llvm
7856 point <t_floating>` operand into the nearest (rounding towards zero)
7863 .. code-block:: llvm
7898 point <t_floating>` operand into the nearest (rounding towards zero)
7905 .. code-block:: llvm
7907 %X = fptosi double -123.0 to i32 ; yields i32:-123
7908 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1
7939 The '``uitofp``' instruction interprets its operand as an unsigned
7947 .. code-block:: llvm
7950 %Y = uitofp i8 -1 to double ; yields double:255.0
7980 The '``sitofp``' instruction interprets its operand as a signed integer
7988 .. code-block:: llvm
7991 %Y = sitofp i8 -1 to double ; yields double:-1.0
8027 the same size, then nothing is done (*no-op cast*) other than a type
8033 .. code-block:: llvm
8035 %X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture
8036 … %Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture
8037 …32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
8072 nothing is done (*no-op cast*).
8077 .. code-block:: llvm
8079 %X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture
8080 %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture
8081 %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture
8106 non-aggregate first class value, and a type to cast it to, which must
8107 also be a non-aggregate :ref:`first class <t_firstclass>` type. The
8118 is always a *no-op cast* because no bits change with this
8129 .. code-block:: llvm
8131 %X = bitcast i8 255 to i8 ; yields i8 :-1
8165 ``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
8169 conversion is legal then both result and operand refer to the same memory
8175 .. code-block:: llvm
8184 ----------------
8211 The '``icmp``' instruction takes three operands. The first operand is
8268 .. code-block:: llvm
8274 <result> = icmp ule i16 -4, 5 ; yields: result=false
8287 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
8305 The '``fcmp``' instruction takes three operands. The first operand is
8326 *Ordered* means that neither operand is a QNAN while *unordered* means
8327 that either operand may be a QNAN.
8355 #. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
8357 #. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
8359 #. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
8361 #. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
8363 #. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
8365 #. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
8367 #. ``uno``: yields ``true`` if either operand is a QNAN.
8371 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
8374 Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
8382 .. code-block:: llvm
8417 There must be no non-phi instructions between the start of a basic block
8436 .. code-block:: llvm
8461 condition, without IR-level branching.
8486 .. code-block:: llvm
8500 …<result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty>|<fnty> <fnp…
8501 [ operand bundles ]
8528 - The call must immediately precede a :ref:`ret <i_ret>` instruction,
8530 - The ret instruction must return the (possibly bitcasted) value
8532 - The caller and callee prototypes must match. Pointer types of
8535 - The calling conventions of the caller and callee must match.
8536 - All ABI-impacting function attributes, such as sret, byval, inreg,
8538 - The callee must be varargs iff the caller is varargs. Bitcasting a
8539 non-varargs function to the appropriate varargs type is legal so
8540 long as the non-varargs prefixes obey the other rules.
8545 - Caller and callee both have the calling convention ``fastcc``.
8546 - The call is in tail position (ret immediately follows call and ret
8548 - Option ``-tailcallopt`` is enabled, or
8550 - `Platform-specific constraints are
8557 #. The optional ``fast-math flags`` marker indicates that the call has one or more
8558 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
8559 otherwise unsafe floating-point optimizations. Fast-math flags are only valid
8560 for calls that return a floating-point scalar or vector type.
8588 #. The optional :ref:`operand bundles <opbundles>` list.
8602 .. code-block:: llvm
8621 support for freestanding environments and non-C-based languages.
8695 is a landing pad --- one where the exception lands, and corresponds to the
8698 re-entry to the function. The ``resultval`` has the type ``resultty``.
8706 A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
8718 :ref:`personality function <personalityfn>` upon re-entry to the function, and
8733 - A landing pad block is a basic block which is the unwind destination
8735 - A landing pad block must have a '``landingpad``' instruction as its
8736 first non-PHI instruction.
8737 - There can be only one '``landingpad``' instruction within the landing
8739 - A basic block that is not a landing pad block may not include a
8745 .. code-block:: llvm
8775 begins a catch handler --- one where a personality routine attempts to transfer
8781 The ``catchswitch`` operand must always be a token produced by a
8801 entirely target and personality function-specific.
8804 instruction must be the first non-phi of its parent basic block.
8811 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8818 .. code-block:: llvm
8843 is a cleanup block --- one where a personality routine attempts to
8852 this operand may be the token ``none``.
8865 ``cleanuppad`` with the aid of the personality-specific arguments.
8871 - A cleanup block is a basic block which is the unwind destination of
8873 - A cleanup block must have a '``cleanuppad``' instruction as its
8874 first non-PHI instruction.
8875 - There can be only one '``cleanuppad``' instruction within the
8877 - A basic block that is not a cleanup block may not include a
8881 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8888 .. code-block:: llvm
8943 -------------------------------------
8950 All of these functions operate on arguments that use a target-specific
8958 .. code-block:: llvm
9018 available in C. In a target-dependent way, it initializes the
9050 available in C. In a target-dependent way, it destroys the ``va_list``
9084 available in C. In a target-dependent way, it copies the source
9090 --------------------------------------
9100 Frontends for type-safe garbage collected languages should generate
9138 a global value address) contains the meta-data to be associated with the
9145 "ptrloc" location. At compile-time, the code generator generates
9223 -------------------------
9242 target-specific value indicating the return address of the current
9264 of the obvious source-language caller.
9280 target-specific frame pointer value for the specified stack frame.
9301 of the obvious source-language caller.
9337 pointer in platform-specific ways.
9340 '``llvm.localescape``' to recover. It is zero-indexed.
9387 bare-metal programs including OS kernels.
9483 These intrinsics return a non-negative integer value that can be used to
9493 compile-time-known constant value.
9521 ``locality`` is a temporal locality specifier ranging from (0) - no
9522 locality, to (3) - extremely local keep in cache. The ``cache type``
9596 Note that runtime support may be conditional on the privilege-level code is
9614 targets with non-unified instruction and data cache, the implementation
9621 intrinsic is a nop. On platforms with non-coherent instruction and data
9641 i32 <num-counters>, i32 <index>)
9648 lowered by the ``-instrprof`` pass to generate execution counts of a
9661 error if ``hash`` or ``num-counters`` differ between two instances of
9665 be incremented. It should be a value between 0 and ``num-counters``.
9671 cause the ``-instrprof`` pass to generate the appropriate data
9674 the ``llvm-profdata`` tool.
9693 lowered by the ``-instrprof`` pass to find out the target values,
9709 expression's value should be representable as an unsigned 64-bit value. The
9720 should be inserted for value profiling of target expressions. ``-instrprof``
9753 -----------------------------
9756 functions. These intrinsics allow source-language front-ends to pass
9770 support all bit widths however.
9823 bit widths however.
9878 support all bit widths.
9942 The '``llvm.sqrt``' intrinsics return the sqrt of the specified operand,
9945 negative numbers other than -0.0 (which allows for better optimization,
9947 ``llvm.sqrt(-0.0)`` is defined to return -0.0 like IEEE sqrt.
9958 This function returns the sqrt of the specified operand if it is a
9982 The '``llvm.powi.*``' intrinsics return the first operand raised to the
10020 The '``llvm.sin.*``' intrinsics return the sine of the operand.
10031 This function returns the sine of the specified operand, returning the
10056 The '``llvm.cos.*``' intrinsics return the cosine of the operand.
10067 This function returns the cosine of the specified operand, returning the
10092 The '``llvm.pow.*``' intrinsics return the first operand raised to the
10304 The '``llvm.fma.*``' intrinsics perform the fused multiply-add
10341 operand.
10389 Follows the IEEE-754 semantics for minNum, which also match for libm's
10392 If either operand is a NaN, returns the other non-NaN operand. Returns
10395 fmin(+/-0.0, +/-0.0) could return either -0.0 or 0.0.
10430 Follows the IEEE-754 semantics for maxNum, which also match for libm's
10433 If either operand is a NaN, returns the other non-NaN operand. Returns
10436 fmax(+/-0.0, +/-0.0) could return either -0.0 or 0.0.
10460 first operand and the sign of the second operand.
10495 The '``llvm.floor.*``' intrinsics return the floor of the operand.
10530 The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
10565 The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
10566 nearest integer not larger in magnitude than the operand.
10601 The '``llvm.rint.*``' intrinsics returns the operand rounded to the
10602 nearest integer. It may raise an inexact floating-point exception if the
10603 operand isn't an integer.
10638 The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
10674 The '``llvm.round.*``' intrinsics returns the operand rounded to the
10690 ---------------------------
10721 ``M`` in the input moved to bit ``N-M`` in the output.
10755 concept to additional even-byte lengths (6 bytes, 8 bytes and more,
10766 support all bit widths or vector types, however.
10804 targets support all bit widths or vector types, however.
10832 now predicated on avoiding zero-value inputs.
10851 support all bit widths or vector types, however.
10879 now predicated on avoiding zero-value inputs.
10893 -----------------------------------
10897 Each of these intrinsics returns a two-element struct. The first
10902 result of a 32-bit ``add`` instruction with the same operands, where
10913 The behavior of these intrinsics is well-defined for all argument
10951 a signed addition of the two variables. They return a structure --- the
10959 .. code-block:: llvm
11001 an unsigned addition of the two arguments. They return a structure --- the
11008 .. code-block:: llvm
11050 a signed subtraction of the two arguments. They return a structure --- the
11058 .. code-block:: llvm
11100 an unsigned subtraction of the two arguments. They return a structure ---
11108 .. code-block:: llvm
11150 a signed multiplication of the two arguments. They return a structure ---
11158 .. code-block:: llvm
11200 an unsigned multiplication of the two arguments. They return a structure ---
11208 .. code-block:: llvm
11216 ---------------------------------
11235 defined by IEEE-754-2008 to be:
11239 2.1.8 canonical encoding: The preferred encoding of a floating-point
11243 This operation can also be considered equivalent to the IEEE-754-2008
11244 conversion of a floating-point value to the same format. NaNs are handled
11247 Examples of non-canonical encodings:
11249 - x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
11250 converted to a canonical representation per hardware-specific protocol.
11251 - Many normal decimal floating point numbers have non-canonical alternative
11253 - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
11254 These are treated as non-canonical encodings of zero and will be flushed to
11257 Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
11264 -0.0 is also sufficient provided that the rounding mode is not -Infinity.
11268 - ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
11269 - ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
11273 ``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
11282 - The input is known to be canonical. For example, it was produced by a
11283 floating-point operation that is required by the standard to be canonical.
11284 - The result is consumed only by (or fused with) other floating-point
11301 The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
11325 the target platform supports it. If a fused multiply-add is required the
11332 .. code-block:: llvm
11337 ----------------------------------------
11340 storage-only format. This means that it is a dense encoding (in memory)
11343 This means that code must first load the half-precision floating point
11374 The intrinsic function contains single argument - the value to be
11387 .. code-block:: llvm
11415 The intrinsic function contains single argument - the value to be
11423 precision floating point format. The input half-float value is
11429 .. code-block:: llvm
11437 -------------------
11445 -----------------------------
11454 ---------------------
11458 callable function pointer lacking the nest parameter - the caller does
11469 .. code-block:: llvm
11504 intrinsic. Note that the size and the alignment are target-specific -
11506 front-end that generates this intrinsic needs to have some
11507 target-specific knowledge. The ``func`` argument must hold a function
11541 This performs any required machine-specific adjustment to the address of
11563 ---------------------------------------
11565 …d vector load and store operations. The predicate is specified by a mask operand, which holds one …
11588 …s to the masked-off lanes. The masked-off lanes in the result vector are taken from the correspond…
11594 …operand is the base pointer for the load. The second operand is the alignment of the source locati…
11601 … same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
11633 …k holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
11638 …operand is the vector value to be written to memory. The second operand is the base pointer for th…
11645 …uivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and da…
11658 -------------------------------------------
11660 …r than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one …
11680 …s to the masked-off lanes. The masked-off lanes in the result vector are taken from the correspond…
11686 …operand is a vector of pointers which holds all memory addresses to read. The second operand is an…
11700 ;; The gather with all-true mask is equivalent to the following instruction sequence
11723 … with overlapping addresses is guaranteed to be ordered from least-significant to most-significant…
11734 …k holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
11739 …operand is a vector value to be written to memory. The second operand is a vector of pointers, poi…
11769 ------------------
11796 object, or -1 if it is variable sized. The second argument is a pointer
11829 object, or -1 if it is variable sized. The second argument is a pointer
11860 object, or -1 if it is variable sized. The second argument is a pointer
11891 object, or -1 if it is variable sized and the third argument is a
11930 ------------------
12164 Currently some platforms have IR-level customized stack guard loading (e.g.
12195 or -1 (if false) when the object size is unknown. The second argument
12203 compile time, ``llvm.objectsize`` returns ``i32/i64 -1 or 0`` (depending
12342 - If the given pointer is associated with the given type metadata identifier,
12346 - If the given pointer is not associated with the given type metadata
12353 2. If the function has a non-void return type, a pointer to a function that
12402 This intrinsic, together with :ref:`deoptimization operand bundles
12404 frame-local state from the currently executing (typically more specialized,
12425 operand bundle <deopt_opbundles>`) and returns the value returned by
12427 the continuation itself is out of scope of the language reference --
12432 Deoptimization continuations expressed using ``"deopt"`` operand bundles always
12436 - ``@llvm.experimental.deoptimize`` cannot be invoked.
12437 - The call must immediately precede a :ref:`ret <i_ret>` instruction.
12438 - The ``ret`` instruction must return the value produced by the
12477 This intrinsic, together with :ref:`deoptimization operand bundles
12481 ``@llvm.experimental.deoptimize`` -- its body is defined to be
12484 .. code-block:: llvm
12526 This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
12530 ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
12539 --------------------