• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1==============================
2LLVM Language Reference Manual
3==============================
4
5.. contents::
6   :local:
7   :depth: 4
8
9Abstract
10========
11
12This document is a reference manual for the LLVM assembly language. LLVM
13is a Static Single Assignment (SSA) based representation that provides
14type safety, low-level operations, flexibility, and the capability of
15representing 'all' high-level languages cleanly. It is the common code
16representation used throughout all phases of the LLVM compilation
17strategy.
18
19Introduction
20============
21
22The LLVM code representation is designed to be used in three different
23forms: as an in-memory compiler IR, as an on-disk bitcode representation
24(suitable for fast loading by a Just-In-Time compiler), and as a human
25readable assembly language representation. This allows LLVM to provide a
26powerful intermediate representation for efficient compiler
27transformations and analysis, while providing a natural means to debug
28and visualize the transformations. The three different forms of LLVM are
29all equivalent. This document describes the human readable
30representation and notation.
31
32The LLVM representation aims to be light-weight and low-level while
33being expressive, typed, and extensible at the same time. It aims to be
34a "universal IR" of sorts, by being at a low enough level that
35high-level ideas may be cleanly mapped to it (similar to how
36microprocessors are "universal IR's", allowing many source languages to
37be mapped to them). By providing type information, LLVM can be used as
38the target of optimizations: for example, through pointer analysis, it
39can be proven that a C automatic variable is never accessed outside of
40the current function, allowing it to be promoted to a simple SSA value
41instead of a memory location.
42
43.. _wellformed:
44
45Well-Formedness
46---------------
47
48It is important to note that this document describes 'well formed' LLVM
49assembly language. There is a difference between what the parser accepts
50and what is considered 'well formed'. For example, the following
51instruction is syntactically okay, but not well formed:
52
53.. code-block:: llvm
54
55    %x = add i32 1, %x
56
57because the definition of ``%x`` does not dominate all of its uses. The
58LLVM infrastructure provides a verification pass that may be used to
59verify that an LLVM module is well formed. This pass is automatically
60run by the parser after parsing input assembly and by the optimizer
61before it outputs bitcode. The violations pointed out by the verifier
62pass indicate bugs in transformation passes or input to the parser.
63
64.. _identifiers:
65
66Identifiers
67===========
68
69LLVM identifiers come in two basic types: global and local. Global
70identifiers (functions, global variables) begin with the ``'@'``
71character. Local identifiers (register names, types) begin with the
72``'%'`` character. Additionally, there are three different formats for
73identifiers, for different purposes:
74
75#. Named values are represented as a string of characters with their
76   prefix. For example, ``%foo``, ``@DivisionByZero``,
77   ``%a.really.long.identifier``. The actual regular expression used is
78   '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
79   characters in their names can be surrounded with quotes. Special
80   characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
81   code for the character in hexadecimal. In this way, any character can
82   be used in a name value, even quotes themselves. The ``"\01"`` prefix
83   can be used on global values to suppress mangling.
84#. Unnamed values are represented as an unsigned numeric value with
85   their prefix. For example, ``%12``, ``@2``, ``%44``.
86#. Constants, which are described in the section Constants_ below.
87
88LLVM requires that values start with a prefix for two reasons: Compilers
89don't need to worry about name clashes with reserved words, and the set
90of reserved words may be expanded in the future without penalty.
91Additionally, unnamed identifiers allow a compiler to quickly come up
92with a temporary variable without having to avoid symbol table
93conflicts.
94
95Reserved words in LLVM are very similar to reserved words in other
96languages. There are keywords for different opcodes ('``add``',
97'``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
98'``i32``', etc...), and others. These reserved words cannot conflict
99with variable names, because none of them start with a prefix character
100(``'%'`` or ``'@'``).
101
102Here is an example of LLVM code to multiply the integer variable
103'``%X``' by 8:
104
105The easy way:
106
107.. code-block:: llvm
108
109    %result = mul i32 %X, 8
110
111After strength reduction:
112
113.. code-block:: llvm
114
115    %result = shl i32 %X, 3
116
117And the hard way:
118
119.. code-block:: llvm
120
121    %0 = add i32 %X, %X           ; yields i32:%0
122    %1 = add i32 %0, %0           ; yields i32:%1
123    %result = add i32 %1, %1
124
125This last way of multiplying ``%X`` by 8 illustrates several important
126lexical features of LLVM:
127
128#. Comments are delimited with a '``;``' and go until the end of line.
129#. Unnamed temporaries are created when the result of a computation is
130   not assigned to a named value.
131#. Unnamed temporaries are numbered sequentially (using a per-function
132   incrementing counter, starting with 0). Note that basic blocks and unnamed
133   function parameters are included in this numbering. For example, if the
134   entry basic block is not given a label name and all function parameters are
135   named, then it will get number 0.
136
137It also shows a convention that we follow in this document. When
138demonstrating instructions, we will follow an instruction with a comment
139that defines the type and name of value produced.
140
141High Level Structure
142====================
143
144Module Structure
145----------------
146
147LLVM programs are composed of ``Module``'s, each of which is a
148translation unit of the input programs. Each module consists of
149functions, global variables, and symbol table entries. Modules may be
150combined together with the LLVM linker, which merges function (and
151global variable) definitions, resolves forward declarations, and merges
152symbol table entries. Here is an example of the "hello world" module:
153
154.. code-block:: llvm
155
156    ; Declare the string constant as a global constant.
157    @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
158
159    ; External declaration of the puts function
160    declare i32 @puts(i8* nocapture) nounwind
161
162    ; Definition of main function
163    define i32 @main() {   ; i32()*
164      ; Convert [13 x i8]* to i8*...
165      %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0
166
167      ; Call puts function to write out the string to stdout.
168      call i32 @puts(i8* %cast210)
169      ret i32 0
170    }
171
172    ; Named metadata
173    !0 = !{i32 42, null, !"string"}
174    !foo = !{!0}
175
176This example is made up of a :ref:`global variable <globalvars>` named
177"``.str``", an external declaration of the "``puts``" function, a
178:ref:`function definition <functionstructure>` for "``main``" and
179:ref:`named metadata <namedmetadatastructure>` "``foo``".
180
181In general, a module is made up of a list of global values (where both
182functions and global variables are global values). Global values are
183represented by a pointer to a memory location (in this case, a pointer
184to an array of char, and a pointer to a function), and have one of the
185following :ref:`linkage types <linkage>`.
186
187.. _linkage:
188
189Linkage Types
190-------------
191
192All Global Variables and Functions have one of the following types of
193linkage:
194
195``private``
196    Global values with "``private``" linkage are only directly
197    accessible by objects in the current module. In particular, linking
198    code into a module with a private global value may cause the
199    private to be renamed as necessary to avoid collisions. Because the
200    symbol is private to the module, all references can be updated. This
201    doesn't show up in any symbol table in the object file.
202``internal``
203    Similar to private, but the value shows as a local symbol
204    (``STB_LOCAL`` in the case of ELF) in the object file. This
205    corresponds to the notion of the '``static``' keyword in C.
206``available_externally``
207    Globals with "``available_externally``" linkage are never emitted into
208    the object file corresponding to the LLVM module. From the linker's
209    perspective, an ``available_externally`` global is equivalent to
210    an external declaration. They exist to allow inlining and other
211    optimizations to take place given knowledge of the definition of the
212    global, which is known to be somewhere outside the module. Globals
213    with ``available_externally`` linkage are allowed to be discarded at
214    will, and allow inlining and other optimizations. This linkage type is
215    only allowed on definitions, not declarations.
216``linkonce``
217    Globals with "``linkonce``" linkage are merged with other globals of
218    the same name when linkage occurs. This can be used to implement
219    some forms of inline functions, templates, or other code which must
220    be generated in each translation unit that uses it, but where the
221    body may be overridden with a more definitive definition later.
222    Unreferenced ``linkonce`` globals are allowed to be discarded. Note
223    that ``linkonce`` linkage does not actually allow the optimizer to
224    inline the body of this function into callers because it doesn't
225    know if this definition of the function is the definitive definition
226    within the program or whether it will be overridden by a stronger
227    definition. To enable inlining and other optimizations, use
228    "``linkonce_odr``" linkage.
229``weak``
230    "``weak``" linkage has the same merging semantics as ``linkonce``
231    linkage, except that unreferenced globals with ``weak`` linkage may
232    not be discarded. This is used for globals that are declared "weak"
233    in C source code.
234``common``
235    "``common``" linkage is most similar to "``weak``" linkage, but they
236    are used for tentative definitions in C, such as "``int X;``" at
237    global scope. Symbols with "``common``" linkage are merged in the
238    same way as ``weak symbols``, and they may not be deleted if
239    unreferenced. ``common`` symbols may not have an explicit section,
240    must have a zero initializer, and may not be marked
241    ':ref:`constant <globalvars>`'. Functions and aliases may not have
242    common linkage.
243
244.. _linkage_appending:
245
246``appending``
247    "``appending``" linkage may only be applied to global variables of
248    pointer to array type. When two global variables with appending
249    linkage are linked together, the two global arrays are appended
250    together. This is the LLVM, typesafe, equivalent of having the
251    system linker append together "sections" with identical names when
252    .o files are linked.
253
254    Unfortunately this doesn't correspond to any feature in .o files, so it
255    can only be used for variables like ``llvm.global_ctors`` which llvm
256    interprets specially.
257
258``extern_weak``
259    The semantics of this linkage follow the ELF object file model: the
260    symbol is weak until linked, if not linked, the symbol becomes null
261    instead of being an undefined reference.
262``linkonce_odr``, ``weak_odr``
263    Some languages allow differing globals to be merged, such as two
264    functions with different semantics. Other languages, such as
265    ``C++``, ensure that only equivalent globals are ever merged (the
266    "one definition rule" --- "ODR"). Such languages can use the
267    ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
268    global will only be merged with equivalent globals. These linkage
269    types are otherwise the same as their non-``odr`` versions.
270``external``
271    If none of the above identifiers are used, the global is externally
272    visible, meaning that it participates in linkage and can be used to
273    resolve external symbol references.
274
275It is illegal for a function *declaration* to have any linkage type
276other than ``external`` or ``extern_weak``.
277
278.. _callingconv:
279
280Calling Conventions
281-------------------
282
283LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
284:ref:`invokes <i_invoke>` can all have an optional calling convention
285specified for the call. The calling convention of any pair of dynamic
286caller/callee must match, or the behavior of the program is undefined.
287The following calling conventions are supported by LLVM, and more may be
288added in the future:
289
290"``ccc``" - The C calling convention
291    This calling convention (the default if no other calling convention
292    is specified) matches the target C calling conventions. This calling
293    convention supports varargs function calls and tolerates some
294    mismatch in the declared prototype and implemented declaration of
295    the function (as does normal C).
296"``fastcc``" - The fast calling convention
297    This calling convention attempts to make calls as fast as possible
298    (e.g. by passing things in registers). This calling convention
299    allows the target to use whatever tricks it wants to produce fast
300    code for the target, without having to conform to an externally
301    specified ABI (Application Binary Interface). `Tail calls can only
302    be optimized when this, the GHC or the HiPE convention is
303    used. <CodeGenerator.html#id80>`_ This calling convention does not
304    support varargs and requires the prototype of all callees to exactly
305    match the prototype of the function definition.
306"``coldcc``" - The cold calling convention
307    This calling convention attempts to make code in the caller as
308    efficient as possible under the assumption that the call is not
309    commonly executed. As such, these calls often preserve all registers
310    so that the call does not break any live ranges in the caller side.
311    This calling convention does not support varargs and requires the
312    prototype of all callees to exactly match the prototype of the
313    function definition. Furthermore the inliner doesn't consider such function
314    calls for inlining.
315"``cc 10``" - GHC convention
316    This calling convention has been implemented specifically for use by
317    the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
318    It passes everything in registers, going to extremes to achieve this
319    by disabling callee save registers. This calling convention should
320    not be used lightly but only for specific situations such as an
321    alternative to the *register pinning* performance technique often
322    used when implementing functional programming languages. At the
323    moment only X86 supports this convention and it has the following
324    limitations:
325
326    -  On *X86-32* only supports up to 4 bit type parameters. No
327       floating-point types are supported.
328    -  On *X86-64* only supports up to 10 bit type parameters and 6
329       floating-point parameters.
330
331    This calling convention supports `tail call
332    optimization <CodeGenerator.html#id80>`_ but requires both the
333    caller and callee are using it.
334"``cc 11``" - The HiPE calling convention
335    This calling convention has been implemented specifically for use by
336    the `High-Performance Erlang
337    (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
338    native code compiler of the `Ericsson's Open Source Erlang/OTP
339    system <http://www.erlang.org/download.shtml>`_. It uses more
340    registers for argument passing than the ordinary C calling
341    convention and defines no callee-saved registers. The calling
342    convention properly supports `tail call
343    optimization <CodeGenerator.html#id80>`_ but requires that both the
344    caller and the callee use it. It uses a *register pinning*
345    mechanism, similar to GHC's convention, for keeping frequently
346    accessed runtime components pinned to specific hardware registers.
347    At the moment only X86 supports this convention (both 32 and 64
348    bit).
349"``webkit_jscc``" - WebKit's JavaScript calling convention
350    This calling convention has been implemented for `WebKit FTL JIT
351    <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
352    stack right to left (as cdecl does), and returns a value in the
353    platform's customary return register.
354"``anyregcc``" - Dynamic calling convention for code patching
355    This is a special convention that supports patching an arbitrary code
356    sequence in place of a call site. This convention forces the call
357    arguments into registers but allows them to be dynamically
358    allocated. This can currently only be used with calls to
359    llvm.experimental.patchpoint because only this intrinsic records
360    the location of its arguments in a side table. See :doc:`StackMaps`.
361"``preserve_mostcc``" - The `PreserveMost` calling convention
362    This calling convention attempts to make the code in the caller as
363    unintrusive as possible. This convention behaves identically to the `C`
364    calling convention on how arguments and return values are passed, but it
365    uses a different set of caller/callee-saved registers. This alleviates the
366    burden of saving and recovering a large register set before and after the
367    call in the caller. If the arguments are passed in callee-saved registers,
368    then they will be preserved by the callee across the call. This doesn't
369    apply for values returned in callee-saved registers.
370
371    - On X86-64 the callee preserves all general purpose registers, except for
372      R11. R11 can be used as a scratch register. Floating-point registers
373      (XMMs/YMMs) are not preserved and need to be saved by the caller.
374
375    The idea behind this convention is to support calls to runtime functions
376    that have a hot path and a cold path. The hot path is usually a small piece
377    of code that doesn't use many registers. The cold path might need to call out to
378    another function and therefore only needs to preserve the caller-saved
379    registers, which haven't already been saved by the caller. The
380    `PreserveMost` calling convention is very similar to the `cold` calling
381    convention in terms of caller/callee-saved registers, but they are used for
382    different types of function calls. `coldcc` is for function calls that are
383    rarely executed, whereas `preserve_mostcc` function calls are intended to be
384    on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
385    doesn't prevent the inliner from inlining the function call.
386
387    This calling convention will be used by a future version of the ObjectiveC
388    runtime and should therefore still be considered experimental at this time.
389    Although this convention was created to optimize certain runtime calls to
390    the ObjectiveC runtime, it is not limited to this runtime and might be used
391    by other runtimes in the future too. The current implementation only
392    supports X86-64, but the intention is to support more architectures in the
393    future.
394"``preserve_allcc``" - The `PreserveAll` calling convention
395    This calling convention attempts to make the code in the caller even less
396    intrusive than the `PreserveMost` calling convention. This calling
397    convention also behaves identical to the `C` calling convention on how
398    arguments and return values are passed, but it uses a different set of
399    caller/callee-saved registers. This removes the burden of saving and
400    recovering a large register set before and after the call in the caller. If
401    the arguments are passed in callee-saved registers, then they will be
402    preserved by the callee across the call. This doesn't apply for values
403    returned in callee-saved registers.
404
405    - On X86-64 the callee preserves all general purpose registers, except for
406      R11. R11 can be used as a scratch register. Furthermore it also preserves
407      all floating-point registers (XMMs/YMMs).
408
409    The idea behind this convention is to support calls to runtime functions
410    that don't need to call out to any other functions.
411
412    This calling convention, like the `PreserveMost` calling convention, will be
413    used by a future version of the ObjectiveC runtime and should be considered
414    experimental at this time.
415"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
416    Clang generates an access function to access C++-style TLS. The access
417    function generally has an entry block, an exit block and an initialization
418    block that is run at the first time. The entry and exit blocks can access
419    a few TLS IR variables, each access will be lowered to a platform-specific
420    sequence.
421
422    This calling convention aims to minimize overhead in the caller by
423    preserving as many registers as possible (all the registers that are
424    perserved on the fast path, composed of the entry and exit blocks).
425
426    This calling convention behaves identical to the `C` calling convention on
427    how arguments and return values are passed, but it uses a different set of
428    caller/callee-saved registers.
429
430    Given that each platform has its own lowering sequence, hence its own set
431    of preserved registers, we can't use the existing `PreserveMost`.
432
433    - On X86-64 the callee preserves all general purpose registers, except for
434      RDI and RAX.
435"``swiftcc``" - This calling convention is used for Swift language.
436    - On X86-64 RCX and R8 are available for additional integer returns, and
437      XMM2 and XMM3 are available for additional FP/vector returns.
438    - On iOS platforms, we use AAPCS-VFP calling convention.
439"``cc <n>``" - Numbered convention
440    Any calling convention may be specified by number, allowing
441    target-specific calling conventions to be used. Target specific
442    calling conventions start at 64.
443
444More calling conventions can be added/defined on an as-needed basis, to
445support Pascal conventions or any other well-known target-independent
446convention.
447
448.. _visibilitystyles:
449
450Visibility Styles
451-----------------
452
453All Global Variables and Functions have one of the following visibility
454styles:
455
456"``default``" - Default style
457    On targets that use the ELF object file format, default visibility
458    means that the declaration is visible to other modules and, in
459    shared libraries, means that the declared entity may be overridden.
460    On Darwin, default visibility means that the declaration is visible
461    to other modules. Default visibility corresponds to "external
462    linkage" in the language.
463"``hidden``" - Hidden style
464    Two declarations of an object with hidden visibility refer to the
465    same object if they are in the same shared object. Usually, hidden
466    visibility indicates that the symbol will not be placed into the
467    dynamic symbol table, so no other module (executable or shared
468    library) can reference it directly.
469"``protected``" - Protected style
470    On ELF, protected visibility indicates that the symbol will be
471    placed in the dynamic symbol table, but that references within the
472    defining module will bind to the local symbol. That is, the symbol
473    cannot be overridden by another module.
474
475A symbol with ``internal`` or ``private`` linkage must have ``default``
476visibility.
477
478.. _dllstorageclass:
479
480DLL Storage Classes
481-------------------
482
483All Global Variables, Functions and Aliases can have one of the following
484DLL storage class:
485
486``dllimport``
487    "``dllimport``" causes the compiler to reference a function or variable via
488    a global pointer to a pointer that is set up by the DLL exporting the
489    symbol. On Microsoft Windows targets, the pointer name is formed by
490    combining ``__imp_`` and the function or variable name.
491``dllexport``
492    "``dllexport``" causes the compiler to provide a global pointer to a pointer
493    in a DLL, so that it can be referenced with the ``dllimport`` attribute. On
494    Microsoft Windows targets, the pointer name is formed by combining
495    ``__imp_`` and the function or variable name. Since this storage class
496    exists for defining a dll interface, the compiler, assembler and linker know
497    it is externally referenced and must refrain from deleting the symbol.
498
499.. _tls_model:
500
501Thread Local Storage Models
502---------------------------
503
504A variable may be defined as ``thread_local``, which means that it will
505not be shared by threads (each thread will have a separated copy of the
506variable). Not all targets support thread-local variables. Optionally, a
507TLS model may be specified:
508
509``localdynamic``
510    For variables that are only used within the current shared library.
511``initialexec``
512    For variables in modules that will not be loaded dynamically.
513``localexec``
514    For variables defined in the executable and only used within it.
515
516If no explicit model is given, the "general dynamic" model is used.
517
518The models correspond to the ELF TLS models; see `ELF Handling For
519Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
520more information on under which circumstances the different models may
521be used. The target may choose a different TLS model if the specified
522model is not supported, or if a better choice of model can be made.
523
524A model can also be specified in an alias, but then it only governs how
525the alias is accessed. It will not have any effect in the aliasee.
526
527For platforms without linker support of ELF TLS model, the -femulated-tls
528flag can be used to generate GCC compatible emulated TLS code.
529
530.. _runtime_preemption_model:
531
532Runtime Preemption Specifiers
533-----------------------------
534
535Global variables, functions and aliases may have an optional runtime preemption
536specifier. If a preemption specifier isn't given explicitly, then a
537symbol is assumed to be ``dso_preemptable``.
538
539``dso_preemptable``
540    Indicates that the function or variable may be replaced by a symbol from
541    outside the linkage unit at runtime.
542
543``dso_local``
544    The compiler may assume that a function or variable marked as ``dso_local``
545    will resolve to a symbol within the same linkage unit. Direct access will
546    be generated even if the definition is not within this compilation unit.
547
548.. _namedtypes:
549
550Structure Types
551---------------
552
553LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
554types <t_struct>`. Literal types are uniqued structurally, but identified types
555are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
556to forward declare a type that is not yet available.
557
558An example of an identified structure specification is:
559
560.. code-block:: llvm
561
562    %mytype = type { %mytype*, i32 }
563
564Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
565literal types are uniqued in recent versions of LLVM.
566
567.. _nointptrtype:
568
569Non-Integral Pointer Type
570-------------------------
571
572Note: non-integral pointer types are a work in progress, and they should be
573considered experimental at this time.
574
575LLVM IR optionally allows the frontend to denote pointers in certain address
576spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
577Non-integral pointer types represent pointers that have an *unspecified* bitwise
578representation; that is, the integral representation may be target dependent or
579unstable (not backed by a fixed integer).
580
581``inttoptr`` instructions converting integers to non-integral pointer types are
582ill-typed, and so are ``ptrtoint`` instructions converting values of
583non-integral pointer types to integers.  Vector versions of said instructions
584are ill-typed as well.
585
586.. _globalvars:
587
588Global Variables
589----------------
590
591Global variables define regions of memory allocated at compilation time
592instead of run-time.
593
594Global variable definitions must be initialized.
595
596Global variables in other translation units can also be declared, in which
597case they don't have an initializer.
598
599Either global variable definitions or declarations may have an explicit section
600to be placed in and may have an optional explicit alignment specified. If there
601is a mismatch between the explicit or inferred section information for the
602variable declaration and its definition the resulting behavior is undefined.
603
604A variable may be defined as a global ``constant``, which indicates that
605the contents of the variable will **never** be modified (enabling better
606optimization, allowing the global data to be placed in the read-only
607section of an executable, etc). Note that variables that need runtime
608initialization cannot be marked ``constant`` as there is a store to the
609variable.
610
611LLVM explicitly allows *declarations* of global variables to be marked
612constant, even if the final definition of the global is not. This
613capability can be used to enable slightly better optimization of the
614program, but requires the language definition to guarantee that
615optimizations based on the 'constantness' are valid for the translation
616units that do not include the definition.
617
618As SSA values, global variables define pointer values that are in scope
619(i.e. they dominate) all basic blocks in the program. Global variables
620always define a pointer to their "content" type because they describe a
621region of memory, and all memory objects in LLVM are accessed through
622pointers.
623
624Global variables can be marked with ``unnamed_addr`` which indicates
625that the address is not significant, only the content. Constants marked
626like this can be merged with other constants if they have the same
627initializer. Note that a constant with significant address *can* be
628merged with a ``unnamed_addr`` constant, the result being a constant
629whose address is significant.
630
631If the ``local_unnamed_addr`` attribute is given, the address is known to
632not be significant within the module.
633
634A global variable may be declared to reside in a target-specific
635numbered address space. For targets that support them, address spaces
636may affect how optimizations are performed and/or what target
637instructions are used to access the variable. The default address space
638is zero. The address space qualifier must precede any other attributes.
639
640LLVM allows an explicit section to be specified for globals. If the
641target supports it, it will emit globals to the section specified.
642Additionally, the global can placed in a comdat if the target has the necessary
643support.
644
645External declarations may have an explicit section specified. Section
646information is retained in LLVM IR for targets that make use of this
647information. Attaching section information to an external declaration is an
648assertion that its definition is located in the specified section. If the
649definition is located in a different section, the behavior is undefined.
650
651By default, global initializers are optimized by assuming that global
652variables defined within the module are not modified from their
653initial values before the start of the global initializer. This is
654true even for variables potentially accessible from outside the
655module, including those with external linkage or appearing in
656``@llvm.used`` or dllexported variables. This assumption may be suppressed
657by marking the variable with ``externally_initialized``.
658
659An explicit alignment may be specified for a global, which must be a
660power of 2. If not present, or if the alignment is set to zero, the
661alignment of the global is set by the target to whatever it feels
662convenient. If an explicit alignment is specified, the global is forced
663to have exactly that alignment. Targets and optimizers are not allowed
664to over-align the global if the global has an assigned section. In this
665case, the extra alignment could be observable: for example, code could
666assume that the globals are densely packed in their section and try to
667iterate over them as an array, alignment padding would break this
668iteration. The maximum alignment is ``1 << 29``.
669
670Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
671an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
672an optional :ref:`global attributes <glattrs>` and
673an optional list of attached :ref:`metadata <metadata>`.
674
675Variables and aliases can have a
676:ref:`Thread Local Storage Model <tls_model>`.
677
678Syntax::
679
680      @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
681                         [DLLStorageClass] [ThreadLocal]
682                         [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
683                         [ExternallyInitialized]
684                         <global | constant> <Type> [<InitializerConstant>]
685                         [, section "name"] [, comdat [($name)]]
686                         [, align <Alignment>] (, !name !N)*
687
688For example, the following defines a global in a numbered address space
689with an initializer, section, and alignment:
690
691.. code-block:: llvm
692
693    @G = addrspace(5) constant float 1.0, section "foo", align 4
694
695The following example just declares a global variable
696
697.. code-block:: llvm
698
699   @G = external global i32
700
701The following example defines a thread-local global with the
702``initialexec`` TLS model:
703
704.. code-block:: llvm
705
706    @G = thread_local(initialexec) global i32 0, align 4
707
708.. _functionstructure:
709
710Functions
711---------
712
713LLVM function definitions consist of the "``define``" keyword, an
714optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
715specifier <runtime_preemption_model>`,  an optional :ref:`visibility
716style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
717an optional :ref:`calling convention <callingconv>`,
718an optional ``unnamed_addr`` attribute, a return type, an optional
719:ref:`parameter attribute <paramattrs>` for the return type, a function
720name, a (possibly empty) argument list (each with optional :ref:`parameter
721attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
722an optional section, an optional alignment,
723an optional :ref:`comdat <langref_comdats>`,
724an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
725an optional :ref:`prologue <prologuedata>`,
726an optional :ref:`personality <personalityfn>`,
727an optional list of attached :ref:`metadata <metadata>`,
728an opening curly brace, a list of basic blocks, and a closing curly brace.
729
730LLVM function declarations consist of the "``declare``" keyword, an
731optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
732<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
733optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
734or ``local_unnamed_addr`` attribute, a return type, an optional :ref:`parameter
735attribute <paramattrs>` for the return type, a function name, a possibly
736empty list of arguments, an optional alignment, an optional :ref:`garbage
737collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
738:ref:`prologue <prologuedata>`.
739
740A function definition contains a list of basic blocks, forming the CFG (Control
741Flow Graph) for the function. Each basic block may optionally start with a label
742(giving the basic block a symbol table entry), contains a list of instructions,
743and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
744function return). If an explicit label is not provided, a block is assigned an
745implicit numbered label, using the next value from the same counter as used for
746unnamed temporaries (:ref:`see above<identifiers>`). For example, if a function
747entry block does not have an explicit label, it will be assigned label "%0",
748then the first unnamed temporary in that block will be "%1", etc.
749
750The first basic block in a function is special in two ways: it is
751immediately executed on entrance to the function, and it is not allowed
752to have predecessor basic blocks (i.e. there can not be any branches to
753the entry block of a function). Because the block can have no
754predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
755
756LLVM allows an explicit section to be specified for functions. If the
757target supports it, it will emit functions to the section specified.
758Additionally, the function can be placed in a COMDAT.
759
760An explicit alignment may be specified for a function. If not present,
761or if the alignment is set to zero, the alignment of the function is set
762by the target to whatever it feels convenient. If an explicit alignment
763is specified, the function is forced to have at least that much
764alignment. All alignments must be a power of 2.
765
766If the ``unnamed_addr`` attribute is given, the address is known to not
767be significant and two identical functions can be merged.
768
769If the ``local_unnamed_addr`` attribute is given, the address is known to
770not be significant within the module.
771
772Syntax::
773
774    define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
775           [cconv] [ret attrs]
776           <ResultType> @<FunctionName> ([argument list])
777           [(unnamed_addr|local_unnamed_addr)] [fn Attrs] [section "name"]
778           [comdat [($name)]] [align N] [gc] [prefix Constant]
779           [prologue Constant] [personality Constant] (!name !N)* { ... }
780
781The argument list is a comma separated sequence of arguments where each
782argument is of the following form:
783
784Syntax::
785
786   <type> [parameter Attrs] [name]
787
788
789.. _langref_aliases:
790
791Aliases
792-------
793
794Aliases, unlike function or variables, don't create any new data. They
795are just a new symbol and metadata for an existing position.
796
797Aliases have a name and an aliasee that is either a global value or a
798constant expression.
799
800Aliases may have an optional :ref:`linkage type <linkage>`, an optional
801:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
802:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
803<dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
804
805Syntax::
806
807    @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
808
809The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
810``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
811might not correctly handle dropping a weak symbol that is aliased.
812
813Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
814the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
815to the same content.
816
817If the ``local_unnamed_addr`` attribute is given, the address is known to
818not be significant within the module.
819
820Since aliases are only a second name, some restrictions apply, of which
821some can only be checked when producing an object file:
822
823* The expression defining the aliasee must be computable at assembly
824  time. Since it is just a name, no relocations can be used.
825
826* No alias in the expression can be weak as the possibility of the
827  intermediate alias being overridden cannot be represented in an
828  object file.
829
830* No global value in the expression can be a declaration, since that
831  would require a relocation, which is not possible.
832
833.. _langref_ifunc:
834
835IFuncs
836-------
837
838IFuncs, like as aliases, don't create any new data or func. They are just a new
839symbol that dynamic linker resolves at runtime by calling a resolver function.
840
841IFuncs have a name and a resolver that is a function called by dynamic linker
842that returns address of another function associated with the name.
843
844IFunc may have an optional :ref:`linkage type <linkage>` and an optional
845:ref:`visibility style <visibility>`.
846
847Syntax::
848
849    @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
850
851
852.. _langref_comdats:
853
854Comdats
855-------
856
857Comdat IR provides access to COFF and ELF object file COMDAT functionality.
858
859Comdats have a name which represents the COMDAT key. All global objects that
860specify this key will only end up in the final object file if the linker chooses
861that key over some other key. Aliases are placed in the same COMDAT that their
862aliasee computes to, if any.
863
864Comdats have a selection kind to provide input on how the linker should
865choose between keys in two different object files.
866
867Syntax::
868
869    $<Name> = comdat SelectionKind
870
871The selection kind must be one of the following:
872
873``any``
874    The linker may choose any COMDAT key, the choice is arbitrary.
875``exactmatch``
876    The linker may choose any COMDAT key but the sections must contain the
877    same data.
878``largest``
879    The linker will choose the section containing the largest COMDAT key.
880``noduplicates``
881    The linker requires that only section with this COMDAT key exist.
882``samesize``
883    The linker may choose any COMDAT key but the sections must contain the
884    same amount of data.
885
886Note that the Mach-O platform doesn't support COMDATs, and ELF and WebAssembly
887only support ``any`` as a selection kind.
888
889Here is an example of a COMDAT group where a function will only be selected if
890the COMDAT key's section is the largest:
891
892.. code-block:: text
893
894   $foo = comdat largest
895   @foo = global i32 2, comdat($foo)
896
897   define void @bar() comdat($foo) {
898     ret void
899   }
900
901As a syntactic sugar the ``$name`` can be omitted if the name is the same as
902the global name:
903
904.. code-block:: text
905
906  $foo = comdat any
907  @foo = global i32 2, comdat
908
909
910In a COFF object file, this will create a COMDAT section with selection kind
911``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
912and another COMDAT section with selection kind
913``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
914section and contains the contents of the ``@bar`` symbol.
915
916There are some restrictions on the properties of the global object.
917It, or an alias to it, must have the same name as the COMDAT group when
918targeting COFF.
919The contents and size of this object may be used during link-time to determine
920which COMDAT groups get selected depending on the selection kind.
921Because the name of the object must match the name of the COMDAT group, the
922linkage of the global object must not be local; local symbols can get renamed
923if a collision occurs in the symbol table.
924
925The combined use of COMDATS and section attributes may yield surprising results.
926For example:
927
928.. code-block:: text
929
930   $foo = comdat any
931   $bar = comdat any
932   @g1 = global i32 42, section "sec", comdat($foo)
933   @g2 = global i32 42, section "sec", comdat($bar)
934
935From the object file perspective, this requires the creation of two sections
936with the same name. This is necessary because both globals belong to different
937COMDAT groups and COMDATs, at the object file level, are represented by
938sections.
939
940Note that certain IR constructs like global variables and functions may
941create COMDATs in the object file in addition to any which are specified using
942COMDAT IR. This arises when the code generator is configured to emit globals
943in individual sections (e.g. when `-data-sections` or `-function-sections`
944is supplied to `llc`).
945
946.. _namedmetadatastructure:
947
948Named Metadata
949--------------
950
951Named metadata is a collection of metadata. :ref:`Metadata
952nodes <metadata>` (but not metadata strings) are the only valid
953operands for a named metadata.
954
955#. Named metadata are represented as a string of characters with the
956   metadata prefix. The rules for metadata names are the same as for
957   identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
958   are still valid, which allows any character to be part of a name.
959
960Syntax::
961
962    ; Some unnamed metadata nodes, which are referenced by the named metadata.
963    !0 = !{!"zero"}
964    !1 = !{!"one"}
965    !2 = !{!"two"}
966    ; A named metadata.
967    !name = !{!0, !1, !2}
968
969.. _paramattrs:
970
971Parameter Attributes
972--------------------
973
974The return type and each parameter of a function type may have a set of
975*parameter attributes* associated with them. Parameter attributes are
976used to communicate additional information about the result or
977parameters of a function. Parameter attributes are considered to be part
978of the function, not of the function type, so functions with different
979parameter attributes can have the same function type.
980
981Parameter attributes are simple keywords that follow the type specified.
982If multiple parameter attributes are needed, they are space separated.
983For example:
984
985.. code-block:: llvm
986
987    declare i32 @printf(i8* noalias nocapture, ...)
988    declare i32 @atoi(i8 zeroext)
989    declare signext i8 @returns_signed_char()
990
991Note that any attributes for the function result (``nounwind``,
992``readonly``) come immediately after the argument list.
993
994Currently, only the following parameter attributes are defined:
995
996``zeroext``
997    This indicates to the code generator that the parameter or return
998    value should be zero-extended to the extent required by the target's
999    ABI by the caller (for a parameter) or the callee (for a return value).
1000``signext``
1001    This indicates to the code generator that the parameter or return
1002    value should be sign-extended to the extent required by the target's
1003    ABI (which is usually 32-bits) by the caller (for a parameter) or
1004    the callee (for a return value).
1005``inreg``
1006    This indicates that this parameter or return value should be treated
1007    in a special target-dependent fashion while emitting code for
1008    a function call or return (usually, by putting it in a register as
1009    opposed to memory, though some targets use it to distinguish between
1010    two different kinds of registers). Use of this attribute is
1011    target-specific.
1012``byval``
1013    This indicates that the pointer parameter should really be passed by
1014    value to the function. The attribute implies that a hidden copy of
1015    the pointee is made between the caller and the callee, so the callee
1016    is unable to modify the value in the caller. This attribute is only
1017    valid on LLVM pointer arguments. It is generally used to pass
1018    structs and arrays by value, but is also valid on pointers to
1019    scalars. The copy is considered to belong to the caller not the
1020    callee (for example, ``readonly`` functions should not write to
1021    ``byval`` parameters). This is not a valid attribute for return
1022    values.
1023
1024    The byval attribute also supports specifying an alignment with the
1025    align attribute. It indicates the alignment of the stack slot to
1026    form and the known alignment of the pointer specified to the call
1027    site. If the alignment is not specified, then the code generator
1028    makes a target-specific assumption.
1029
1030.. _attr_inalloca:
1031
1032``inalloca``
1033
1034    The ``inalloca`` argument attribute allows the caller to take the
1035    address of outgoing stack arguments. An ``inalloca`` argument must
1036    be a pointer to stack memory produced by an ``alloca`` instruction.
1037    The alloca, or argument allocation, must also be tagged with the
1038    inalloca keyword. Only the last argument may have the ``inalloca``
1039    attribute, and that argument is guaranteed to be passed in memory.
1040
1041    An argument allocation may be used by a call at most once because
1042    the call may deallocate it. The ``inalloca`` attribute cannot be
1043    used in conjunction with other attributes that affect argument
1044    storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
1045    ``inalloca`` attribute also disables LLVM's implicit lowering of
1046    large aggregate return values, which means that frontend authors
1047    must lower them with ``sret`` pointers.
1048
1049    When the call site is reached, the argument allocation must have
1050    been the most recent stack allocation that is still live, or the
1051    behavior is undefined. It is possible to allocate additional stack
1052    space after an argument allocation and before its call site, but it
1053    must be cleared off with :ref:`llvm.stackrestore
1054    <int_stackrestore>`.
1055
1056    See :doc:`InAlloca` for more information on how to use this
1057    attribute.
1058
1059``sret``
1060    This indicates that the pointer parameter specifies the address of a
1061    structure that is the return value of the function in the source
1062    program. This pointer must be guaranteed by the caller to be valid:
1063    loads and stores to the structure may be assumed by the callee not
1064    to trap and to be properly aligned. This is not a valid attribute
1065    for return values.
1066
1067.. _attr_align:
1068
1069``align <n>``
1070    This indicates that the pointer value may be assumed by the optimizer to
1071    have the specified alignment.
1072
1073    Note that this attribute has additional semantics when combined with the
1074    ``byval`` attribute.
1075
1076.. _noalias:
1077
1078``noalias``
1079    This indicates that objects accessed via pointer values
1080    :ref:`based <pointeraliasing>` on the argument or return value are not also
1081    accessed, during the execution of the function, via pointer values not
1082    *based* on the argument or return value. The attribute on a return value
1083    also has additional semantics described below. The caller shares the
1084    responsibility with the callee for ensuring that these requirements are met.
1085    For further details, please see the discussion of the NoAlias response in
1086    :ref:`alias analysis <Must, May, or No>`.
1087
1088    Note that this definition of ``noalias`` is intentionally similar
1089    to the definition of ``restrict`` in C99 for function arguments.
1090
1091    For function return values, C99's ``restrict`` is not meaningful,
1092    while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1093    attribute on return values are stronger than the semantics of the attribute
1094    when used on function arguments. On function return values, the ``noalias``
1095    attribute indicates that the function acts like a system memory allocation
1096    function, returning a pointer to allocated storage disjoint from the
1097    storage for any other object accessible to the caller.
1098
1099``nocapture``
1100    This indicates that the callee does not make any copies of the
1101    pointer that outlive the callee itself. This is not a valid
1102    attribute for return values.  Addresses used in volatile operations
1103    are considered to be captured.
1104
1105.. _nest:
1106
1107``nest``
1108    This indicates that the pointer parameter can be excised using the
1109    :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
1110    attribute for return values and can only be applied to one parameter.
1111
1112``returned``
1113    This indicates that the function always returns the argument as its return
1114    value. This is a hint to the optimizer and code generator used when
1115    generating the caller, allowing value propagation, tail call optimization,
1116    and omission of register saves and restores in some cases; it is not
1117    checked or enforced when generating the callee. The parameter and the
1118    function return type must be valid operands for the
1119    :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1120    return values and can only be applied to one parameter.
1121
1122``nonnull``
1123    This indicates that the parameter or return pointer is not null. This
1124    attribute may only be applied to pointer typed parameters. This is not
1125    checked or enforced by LLVM; if the parameter or return pointer is null,
1126    the behavior is undefined.
1127
1128``dereferenceable(<n>)``
1129    This indicates that the parameter or return pointer is dereferenceable. This
1130    attribute may only be applied to pointer typed parameters. A pointer that
1131    is dereferenceable can be loaded from speculatively without a risk of
1132    trapping. The number of bytes known to be dereferenceable must be provided
1133    in parentheses. It is legal for the number of bytes to be less than the
1134    size of the pointee type. The ``nonnull`` attribute does not imply
1135    dereferenceability (consider a pointer to one element past the end of an
1136    array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1137    ``addrspace(0)`` (which is the default address space).
1138
1139``dereferenceable_or_null(<n>)``
1140    This indicates that the parameter or return value isn't both
1141    non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1142    time. All non-null pointers tagged with
1143    ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1144    For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1145    a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1146    and in other address spaces ``dereferenceable_or_null(<n>)``
1147    implies that a pointer is at least one of ``dereferenceable(<n>)``
1148    or ``null`` (i.e. it may be both ``null`` and
1149    ``dereferenceable(<n>)``). This attribute may only be applied to
1150    pointer typed parameters.
1151
1152``swiftself``
1153    This indicates that the parameter is the self/context parameter. This is not
1154    a valid attribute for return values and can only be applied to one
1155    parameter.
1156
1157``swifterror``
1158    This attribute is motivated to model and optimize Swift error handling. It
1159    can be applied to a parameter with pointer to pointer type or a
1160    pointer-sized alloca. At the call site, the actual argument that corresponds
1161    to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1162    the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1163    the parameter or the alloca) can only be loaded and stored from, or used as
1164    a ``swifterror`` argument. This is not a valid attribute for return values
1165    and can only be applied to one parameter.
1166
1167    These constraints allow the calling convention to optimize access to
1168    ``swifterror`` variables by associating them with a specific register at
1169    call boundaries rather than placing them in memory. Since this does change
1170    the calling convention, a function which uses the ``swifterror`` attribute
1171    on a parameter is not ABI-compatible with one which does not.
1172
1173    These constraints also allow LLVM to assume that a ``swifterror`` argument
1174    does not alias any other memory visible within a function and that a
1175    ``swifterror`` alloca passed as an argument does not escape.
1176
1177.. _gc:
1178
1179Garbage Collector Strategy Names
1180--------------------------------
1181
1182Each function may specify a garbage collector strategy name, which is simply a
1183string:
1184
1185.. code-block:: llvm
1186
1187    define void @f() gc "name" { ... }
1188
1189The supported values of *name* includes those :ref:`built in to LLVM
1190<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1191strategy will cause the compiler to alter its output in order to support the
1192named garbage collection algorithm. Note that LLVM itself does not contain a
1193garbage collector, this functionality is restricted to generating machine code
1194which can interoperate with a collector provided externally.
1195
1196.. _prefixdata:
1197
1198Prefix Data
1199-----------
1200
1201Prefix data is data associated with a function which the code
1202generator will emit immediately before the function's entrypoint.
1203The purpose of this feature is to allow frontends to associate
1204language-specific runtime metadata with specific functions and make it
1205available through the function pointer while still allowing the
1206function pointer to be called.
1207
1208To access the data for a given function, a program may bitcast the
1209function pointer to a pointer to the constant's type and dereference
1210index -1. This implies that the IR symbol points just past the end of
1211the prefix data. For instance, take the example of a function annotated
1212with a single ``i32``,
1213
1214.. code-block:: llvm
1215
1216    define void @f() prefix i32 123 { ... }
1217
1218The prefix data can be referenced as,
1219
1220.. code-block:: llvm
1221
1222    %0 = bitcast void* () @f to i32*
1223    %a = getelementptr inbounds i32, i32* %0, i32 -1
1224    %b = load i32, i32* %a
1225
1226Prefix data is laid out as if it were an initializer for a global variable
1227of the prefix data's type. The function will be placed such that the
1228beginning of the prefix data is aligned. This means that if the size
1229of the prefix data is not a multiple of the alignment size, the
1230function's entrypoint will not be aligned. If alignment of the
1231function's entrypoint is desired, padding must be added to the prefix
1232data.
1233
1234A function may have prefix data but no body. This has similar semantics
1235to the ``available_externally`` linkage in that the data may be used by the
1236optimizers but will not be emitted in the object file.
1237
1238.. _prologuedata:
1239
1240Prologue Data
1241-------------
1242
1243The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1244be inserted prior to the function body. This can be used for enabling
1245function hot-patching and instrumentation.
1246
1247To maintain the semantics of ordinary function calls, the prologue data must
1248have a particular format. Specifically, it must begin with a sequence of
1249bytes which decode to a sequence of machine instructions, valid for the
1250module's target, which transfer control to the point immediately succeeding
1251the prologue data, without performing any other visible action. This allows
1252the inliner and other passes to reason about the semantics of the function
1253definition without needing to reason about the prologue data. Obviously this
1254makes the format of the prologue data highly target dependent.
1255
1256A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
1257which encodes the ``nop`` instruction:
1258
1259.. code-block:: text
1260
1261    define void @f() prologue i8 144 { ... }
1262
1263Generally prologue data can be formed by encoding a relative branch instruction
1264which skips the metadata, as in this example of valid prologue data for the
1265x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1266
1267.. code-block:: text
1268
1269    %0 = type <{ i8, i8, i8* }>
1270
1271    define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... }
1272
1273A function may have prologue data but no body. This has similar semantics
1274to the ``available_externally`` linkage in that the data may be used by the
1275optimizers but will not be emitted in the object file.
1276
1277.. _personalityfn:
1278
1279Personality Function
1280--------------------
1281
1282The ``personality`` attribute permits functions to specify what function
1283to use for exception handling.
1284
1285.. _attrgrp:
1286
1287Attribute Groups
1288----------------
1289
1290Attribute groups are groups of attributes that are referenced by objects within
1291the IR. They are important for keeping ``.ll`` files readable, because a lot of
1292functions will use the same set of attributes. In the degenerative case of a
1293``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1294group will capture the important command line flags used to build that file.
1295
1296An attribute group is a module-level object. To use an attribute group, an
1297object references the attribute group's ID (e.g. ``#37``). An object may refer
1298to more than one attribute group. In that situation, the attributes from the
1299different groups are merged.
1300
1301Here is an example of attribute groups for a function that should always be
1302inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1303
1304.. code-block:: llvm
1305
1306   ; Target-independent attributes:
1307   attributes #0 = { alwaysinline alignstack=4 }
1308
1309   ; Target-dependent attributes:
1310   attributes #1 = { "no-sse" }
1311
1312   ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1313   define void @f() #0 #1 { ... }
1314
1315.. _fnattrs:
1316
1317Function Attributes
1318-------------------
1319
1320Function attributes are set to communicate additional information about
1321a function. Function attributes are considered to be part of the
1322function, not of the function type, so functions with different function
1323attributes can have the same function type.
1324
1325Function attributes are simple keywords that follow the type specified.
1326If multiple attributes are needed, they are space separated. For
1327example:
1328
1329.. code-block:: llvm
1330
1331    define void @f() noinline { ... }
1332    define void @f() alwaysinline { ... }
1333    define void @f() alwaysinline optsize { ... }
1334    define void @f() optsize { ... }
1335
1336``alignstack(<n>)``
1337    This attribute indicates that, when emitting the prologue and
1338    epilogue, the backend should forcibly align the stack pointer.
1339    Specify the desired alignment, which must be a power of two, in
1340    parentheses.
1341``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1342    This attribute indicates that the annotated function will always return at
1343    least a given number of bytes (or null). Its arguments are zero-indexed
1344    parameter numbers; if one argument is provided, then it's assumed that at
1345    least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1346    returned pointer. If two are provided, then it's assumed that
1347    ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1348    available. The referenced parameters must be integer types. No assumptions
1349    are made about the contents of the returned block of memory.
1350``alwaysinline``
1351    This attribute indicates that the inliner should attempt to inline
1352    this function into callers whenever possible, ignoring any active
1353    inlining size threshold for this caller.
1354``builtin``
1355    This indicates that the callee function at a call site should be
1356    recognized as a built-in function, even though the function's declaration
1357    uses the ``nobuiltin`` attribute. This is only valid at call sites for
1358    direct calls to functions that are declared with the ``nobuiltin``
1359    attribute.
1360``cold``
1361    This attribute indicates that this function is rarely called. When
1362    computing edge weights, basic blocks post-dominated by a cold
1363    function call are also considered to be cold; and, thus, given low
1364    weight.
1365``convergent``
1366    In some parallel execution models, there exist operations that cannot be
1367    made control-dependent on any additional values.  We call such operations
1368    ``convergent``, and mark them with this attribute.
1369
1370    The ``convergent`` attribute may appear on functions or call/invoke
1371    instructions.  When it appears on a function, it indicates that calls to
1372    this function should not be made control-dependent on additional values.
1373    For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so
1374    calls to this intrinsic cannot be made control-dependent on additional
1375    values.
1376
1377    When it appears on a call/invoke, the ``convergent`` attribute indicates
1378    that we should treat the call as though we're calling a convergent
1379    function.  This is particularly useful on indirect calls; without this we
1380    may treat such calls as though the target is non-convergent.
1381
1382    The optimizer may remove the ``convergent`` attribute on functions when it
1383    can prove that the function does not execute any convergent operations.
1384    Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
1385    can prove that the call/invoke cannot call a convergent function.
1386``inaccessiblememonly``
1387    This attribute indicates that the function may only access memory that
1388    is not accessible by the module being compiled. This is a weaker form
1389    of ``readnone``. If the function reads or writes other memory, the
1390    behavior is undefined.
1391``inaccessiblemem_or_argmemonly``
1392    This attribute indicates that the function may only access memory that is
1393    either not accessible by the module being compiled, or is pointed to
1394    by its pointer arguments. This is a weaker form of  ``argmemonly``. If the
1395    function reads or writes other memory, the behavior is undefined.
1396``inlinehint``
1397    This attribute indicates that the source code contained a hint that
1398    inlining this function is desirable (such as the "inline" keyword in
1399    C/C++). It is just a hint; it imposes no requirements on the
1400    inliner.
1401``jumptable``
1402    This attribute indicates that the function should be added to a
1403    jump-instruction table at code-generation time, and that all address-taken
1404    references to this function should be replaced with a reference to the
1405    appropriate jump-instruction-table function pointer. Note that this creates
1406    a new pointer for the original function, which means that code that depends
1407    on function-pointer identity can break. So, any function annotated with
1408    ``jumptable`` must also be ``unnamed_addr``.
1409``minsize``
1410    This attribute suggests that optimization passes and code generator
1411    passes make choices that keep the code size of this function as small
1412    as possible and perform optimizations that may sacrifice runtime
1413    performance in order to minimize the size of the generated code.
1414``naked``
1415    This attribute disables prologue / epilogue emission for the
1416    function. This can have very system-specific consequences.
1417``no-jump-tables``
1418    When this attribute is set to true, the jump tables and lookup tables that
1419    can be generated from a switch case lowering are disabled.
1420``nobuiltin``
1421    This indicates that the callee function at a call site is not recognized as
1422    a built-in function. LLVM will retain the original call and not replace it
1423    with equivalent code based on the semantics of the built-in function, unless
1424    the call site uses the ``builtin`` attribute. This is valid at call sites
1425    and on function declarations and definitions.
1426``noduplicate``
1427    This attribute indicates that calls to the function cannot be
1428    duplicated. A call to a ``noduplicate`` function may be moved
1429    within its parent function, but may not be duplicated within
1430    its parent function.
1431
1432    A function containing a ``noduplicate`` call may still
1433    be an inlining candidate, provided that the call is not
1434    duplicated by inlining. That implies that the function has
1435    internal linkage and only has one call site, so the original
1436    call is dead after inlining.
1437``noimplicitfloat``
1438    This attributes disables implicit floating-point instructions.
1439``noinline``
1440    This attribute indicates that the inliner should never inline this
1441    function in any situation. This attribute may not be used together
1442    with the ``alwaysinline`` attribute.
1443``nonlazybind``
1444    This attribute suppresses lazy symbol binding for the function. This
1445    may make calls to the function faster, at the cost of extra program
1446    startup time if the function is not called during program startup.
1447``noredzone``
1448    This attribute indicates that the code generator should not use a
1449    red zone, even if the target-specific ABI normally permits it.
1450``noreturn``
1451    This function attribute indicates that the function never returns
1452    normally. This produces undefined behavior at runtime if the
1453    function ever does dynamically return.
1454``norecurse``
1455    This function attribute indicates that the function does not call itself
1456    either directly or indirectly down any possible call path. This produces
1457    undefined behavior at runtime if the function ever does recurse.
1458``nounwind``
1459    This function attribute indicates that the function never raises an
1460    exception. If the function does raise an exception, its runtime
1461    behavior is undefined. However, functions marked nounwind may still
1462    trap or generate asynchronous exceptions. Exception handling schemes
1463    that are recognized by LLVM to handle asynchronous exceptions, such
1464    as SEH, will still provide their implementation defined semantics.
1465``"null-pointer-is-valid"``
1466   If ``"null-pointer-is-valid"`` is set to ``"true"``, then ``null`` address
1467   in address-space 0 is considered to be a valid address for memory loads and
1468   stores. Any analysis or optimization should not treat dereferencing a
1469   pointer to ``null`` as undefined behavior in this function.
1470   Note: Comparing address of a global variable to ``null`` may still
1471   evaluate to false because of a limitation in querying this attribute inside
1472   constant expressions.
1473``optforfuzzing``
1474    This attribute indicates that this function should be optimized
1475    for maximum fuzzing signal.
1476``optnone``
1477    This function attribute indicates that most optimization passes will skip
1478    this function, with the exception of interprocedural optimization passes.
1479    Code generation defaults to the "fast" instruction selector.
1480    This attribute cannot be used together with the ``alwaysinline``
1481    attribute; this attribute is also incompatible
1482    with the ``minsize`` attribute and the ``optsize`` attribute.
1483
1484    This attribute requires the ``noinline`` attribute to be specified on
1485    the function as well, so the function is never inlined into any caller.
1486    Only functions with the ``alwaysinline`` attribute are valid
1487    candidates for inlining into the body of this function.
1488``optsize``
1489    This attribute suggests that optimization passes and code generator
1490    passes make choices that keep the code size of this function low,
1491    and otherwise do optimizations specifically to reduce code size as
1492    long as they do not significantly impact runtime performance.
1493``"patchable-function"``
1494    This attribute tells the code generator that the code
1495    generated for this function needs to follow certain conventions that
1496    make it possible for a runtime function to patch over it later.
1497    The exact effect of this attribute depends on its string value,
1498    for which there currently is one legal possibility:
1499
1500     * ``"prologue-short-redirect"`` - This style of patchable
1501       function is intended to support patching a function prologue to
1502       redirect control away from the function in a thread safe
1503       manner.  It guarantees that the first instruction of the
1504       function will be large enough to accommodate a short jump
1505       instruction, and will be sufficiently aligned to allow being
1506       fully changed via an atomic compare-and-swap instruction.
1507       While the first requirement can be satisfied by inserting large
1508       enough NOP, LLVM can and will try to re-purpose an existing
1509       instruction (i.e. one that would have to be emitted anyway) as
1510       the patchable instruction larger than a short jump.
1511
1512       ``"prologue-short-redirect"`` is currently only supported on
1513       x86-64.
1514
1515    This attribute by itself does not imply restrictions on
1516    inter-procedural optimizations.  All of the semantic effects the
1517    patching may have to be separately conveyed via the linkage type.
1518``"probe-stack"``
1519    This attribute indicates that the function will trigger a guard region
1520    in the end of the stack. It ensures that accesses to the stack must be
1521    no further apart than the size of the guard region to a previous
1522    access of the stack. It takes one required string value, the name of
1523    the stack probing function that will be called.
1524
1525    If a function that has a ``"probe-stack"`` attribute is inlined into
1526    a function with another ``"probe-stack"`` attribute, the resulting
1527    function has the ``"probe-stack"`` attribute of the caller. If a
1528    function that has a ``"probe-stack"`` attribute is inlined into a
1529    function that has no ``"probe-stack"`` attribute at all, the resulting
1530    function has the ``"probe-stack"`` attribute of the callee.
1531``readnone``
1532    On a function, this attribute indicates that the function computes its
1533    result (or decides to unwind an exception) based strictly on its arguments,
1534    without dereferencing any pointer arguments or otherwise accessing
1535    any mutable state (e.g. memory, control registers, etc) visible to
1536    caller functions. It does not write through any pointer arguments
1537    (including ``byval`` arguments) and never changes any state visible
1538    to callers. This means while it cannot unwind exceptions by calling
1539    the ``C++`` exception throwing methods (since they write to memory), there may
1540    be non-``C++`` mechanisms that throw exceptions without writing to LLVM
1541    visible memory.
1542
1543    On an argument, this attribute indicates that the function does not
1544    dereference that pointer argument, even though it may read or write the
1545    memory that the pointer points to if accessed through other pointers.
1546
1547    If a readnone function reads or writes memory visible to the program, or
1548    has other side-effects, the behavior is undefined. If a function reads from
1549    or writes to a readnone pointer argument, the behavior is undefined.
1550``readonly``
1551    On a function, this attribute indicates that the function does not write
1552    through any pointer arguments (including ``byval`` arguments) or otherwise
1553    modify any state (e.g. memory, control registers, etc) visible to
1554    caller functions. It may dereference pointer arguments and read
1555    state that may be set in the caller. A readonly function always
1556    returns the same value (or unwinds an exception identically) when
1557    called with the same set of arguments and global state.  This means while it
1558    cannot unwind exceptions by calling the ``C++`` exception throwing methods
1559    (since they write to memory), there may be non-``C++`` mechanisms that throw
1560    exceptions without writing to LLVM visible memory.
1561
1562    On an argument, this attribute indicates that the function does not write
1563    through this pointer argument, even though it may write to the memory that
1564    the pointer points to.
1565
1566    If a readonly function writes memory visible to the program, or
1567    has other side-effects, the behavior is undefined. If a function writes to
1568    a readonly pointer argument, the behavior is undefined.
1569``"stack-probe-size"``
1570    This attribute controls the behavior of stack probes: either
1571    the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
1572    It defines the size of the guard region. It ensures that if the function
1573    may use more stack space than the size of the guard region, stack probing
1574    sequence will be emitted. It takes one required integer value, which
1575    is 4096 by default.
1576
1577    If a function that has a ``"stack-probe-size"`` attribute is inlined into
1578    a function with another ``"stack-probe-size"`` attribute, the resulting
1579    function has the ``"stack-probe-size"`` attribute that has the lower
1580    numeric value. If a function that has a ``"stack-probe-size"`` attribute is
1581    inlined into a function that has no ``"stack-probe-size"`` attribute
1582    at all, the resulting function has the ``"stack-probe-size"`` attribute
1583    of the callee.
1584``"no-stack-arg-probe"``
1585    This attribute disables ABI-required stack probes, if any.
1586``writeonly``
1587    On a function, this attribute indicates that the function may write to but
1588    does not read from memory.
1589
1590    On an argument, this attribute indicates that the function may write to but
1591    does not read through this pointer argument (even though it may read from
1592    the memory that the pointer points to).
1593
1594    If a writeonly function reads memory visible to the program, or
1595    has other side-effects, the behavior is undefined. If a function reads
1596    from a writeonly pointer argument, the behavior is undefined.
1597``argmemonly``
1598    This attribute indicates that the only memory accesses inside function are
1599    loads and stores from objects pointed to by its pointer-typed arguments,
1600    with arbitrary offsets. Or in other words, all memory operations in the
1601    function can refer to memory only using pointers based on its function
1602    arguments.
1603
1604    Note that ``argmemonly`` can be used together with ``readonly`` attribute
1605    in order to specify that function reads only from its arguments.
1606
1607    If an argmemonly function reads or writes memory other than the pointer
1608    arguments, or has other side-effects, the behavior is undefined.
1609``returns_twice``
1610    This attribute indicates that this function can return twice. The C
1611    ``setjmp`` is an example of such a function. The compiler disables
1612    some optimizations (like tail calls) in the caller of these
1613    functions.
1614``safestack``
1615    This attribute indicates that
1616    `SafeStack <http://clang.llvm.org/docs/SafeStack.html>`_
1617    protection is enabled for this function.
1618
1619    If a function that has a ``safestack`` attribute is inlined into a
1620    function that doesn't have a ``safestack`` attribute or which has an
1621    ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
1622    function will have a ``safestack`` attribute.
1623``sanitize_address``
1624    This attribute indicates that AddressSanitizer checks
1625    (dynamic address safety analysis) are enabled for this function.
1626``sanitize_memory``
1627    This attribute indicates that MemorySanitizer checks (dynamic detection
1628    of accesses to uninitialized memory) are enabled for this function.
1629``sanitize_thread``
1630    This attribute indicates that ThreadSanitizer checks
1631    (dynamic thread safety analysis) are enabled for this function.
1632``sanitize_hwaddress``
1633    This attribute indicates that HWAddressSanitizer checks
1634    (dynamic address safety analysis based on tagged pointers) are enabled for
1635    this function.
1636``speculatable``
1637    This function attribute indicates that the function does not have any
1638    effects besides calculating its result and does not have undefined behavior.
1639    Note that ``speculatable`` is not enough to conclude that along any
1640    particular execution path the number of calls to this function will not be
1641    externally observable. This attribute is only valid on functions
1642    and declarations, not on individual call sites. If a function is
1643    incorrectly marked as speculatable and really does exhibit
1644    undefined behavior, the undefined behavior may be observed even
1645    if the call site is dead code.
1646
1647``ssp``
1648    This attribute indicates that the function should emit a stack
1649    smashing protector. It is in the form of a "canary" --- a random value
1650    placed on the stack before the local variables that's checked upon
1651    return from the function to see if it has been overwritten. A
1652    heuristic is used to determine if a function needs stack protectors
1653    or not. The heuristic used will enable protectors for functions with:
1654
1655    - Character arrays larger than ``ssp-buffer-size`` (default 8).
1656    - Aggregates containing character arrays larger than ``ssp-buffer-size``.
1657    - Calls to alloca() with variable sizes or constant sizes greater than
1658      ``ssp-buffer-size``.
1659
1660    Variables that are identified as requiring a protector will be arranged
1661    on the stack such that they are adjacent to the stack protector guard.
1662
1663    If a function that has an ``ssp`` attribute is inlined into a
1664    function that doesn't have an ``ssp`` attribute, then the resulting
1665    function will have an ``ssp`` attribute.
1666``sspreq``
1667    This attribute indicates that the function should *always* emit a
1668    stack smashing protector. This overrides the ``ssp`` function
1669    attribute.
1670
1671    Variables that are identified as requiring a protector will be arranged
1672    on the stack such that they are adjacent to the stack protector guard.
1673    The specific layout rules are:
1674
1675    #. Large arrays and structures containing large arrays
1676       (``>= ssp-buffer-size``) are closest to the stack protector.
1677    #. Small arrays and structures containing small arrays
1678       (``< ssp-buffer-size``) are 2nd closest to the protector.
1679    #. Variables that have had their address taken are 3rd closest to the
1680       protector.
1681
1682    If a function that has an ``sspreq`` attribute is inlined into a
1683    function that doesn't have an ``sspreq`` attribute or which has an
1684    ``ssp`` or ``sspstrong`` attribute, then the resulting function will have
1685    an ``sspreq`` attribute.
1686``sspstrong``
1687    This attribute indicates that the function should emit a stack smashing
1688    protector. This attribute causes a strong heuristic to be used when
1689    determining if a function needs stack protectors. The strong heuristic
1690    will enable protectors for functions with:
1691
1692    - Arrays of any size and type
1693    - Aggregates containing an array of any size and type.
1694    - Calls to alloca().
1695    - Local variables that have had their address taken.
1696
1697    Variables that are identified as requiring a protector will be arranged
1698    on the stack such that they are adjacent to the stack protector guard.
1699    The specific layout rules are:
1700
1701    #. Large arrays and structures containing large arrays
1702       (``>= ssp-buffer-size``) are closest to the stack protector.
1703    #. Small arrays and structures containing small arrays
1704       (``< ssp-buffer-size``) are 2nd closest to the protector.
1705    #. Variables that have had their address taken are 3rd closest to the
1706       protector.
1707
1708    This overrides the ``ssp`` function attribute.
1709
1710    If a function that has an ``sspstrong`` attribute is inlined into a
1711    function that doesn't have an ``sspstrong`` attribute, then the
1712    resulting function will have an ``sspstrong`` attribute.
1713``strictfp``
1714    This attribute indicates that the function was called from a scope that
1715    requires strict floating-point semantics.  LLVM will not attempt any
1716    optimizations that require assumptions about the floating-point rounding
1717    mode or that might alter the state of floating-point status flags that
1718    might otherwise be set or cleared by calling this function.
1719``"thunk"``
1720    This attribute indicates that the function will delegate to some other
1721    function with a tail call. The prototype of a thunk should not be used for
1722    optimization purposes. The caller is expected to cast the thunk prototype to
1723    match the thunk target prototype.
1724``uwtable``
1725    This attribute indicates that the ABI being targeted requires that
1726    an unwind table entry be produced for this function even if we can
1727    show that no exceptions passes by it. This is normally the case for
1728    the ELF x86-64 abi, but it can be disabled for some compilation
1729    units.
1730``nocf_check``
1731    This attribute indicates that no control-flow check will be performed on
1732    the attributed entity. It disables -fcf-protection=<> for a specific
1733    entity to fine grain the HW control flow protection mechanism. The flag
1734    is target independent and currently appertains to a function or function
1735    pointer.
1736``shadowcallstack``
1737    This attribute indicates that the ShadowCallStack checks are enabled for
1738    the function. The instrumentation checks that the return address for the
1739    function has not changed between the function prolog and eiplog. It is
1740    currently x86_64-specific.
1741
1742.. _glattrs:
1743
1744Global Attributes
1745-----------------
1746
1747Attributes may be set to communicate additional information about a global variable.
1748Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
1749are grouped into a single :ref:`attribute group <attrgrp>`.
1750
1751.. _opbundles:
1752
1753Operand Bundles
1754---------------
1755
1756Operand bundles are tagged sets of SSA values that can be associated
1757with certain LLVM instructions (currently only ``call`` s and
1758``invoke`` s).  In a way they are like metadata, but dropping them is
1759incorrect and will change program semantics.
1760
1761Syntax::
1762
1763    operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
1764    operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
1765    bundle operand ::= SSA value
1766    tag ::= string constant
1767
1768Operand bundles are **not** part of a function's signature, and a
1769given function may be called from multiple places with different kinds
1770of operand bundles.  This reflects the fact that the operand bundles
1771are conceptually a part of the ``call`` (or ``invoke``), not the
1772callee being dispatched to.
1773
1774Operand bundles are a generic mechanism intended to support
1775runtime-introspection-like functionality for managed languages.  While
1776the exact semantics of an operand bundle depend on the bundle tag,
1777there are certain limitations to how much the presence of an operand
1778bundle can influence the semantics of a program.  These restrictions
1779are described as the semantics of an "unknown" operand bundle.  As
1780long as the behavior of an operand bundle is describable within these
1781restrictions, LLVM does not need to have special knowledge of the
1782operand bundle to not miscompile programs containing it.
1783
1784- The bundle operands for an unknown operand bundle escape in unknown
1785  ways before control is transferred to the callee or invokee.
1786- Calls and invokes with operand bundles have unknown read / write
1787  effect on the heap on entry and exit (even if the call target is
1788  ``readnone`` or ``readonly``), unless they're overridden with
1789  callsite specific attributes.
1790- An operand bundle at a call site cannot change the implementation
1791  of the called function.  Inter-procedural optimizations work as
1792  usual as long as they take into account the first two properties.
1793
1794More specific types of operand bundles are described below.
1795
1796.. _deopt_opbundles:
1797
1798Deoptimization Operand Bundles
1799^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1800
1801Deoptimization operand bundles are characterized by the ``"deopt"``
1802operand bundle tag.  These operand bundles represent an alternate
1803"safe" continuation for the call site they're attached to, and can be
1804used by a suitable runtime to deoptimize the compiled frame at the
1805specified call site.  There can be at most one ``"deopt"`` operand
1806bundle attached to a call site.  Exact details of deoptimization is
1807out of scope for the language reference, but it usually involves
1808rewriting a compiled frame into a set of interpreted frames.
1809
1810From the compiler's perspective, deoptimization operand bundles make
1811the call sites they're attached to at least ``readonly``.  They read
1812through all of their pointer typed operands (even if they're not
1813otherwise escaped) and the entire visible heap.  Deoptimization
1814operand bundles do not capture their operands except during
1815deoptimization, in which case control will not be returned to the
1816compiled frame.
1817
1818The inliner knows how to inline through calls that have deoptimization
1819operand bundles.  Just like inlining through a normal call site
1820involves composing the normal and exceptional continuations, inlining
1821through a call site with a deoptimization operand bundle needs to
1822appropriately compose the "safe" deoptimization continuation.  The
1823inliner does this by prepending the parent's deoptimization
1824continuation to every deoptimization continuation in the inlined body.
1825E.g. inlining ``@f`` into ``@g`` in the following example
1826
1827.. code-block:: llvm
1828
1829    define void @f() {
1830      call void @x()  ;; no deopt state
1831      call void @y() [ "deopt"(i32 10) ]
1832      call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ]
1833      ret void
1834    }
1835
1836    define void @g() {
1837      call void @f() [ "deopt"(i32 20) ]
1838      ret void
1839    }
1840
1841will result in
1842
1843.. code-block:: llvm
1844
1845    define void @g() {
1846      call void @x()  ;; still no deopt state
1847      call void @y() [ "deopt"(i32 20, i32 10) ]
1848      call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ]
1849      ret void
1850    }
1851
1852It is the frontend's responsibility to structure or encode the
1853deoptimization state in a way that syntactically prepending the
1854caller's deoptimization state to the callee's deoptimization state is
1855semantically equivalent to composing the caller's deoptimization
1856continuation after the callee's deoptimization continuation.
1857
1858.. _ob_funclet:
1859
1860Funclet Operand Bundles
1861^^^^^^^^^^^^^^^^^^^^^^^
1862
1863Funclet operand bundles are characterized by the ``"funclet"``
1864operand bundle tag.  These operand bundles indicate that a call site
1865is within a particular funclet.  There can be at most one
1866``"funclet"`` operand bundle attached to a call site and it must have
1867exactly one bundle operand.
1868
1869If any funclet EH pads have been "entered" but not "exited" (per the
1870`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
1871it is undefined behavior to execute a ``call`` or ``invoke`` which:
1872
1873* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
1874  intrinsic, or
1875* has a ``"funclet"`` bundle whose operand is not the most-recently-entered
1876  not-yet-exited funclet EH pad.
1877
1878Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
1879executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
1880
1881GC Transition Operand Bundles
1882^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1883
1884GC transition operand bundles are characterized by the
1885``"gc-transition"`` operand bundle tag. These operand bundles mark a
1886call as a transition between a function with one GC strategy to a
1887function with a different GC strategy. If coordinating the transition
1888between GC strategies requires additional code generation at the call
1889site, these bundles may contain any values that are needed by the
1890generated code.  For more details, see :ref:`GC Transitions
1891<gc_transition_args>`.
1892
1893.. _moduleasm:
1894
1895Module-Level Inline Assembly
1896----------------------------
1897
1898Modules may contain "module-level inline asm" blocks, which corresponds
1899to the GCC "file scope inline asm" blocks. These blocks are internally
1900concatenated by LLVM and treated as a single unit, but may be separated
1901in the ``.ll`` file if desired. The syntax is very simple:
1902
1903.. code-block:: llvm
1904
1905    module asm "inline asm code goes here"
1906    module asm "more can go here"
1907
1908The strings can contain any character by escaping non-printable
1909characters. The escape sequence used is simply "\\xx" where "xx" is the
1910two digit hex code for the number.
1911
1912Note that the assembly string *must* be parseable by LLVM's integrated assembler
1913(unless it is disabled), even when emitting a ``.s`` file.
1914
1915.. _langref_datalayout:
1916
1917Data Layout
1918-----------
1919
1920A module may specify a target specific data layout string that specifies
1921how data is to be laid out in memory. The syntax for the data layout is
1922simply:
1923
1924.. code-block:: llvm
1925
1926    target datalayout = "layout specification"
1927
1928The *layout specification* consists of a list of specifications
1929separated by the minus sign character ('-'). Each specification starts
1930with a letter and may include other information after the letter to
1931define some aspect of the data layout. The specifications accepted are
1932as follows:
1933
1934``E``
1935    Specifies that the target lays out data in big-endian form. That is,
1936    the bits with the most significance have the lowest address
1937    location.
1938``e``
1939    Specifies that the target lays out data in little-endian form. That
1940    is, the bits with the least significance have the lowest address
1941    location.
1942``S<size>``
1943    Specifies the natural alignment of the stack in bits. Alignment
1944    promotion of stack variables is limited to the natural stack
1945    alignment to avoid dynamic stack realignment. The stack alignment
1946    must be a multiple of 8-bits. If omitted, the natural stack
1947    alignment defaults to "unspecified", which does not prevent any
1948    alignment promotions.
1949``P<address space>``
1950    Specifies the address space that corresponds to program memory.
1951    Harvard architectures can use this to specify what space LLVM
1952    should place things such as functions into. If omitted, the
1953    program memory space defaults to the default address space of 0,
1954    which corresponds to a Von Neumann architecture that has code
1955    and data in the same space.
1956``A<address space>``
1957    Specifies the address space of objects created by '``alloca``'.
1958    Defaults to the default address space of 0.
1959``p[n]:<size>:<abi>:<pref>:<idx>``
1960    This specifies the *size* of a pointer and its ``<abi>`` and
1961    ``<pref>``\erred alignments for address space ``n``. The fourth parameter
1962    ``<idx>`` is a size of index that used for address calculation. If not
1963    specified, the default index size is equal to the pointer size. All sizes
1964    are in bits. The address space, ``n``, is optional, and if not specified,
1965    denotes the default address space 0. The value of ``n`` must be
1966    in the range [1,2^23).
1967``i<size>:<abi>:<pref>``
1968    This specifies the alignment for an integer type of a given bit
1969    ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
1970``v<size>:<abi>:<pref>``
1971    This specifies the alignment for a vector type of a given bit
1972    ``<size>``.
1973``f<size>:<abi>:<pref>``
1974    This specifies the alignment for a floating-point type of a given bit
1975    ``<size>``. Only values of ``<size>`` that are supported by the target
1976    will work. 32 (float) and 64 (double) are supported on all targets; 80
1977    or 128 (different flavors of long double) are also supported on some
1978    targets.
1979``a:<abi>:<pref>``
1980    This specifies the alignment for an object of aggregate type.
1981``m:<mangling>``
1982    If present, specifies that llvm names are mangled in the output. Symbols
1983    prefixed with the mangling escape character ``\01`` are passed through
1984    directly to the assembler without the escape character. The mangling style
1985    options are
1986
1987    * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
1988    * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
1989    * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
1990      symbols get a ``_`` prefix.
1991    * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
1992      Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
1993      ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
1994      ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
1995      starting with ``?`` are not mangled in any way.
1996    * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
1997      symbols do not receive a ``_`` prefix.
1998``n<size1>:<size2>:<size3>...``
1999    This specifies a set of native integer widths for the target CPU in
2000    bits. For example, it might contain ``n32`` for 32-bit PowerPC,
2001    ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
2002    this set are considered to support most general arithmetic operations
2003    efficiently.
2004``ni:<address space0>:<address space1>:<address space2>...``
2005    This specifies pointer types with the specified address spaces
2006    as :ref:`Non-Integral Pointer Type <nointptrtype>` s.  The ``0``
2007    address space cannot be specified as non-integral.
2008
2009On every specification that takes a ``<abi>:<pref>``, specifying the
2010``<pref>`` alignment is optional. If omitted, the preceding ``:``
2011should be omitted too and ``<pref>`` will be equal to ``<abi>``.
2012
2013When constructing the data layout for a given target, LLVM starts with a
2014default set of specifications which are then (possibly) overridden by
2015the specifications in the ``datalayout`` keyword. The default
2016specifications are given in this list:
2017
2018-  ``E`` - big endian
2019-  ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
2020-  ``p[n]:64:64:64`` - Other address spaces are assumed to be the
2021   same as the default address space.
2022-  ``S0`` - natural stack alignment is unspecified
2023-  ``i1:8:8`` - i1 is 8-bit (byte) aligned
2024-  ``i8:8:8`` - i8 is 8-bit (byte) aligned
2025-  ``i16:16:16`` - i16 is 16-bit aligned
2026-  ``i32:32:32`` - i32 is 32-bit aligned
2027-  ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
2028   alignment of 64-bits
2029-  ``f16:16:16`` - half is 16-bit aligned
2030-  ``f32:32:32`` - float is 32-bit aligned
2031-  ``f64:64:64`` - double is 64-bit aligned
2032-  ``f128:128:128`` - quad is 128-bit aligned
2033-  ``v64:64:64`` - 64-bit vector is 64-bit aligned
2034-  ``v128:128:128`` - 128-bit vector is 128-bit aligned
2035-  ``a:0:64`` - aggregates are 64-bit aligned
2036
2037When LLVM is determining the alignment for a given type, it uses the
2038following rules:
2039
2040#. If the type sought is an exact match for one of the specifications,
2041   that specification is used.
2042#. If no match is found, and the type sought is an integer type, then
2043   the smallest integer type that is larger than the bitwidth of the
2044   sought type is used. If none of the specifications are larger than
2045   the bitwidth then the largest integer type is used. For example,
2046   given the default specifications above, the i7 type will use the
2047   alignment of i8 (next largest) while both i65 and i256 will use the
2048   alignment of i64 (largest specified).
2049#. If no match is found, and the type sought is a vector type, then the
2050   largest vector type that is smaller than the sought vector type will
2051   be used as a fall back. This happens because <128 x double> can be
2052   implemented in terms of 64 <2 x double>, for example.
2053
2054The function of the data layout string may not be what you expect.
2055Notably, this is not a specification from the frontend of what alignment
2056the code generator should use.
2057
2058Instead, if specified, the target data layout is required to match what
2059the ultimate *code generator* expects. This string is used by the
2060mid-level optimizers to improve code, and this only works if it matches
2061what the ultimate code generator uses. There is no way to generate IR
2062that does not embed this target-specific detail into the IR. If you
2063don't specify the string, the default specifications will be used to
2064generate a Data Layout and the optimization phases will operate
2065accordingly and introduce target specificity into the IR with respect to
2066these default specifications.
2067
2068.. _langref_triple:
2069
2070Target Triple
2071-------------
2072
2073A module may specify a target triple string that describes the target
2074host. The syntax for the target triple is simply:
2075
2076.. code-block:: llvm
2077
2078    target triple = "x86_64-apple-macosx10.7.0"
2079
2080The *target triple* string consists of a series of identifiers delimited
2081by the minus sign character ('-'). The canonical forms are:
2082
2083::
2084
2085    ARCHITECTURE-VENDOR-OPERATING_SYSTEM
2086    ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
2087
2088This information is passed along to the backend so that it generates
2089code for the proper architecture. It's possible to override this on the
2090command line with the ``-mtriple`` command line option.
2091
2092.. _pointeraliasing:
2093
2094Pointer Aliasing Rules
2095----------------------
2096
2097Any memory access must be done through a pointer value associated with
2098an address range of the memory access, otherwise the behavior is
2099undefined. Pointer values are associated with address ranges according
2100to the following rules:
2101
2102-  A pointer value is associated with the addresses associated with any
2103   value it is *based* on.
2104-  An address of a global variable is associated with the address range
2105   of the variable's storage.
2106-  The result value of an allocation instruction is associated with the
2107   address range of the allocated storage.
2108-  A null pointer in the default address-space is associated with no
2109   address.
2110-  An integer constant other than zero or a pointer value returned from
2111   a function not defined within LLVM may be associated with address
2112   ranges allocated through mechanisms other than those provided by
2113   LLVM. Such ranges shall not overlap with any ranges of addresses
2114   allocated by mechanisms provided by LLVM.
2115
2116A pointer value is *based* on another pointer value according to the
2117following rules:
2118
2119-  A pointer value formed from a scalar ``getelementptr`` operation is *based* on
2120   the pointer-typed operand of the ``getelementptr``.
2121-  The pointer in lane *l* of the result of a vector ``getelementptr`` operation
2122   is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
2123   of the ``getelementptr``.
2124-  The result value of a ``bitcast`` is *based* on the operand of the
2125   ``bitcast``.
2126-  A pointer value formed by an ``inttoptr`` is *based* on all pointer
2127   values that contribute (directly or indirectly) to the computation of
2128   the pointer's value.
2129-  The "*based* on" relationship is transitive.
2130
2131Note that this definition of *"based"* is intentionally similar to the
2132definition of *"based"* in C99, though it is slightly weaker.
2133
2134LLVM IR does not associate types with memory. The result type of a
2135``load`` merely indicates the size and alignment of the memory from
2136which to load, as well as the interpretation of the value. The first
2137operand type of a ``store`` similarly only indicates the size and
2138alignment of the store.
2139
2140Consequently, type-based alias analysis, aka TBAA, aka
2141``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
2142:ref:`Metadata <metadata>` may be used to encode additional information
2143which specialized optimization passes may use to implement type-based
2144alias analysis.
2145
2146.. _volatile:
2147
2148Volatile Memory Accesses
2149------------------------
2150
2151Certain memory accesses, such as :ref:`load <i_load>`'s,
2152:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
2153marked ``volatile``. The optimizers must not change the number of
2154volatile operations or change their order of execution relative to other
2155volatile operations. The optimizers *may* change the order of volatile
2156operations relative to non-volatile operations. This is not Java's
2157"volatile" and has no cross-thread synchronization behavior.
2158
2159IR-level volatile loads and stores cannot safely be optimized into
2160llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are
2161flagged volatile. Likewise, the backend should never split or merge
2162target-legal volatile load/store instructions.
2163
2164.. admonition:: Rationale
2165
2166 Platforms may rely on volatile loads and stores of natively supported
2167 data width to be executed as single instruction. For example, in C
2168 this holds for an l-value of volatile primitive type with native
2169 hardware support, but not necessarily for aggregate types. The
2170 frontend upholds these expectations, which are intentionally
2171 unspecified in the IR. The rules above ensure that IR transformations
2172 do not violate the frontend's contract with the language.
2173
2174.. _memmodel:
2175
2176Memory Model for Concurrent Operations
2177--------------------------------------
2178
2179The LLVM IR does not define any way to start parallel threads of
2180execution or to register signal handlers. Nonetheless, there are
2181platform-specific ways to create them, and we define LLVM IR's behavior
2182in their presence. This model is inspired by the C++0x memory model.
2183
2184For a more informal introduction to this model, see the :doc:`Atomics`.
2185
2186We define a *happens-before* partial order as the least partial order
2187that
2188
2189-  Is a superset of single-thread program order, and
2190-  When a *synchronizes-with* ``b``, includes an edge from ``a`` to
2191   ``b``. *Synchronizes-with* pairs are introduced by platform-specific
2192   techniques, like pthread locks, thread creation, thread joining,
2193   etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
2194   Constraints <ordering>`).
2195
2196Note that program order does not introduce *happens-before* edges
2197between a thread and signals executing inside that thread.
2198
2199Every (defined) read operation (load instructions, memcpy, atomic
2200loads/read-modify-writes, etc.) R reads a series of bytes written by
2201(defined) write operations (store instructions, atomic
2202stores/read-modify-writes, memcpy, etc.). For the purposes of this
2203section, initialized globals are considered to have a write of the
2204initializer which is atomic and happens before any other read or write
2205of the memory in question. For each byte of a read R, R\ :sub:`byte`
2206may see any write to the same byte, except:
2207
2208-  If write\ :sub:`1`  happens before write\ :sub:`2`, and
2209   write\ :sub:`2` happens before R\ :sub:`byte`, then
2210   R\ :sub:`byte` does not see write\ :sub:`1`.
2211-  If R\ :sub:`byte` happens before write\ :sub:`3`, then
2212   R\ :sub:`byte` does not see write\ :sub:`3`.
2213
2214Given that definition, R\ :sub:`byte` is defined as follows:
2215
2216-  If R is volatile, the result is target-dependent. (Volatile is
2217   supposed to give guarantees which can support ``sig_atomic_t`` in
2218   C/C++, and may be used for accesses to addresses that do not behave
2219   like normal memory. It does not generally provide cross-thread
2220   synchronization.)
2221-  Otherwise, if there is no write to the same byte that happens before
2222   R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
2223-  Otherwise, if R\ :sub:`byte` may see exactly one write,
2224   R\ :sub:`byte` returns the value written by that write.
2225-  Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
2226   see are atomic, it chooses one of the values written. See the :ref:`Atomic
2227   Memory Ordering Constraints <ordering>` section for additional
2228   constraints on how the choice is made.
2229-  Otherwise R\ :sub:`byte` returns ``undef``.
2230
2231R returns the value composed of the series of bytes it read. This
2232implies that some bytes within the value may be ``undef`` **without**
2233the entire value being ``undef``. Note that this only defines the
2234semantics of the operation; it doesn't mean that targets will emit more
2235than one instruction to read the series of bytes.
2236
2237Note that in cases where none of the atomic intrinsics are used, this
2238model places only one restriction on IR transformations on top of what
2239is required for single-threaded execution: introducing a store to a byte
2240which might not otherwise be stored is not allowed in general.
2241(Specifically, in the case where another thread might write to and read
2242from an address, introducing a store can change a load that may see
2243exactly one write into a load that may see multiple writes.)
2244
2245.. _ordering:
2246
2247Atomic Memory Ordering Constraints
2248----------------------------------
2249
2250Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
2251:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
2252:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
2253ordering parameters that determine which other atomic instructions on
2254the same address they *synchronize with*. These semantics are borrowed
2255from Java and C++0x, but are somewhat more colloquial. If these
2256descriptions aren't precise enough, check those specs (see spec
2257references in the :doc:`atomics guide <Atomics>`).
2258:ref:`fence <i_fence>` instructions treat these orderings somewhat
2259differently since they don't take an address. See that instruction's
2260documentation for details.
2261
2262For a simpler introduction to the ordering constraints, see the
2263:doc:`Atomics`.
2264
2265``unordered``
2266    The set of values that can be read is governed by the happens-before
2267    partial order. A value cannot be read unless some operation wrote
2268    it. This is intended to provide a guarantee strong enough to model
2269    Java's non-volatile shared variables. This ordering cannot be
2270    specified for read-modify-write operations; it is not strong enough
2271    to make them atomic in any interesting way.
2272``monotonic``
2273    In addition to the guarantees of ``unordered``, there is a single
2274    total order for modifications by ``monotonic`` operations on each
2275    address. All modification orders must be compatible with the
2276    happens-before order. There is no guarantee that the modification
2277    orders can be combined to a global total order for the whole program
2278    (and this often will not be possible). The read in an atomic
2279    read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
2280    :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
2281    order immediately before the value it writes. If one atomic read
2282    happens before another atomic read of the same address, the later
2283    read must see the same value or a later value in the address's
2284    modification order. This disallows reordering of ``monotonic`` (or
2285    stronger) operations on the same address. If an address is written
2286    ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
2287    read that address repeatedly, the other threads must eventually see
2288    the write. This corresponds to the C++0x/C1x
2289    ``memory_order_relaxed``.
2290``acquire``
2291    In addition to the guarantees of ``monotonic``, a
2292    *synchronizes-with* edge may be formed with a ``release`` operation.
2293    This is intended to model C++'s ``memory_order_acquire``.
2294``release``
2295    In addition to the guarantees of ``monotonic``, if this operation
2296    writes a value which is subsequently read by an ``acquire``
2297    operation, it *synchronizes-with* that operation. (This isn't a
2298    complete description; see the C++0x definition of a release
2299    sequence.) This corresponds to the C++0x/C1x
2300    ``memory_order_release``.
2301``acq_rel`` (acquire+release)
2302    Acts as both an ``acquire`` and ``release`` operation on its
2303    address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
2304``seq_cst`` (sequentially consistent)
2305    In addition to the guarantees of ``acq_rel`` (``acquire`` for an
2306    operation that only reads, ``release`` for an operation that only
2307    writes), there is a global total order on all
2308    sequentially-consistent operations on all addresses, which is
2309    consistent with the *happens-before* partial order and with the
2310    modification orders of all the affected addresses. Each
2311    sequentially-consistent read sees the last preceding write to the
2312    same address in this global order. This corresponds to the C++0x/C1x
2313    ``memory_order_seq_cst`` and Java volatile.
2314
2315.. _syncscope:
2316
2317If an atomic operation is marked ``syncscope("singlethread")``, it only
2318*synchronizes with* and only participates in the seq\_cst total orderings of
2319other operations running in the same thread (for example, in signal handlers).
2320
2321If an atomic operation is marked ``syncscope("<target-scope>")``, where
2322``<target-scope>`` is a target specific synchronization scope, then it is target
2323dependent if it *synchronizes with* and participates in the seq\_cst total
2324orderings of other operations.
2325
2326Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
2327or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
2328seq\_cst total orderings of other operations that are not marked
2329``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
2330
2331.. _floatenv:
2332
2333Floating-Point Environment
2334--------------------------
2335
2336The default LLVM floating-point environment assumes that floating-point
2337instructions do not have side effects. Results assume the round-to-nearest
2338rounding mode. No floating-point exception state is maintained in this
2339environment. Therefore, there is no attempt to create or preserve invalid
2340operation (SNaN) or division-by-zero exceptions in these examples:
2341
2342.. code-block:: llvm
2343
2344      %A = fdiv 0x7ff0000000000001, %X  ; 64-bit SNaN hex value
2345      %B = fdiv %X, 0.0
2346    Safe:
2347      %A = NaN
2348      %B = NaN
2349
2350The benefit of this exception-free assumption is that floating-point
2351operations may be speculated freely without any other fast-math relaxations
2352to the floating-point model.
2353
2354Code that requires different behavior than this should use the
2355:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
2356
2357.. _fastmath:
2358
2359Fast-Math Flags
2360---------------
2361
2362LLVM IR floating-point operations (:ref:`fadd <i_fadd>`,
2363:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
2364:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) and :ref:`call <i_call>`
2365may use the following flags to enable otherwise unsafe
2366floating-point transformations.
2367
2368``nnan``
2369   No NaNs - Allow optimizations to assume the arguments and result are not
2370   NaN. If an argument is a nan, or the result would be a nan, it produces
2371   a :ref:`poison value <poisonvalues>` instead.
2372
2373``ninf``
2374   No Infs - Allow optimizations to assume the arguments and result are not
2375   +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it
2376   produces a :ref:`poison value <poisonvalues>` instead.
2377
2378``nsz``
2379   No Signed Zeros - Allow optimizations to treat the sign of a zero
2380   argument or result as insignificant.
2381
2382``arcp``
2383   Allow Reciprocal - Allow optimizations to use the reciprocal of an
2384   argument rather than perform division.
2385
2386``contract``
2387   Allow floating-point contraction (e.g. fusing a multiply followed by an
2388   addition into a fused multiply-and-add).
2389
2390``afn``
2391   Approximate functions - Allow substitution of approximate calculations for
2392   functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
2393   for places where this can apply to LLVM's intrinsic math functions.
2394
2395``reassoc``
2396   Allow reassociation transformations for floating-point instructions.
2397   This may dramatically change results in floating-point.
2398
2399``fast``
2400   This flag implies all of the others.
2401
2402.. _uselistorder:
2403
2404Use-list Order Directives
2405-------------------------
2406
2407Use-list directives encode the in-memory order of each use-list, allowing the
2408order to be recreated. ``<order-indexes>`` is a comma-separated list of
2409indexes that are assigned to the referenced value's uses. The referenced
2410value's use-list is immediately sorted by these indexes.
2411
2412Use-list directives may appear at function scope or global scope. They are not
2413instructions, and have no effect on the semantics of the IR. When they're at
2414function scope, they must appear after the terminator of the final basic block.
2415
2416If basic blocks have their address taken via ``blockaddress()`` expressions,
2417``uselistorder_bb`` can be used to reorder their use-lists from outside their
2418function's scope.
2419
2420:Syntax:
2421
2422::
2423
2424    uselistorder <ty> <value>, { <order-indexes> }
2425    uselistorder_bb @function, %block { <order-indexes> }
2426
2427:Examples:
2428
2429::
2430
2431    define void @foo(i32 %arg1, i32 %arg2) {
2432    entry:
2433      ; ... instructions ...
2434    bb:
2435      ; ... instructions ...
2436
2437      ; At function scope.
2438      uselistorder i32 %arg1, { 1, 0, 2 }
2439      uselistorder label %bb, { 1, 0 }
2440    }
2441
2442    ; At global scope.
2443    uselistorder i32* @global, { 1, 2, 0 }
2444    uselistorder i32 7, { 1, 0 }
2445    uselistorder i32 (i32) @bar, { 1, 0 }
2446    uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
2447
2448.. _source_filename:
2449
2450Source Filename
2451---------------
2452
2453The *source filename* string is set to the original module identifier,
2454which will be the name of the compiled source file when compiling from
2455source through the clang front end, for example. It is then preserved through
2456the IR and bitcode.
2457
2458This is currently necessary to generate a consistent unique global
2459identifier for local functions used in profile data, which prepends the
2460source file name to the local function name.
2461
2462The syntax for the source file name is simply:
2463
2464.. code-block:: text
2465
2466    source_filename = "/path/to/source.c"
2467
2468.. _typesystem:
2469
2470Type System
2471===========
2472
2473The LLVM type system is one of the most important features of the
2474intermediate representation. Being typed enables a number of
2475optimizations to be performed on the intermediate representation
2476directly, without having to do extra analyses on the side before the
2477transformation. A strong type system makes it easier to read the
2478generated code and enables novel analyses and transformations that are
2479not feasible to perform on normal three address code representations.
2480
2481.. _t_void:
2482
2483Void Type
2484---------
2485
2486:Overview:
2487
2488
2489The void type does not represent any value and has no size.
2490
2491:Syntax:
2492
2493
2494::
2495
2496      void
2497
2498
2499.. _t_function:
2500
2501Function Type
2502-------------
2503
2504:Overview:
2505
2506
2507The function type can be thought of as a function signature. It consists of a
2508return type and a list of formal parameter types. The return type of a function
2509type is a void type or first class type --- except for :ref:`label <t_label>`
2510and :ref:`metadata <t_metadata>` types.
2511
2512:Syntax:
2513
2514::
2515
2516      <returntype> (<parameter list>)
2517
2518...where '``<parameter list>``' is a comma-separated list of type
2519specifiers. Optionally, the parameter list may include a type ``...``, which
2520indicates that the function takes a variable number of arguments. Variable
2521argument functions can access their arguments with the :ref:`variable argument
2522handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
2523except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
2524
2525:Examples:
2526
2527+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2528| ``i32 (i32)``                   | function taking an ``i32``, returning an ``i32``                                                                                                                    |
2529+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2530| ``float (i16, i32 *) *``        | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``.                                    |
2531+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2532| ``i32 (i8*, ...)``              | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. |
2533+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2534| ``{i32, i32} (i32)``            | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values                                                                 |
2535+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2536
2537.. _t_firstclass:
2538
2539First Class Types
2540-----------------
2541
2542The :ref:`first class <t_firstclass>` types are perhaps the most important.
2543Values of these types are the only ones which can be produced by
2544instructions.
2545
2546.. _t_single_value:
2547
2548Single Value Types
2549^^^^^^^^^^^^^^^^^^
2550
2551These are the types that are valid in registers from CodeGen's perspective.
2552
2553.. _t_integer:
2554
2555Integer Type
2556""""""""""""
2557
2558:Overview:
2559
2560The integer type is a very simple type that simply specifies an
2561arbitrary bit width for the integer type desired. Any bit width from 1
2562bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified.
2563
2564:Syntax:
2565
2566::
2567
2568      iN
2569
2570The number of bits the integer will occupy is specified by the ``N``
2571value.
2572
2573Examples:
2574*********
2575
2576+----------------+------------------------------------------------+
2577| ``i1``         | a single-bit integer.                          |
2578+----------------+------------------------------------------------+
2579| ``i32``        | a 32-bit integer.                              |
2580+----------------+------------------------------------------------+
2581| ``i1942652``   | a really big integer of over 1 million bits.   |
2582+----------------+------------------------------------------------+
2583
2584.. _t_floating:
2585
2586Floating-Point Types
2587""""""""""""""""""""
2588
2589.. list-table::
2590   :header-rows: 1
2591
2592   * - Type
2593     - Description
2594
2595   * - ``half``
2596     - 16-bit floating-point value
2597
2598   * - ``float``
2599     - 32-bit floating-point value
2600
2601   * - ``double``
2602     - 64-bit floating-point value
2603
2604   * - ``fp128``
2605     - 128-bit floating-point value (112-bit mantissa)
2606
2607   * - ``x86_fp80``
2608     -  80-bit floating-point value (X87)
2609
2610   * - ``ppc_fp128``
2611     - 128-bit floating-point value (two 64-bits)
2612
2613The binary format of half, float, double, and fp128 correspond to the
2614IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128
2615respectively.
2616
2617X86_mmx Type
2618""""""""""""
2619
2620:Overview:
2621
2622The x86_mmx type represents a value held in an MMX register on an x86
2623machine. The operations allowed on it are quite limited: parameters and
2624return values, load and store, and bitcast. User-specified MMX
2625instructions are represented as intrinsic or asm calls with arguments
2626and/or results of this type. There are no arrays, vectors or constants
2627of this type.
2628
2629:Syntax:
2630
2631::
2632
2633      x86_mmx
2634
2635
2636.. _t_pointer:
2637
2638Pointer Type
2639""""""""""""
2640
2641:Overview:
2642
2643The pointer type is used to specify memory locations. Pointers are
2644commonly used to reference objects in memory.
2645
2646Pointer types may have an optional address space attribute defining the
2647numbered address space where the pointed-to object resides. The default
2648address space is number zero. The semantics of non-zero address spaces
2649are target-specific.
2650
2651Note that LLVM does not permit pointers to void (``void*``) nor does it
2652permit pointers to labels (``label*``). Use ``i8*`` instead.
2653
2654:Syntax:
2655
2656::
2657
2658      <type> *
2659
2660:Examples:
2661
2662+-------------------------+--------------------------------------------------------------------------------------------------------------+
2663| ``[4 x i32]*``          | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values.                               |
2664+-------------------------+--------------------------------------------------------------------------------------------------------------+
2665| ``i32 (i32*) *``        | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. |
2666+-------------------------+--------------------------------------------------------------------------------------------------------------+
2667| ``i32 addrspace(5)*``   | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5.                           |
2668+-------------------------+--------------------------------------------------------------------------------------------------------------+
2669
2670.. _t_vector:
2671
2672Vector Type
2673"""""""""""
2674
2675:Overview:
2676
2677A vector type is a simple derived type that represents a vector of
2678elements. Vector types are used when multiple primitive data are
2679operated in parallel using a single instruction (SIMD). A vector type
2680requires a size (number of elements) and an underlying primitive data
2681type. Vector types are considered :ref:`first class <t_firstclass>`.
2682
2683:Syntax:
2684
2685::
2686
2687      < <# elements> x <elementtype> >
2688
2689The number of elements is a constant integer value larger than 0;
2690elementtype may be any integer, floating-point or pointer type. Vectors
2691of size zero are not allowed.
2692
2693:Examples:
2694
2695+-------------------+--------------------------------------------------+
2696| ``<4 x i32>``     | Vector of 4 32-bit integer values.               |
2697+-------------------+--------------------------------------------------+
2698| ``<8 x float>``   | Vector of 8 32-bit floating-point values.        |
2699+-------------------+--------------------------------------------------+
2700| ``<2 x i64>``     | Vector of 2 64-bit integer values.               |
2701+-------------------+--------------------------------------------------+
2702| ``<4 x i64*>``    | Vector of 4 pointers to 64-bit integer values.   |
2703+-------------------+--------------------------------------------------+
2704
2705.. _t_label:
2706
2707Label Type
2708^^^^^^^^^^
2709
2710:Overview:
2711
2712The label type represents code labels.
2713
2714:Syntax:
2715
2716::
2717
2718      label
2719
2720.. _t_token:
2721
2722Token Type
2723^^^^^^^^^^
2724
2725:Overview:
2726
2727The token type is used when a value is associated with an instruction
2728but all uses of the value must not attempt to introspect or obscure it.
2729As such, it is not appropriate to have a :ref:`phi <i_phi>` or
2730:ref:`select <i_select>` of type token.
2731
2732:Syntax:
2733
2734::
2735
2736      token
2737
2738
2739
2740.. _t_metadata:
2741
2742Metadata Type
2743^^^^^^^^^^^^^
2744
2745:Overview:
2746
2747The metadata type represents embedded metadata. No derived types may be
2748created from metadata except for :ref:`function <t_function>` arguments.
2749
2750:Syntax:
2751
2752::
2753
2754      metadata
2755
2756.. _t_aggregate:
2757
2758Aggregate Types
2759^^^^^^^^^^^^^^^
2760
2761Aggregate Types are a subset of derived types that can contain multiple
2762member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
2763aggregate types. :ref:`Vectors <t_vector>` are not considered to be
2764aggregate types.
2765
2766.. _t_array:
2767
2768Array Type
2769""""""""""
2770
2771:Overview:
2772
2773The array type is a very simple derived type that arranges elements
2774sequentially in memory. The array type requires a size (number of
2775elements) and an underlying data type.
2776
2777:Syntax:
2778
2779::
2780
2781      [<# elements> x <elementtype>]
2782
2783The number of elements is a constant integer value; ``elementtype`` may
2784be any type with a size.
2785
2786:Examples:
2787
2788+------------------+--------------------------------------+
2789| ``[40 x i32]``   | Array of 40 32-bit integer values.   |
2790+------------------+--------------------------------------+
2791| ``[41 x i32]``   | Array of 41 32-bit integer values.   |
2792+------------------+--------------------------------------+
2793| ``[4 x i8]``     | Array of 4 8-bit integer values.     |
2794+------------------+--------------------------------------+
2795
2796Here are some examples of multidimensional arrays:
2797
2798+-----------------------------+----------------------------------------------------------+
2799| ``[3 x [4 x i32]]``         | 3x4 array of 32-bit integer values.                      |
2800+-----------------------------+----------------------------------------------------------+
2801| ``[12 x [10 x float]]``     | 12x10 array of single precision floating-point values.   |
2802+-----------------------------+----------------------------------------------------------+
2803| ``[2 x [3 x [4 x i16]]]``   | 2x3x4 array of 16-bit integer values.                    |
2804+-----------------------------+----------------------------------------------------------+
2805
2806There is no restriction on indexing beyond the end of the array implied
2807by a static type (though there are restrictions on indexing beyond the
2808bounds of an allocated object in some cases). This means that
2809single-dimension 'variable sized array' addressing can be implemented in
2810LLVM with a zero length array type. An implementation of 'pascal style
2811arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
2812example.
2813
2814.. _t_struct:
2815
2816Structure Type
2817""""""""""""""
2818
2819:Overview:
2820
2821The structure type is used to represent a collection of data members
2822together in memory. The elements of a structure may be any type that has
2823a size.
2824
2825Structures in memory are accessed using '``load``' and '``store``' by
2826getting a pointer to a field with the '``getelementptr``' instruction.
2827Structures in registers are accessed using the '``extractvalue``' and
2828'``insertvalue``' instructions.
2829
2830Structures may optionally be "packed" structures, which indicate that
2831the alignment of the struct is one byte, and that there is no padding
2832between the elements. In non-packed structs, padding between field types
2833is inserted as defined by the DataLayout string in the module, which is
2834required to match what the underlying code generator expects.
2835
2836Structures can either be "literal" or "identified". A literal structure
2837is defined inline with other types (e.g. ``{i32, i32}*``) whereas
2838identified types are always defined at the top level with a name.
2839Literal types are uniqued by their contents and can never be recursive
2840or opaque since there is no way to write one. Identified types can be
2841recursive, can be opaqued, and are never uniqued.
2842
2843:Syntax:
2844
2845::
2846
2847      %T1 = type { <type list> }     ; Identified normal struct type
2848      %T2 = type <{ <type list> }>   ; Identified packed struct type
2849
2850:Examples:
2851
2852+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2853| ``{ i32, i32, i32 }``        | A triple of three ``i32`` values                                                                                                                                                      |
2854+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2855| ``{ float, i32 (i32) * }``   | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``.  |
2856+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2857| ``<{ i8, i32 }>``            | A packed struct known to be 5 bytes in size.                                                                                                                                          |
2858+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2859
2860.. _t_opaque:
2861
2862Opaque Structure Types
2863""""""""""""""""""""""
2864
2865:Overview:
2866
2867Opaque structure types are used to represent named structure types that
2868do not have a body specified. This corresponds (for example) to the C
2869notion of a forward declared structure.
2870
2871:Syntax:
2872
2873::
2874
2875      %X = type opaque
2876      %52 = type opaque
2877
2878:Examples:
2879
2880+--------------+-------------------+
2881| ``opaque``   | An opaque type.   |
2882+--------------+-------------------+
2883
2884.. _constants:
2885
2886Constants
2887=========
2888
2889LLVM has several different basic types of constants. This section
2890describes them all and their syntax.
2891
2892Simple Constants
2893----------------
2894
2895**Boolean constants**
2896    The two strings '``true``' and '``false``' are both valid constants
2897    of the ``i1`` type.
2898**Integer constants**
2899    Standard integers (such as '4') are constants of the
2900    :ref:`integer <t_integer>` type. Negative numbers may be used with
2901    integer types.
2902**Floating-point constants**
2903    Floating-point constants use standard decimal notation (e.g.
2904    123.421), exponential notation (e.g. 1.23421e+2), or a more precise
2905    hexadecimal notation (see below). The assembler requires the exact
2906    decimal value of a floating-point constant. For example, the
2907    assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
2908    decimal in binary. Floating-point constants must have a
2909    :ref:`floating-point <t_floating>` type.
2910**Null pointer constants**
2911    The identifier '``null``' is recognized as a null pointer constant
2912    and must be of :ref:`pointer type <t_pointer>`.
2913**Token constants**
2914    The identifier '``none``' is recognized as an empty token constant
2915    and must be of :ref:`token type <t_token>`.
2916
2917The one non-intuitive notation for constants is the hexadecimal form of
2918floating-point constants. For example, the form
2919'``double    0x432ff973cafa8000``' is equivalent to (but harder to read
2920than) '``double 4.5e+15``'. The only time hexadecimal floating-point
2921constants are required (and the only time that they are generated by the
2922disassembler) is when a floating-point constant must be emitted but it
2923cannot be represented as a decimal floating-point number in a reasonable
2924number of digits. For example, NaN's, infinities, and other special
2925values are represented in their IEEE hexadecimal format so that assembly
2926and disassembly do not cause any bits to change in the constants.
2927
2928When using the hexadecimal form, constants of types half, float, and
2929double are represented using the 16-digit form shown above (which
2930matches the IEEE754 representation for double); half and float values
2931must, however, be exactly representable as IEEE 754 half and single
2932precision, respectively. Hexadecimal format is always used for long
2933double, and there are three forms of long double. The 80-bit format used
2934by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The
2935128-bit format used by PowerPC (two adjacent doubles) is represented by
2936``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is
2937represented by ``0xL`` followed by 32 hexadecimal digits. Long doubles
2938will only work if they match the long double format on your target.
2939The IEEE 16-bit format (half precision) is represented by ``0xH``
2940followed by 4 hexadecimal digits. All hexadecimal formats are big-endian
2941(sign bit at the left).
2942
2943There are no constants of type x86_mmx.
2944
2945.. _complexconstants:
2946
2947Complex Constants
2948-----------------
2949
2950Complex constants are a (potentially recursive) combination of simple
2951constants and smaller complex constants.
2952
2953**Structure constants**
2954    Structure constants are represented with notation similar to
2955    structure type definitions (a comma separated list of elements,
2956    surrounded by braces (``{}``)). For example:
2957    "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as
2958    "``@G = external global i32``". Structure constants must have
2959    :ref:`structure type <t_struct>`, and the number and types of elements
2960    must match those specified by the type.
2961**Array constants**
2962    Array constants are represented with notation similar to array type
2963    definitions (a comma separated list of elements, surrounded by
2964    square brackets (``[]``)). For example:
2965    "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
2966    :ref:`array type <t_array>`, and the number and types of elements must
2967    match those specified by the type. As a special case, character array
2968    constants may also be represented as a double-quoted string using the ``c``
2969    prefix. For example: "``c"Hello World\0A\00"``".
2970**Vector constants**
2971    Vector constants are represented with notation similar to vector
2972    type definitions (a comma separated list of elements, surrounded by
2973    less-than/greater-than's (``<>``)). For example:
2974    "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
2975    must have :ref:`vector type <t_vector>`, and the number and types of
2976    elements must match those specified by the type.
2977**Zero initialization**
2978    The string '``zeroinitializer``' can be used to zero initialize a
2979    value to zero of *any* type, including scalar and
2980    :ref:`aggregate <t_aggregate>` types. This is often used to avoid
2981    having to print large zero initializers (e.g. for large arrays) and
2982    is always exactly equivalent to using explicit zero initializers.
2983**Metadata node**
2984    A metadata node is a constant tuple without types. For example:
2985    "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
2986    for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``".
2987    Unlike other typed constants that are meant to be interpreted as part of
2988    the instruction stream, metadata is a place to attach additional
2989    information such as debug info.
2990
2991Global Variable and Function Addresses
2992--------------------------------------
2993
2994The addresses of :ref:`global variables <globalvars>` and
2995:ref:`functions <functionstructure>` are always implicitly valid
2996(link-time) constants. These constants are explicitly referenced when
2997the :ref:`identifier for the global <identifiers>` is used and always have
2998:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
2999file:
3000
3001.. code-block:: llvm
3002
3003    @X = global i32 17
3004    @Y = global i32 42
3005    @Z = global [2 x i32*] [ i32* @X, i32* @Y ]
3006
3007.. _undefvalues:
3008
3009Undefined Values
3010----------------
3011
3012The string '``undef``' can be used anywhere a constant is expected, and
3013indicates that the user of the value may receive an unspecified
3014bit-pattern. Undefined values may be of any type (other than '``label``'
3015or '``void``') and be used anywhere a constant is permitted.
3016
3017Undefined values are useful because they indicate to the compiler that
3018the program is well defined no matter what value is used. This gives the
3019compiler more freedom to optimize. Here are some examples of
3020(potentially surprising) transformations that are valid (in pseudo IR):
3021
3022.. code-block:: llvm
3023
3024      %A = add %X, undef
3025      %B = sub %X, undef
3026      %C = xor %X, undef
3027    Safe:
3028      %A = undef
3029      %B = undef
3030      %C = undef
3031
3032This is safe because all of the output bits are affected by the undef
3033bits. Any output bit can have a zero or one depending on the input bits.
3034
3035.. code-block:: llvm
3036
3037      %A = or %X, undef
3038      %B = and %X, undef
3039    Safe:
3040      %A = -1
3041      %B = 0
3042    Safe:
3043      %A = %X  ;; By choosing undef as 0
3044      %B = %X  ;; By choosing undef as -1
3045    Unsafe:
3046      %A = undef
3047      %B = undef
3048
3049These logical operations have bits that are not always affected by the
3050input. For example, if ``%X`` has a zero bit, then the output of the
3051'``and``' operation will always be a zero for that bit, no matter what
3052the corresponding bit from the '``undef``' is. As such, it is unsafe to
3053optimize or assume that the result of the '``and``' is '``undef``'.
3054However, it is safe to assume that all bits of the '``undef``' could be
30550, and optimize the '``and``' to 0. Likewise, it is safe to assume that
3056all the bits of the '``undef``' operand to the '``or``' could be set,
3057allowing the '``or``' to be folded to -1.
3058
3059.. code-block:: llvm
3060
3061      %A = select undef, %X, %Y
3062      %B = select undef, 42, %Y
3063      %C = select %X, %Y, undef
3064    Safe:
3065      %A = %X     (or %Y)
3066      %B = 42     (or %Y)
3067      %C = %Y
3068    Unsafe:
3069      %A = undef
3070      %B = undef
3071      %C = undef
3072
3073This set of examples shows that undefined '``select``' (and conditional
3074branch) conditions can go *either way*, but they have to come from one
3075of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
3076both known to have a clear low bit, then ``%A`` would have to have a
3077cleared low bit. However, in the ``%C`` example, the optimizer is
3078allowed to assume that the '``undef``' operand could be the same as
3079``%Y``, allowing the whole '``select``' to be eliminated.
3080
3081.. code-block:: text
3082
3083      %A = xor undef, undef
3084
3085      %B = undef
3086      %C = xor %B, %B
3087
3088      %D = undef
3089      %E = icmp slt %D, 4
3090      %F = icmp gte %D, 4
3091
3092    Safe:
3093      %A = undef
3094      %B = undef
3095      %C = undef
3096      %D = undef
3097      %E = undef
3098      %F = undef
3099
3100This example points out that two '``undef``' operands are not
3101necessarily the same. This can be surprising to people (and also matches
3102C semantics) where they assume that "``X^X``" is always zero, even if
3103``X`` is undefined. This isn't true for a number of reasons, but the
3104short answer is that an '``undef``' "variable" can arbitrarily change
3105its value over its "live range". This is true because the variable
3106doesn't actually *have a live range*. Instead, the value is logically
3107read from arbitrary registers that happen to be around when needed, so
3108the value is not necessarily consistent over time. In fact, ``%A`` and
3109``%C`` need to have the same semantics or the core LLVM "replace all
3110uses with" concept would not hold.
3111
3112.. code-block:: llvm
3113
3114      %A = sdiv undef, %X
3115      %B = sdiv %X, undef
3116    Safe:
3117      %A = 0
3118    b: unreachable
3119
3120These examples show the crucial difference between an *undefined value*
3121and *undefined behavior*. An undefined value (like '``undef``') is
3122allowed to have an arbitrary bit-pattern. This means that the ``%A``
3123operation can be constant folded to '``0``', because the '``undef``'
3124could be zero, and zero divided by any value is zero.
3125However, in the second example, we can make a more aggressive
3126assumption: because the ``undef`` is allowed to be an arbitrary value,
3127we are allowed to assume that it could be zero. Since a divide by zero
3128has *undefined behavior*, we are allowed to assume that the operation
3129does not execute at all. This allows us to delete the divide and all
3130code after it. Because the undefined operation "can't happen", the
3131optimizer can assume that it occurs in dead code.
3132
3133.. code-block:: text
3134
3135    a:  store undef -> %X
3136    b:  store %X -> undef
3137    Safe:
3138    a: <deleted>
3139    b: unreachable
3140
3141A store *of* an undefined value can be assumed to not have any effect;
3142we can assume that the value is overwritten with bits that happen to
3143match what was already there. However, a store *to* an undefined
3144location could clobber arbitrary memory, therefore, it has undefined
3145behavior.
3146
3147.. _poisonvalues:
3148
3149Poison Values
3150-------------
3151
3152Poison values are similar to :ref:`undef values <undefvalues>`, however
3153they also represent the fact that an instruction or constant expression
3154that cannot evoke side effects has nevertheless detected a condition
3155that results in undefined behavior.
3156
3157There is currently no way of representing a poison value in the IR; they
3158only exist when produced by operations such as :ref:`add <i_add>` with
3159the ``nsw`` flag.
3160
3161Poison value behavior is defined in terms of value *dependence*:
3162
3163-  Values other than :ref:`phi <i_phi>` nodes depend on their operands.
3164-  :ref:`Phi <i_phi>` nodes depend on the operand corresponding to
3165   their dynamic predecessor basic block.
3166-  Function arguments depend on the corresponding actual argument values
3167   in the dynamic callers of their functions.
3168-  :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>`
3169   instructions that dynamically transfer control back to them.
3170-  :ref:`Invoke <i_invoke>` instructions depend on the
3171   :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing
3172   call instructions that dynamically transfer control back to them.
3173-  Non-volatile loads and stores depend on the most recent stores to all
3174   of the referenced memory addresses, following the order in the IR
3175   (including loads and stores implied by intrinsics such as
3176   :ref:`@llvm.memcpy <int_memcpy>`.)
3177-  An instruction with externally visible side effects depends on the
3178   most recent preceding instruction with externally visible side
3179   effects, following the order in the IR. (This includes :ref:`volatile
3180   operations <volatile>`.)
3181-  An instruction *control-depends* on a :ref:`terminator
3182   instruction <terminators>` if the terminator instruction has
3183   multiple successors and the instruction is always executed when
3184   control transfers to one of the successors, and may not be executed
3185   when control is transferred to another.
3186-  Additionally, an instruction also *control-depends* on a terminator
3187   instruction if the set of instructions it otherwise depends on would
3188   be different if the terminator had transferred control to a different
3189   successor.
3190-  Dependence is transitive.
3191
3192Poison values have the same behavior as :ref:`undef values <undefvalues>`,
3193with the additional effect that any instruction that has a *dependence*
3194on a poison value has undefined behavior.
3195
3196Here are some examples:
3197
3198.. code-block:: llvm
3199
3200    entry:
3201      %poison = sub nuw i32 0, 1           ; Results in a poison value.
3202      %still_poison = and i32 %poison, 0   ; 0, but also poison.
3203      %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison
3204      store i32 0, i32* %poison_yet_again  ; memory at @h[0] is poisoned
3205
3206      store i32 %poison, i32* @g           ; Poison value stored to memory.
3207      %poison2 = load i32, i32* @g         ; Poison value loaded back from memory.
3208
3209      store volatile i32 %poison, i32* @g  ; External observation; undefined behavior.
3210
3211      %narrowaddr = bitcast i32* @g to i16*
3212      %wideaddr = bitcast i32* @g to i64*
3213      %poison3 = load i16, i16* %narrowaddr ; Returns a poison value.
3214      %poison4 = load i64, i64* %wideaddr  ; Returns a poison value.
3215
3216      %cmp = icmp slt i32 %poison, 0       ; Returns a poison value.
3217      br i1 %cmp, label %true, label %end  ; Branch to either destination.
3218
3219    true:
3220      store volatile i32 0, i32* @g        ; This is control-dependent on %cmp, so
3221                                           ; it has undefined behavior.
3222      br label %end
3223
3224    end:
3225      %p = phi i32 [ 0, %entry ], [ 1, %true ]
3226                                           ; Both edges into this PHI are
3227                                           ; control-dependent on %cmp, so this
3228                                           ; always results in a poison value.
3229
3230      store volatile i32 0, i32* @g        ; This would depend on the store in %true
3231                                           ; if %cmp is true, or the store in %entry
3232                                           ; otherwise, so this is undefined behavior.
3233
3234      br i1 %cmp, label %second_true, label %second_end
3235                                           ; The same branch again, but this time the
3236                                           ; true block doesn't have side effects.
3237
3238    second_true:
3239      ; No side effects!
3240      ret void
3241
3242    second_end:
3243      store volatile i32 0, i32* @g        ; This time, the instruction always depends
3244                                           ; on the store in %end. Also, it is
3245                                           ; control-equivalent to %end, so this is
3246                                           ; well-defined (ignoring earlier undefined
3247                                           ; behavior in this example).
3248
3249.. _blockaddress:
3250
3251Addresses of Basic Blocks
3252-------------------------
3253
3254``blockaddress(@function, %block)``
3255
3256The '``blockaddress``' constant computes the address of the specified
3257basic block in the specified function, and always has an ``i8*`` type.
3258Taking the address of the entry block is illegal.
3259
3260This value only has defined behavior when used as an operand to the
3261':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons
3262against null. Pointer equality tests between labels addresses results in
3263undefined behavior --- though, again, comparison against null is ok, and
3264no label is equal to the null pointer. This may be passed around as an
3265opaque pointer sized value as long as the bits are not inspected. This
3266allows ``ptrtoint`` and arithmetic to be performed on these values so
3267long as the original value is reconstituted before the ``indirectbr``
3268instruction.
3269
3270Finally, some targets may provide defined semantics when using the value
3271as the operand to an inline assembly, but that is target specific.
3272
3273.. _constantexprs:
3274
3275Constant Expressions
3276--------------------
3277
3278Constant expressions are used to allow expressions involving other
3279constants to be used as constants. Constant expressions may be of any
3280:ref:`first class <t_firstclass>` type and may involve any LLVM operation
3281that does not have side effects (e.g. load and call are not supported).
3282The following is the syntax for constant expressions:
3283
3284``trunc (CST to TYPE)``
3285    Perform the :ref:`trunc operation <i_trunc>` on constants.
3286``zext (CST to TYPE)``
3287    Perform the :ref:`zext operation <i_zext>` on constants.
3288``sext (CST to TYPE)``
3289    Perform the :ref:`sext operation <i_sext>` on constants.
3290``fptrunc (CST to TYPE)``
3291    Truncate a floating-point constant to another floating-point type.
3292    The size of CST must be larger than the size of TYPE. Both types
3293    must be floating-point.
3294``fpext (CST to TYPE)``
3295    Floating-point extend a constant to another type. The size of CST
3296    must be smaller or equal to the size of TYPE. Both types must be
3297    floating-point.
3298``fptoui (CST to TYPE)``
3299    Convert a floating-point constant to the corresponding unsigned
3300    integer constant. TYPE must be a scalar or vector integer type. CST
3301    must be of scalar or vector floating-point type. Both CST and TYPE
3302    must be scalars, or vectors of the same number of elements. If the
3303    value won't fit in the integer type, the result is a
3304    :ref:`poison value <poisonvalues>`.
3305``fptosi (CST to TYPE)``
3306    Convert a floating-point constant to the corresponding signed
3307    integer constant. TYPE must be a scalar or vector integer type. CST
3308    must be of scalar or vector floating-point type. Both CST and TYPE
3309    must be scalars, or vectors of the same number of elements. If the
3310    value won't fit in the integer type, the result is a
3311    :ref:`poison value <poisonvalues>`.
3312``uitofp (CST to TYPE)``
3313    Convert an unsigned integer constant to the corresponding
3314    floating-point constant. TYPE must be a scalar or vector floating-point
3315    type.  CST must be of scalar or vector integer type. Both CST and TYPE must
3316    be scalars, or vectors of the same number of elements.
3317``sitofp (CST to TYPE)``
3318    Convert a signed integer constant to the corresponding floating-point
3319    constant. TYPE must be a scalar or vector floating-point type.
3320    CST must be of scalar or vector integer type. Both CST and TYPE must
3321    be scalars, or vectors of the same number of elements.
3322``ptrtoint (CST to TYPE)``
3323    Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
3324``inttoptr (CST to TYPE)``
3325    Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
3326    This one is *really* dangerous!
3327``bitcast (CST to TYPE)``
3328    Convert a constant, CST, to another TYPE.
3329    The constraints of the operands are the same as those for the
3330    :ref:`bitcast instruction <i_bitcast>`.
3331``addrspacecast (CST to TYPE)``
3332    Convert a constant pointer or constant vector of pointer, CST, to another
3333    TYPE in a different address space. The constraints of the operands are the
3334    same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
3335``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
3336    Perform the :ref:`getelementptr operation <i_getelementptr>` on
3337    constants. As with the :ref:`getelementptr <i_getelementptr>`
3338    instruction, the index list may have one or more indexes, which are
3339    required to make sense for the type of "pointer to TY".
3340``select (COND, VAL1, VAL2)``
3341    Perform the :ref:`select operation <i_select>` on constants.
3342``icmp COND (VAL1, VAL2)``
3343    Perform the :ref:`icmp operation <i_icmp>` on constants.
3344``fcmp COND (VAL1, VAL2)``
3345    Perform the :ref:`fcmp operation <i_fcmp>` on constants.
3346``extractelement (VAL, IDX)``
3347    Perform the :ref:`extractelement operation <i_extractelement>` on
3348    constants.
3349``insertelement (VAL, ELT, IDX)``
3350    Perform the :ref:`insertelement operation <i_insertelement>` on
3351    constants.
3352``shufflevector (VEC1, VEC2, IDXMASK)``
3353    Perform the :ref:`shufflevector operation <i_shufflevector>` on
3354    constants.
3355``extractvalue (VAL, IDX0, IDX1, ...)``
3356    Perform the :ref:`extractvalue operation <i_extractvalue>` on
3357    constants. The index list is interpreted in a similar manner as
3358    indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At
3359    least one index value must be specified.
3360``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
3361    Perform the :ref:`insertvalue operation <i_insertvalue>` on constants.
3362    The index list is interpreted in a similar manner as indices in a
3363    ':ref:`getelementptr <i_getelementptr>`' operation. At least one index
3364    value must be specified.
3365``OPCODE (LHS, RHS)``
3366    Perform the specified operation of the LHS and RHS constants. OPCODE
3367    may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
3368    binary <bitwiseops>` operations. The constraints on operands are
3369    the same as those for the corresponding instruction (e.g. no bitwise
3370    operations on floating-point values are allowed).
3371
3372Other Values
3373============
3374
3375.. _inlineasmexprs:
3376
3377Inline Assembler Expressions
3378----------------------------
3379
3380LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
3381Inline Assembly <moduleasm>`) through the use of a special value. This value
3382represents the inline assembler as a template string (containing the
3383instructions to emit), a list of operand constraints (stored as a string), a
3384flag that indicates whether or not the inline asm expression has side effects,
3385and a flag indicating whether the function containing the asm needs to align its
3386stack conservatively.
3387
3388The template string supports argument substitution of the operands using "``$``"
3389followed by a number, to indicate substitution of the given register/memory
3390location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
3391be used, where ``MODIFIER`` is a target-specific annotation for how to print the
3392operand (See :ref:`inline-asm-modifiers`).
3393
3394A literal "``$``" may be included by using "``$$``" in the template. To include
3395other special characters into the output, the usual "``\XX``" escapes may be
3396used, just as in other strings. Note that after template substitution, the
3397resulting assembly string is parsed by LLVM's integrated assembler unless it is
3398disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
3399syntax known to LLVM.
3400
3401LLVM also supports a few more substitions useful for writing inline assembly:
3402
3403- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
3404  This substitution is useful when declaring a local label. Many standard
3405  compiler optimizations, such as inlining, may duplicate an inline asm blob.
3406  Adding a blob-unique identifier ensures that the two labels will not conflict
3407  during assembly. This is used to implement `GCC's %= special format
3408  string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
3409- ``${:comment}``: Expands to the comment character of the current target's
3410  assembly dialect. This is usually ``#``, but many targets use other strings,
3411  such as ``;``, ``//``, or ``!``.
3412- ``${:private}``: Expands to the assembler private label prefix. Labels with
3413  this prefix will not appear in the symbol table of the assembled object.
3414  Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
3415  relatively popular.
3416
3417LLVM's support for inline asm is modeled closely on the requirements of Clang's
3418GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
3419modifier codes listed here are similar or identical to those in GCC's inline asm
3420support. However, to be clear, the syntax of the template and constraint strings
3421described here is *not* the same as the syntax accepted by GCC and Clang, and,
3422while most constraint letters are passed through as-is by Clang, some get
3423translated to other codes when converting from the C source to the LLVM
3424assembly.
3425
3426An example inline assembler expression is:
3427
3428.. code-block:: llvm
3429
3430    i32 (i32) asm "bswap $0", "=r,r"
3431
3432Inline assembler expressions may **only** be used as the callee operand
3433of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
3434Thus, typically we have:
3435
3436.. code-block:: llvm
3437
3438    %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
3439
3440Inline asms with side effects not visible in the constraint list must be
3441marked as having side effects. This is done through the use of the
3442'``sideeffect``' keyword, like so:
3443
3444.. code-block:: llvm
3445
3446    call void asm sideeffect "eieio", ""()
3447
3448In some cases inline asms will contain code that will not work unless
3449the stack is aligned in some way, such as calls or SSE instructions on
3450x86, yet will not contain code that does that alignment within the asm.
3451The compiler should make conservative assumptions about what the asm
3452might contain and should generate its usual stack alignment code in the
3453prologue if the '``alignstack``' keyword is present:
3454
3455.. code-block:: llvm
3456
3457    call void asm alignstack "eieio", ""()
3458
3459Inline asms also support using non-standard assembly dialects. The
3460assumed dialect is ATT. When the '``inteldialect``' keyword is present,
3461the inline asm is using the Intel dialect. Currently, ATT and Intel are
3462the only supported dialects. An example is:
3463
3464.. code-block:: llvm
3465
3466    call void asm inteldialect "eieio", ""()
3467
3468If multiple keywords appear the '``sideeffect``' keyword must come
3469first, the '``alignstack``' keyword second and the '``inteldialect``'
3470keyword last.
3471
3472Inline Asm Constraint String
3473^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3474
3475The constraint list is a comma-separated string, each element containing one or
3476more constraint codes.
3477
3478For each element in the constraint list an appropriate register or memory
3479operand will be chosen, and it will be made available to assembly template
3480string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
3481second, etc.
3482
3483There are three different types of constraints, which are distinguished by a
3484prefix symbol in front of the constraint code: Output, Input, and Clobber. The
3485constraints must always be given in that order: outputs first, then inputs, then
3486clobbers. They cannot be intermingled.
3487
3488There are also three different categories of constraint codes:
3489
3490- Register constraint. This is either a register class, or a fixed physical
3491  register. This kind of constraint will allocate a register, and if necessary,
3492  bitcast the argument or result to the appropriate type.
3493- Memory constraint. This kind of constraint is for use with an instruction
3494  taking a memory operand. Different constraints allow for different addressing
3495  modes used by the target.
3496- Immediate value constraint. This kind of constraint is for an integer or other
3497  immediate value which can be rendered directly into an instruction. The
3498  various target-specific constraints allow the selection of a value in the
3499  proper range for the instruction you wish to use it with.
3500
3501Output constraints
3502""""""""""""""""""
3503
3504Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
3505indicates that the assembly will write to this operand, and the operand will
3506then be made available as a return value of the ``asm`` expression. Output
3507constraints do not consume an argument from the call instruction. (Except, see
3508below about indirect outputs).
3509
3510Normally, it is expected that no output locations are written to by the assembly
3511expression until *all* of the inputs have been read. As such, LLVM may assign
3512the same register to an output and an input. If this is not safe (e.g. if the
3513assembly contains two instructions, where the first writes to one output, and
3514the second reads an input and writes to a second output), then the "``&``"
3515modifier must be used (e.g. "``=&r``") to specify that the output is an
3516"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
3517will not use the same register for any inputs (other than an input tied to this
3518output).
3519
3520Input constraints
3521"""""""""""""""""
3522
3523Input constraints do not have a prefix -- just the constraint codes. Each input
3524constraint will consume one argument from the call instruction. It is not
3525permitted for the asm to write to any input register or memory location (unless
3526that input is tied to an output). Note also that multiple inputs may all be
3527assigned to the same register, if LLVM can determine that they necessarily all
3528contain the same value.
3529
3530Instead of providing a Constraint Code, input constraints may also "tie"
3531themselves to an output constraint, by providing an integer as the constraint
3532string. Tied inputs still consume an argument from the call instruction, and
3533take up a position in the asm template numbering as is usual -- they will simply
3534be constrained to always use the same register as the output they've been tied
3535to. For example, a constraint string of "``=r,0``" says to assign a register for
3536output, and use that register as an input as well (it being the 0'th
3537constraint).
3538
3539It is permitted to tie an input to an "early-clobber" output. In that case, no
3540*other* input may share the same register as the input tied to the early-clobber
3541(even when the other input has the same value).
3542
3543You may only tie an input to an output which has a register constraint, not a
3544memory constraint. Only a single input may be tied to an output.
3545
3546There is also an "interesting" feature which deserves a bit of explanation: if a
3547register class constraint allocates a register which is too small for the value
3548type operand provided as input, the input value will be split into multiple
3549registers, and all of them passed to the inline asm.
3550
3551However, this feature is often not as useful as you might think.
3552
3553Firstly, the registers are *not* guaranteed to be consecutive. So, on those
3554architectures that have instructions which operate on multiple consecutive
3555instructions, this is not an appropriate way to support them. (e.g. the 32-bit
3556SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
3557hardware then loads into both the named register, and the next register. This
3558feature of inline asm would not be useful to support that.)
3559
3560A few of the targets provide a template string modifier allowing explicit access
3561to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
3562``D``). On such an architecture, you can actually access the second allocated
3563register (yet, still, not any subsequent ones). But, in that case, you're still
3564probably better off simply splitting the value into two separate operands, for
3565clarity. (e.g. see the description of the ``A`` constraint on X86, which,
3566despite existing only for use with this feature, is not really a good idea to
3567use)
3568
3569Indirect inputs and outputs
3570"""""""""""""""""""""""""""
3571
3572Indirect output or input constraints can be specified by the "``*``" modifier
3573(which goes after the "``=``" in case of an output). This indicates that the asm
3574will write to or read from the contents of an *address* provided as an input
3575argument. (Note that in this way, indirect outputs act more like an *input* than
3576an output: just like an input, they consume an argument of the call expression,
3577rather than producing a return value. An indirect output constraint is an
3578"output" only in that the asm is expected to write to the contents of the input
3579memory location, instead of just read from it).
3580
3581This is most typically used for memory constraint, e.g. "``=*m``", to pass the
3582address of a variable as a value.
3583
3584It is also possible to use an indirect *register* constraint, but only on output
3585(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
3586value normally, and then, separately emit a store to the address provided as
3587input, after the provided inline asm. (It's not clear what value this
3588functionality provides, compared to writing the store explicitly after the asm
3589statement, and it can only produce worse code, since it bypasses many
3590optimization passes. I would recommend not using it.)
3591
3592
3593Clobber constraints
3594"""""""""""""""""""
3595
3596A clobber constraint is indicated by a "``~``" prefix. A clobber does not
3597consume an input operand, nor generate an output. Clobbers cannot use any of the
3598general constraint code letters -- they may use only explicit register
3599constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
3600"``~{memory}``" indicates that the assembly writes to arbitrary undeclared
3601memory locations -- not only the memory pointed to by a declared indirect
3602output.
3603
3604Note that clobbering named registers that are also present in output
3605constraints is not legal.
3606
3607
3608Constraint Codes
3609""""""""""""""""
3610After a potential prefix comes constraint code, or codes.
3611
3612A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
3613followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
3614(e.g. "``{eax}``").
3615
3616The one and two letter constraint codes are typically chosen to be the same as
3617GCC's constraint codes.
3618
3619A single constraint may include one or more than constraint code in it, leaving
3620it up to LLVM to choose which one to use. This is included mainly for
3621compatibility with the translation of GCC inline asm coming from clang.
3622
3623There are two ways to specify alternatives, and either or both may be used in an
3624inline asm constraint list:
3625
36261) Append the codes to each other, making a constraint code set. E.g. "``im``"
3627   or "``{eax}m``". This means "choose any of the options in the set". The
3628   choice of constraint is made independently for each constraint in the
3629   constraint list.
3630
36312) Use "``|``" between constraint code sets, creating alternatives. Every
3632   constraint in the constraint list must have the same number of alternative
3633   sets. With this syntax, the same alternative in *all* of the items in the
3634   constraint list will be chosen together.
3635
3636Putting those together, you might have a two operand constraint string like
3637``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
3638operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
3639may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
3640
3641However, the use of either of the alternatives features is *NOT* recommended, as
3642LLVM is not able to make an intelligent choice about which one to use. (At the
3643point it currently needs to choose, not enough information is available to do so
3644in a smart way.) Thus, it simply tries to make a choice that's most likely to
3645compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
3646always choose to use memory, not registers). And, if given multiple registers,
3647or multiple register classes, it will simply choose the first one. (In fact, it
3648doesn't currently even ensure explicitly specified physical registers are
3649unique, so specifying multiple physical registers as alternatives, like
3650``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
3651intended.)
3652
3653Supported Constraint Code List
3654""""""""""""""""""""""""""""""
3655
3656The constraint codes are, in general, expected to behave the same way they do in
3657GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
3658inline asm code which was supported by GCC. A mismatch in behavior between LLVM
3659and GCC likely indicates a bug in LLVM.
3660
3661Some constraint codes are typically supported by all targets:
3662
3663- ``r``: A register in the target's general purpose register class.
3664- ``m``: A memory address operand. It is target-specific what addressing modes
3665  are supported, typical examples are register, or register + register offset,
3666  or register + immediate offset (of some target-specific size).
3667- ``i``: An integer constant (of target-specific width). Allows either a simple
3668  immediate, or a relocatable value.
3669- ``n``: An integer constant -- *not* including relocatable values.
3670- ``s``: An integer constant, but allowing *only* relocatable values.
3671- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
3672  useful to pass a label for an asm branch or call.
3673
3674  .. FIXME: but that surely isn't actually okay to jump out of an asm
3675     block without telling llvm about the control transfer???)
3676
3677- ``{register-name}``: Requires exactly the named physical register.
3678
3679Other constraints are target-specific:
3680
3681AArch64:
3682
3683- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
3684- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
3685  i.e. 0 to 4095 with optional shift by 12.
3686- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
3687  ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
3688- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
3689  logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
3690- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
3691  logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
3692- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
3693  32-bit register. This is a superset of ``K``: in addition to the bitmask
3694  immediate, also allows immediate integers which can be loaded with a single
3695  ``MOVZ`` or ``MOVL`` instruction.
3696- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
3697  64-bit register. This is a superset of ``L``.
3698- ``Q``: Memory address operand must be in a single register (no
3699  offsets). (However, LLVM currently does this for the ``m`` constraint as
3700  well.)
3701- ``r``: A 32 or 64-bit integer register (W* or X*).
3702- ``w``: A 32, 64, or 128-bit floating-point/SIMD register.
3703- ``x``: A lower 128-bit floating-point/SIMD register (``V0`` to ``V15``).
3704
3705AMDGPU:
3706
3707- ``r``: A 32 or 64-bit integer register.
3708- ``[0-9]v``: The 32-bit VGPR register, number 0-9.
3709- ``[0-9]s``: The 32-bit SGPR register, number 0-9.
3710
3711
3712All ARM modes:
3713
3714- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
3715  operand. Treated the same as operand ``m``, at the moment.
3716
3717ARM and ARM's Thumb2 mode:
3718
3719- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
3720- ``I``: An immediate integer valid for a data-processing instruction.
3721- ``J``: An immediate integer between -4095 and 4095.
3722- ``K``: An immediate integer whose bitwise inverse is valid for a
3723  data-processing instruction. (Can be used with template modifier "``B``" to
3724  print the inverted value).
3725- ``L``: An immediate integer whose negation is valid for a data-processing
3726  instruction. (Can be used with template modifier "``n``" to print the negated
3727  value).
3728- ``M``: A power of two or a integer between 0 and 32.
3729- ``N``: Invalid immediate constraint.
3730- ``O``: Invalid immediate constraint.
3731- ``r``: A general-purpose 32-bit integer register (``r0-r15``).
3732- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
3733  as ``r``.
3734- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
3735  invalid.
3736- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
3737  ``d0-d31``, or ``q0-q15``.
3738- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
3739  ``d0-d7``, or ``q0-q3``.
3740- ``t``: A low floating-point/SIMD register: ``s0-s31``, ``d0-d16``, or
3741  ``q0-q8``.
3742
3743ARM's Thumb1 mode:
3744
3745- ``I``: An immediate integer between 0 and 255.
3746- ``J``: An immediate integer between -255 and -1.
3747- ``K``: An immediate integer between 0 and 255, with optional left-shift by
3748  some amount.
3749- ``L``: An immediate integer between -7 and 7.
3750- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
3751- ``N``: An immediate integer between 0 and 31.
3752- ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
3753- ``r``: A low 32-bit GPR register (``r0-r7``).
3754- ``l``: A low 32-bit GPR register (``r0-r7``).
3755- ``h``: A high GPR register (``r0-r7``).
3756- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
3757  ``d0-d31``, or ``q0-q15``.
3758- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
3759  ``d0-d7``, or ``q0-q3``.
3760- ``t``: A low floating-point/SIMD register: ``s0-s31``, ``d0-d16``, or
3761  ``q0-q8``.
3762
3763
3764Hexagon:
3765
3766- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
3767  at the moment.
3768- ``r``: A 32 or 64-bit register.
3769
3770MSP430:
3771
3772- ``r``: An 8 or 16-bit register.
3773
3774MIPS:
3775
3776- ``I``: An immediate signed 16-bit integer.
3777- ``J``: An immediate integer zero.
3778- ``K``: An immediate unsigned 16-bit integer.
3779- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
3780- ``N``: An immediate integer between -65535 and -1.
3781- ``O``: An immediate signed 15-bit integer.
3782- ``P``: An immediate integer between 1 and 65535.
3783- ``m``: A memory address operand. In MIPS-SE mode, allows a base address
3784  register plus 16-bit immediate offset. In MIPS mode, just a base register.
3785- ``R``: A memory address operand. In MIPS-SE mode, allows a base address
3786  register plus a 9-bit signed offset. In MIPS mode, the same as constraint
3787  ``m``.
3788- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
3789  ``sc`` instruction on the given subtarget (details vary).
3790- ``r``, ``d``,  ``y``: A 32 or 64-bit GPR register.
3791- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
3792  (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
3793  argument modifier for compatibility with GCC.
3794- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
3795  ``25``).
3796- ``l``: The ``lo`` register, 32 or 64-bit.
3797- ``x``: Invalid.
3798
3799NVPTX:
3800
3801- ``b``: A 1-bit integer register.
3802- ``c`` or ``h``: A 16-bit integer register.
3803- ``r``: A 32-bit integer register.
3804- ``l`` or ``N``: A 64-bit integer register.
3805- ``f``: A 32-bit float register.
3806- ``d``: A 64-bit float register.
3807
3808
3809PowerPC:
3810
3811- ``I``: An immediate signed 16-bit integer.
3812- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
3813- ``K``: An immediate unsigned 16-bit integer.
3814- ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
3815- ``M``: An immediate integer greater than 31.
3816- ``N``: An immediate integer that is an exact power of 2.
3817- ``O``: The immediate integer constant 0.
3818- ``P``: An immediate integer constant whose negation is a signed 16-bit
3819  constant.
3820- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
3821  treated the same as ``m``.
3822- ``r``: A 32 or 64-bit integer register.
3823- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
3824  ``R1-R31``).
3825- ``f``: A 32 or 64-bit float register (``F0-F31``), or when QPX is enabled, a
3826  128 or 256-bit QPX register (``Q0-Q31``; aliases the ``F`` registers).
3827- ``v``: For ``4 x f32`` or ``4 x f64`` types, when QPX is enabled, a
3828  128 or 256-bit QPX register (``Q0-Q31``), otherwise a 128-bit
3829  altivec vector register (``V0-V31``).
3830
3831  .. FIXME: is this a bug that v accepts QPX registers? I think this
3832     is supposed to only use the altivec vector registers?
3833
3834- ``y``: Condition register (``CR0-CR7``).
3835- ``wc``: An individual CR bit in a CR register.
3836- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
3837  register set (overlapping both the floating-point and vector register files).
3838- ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
3839  set.
3840
3841Sparc:
3842
3843- ``I``: An immediate 13-bit signed integer.
3844- ``r``: A 32-bit integer register.
3845- ``f``: Any floating-point register on SparcV8, or a floating-point
3846  register in the "low" half of the registers on SparcV9.
3847- ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
3848
3849SystemZ:
3850
3851- ``I``: An immediate unsigned 8-bit integer.
3852- ``J``: An immediate unsigned 12-bit integer.
3853- ``K``: An immediate signed 16-bit integer.
3854- ``L``: An immediate signed 20-bit integer.
3855- ``M``: An immediate integer 0x7fffffff.
3856- ``Q``: A memory address operand with a base address and a 12-bit immediate
3857  unsigned displacement.
3858- ``R``: A memory address operand with a base address, a 12-bit immediate
3859  unsigned displacement, and an index register.
3860- ``S``: A memory address operand with a base address and a 20-bit immediate
3861  signed displacement.
3862- ``T``: A memory address operand with a base address, a 20-bit immediate
3863  signed displacement, and an index register.
3864- ``r`` or ``d``: A 32, 64, or 128-bit integer register.
3865- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
3866  address context evaluates as zero).
3867- ``h``: A 32-bit value in the high part of a 64bit data register
3868  (LLVM-specific)
3869- ``f``: A 32, 64, or 128-bit floating-point register.
3870
3871X86:
3872
3873- ``I``: An immediate integer between 0 and 31.
3874- ``J``: An immediate integer between 0 and 64.
3875- ``K``: An immediate signed 8-bit integer.
3876- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
3877  0xffffffff.
3878- ``M``: An immediate integer between 0 and 3.
3879- ``N``: An immediate unsigned 8-bit integer.
3880- ``O``: An immediate integer between 0 and 127.
3881- ``e``: An immediate 32-bit signed integer.
3882- ``Z``: An immediate 32-bit unsigned integer.
3883- ``o``, ``v``: Treated the same as ``m``, at the moment.
3884- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
3885  ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
3886  registers, and on X86-64, it is all of the integer registers.
3887- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
3888  ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
3889- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
3890- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
3891  existed since i386, and can be accessed without the REX prefix.
3892- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
3893- ``y``: A 64-bit MMX register, if MMX is enabled.
3894- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
3895  operand in a SSE register. If AVX is also enabled, can also be a 256-bit
3896  vector operand in an AVX register. If AVX-512 is also enabled, can also be a
3897  512-bit vector operand in an AVX512 register, Otherwise, an error.
3898- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
3899- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
3900  32-bit mode, a 64-bit integer operand will get split into two registers). It
3901  is not recommended to use this constraint, as in 64-bit mode, the 64-bit
3902  operand will get allocated only to RAX -- if two 32-bit operands are needed,
3903  you're better off splitting it yourself, before passing it to the asm
3904  statement.
3905
3906XCore:
3907
3908- ``r``: A 32-bit integer register.
3909
3910
3911.. _inline-asm-modifiers:
3912
3913Asm template argument modifiers
3914^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3915
3916In the asm template string, modifiers can be used on the operand reference, like
3917"``${0:n}``".
3918
3919The modifiers are, in general, expected to behave the same way they do in
3920GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
3921inline asm code which was supported by GCC. A mismatch in behavior between LLVM
3922and GCC likely indicates a bug in LLVM.
3923
3924Target-independent:
3925
3926- ``c``: Print an immediate integer constant unadorned, without
3927  the target-specific immediate punctuation (e.g. no ``$`` prefix).
3928- ``n``: Negate and print immediate integer constant unadorned, without the
3929  target-specific immediate punctuation (e.g. no ``$`` prefix).
3930- ``l``: Print as an unadorned label, without the target-specific label
3931  punctuation (e.g. no ``$`` prefix).
3932
3933AArch64:
3934
3935- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
3936  instead of ``x30``, print ``w30``.
3937- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
3938- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
3939  ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
3940  ``v*``.
3941
3942AMDGPU:
3943
3944- ``r``: No effect.
3945
3946ARM:
3947
3948- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
3949  register).
3950- ``P``: No effect.
3951- ``q``: No effect.
3952- ``y``: Print a VFP single-precision register as an indexed double (e.g. print
3953  as ``d4[1]`` instead of ``s9``)
3954- ``B``: Bitwise invert and print an immediate integer constant without ``#``
3955  prefix.
3956- ``L``: Print the low 16-bits of an immediate integer constant.
3957- ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
3958  register operands subsequent to the specified one (!), so use carefully.
3959- ``Q``: Print the low-order register of a register-pair, or the low-order
3960  register of a two-register operand.
3961- ``R``: Print the high-order register of a register-pair, or the high-order
3962  register of a two-register operand.
3963- ``H``: Print the second register of a register-pair. (On a big-endian system,
3964  ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
3965  to ``R``.)
3966
3967  .. FIXME: H doesn't currently support printing the second register
3968     of a two-register operand.
3969
3970- ``e``: Print the low doubleword register of a NEON quad register.
3971- ``f``: Print the high doubleword register of a NEON quad register.
3972- ``m``: Print the base register of a memory operand without the ``[`` and ``]``
3973  adornment.
3974
3975Hexagon:
3976
3977- ``L``: Print the second register of a two-register operand. Requires that it
3978  has been allocated consecutively to the first.
3979
3980  .. FIXME: why is it restricted to consecutive ones? And there's
3981     nothing that ensures that happens, is there?
3982
3983- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
3984  nothing. Used to print 'addi' vs 'add' instructions.
3985
3986MSP430:
3987
3988No additional modifiers.
3989
3990MIPS:
3991
3992- ``X``: Print an immediate integer as hexadecimal
3993- ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
3994- ``d``: Print an immediate integer as decimal.
3995- ``m``: Subtract one and print an immediate integer as decimal.
3996- ``z``: Print $0 if an immediate zero, otherwise print normally.
3997- ``L``: Print the low-order register of a two-register operand, or prints the
3998  address of the low-order word of a double-word memory operand.
3999
4000  .. FIXME: L seems to be missing memory operand support.
4001
4002- ``M``: Print the high-order register of a two-register operand, or prints the
4003  address of the high-order word of a double-word memory operand.
4004
4005  .. FIXME: M seems to be missing memory operand support.
4006
4007- ``D``: Print the second register of a two-register operand, or prints the
4008  second word of a double-word memory operand. (On a big-endian system, ``D`` is
4009  equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
4010  ``M``.)
4011- ``w``: No effect. Provided for compatibility with GCC which requires this
4012  modifier in order to print MSA registers (``W0-W31``) with the ``f``
4013  constraint.
4014
4015NVPTX:
4016
4017- ``r``: No effect.
4018
4019PowerPC:
4020
4021- ``L``: Print the second register of a two-register operand. Requires that it
4022  has been allocated consecutively to the first.
4023
4024  .. FIXME: why is it restricted to consecutive ones? And there's
4025     nothing that ensures that happens, is there?
4026
4027- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
4028  nothing. Used to print 'addi' vs 'add' instructions.
4029- ``y``: For a memory operand, prints formatter for a two-register X-form
4030  instruction. (Currently always prints ``r0,OPERAND``).
4031- ``U``: Prints 'u' if the memory operand is an update form, and nothing
4032  otherwise. (NOTE: LLVM does not support update form, so this will currently
4033  always print nothing)
4034- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
4035  not support indexed form, so this will currently always print nothing)
4036
4037Sparc:
4038
4039- ``r``: No effect.
4040
4041SystemZ:
4042
4043SystemZ implements only ``n``, and does *not* support any of the other
4044target-independent modifiers.
4045
4046X86:
4047
4048- ``c``: Print an unadorned integer or symbol name. (The latter is
4049  target-specific behavior for this typically target-independent modifier).
4050- ``A``: Print a register name with a '``*``' before it.
4051- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
4052  operand.
4053- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
4054  memory operand.
4055- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
4056  operand.
4057- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
4058  operand.
4059- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
4060  available, otherwise the 32-bit register name; do nothing on a memory operand.
4061- ``n``: Negate and print an unadorned integer, or, for operands other than an
4062  immediate integer (e.g. a relocatable symbol expression), print a '-' before
4063  the operand. (The behavior for relocatable symbol expressions is a
4064  target-specific behavior for this typically target-independent modifier)
4065- ``H``: Print a memory reference with additional offset +8.
4066- ``P``: Print a memory reference or operand for use as the argument of a call
4067  instruction. (E.g. omit ``(rip)``, even though it's PC-relative.)
4068
4069XCore:
4070
4071No additional modifiers.
4072
4073
4074Inline Asm Metadata
4075^^^^^^^^^^^^^^^^^^^
4076
4077The call instructions that wrap inline asm nodes may have a
4078"``!srcloc``" MDNode attached to it that contains a list of constant
4079integers. If present, the code generator will use the integer as the
4080location cookie value when report errors through the ``LLVMContext``
4081error reporting mechanisms. This allows a front-end to correlate backend
4082errors that occur with inline asm back to the source code that produced
4083it. For example:
4084
4085.. code-block:: llvm
4086
4087    call void asm sideeffect "something bad", ""(), !srcloc !42
4088    ...
4089    !42 = !{ i32 1234567 }
4090
4091It is up to the front-end to make sense of the magic numbers it places
4092in the IR. If the MDNode contains multiple constants, the code generator
4093will use the one that corresponds to the line of the asm that the error
4094occurs on.
4095
4096.. _metadata:
4097
4098Metadata
4099========
4100
4101LLVM IR allows metadata to be attached to instructions in the program
4102that can convey extra information about the code to the optimizers and
4103code generator. One example application of metadata is source-level
4104debug information. There are two metadata primitives: strings and nodes.
4105
4106Metadata does not have a type, and is not a value. If referenced from a
4107``call`` instruction, it uses the ``metadata`` type.
4108
4109All metadata are identified in syntax by a exclamation point ('``!``').
4110
4111.. _metadata-string:
4112
4113Metadata Nodes and Metadata Strings
4114-----------------------------------
4115
4116A metadata string is a string surrounded by double quotes. It can
4117contain any character by escaping non-printable characters with
4118"``\xx``" where "``xx``" is the two digit hex code. For example:
4119"``!"test\00"``".
4120
4121Metadata nodes are represented with notation similar to structure
4122constants (a comma separated list of elements, surrounded by braces and
4123preceded by an exclamation point). Metadata nodes can have any values as
4124their operand. For example:
4125
4126.. code-block:: llvm
4127
4128    !{ !"test\00", i32 10}
4129
4130Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
4131
4132.. code-block:: text
4133
4134    !0 = distinct !{!"test\00", i32 10}
4135
4136``distinct`` nodes are useful when nodes shouldn't be merged based on their
4137content. They can also occur when transformations cause uniquing collisions
4138when metadata operands change.
4139
4140A :ref:`named metadata <namedmetadatastructure>` is a collection of
4141metadata nodes, which can be looked up in the module symbol table. For
4142example:
4143
4144.. code-block:: llvm
4145
4146    !foo = !{!4, !3}
4147
4148Metadata can be used as function arguments. Here the ``llvm.dbg.value``
4149intrinsic is using three metadata arguments:
4150
4151.. code-block:: llvm
4152
4153    call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
4154
4155Metadata can be attached to an instruction. Here metadata ``!21`` is attached
4156to the ``add`` instruction using the ``!dbg`` identifier:
4157
4158.. code-block:: llvm
4159
4160    %indvar.next = add i64 %indvar, 1, !dbg !21
4161
4162Metadata can also be attached to a function or a global variable. Here metadata
4163``!22`` is attached to the ``f1`` and ``f2 functions, and the globals ``g1``
4164and ``g2`` using the ``!dbg`` identifier:
4165
4166.. code-block:: llvm
4167
4168    declare !dbg !22 void @f1()
4169    define void @f2() !dbg !22 {
4170      ret void
4171    }
4172
4173    @g1 = global i32 0, !dbg !22
4174    @g2 = external global i32, !dbg !22
4175
4176A transformation is required to drop any metadata attachment that it does not
4177know or know it can't preserve. Currently there is an exception for metadata
4178attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be
4179unconditionally dropped unless the global is itself deleted.
4180
4181Metadata attached to a module using named metadata may not be dropped, with
4182the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
4183
4184More information about specific metadata nodes recognized by the
4185optimizers and code generator is found below.
4186
4187.. _specialized-metadata:
4188
4189Specialized Metadata Nodes
4190^^^^^^^^^^^^^^^^^^^^^^^^^^
4191
4192Specialized metadata nodes are custom data structures in metadata (as opposed
4193to generic tuples). Their fields are labelled, and can be specified in any
4194order.
4195
4196These aren't inherently debug info centric, but currently all the specialized
4197metadata nodes are related to debug info.
4198
4199.. _DICompileUnit:
4200
4201DICompileUnit
4202"""""""""""""
4203
4204``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
4205``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
4206containing the debug info to be emitted along with the compile unit, regardless
4207of code optimizations (some nodes are only emitted if there are references to
4208them from instructions). The ``debugInfoForProfiling:`` field is a boolean
4209indicating whether or not line-table discriminators are updated to provide
4210more-accurate debug info for profiling results.
4211
4212.. code-block:: text
4213
4214    !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
4215                        isOptimized: true, flags: "-O2", runtimeVersion: 2,
4216                        splitDebugFilename: "abc.debug", emissionKind: FullDebug,
4217                        enums: !2, retainedTypes: !3, globals: !4, imports: !5,
4218                        macros: !6, dwoId: 0x0abcd)
4219
4220Compile unit descriptors provide the root scope for objects declared in a
4221specific compilation unit. File descriptors are defined using this scope.  These
4222descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
4223track of global variables, type information, and imported entities (declarations
4224and namespaces).
4225
4226.. _DIFile:
4227
4228DIFile
4229""""""
4230
4231``DIFile`` nodes represent files. The ``filename:`` can include slashes.
4232
4233.. code-block:: none
4234
4235    !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
4236                 checksumkind: CSK_MD5,
4237                 checksum: "000102030405060708090a0b0c0d0e0f")
4238
4239Files are sometimes used in ``scope:`` fields, and are the only valid target
4240for ``file:`` fields.
4241Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1}
4242
4243.. _DIBasicType:
4244
4245DIBasicType
4246"""""""""""
4247
4248``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
4249``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
4250
4251.. code-block:: text
4252
4253    !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
4254                      encoding: DW_ATE_unsigned_char)
4255    !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
4256
4257The ``encoding:`` describes the details of the type. Usually it's one of the
4258following:
4259
4260.. code-block:: text
4261
4262  DW_ATE_address       = 1
4263  DW_ATE_boolean       = 2
4264  DW_ATE_float         = 4
4265  DW_ATE_signed        = 5
4266  DW_ATE_signed_char   = 6
4267  DW_ATE_unsigned      = 7
4268  DW_ATE_unsigned_char = 8
4269
4270.. _DISubroutineType:
4271
4272DISubroutineType
4273""""""""""""""""
4274
4275``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
4276refers to a tuple; the first operand is the return type, while the rest are the
4277types of the formal arguments in order. If the first operand is ``null``, that
4278represents a function with no return value (such as ``void foo() {}`` in C++).
4279
4280.. code-block:: text
4281
4282    !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
4283    !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
4284    !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
4285
4286.. _DIDerivedType:
4287
4288DIDerivedType
4289"""""""""""""
4290
4291``DIDerivedType`` nodes represent types derived from other types, such as
4292qualified types.
4293
4294.. code-block:: text
4295
4296    !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
4297                      encoding: DW_ATE_unsigned_char)
4298    !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
4299                        align: 32)
4300
4301The following ``tag:`` values are valid:
4302
4303.. code-block:: text
4304
4305  DW_TAG_member             = 13
4306  DW_TAG_pointer_type       = 15
4307  DW_TAG_reference_type     = 16
4308  DW_TAG_typedef            = 22
4309  DW_TAG_inheritance        = 28
4310  DW_TAG_ptr_to_member_type = 31
4311  DW_TAG_const_type         = 38
4312  DW_TAG_friend             = 42
4313  DW_TAG_volatile_type      = 53
4314  DW_TAG_restrict_type      = 55
4315  DW_TAG_atomic_type        = 71
4316
4317.. _DIDerivedTypeMember:
4318
4319``DW_TAG_member`` is used to define a member of a :ref:`composite type
4320<DICompositeType>`. The type of the member is the ``baseType:``. The
4321``offset:`` is the member's bit offset.  If the composite type has an ODR
4322``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
4323uniqued based only on its ``name:`` and ``scope:``.
4324
4325``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
4326field of :ref:`composite types <DICompositeType>` to describe parents and
4327friends.
4328
4329``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
4330
4331``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
4332``DW_TAG_volatile_type``, ``DW_TAG_restrict_type`` and ``DW_TAG_atomic_type``
4333are used to qualify the ``baseType:``.
4334
4335Note that the ``void *`` type is expressed as a type derived from NULL.
4336
4337.. _DICompositeType:
4338
4339DICompositeType
4340"""""""""""""""
4341
4342``DICompositeType`` nodes represent types composed of other types, like
4343structures and unions. ``elements:`` points to a tuple of the composed types.
4344
4345If the source language supports ODR, the ``identifier:`` field gives the unique
4346identifier used for type merging between modules.  When specified,
4347:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
4348derived types <DIDerivedTypeMember>` that reference the ODR-type in their
4349``scope:`` change uniquing rules.
4350
4351For a given ``identifier:``, there should only be a single composite type that
4352does not have  ``flags: DIFlagFwdDecl`` set.  LLVM tools that link modules
4353together will unique such definitions at parse time via the ``identifier:``
4354field, even if the nodes are ``distinct``.
4355
4356.. code-block:: text
4357
4358    !0 = !DIEnumerator(name: "SixKind", value: 7)
4359    !1 = !DIEnumerator(name: "SevenKind", value: 7)
4360    !2 = !DIEnumerator(name: "NegEightKind", value: -8)
4361    !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
4362                          line: 2, size: 32, align: 32, identifier: "_M4Enum",
4363                          elements: !{!0, !1, !2})
4364
4365The following ``tag:`` values are valid:
4366
4367.. code-block:: text
4368
4369  DW_TAG_array_type       = 1
4370  DW_TAG_class_type       = 2
4371  DW_TAG_enumeration_type = 4
4372  DW_TAG_structure_type   = 19
4373  DW_TAG_union_type       = 23
4374
4375For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
4376descriptors <DISubrange>`, each representing the range of subscripts at that
4377level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
4378array type is a native packed vector.
4379
4380For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
4381descriptors <DIEnumerator>`, each representing the definition of an enumeration
4382value for the set. All enumeration type descriptors are collected in the
4383``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
4384
4385For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
4386``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
4387<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
4388``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
4389``isDefinition: false``.
4390
4391.. _DISubrange:
4392
4393DISubrange
4394""""""""""
4395
4396``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
4397:ref:`DICompositeType`.
4398
4399- ``count: -1`` indicates an empty array.
4400- ``count: !9`` describes the count with a :ref:`DILocalVariable`.
4401- ``count: !11`` describes the count with a :ref:`DIGlobalVariable`.
4402
4403.. code-block:: llvm
4404
4405    !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
4406    !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
4407    !2 = !DISubrange(count: -1) ; empty array.
4408
4409    ; Scopes used in rest of example
4410    !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
4411    !7 = distinct !DICompileUnit(language: DW_LANG_C99, ...
4412    !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5, ...
4413
4414    ; Use of local variable as count value
4415    !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
4416    !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
4417    !11 = !DISubrange(count !10, lowerBound: 0)
4418
4419    ; Use of global variable as count value
4420    !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
4421    !13 = !DISubrange(count !12, lowerBound: 0)
4422
4423.. _DIEnumerator:
4424
4425DIEnumerator
4426""""""""""""
4427
4428``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
4429variants of :ref:`DICompositeType`.
4430
4431.. code-block:: llvm
4432
4433    !0 = !DIEnumerator(name: "SixKind", value: 7)
4434    !1 = !DIEnumerator(name: "SevenKind", value: 7)
4435    !2 = !DIEnumerator(name: "NegEightKind", value: -8)
4436
4437DITemplateTypeParameter
4438"""""""""""""""""""""""
4439
4440``DITemplateTypeParameter`` nodes represent type parameters to generic source
4441language constructs. They are used (optionally) in :ref:`DICompositeType` and
4442:ref:`DISubprogram` ``templateParams:`` fields.
4443
4444.. code-block:: llvm
4445
4446    !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
4447
4448DITemplateValueParameter
4449""""""""""""""""""""""""
4450
4451``DITemplateValueParameter`` nodes represent value parameters to generic source
4452language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
4453but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
4454``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
4455:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
4456
4457.. code-block:: llvm
4458
4459    !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
4460
4461DINamespace
4462"""""""""""
4463
4464``DINamespace`` nodes represent namespaces in the source language.
4465
4466.. code-block:: llvm
4467
4468    !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
4469
4470.. _DIGlobalVariable:
4471
4472DIGlobalVariable
4473""""""""""""""""
4474
4475``DIGlobalVariable`` nodes represent global variables in the source language.
4476
4477.. code-block:: llvm
4478
4479    !0 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !1,
4480                           file: !2, line: 7, type: !3, isLocal: true,
4481                           isDefinition: false, variable: i32* @foo,
4482                           declaration: !4)
4483
4484All global variables should be referenced by the `globals:` field of a
4485:ref:`compile unit <DICompileUnit>`.
4486
4487.. _DISubprogram:
4488
4489DISubprogram
4490""""""""""""
4491
4492``DISubprogram`` nodes represent functions from the source language. A
4493``DISubprogram`` may be attached to a function definition using ``!dbg``
4494metadata. The ``variables:`` field points at :ref:`variables <DILocalVariable>`
4495that must be retained, even if their IR counterparts are optimized out of
4496the IR. The ``type:`` field must point at an :ref:`DISubroutineType`.
4497
4498.. _DISubprogramDeclaration:
4499
4500When ``isDefinition: false``, subprograms describe a declaration in the type
4501tree as opposed to a definition of a function.  If the scope is a composite
4502type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``,
4503then the subprogram declaration is uniqued based only on its ``linkageName:``
4504and ``scope:``.
4505
4506.. code-block:: text
4507
4508    define void @_Z3foov() !dbg !0 {
4509      ...
4510    }
4511
4512    !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
4513                                file: !2, line: 7, type: !3, isLocal: true,
4514                                isDefinition: true, scopeLine: 8,
4515                                containingType: !4,
4516                                virtuality: DW_VIRTUALITY_pure_virtual,
4517                                virtualIndex: 10, flags: DIFlagPrototyped,
4518                                isOptimized: true, unit: !5, templateParams: !6,
4519                                declaration: !7, variables: !8, thrownTypes: !9)
4520
4521.. _DILexicalBlock:
4522
4523DILexicalBlock
4524""""""""""""""
4525
4526``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
4527<DISubprogram>`. The line number and column numbers are used to distinguish
4528two lexical blocks at same depth. They are valid targets for ``scope:``
4529fields.
4530
4531.. code-block:: text
4532
4533    !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
4534
4535Usually lexical blocks are ``distinct`` to prevent node merging based on
4536operands.
4537
4538.. _DILexicalBlockFile:
4539
4540DILexicalBlockFile
4541""""""""""""""""""
4542
4543``DILexicalBlockFile`` nodes are used to discriminate between sections of a
4544:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
4545indicate textual inclusion, or the ``discriminator:`` field can be used to
4546discriminate between control flow within a single block in the source language.
4547
4548.. code-block:: llvm
4549
4550    !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
4551    !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
4552    !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
4553
4554.. _DILocation:
4555
4556DILocation
4557""""""""""
4558
4559``DILocation`` nodes represent source debug locations. The ``scope:`` field is
4560mandatory, and points at an :ref:`DILexicalBlockFile`, an
4561:ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
4562
4563.. code-block:: llvm
4564
4565    !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
4566
4567.. _DILocalVariable:
4568
4569DILocalVariable
4570"""""""""""""""
4571
4572``DILocalVariable`` nodes represent local variables in the source language. If
4573the ``arg:`` field is set to non-zero, then this variable is a subprogram
4574parameter, and it will be included in the ``variables:`` field of its
4575:ref:`DISubprogram`.
4576
4577.. code-block:: text
4578
4579    !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
4580                          type: !3, flags: DIFlagArtificial)
4581    !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
4582                          type: !3)
4583    !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
4584
4585DIExpression
4586""""""""""""
4587
4588``DIExpression`` nodes represent expressions that are inspired by the DWARF
4589expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
4590(such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
4591referenced LLVM variable relates to the source language variable. Debug
4592intrinsics are interpreted left-to-right: start by pushing the value/address
4593operand of the intrinsic onto a stack, then repeatedly push and evaluate
4594opcodes from the DIExpression until the final variable description is produced.
4595
4596The current supported opcode vocabulary is limited:
4597
4598- ``DW_OP_deref`` dereferences the top of the expression stack.
4599- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
4600  them together and appends the result to the expression stack.
4601- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
4602  the last entry from the second last entry and appends the result to the
4603  expression stack.
4604- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
4605- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
4606  here, respectively) of the variable fragment from the working expression. Note
4607  that contrary to DW_OP_bit_piece, the offset is describing the location
4608  within the described source variable.
4609- ``DW_OP_swap`` swaps top two stack entries.
4610- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
4611  of the stack is treated as an address. The second stack entry is treated as an
4612  address space identifier.
4613- ``DW_OP_stack_value`` marks a constant value.
4614
4615DWARF specifies three kinds of simple location descriptions: Register, memory,
4616and implicit location descriptions.  Note that a location description is
4617defined over certain ranges of a program, i.e the location of a variable may
4618change over the course of the program. Register and memory location
4619descriptions describe the *concrete location* of a source variable (in the
4620sense that a debugger might modify its value), whereas *implicit locations*
4621describe merely the actual *value* of a source variable which might not exist
4622in registers or in memory (see ``DW_OP_stack_value``).
4623
4624A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect
4625value (the address) of a source variable. The first operand of the intrinsic
4626must be an address of some kind. A DIExpression attached to the intrinsic
4627refines this address to produce a concrete location for the source variable.
4628
4629A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable.
4630The first operand of the intrinsic may be a direct or indirect value. A
4631DIExpresion attached to the intrinsic refines the first operand to produce a
4632direct value. For example, if the first operand is an indirect value, it may be
4633necessary to insert ``DW_OP_deref`` into the DIExpresion in order to produce a
4634valid debug intrinsic.
4635
4636.. note::
4637
4638   A DIExpression is interpreted in the same way regardless of which kind of
4639   debug intrinsic it's attached to.
4640
4641.. code-block:: text
4642
4643    !0 = !DIExpression(DW_OP_deref)
4644    !1 = !DIExpression(DW_OP_plus_uconst, 3)
4645    !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
4646    !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
4647    !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
4648    !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
4649    !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
4650
4651DIObjCProperty
4652""""""""""""""
4653
4654``DIObjCProperty`` nodes represent Objective-C property nodes.
4655
4656.. code-block:: llvm
4657
4658    !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
4659                         getter: "getFoo", attributes: 7, type: !2)
4660
4661DIImportedEntity
4662""""""""""""""""
4663
4664``DIImportedEntity`` nodes represent entities (such as modules) imported into a
4665compile unit.
4666
4667.. code-block:: text
4668
4669   !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
4670                          entity: !1, line: 7)
4671
4672DIMacro
4673"""""""
4674
4675``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
4676The ``name:`` field is the macro identifier, followed by macro parameters when
4677defining a function-like macro, and the ``value`` field is the token-string
4678used to expand the macro identifier.
4679
4680.. code-block:: text
4681
4682   !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
4683                 value: "((x) + 1)")
4684   !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
4685
4686DIMacroFile
4687"""""""""""
4688
4689``DIMacroFile`` nodes represent inclusion of source files.
4690The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
4691appear in the included source file.
4692
4693.. code-block:: text
4694
4695   !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
4696                     nodes: !3)
4697
4698'``tbaa``' Metadata
4699^^^^^^^^^^^^^^^^^^^
4700
4701In LLVM IR, memory does not have types, so LLVM's own type system is not
4702suitable for doing type based alias analysis (TBAA). Instead, metadata is
4703added to the IR to describe a type system of a higher level language. This
4704can be used to implement C/C++ strict type aliasing rules, but it can also
4705be used to implement custom alias analysis behavior for other languages.
4706
4707This description of LLVM's TBAA system is broken into two parts:
4708:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
4709:ref:`Representation<tbaa_node_representation>` talks about the metadata
4710encoding of various entities.
4711
4712It is always possible to trace any TBAA node to a "root" TBAA node (details
4713in the :ref:`Representation<tbaa_node_representation>` section).  TBAA
4714nodes with different roots have an unknown aliasing relationship, and LLVM
4715conservatively infers ``MayAlias`` between them.  The rules mentioned in
4716this section only pertain to TBAA nodes living under the same root.
4717
4718.. _tbaa_node_semantics:
4719
4720Semantics
4721"""""""""
4722
4723The TBAA metadata system, referred to as "struct path TBAA" (not to be
4724confused with ``tbaa.struct``), consists of the following high level
4725concepts: *Type Descriptors*, further subdivided into scalar type
4726descriptors and struct type descriptors; and *Access Tags*.
4727
4728**Type descriptors** describe the type system of the higher level language
4729being compiled.  **Scalar type descriptors** describe types that do not
4730contain other types.  Each scalar type has a parent type, which must also
4731be a scalar type or the TBAA root.  Via this parent relation, scalar types
4732within a TBAA root form a tree.  **Struct type descriptors** denote types
4733that contain a sequence of other type descriptors, at known offsets.  These
4734contained type descriptors can either be struct type descriptors themselves
4735or scalar type descriptors.
4736
4737**Access tags** are metadata nodes attached to load and store instructions.
4738Access tags use type descriptors to describe the *location* being accessed
4739in terms of the type system of the higher level language.  Access tags are
4740tuples consisting of a base type, an access type and an offset.  The base
4741type is a scalar type descriptor or a struct type descriptor, the access
4742type is a scalar type descriptor, and the offset is a constant integer.
4743
4744The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
4745things:
4746
4747 * If ``BaseTy`` is a struct type, the tag describes a memory access (load
4748   or store) of a value of type ``AccessTy`` contained in the struct type
4749   ``BaseTy`` at offset ``Offset``.
4750
4751 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
4752   ``AccessTy`` must be the same; and the access tag describes a scalar
4753   access with scalar type ``AccessTy``.
4754
4755We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
4756tuples this way:
4757
4758 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
4759   ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
4760   described in the TBAA metadata.  ``ImmediateParent(BaseTy, Offset)`` is
4761   undefined if ``Offset`` is non-zero.
4762
4763 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
4764   is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
4765   ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
4766   to be relative within that inner type.
4767
4768A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
4769aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
4770Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
4771Offset2)`` via the ``Parent`` relation or vice versa.
4772
4773As a concrete example, the type descriptor graph for the following program
4774
4775.. code-block:: c
4776
4777    struct Inner {
4778      int i;    // offset 0
4779      float f;  // offset 4
4780    };
4781
4782    struct Outer {
4783      float f;  // offset 0
4784      double d; // offset 4
4785      struct Inner inner_a;  // offset 12
4786    };
4787
4788    void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
4789      outer->f = 0;            // tag0: (OuterStructTy, FloatScalarTy, 0)
4790      outer->inner_a.i = 0;    // tag1: (OuterStructTy, IntScalarTy, 12)
4791      outer->inner_a.f = 0.0;  // tag2: (OuterStructTy, FloatScalarTy, 16)
4792      *f = 0.0;                // tag3: (FloatScalarTy, FloatScalarTy, 0)
4793    }
4794
4795is (note that in C and C++, ``char`` can be used to access any arbitrary
4796type):
4797
4798.. code-block:: text
4799
4800    Root = "TBAA Root"
4801    CharScalarTy = ("char", Root, 0)
4802    FloatScalarTy = ("float", CharScalarTy, 0)
4803    DoubleScalarTy = ("double", CharScalarTy, 0)
4804    IntScalarTy = ("int", CharScalarTy, 0)
4805    InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
4806    OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
4807                     (InnerStructTy, 12)}
4808
4809
4810with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
48110)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
4812``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
4813
4814.. _tbaa_node_representation:
4815
4816Representation
4817""""""""""""""
4818
4819The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
4820with exactly one ``MDString`` operand.
4821
4822Scalar type descriptors are represented as an ``MDNode`` s with two
4823operands.  The first operand is an ``MDString`` denoting the name of the
4824struct type.  LLVM does not assign meaning to the value of this operand, it
4825only cares about it being an ``MDString``.  The second operand is an
4826``MDNode`` which points to the parent for said scalar type descriptor,
4827which is either another scalar type descriptor or the TBAA root.  Scalar
4828type descriptors can have an optional third argument, but that must be the
4829constant integer zero.
4830
4831Struct type descriptors are represented as ``MDNode`` s with an odd number
4832of operands greater than 1.  The first operand is an ``MDString`` denoting
4833the name of the struct type.  Like in scalar type descriptors the actual
4834value of this name operand is irrelevant to LLVM.  After the name operand,
4835the struct type descriptors have a sequence of alternating ``MDNode`` and
4836``ConstantInt`` operands.  With N starting from 1, the 2N - 1 th operand,
4837an ``MDNode``, denotes a contained field, and the 2N th operand, a
4838``ConstantInt``, is the offset of the said contained field.  The offsets
4839must be in non-decreasing order.
4840
4841Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
4842The first operand is an ``MDNode`` pointing to the node representing the
4843base type.  The second operand is an ``MDNode`` pointing to the node
4844representing the access type.  The third operand is a ``ConstantInt`` that
4845states the offset of the access.  If a fourth field is present, it must be
4846a ``ConstantInt`` valued at 0 or 1.  If it is 1 then the access tag states
4847that the location being accessed is "constant" (meaning
4848``pointsToConstantMemory`` should return true; see `other useful
4849AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_).  The TBAA root of
4850the access type and the base type of an access tag must be the same, and
4851that is the TBAA root of the access tag.
4852
4853'``tbaa.struct``' Metadata
4854^^^^^^^^^^^^^^^^^^^^^^^^^^
4855
4856The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
4857aggregate assignment operations in C and similar languages, however it
4858is defined to copy a contiguous region of memory, which is more than
4859strictly necessary for aggregate types which contain holes due to
4860padding. Also, it doesn't contain any TBAA information about the fields
4861of the aggregate.
4862
4863``!tbaa.struct`` metadata can describe which memory subregions in a
4864memcpy are padding and what the TBAA tags of the struct are.
4865
4866The current metadata format is very simple. ``!tbaa.struct`` metadata
4867nodes are a list of operands which are in conceptual groups of three.
4868For each group of three, the first operand gives the byte offset of a
4869field in bytes, the second gives its size in bytes, and the third gives
4870its tbaa tag. e.g.:
4871
4872.. code-block:: llvm
4873
4874    !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
4875
4876This describes a struct with two fields. The first is at offset 0 bytes
4877with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
4878and has size 4 bytes and has tbaa tag !2.
4879
4880Note that the fields need not be contiguous. In this example, there is a
48814 byte gap between the two fields. This gap represents padding which
4882does not carry useful data and need not be preserved.
4883
4884'``noalias``' and '``alias.scope``' Metadata
4885^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4886
4887``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
4888noalias memory-access sets. This means that some collection of memory access
4889instructions (loads, stores, memory-accessing calls, etc.) that carry
4890``noalias`` metadata can specifically be specified not to alias with some other
4891collection of memory access instructions that carry ``alias.scope`` metadata.
4892Each type of metadata specifies a list of scopes where each scope has an id and
4893a domain.
4894
4895When evaluating an aliasing query, if for some domain, the set
4896of scopes with that domain in one instruction's ``alias.scope`` list is a
4897subset of (or equal to) the set of scopes for that domain in another
4898instruction's ``noalias`` list, then the two memory accesses are assumed not to
4899alias.
4900
4901Because scopes in one domain don't affect scopes in other domains, separate
4902domains can be used to compose multiple independent noalias sets.  This is
4903used for example during inlining.  As the noalias function parameters are
4904turned into noalias scope metadata, a new domain is used every time the
4905function is inlined.
4906
4907The metadata identifying each domain is itself a list containing one or two
4908entries. The first entry is the name of the domain. Note that if the name is a
4909string then it can be combined across functions and translation units. A
4910self-reference can be used to create globally unique domain names. A
4911descriptive string may optionally be provided as a second list entry.
4912
4913The metadata identifying each scope is also itself a list containing two or
4914three entries. The first entry is the name of the scope. Note that if the name
4915is a string then it can be combined across functions and translation units. A
4916self-reference can be used to create globally unique scope names. A metadata
4917reference to the scope's domain is the second entry. A descriptive string may
4918optionally be provided as a third list entry.
4919
4920For example,
4921
4922.. code-block:: llvm
4923
4924    ; Two scope domains:
4925    !0 = !{!0}
4926    !1 = !{!1}
4927
4928    ; Some scopes in these domains:
4929    !2 = !{!2, !0}
4930    !3 = !{!3, !0}
4931    !4 = !{!4, !1}
4932
4933    ; Some scope lists:
4934    !5 = !{!4} ; A list containing only scope !4
4935    !6 = !{!4, !3, !2}
4936    !7 = !{!3}
4937
4938    ; These two instructions don't alias:
4939    %0 = load float, float* %c, align 4, !alias.scope !5
4940    store float %0, float* %arrayidx.i, align 4, !noalias !5
4941
4942    ; These two instructions also don't alias (for domain !1, the set of scopes
4943    ; in the !alias.scope equals that in the !noalias list):
4944    %2 = load float, float* %c, align 4, !alias.scope !5
4945    store float %2, float* %arrayidx.i2, align 4, !noalias !6
4946
4947    ; These two instructions may alias (for domain !0, the set of scopes in
4948    ; the !noalias list is not a superset of, or equal to, the scopes in the
4949    ; !alias.scope list):
4950    %2 = load float, float* %c, align 4, !alias.scope !6
4951    store float %0, float* %arrayidx.i, align 4, !noalias !7
4952
4953'``fpmath``' Metadata
4954^^^^^^^^^^^^^^^^^^^^^
4955
4956``fpmath`` metadata may be attached to any instruction of floating-point
4957type. It can be used to express the maximum acceptable error in the
4958result of that instruction, in ULPs, thus potentially allowing the
4959compiler to use a more efficient but less accurate method of computing
4960it. ULP is defined as follows:
4961
4962    If ``x`` is a real number that lies between two finite consecutive
4963    floating-point numbers ``a`` and ``b``, without being equal to one
4964    of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
4965    distance between the two non-equal finite floating-point numbers
4966    nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
4967
4968The metadata node shall consist of a single positive float type number
4969representing the maximum relative error, for example:
4970
4971.. code-block:: llvm
4972
4973    !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
4974
4975.. _range-metadata:
4976
4977'``range``' Metadata
4978^^^^^^^^^^^^^^^^^^^^
4979
4980``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
4981integer types. It expresses the possible ranges the loaded value or the value
4982returned by the called function at this call site is in. If the loaded or
4983returned value is not in the specified range, the behavior is undefined. The
4984ranges are represented with a flattened list of integers. The loaded value or
4985the value returned is known to be in the union of the ranges defined by each
4986consecutive pair. Each pair has the following properties:
4987
4988-  The type must match the type loaded by the instruction.
4989-  The pair ``a,b`` represents the range ``[a,b)``.
4990-  Both ``a`` and ``b`` are constants.
4991-  The range is allowed to wrap.
4992-  The range should not represent the full or empty set. That is,
4993   ``a!=b``.
4994
4995In addition, the pairs must be in signed order of the lower bound and
4996they must be non-contiguous.
4997
4998Examples:
4999
5000.. code-block:: llvm
5001
5002      %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1
5003      %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
5004      %c = call i8 @foo(),       !range !2 ; Can only be 0, 1, 3, 4 or 5
5005      %d = invoke i8 @bar() to label %cont
5006             unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
5007    ...
5008    !0 = !{ i8 0, i8 2 }
5009    !1 = !{ i8 255, i8 2 }
5010    !2 = !{ i8 0, i8 2, i8 3, i8 6 }
5011    !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
5012
5013'``absolute_symbol``' Metadata
5014^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5015
5016``absolute_symbol`` metadata may be attached to a global variable
5017declaration. It marks the declaration as a reference to an absolute symbol,
5018which causes the backend to use absolute relocations for the symbol even
5019in position independent code, and expresses the possible ranges that the
5020global variable's *address* (not its value) is in, in the same format as
5021``range`` metadata, with the extension that the pair ``all-ones,all-ones``
5022may be used to represent the full set.
5023
5024Example (assuming 64-bit pointers):
5025
5026.. code-block:: llvm
5027
5028      @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
5029      @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
5030
5031    ...
5032    !0 = !{ i64 0, i64 256 }
5033    !1 = !{ i64 -1, i64 -1 }
5034
5035'``callees``' Metadata
5036^^^^^^^^^^^^^^^^^^^^^^
5037
5038``callees`` metadata may be attached to indirect call sites. If ``callees``
5039metadata is attached to a call site, and any callee is not among the set of
5040functions provided by the metadata, the behavior is undefined. The intent of
5041this metadata is to facilitate optimizations such as indirect-call promotion.
5042For example, in the code below, the call instruction may only target the
5043``add`` or ``sub`` functions:
5044
5045.. code-block:: llvm
5046
5047    %result = call i64 %binop(i64 %x, i64 %y), !callees !0
5048
5049    ...
5050    !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub}
5051
5052'``unpredictable``' Metadata
5053^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5054
5055``unpredictable`` metadata may be attached to any branch or switch
5056instruction. It can be used to express the unpredictability of control
5057flow. Similar to the llvm.expect intrinsic, it may be used to alter
5058optimizations related to compare and branch instructions. The metadata
5059is treated as a boolean value; if it exists, it signals that the branch
5060or switch that it is attached to is completely unpredictable.
5061
5062'``llvm.loop``'
5063^^^^^^^^^^^^^^^
5064
5065It is sometimes useful to attach information to loop constructs. Currently,
5066loop metadata is implemented as metadata attached to the branch instruction
5067in the loop latch block. This type of metadata refer to a metadata node that is
5068guaranteed to be separate for each loop. The loop identifier metadata is
5069specified with the name ``llvm.loop``.
5070
5071The loop identifier metadata is implemented using a metadata that refers to
5072itself to avoid merging it with any other identifier metadata, e.g.,
5073during module linkage or function inlining. That is, each loop should refer
5074to their own identification metadata even if they reside in separate functions.
5075The following example contains loop identifier metadata for two separate loop
5076constructs:
5077
5078.. code-block:: llvm
5079
5080    !0 = !{!0}
5081    !1 = !{!1}
5082
5083The loop identifier metadata can be used to specify additional
5084per-loop metadata. Any operands after the first operand can be treated
5085as user-defined metadata. For example the ``llvm.loop.unroll.count``
5086suggests an unroll factor to the loop unroller:
5087
5088.. code-block:: llvm
5089
5090      br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
5091    ...
5092    !0 = !{!0, !1}
5093    !1 = !{!"llvm.loop.unroll.count", i32 4}
5094
5095'``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
5096^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5097
5098Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
5099used to control per-loop vectorization and interleaving parameters such as
5100vectorization width and interleave count. These metadata should be used in
5101conjunction with ``llvm.loop`` loop identification metadata. The
5102``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
5103optimization hints and the optimizer will only interleave and vectorize loops if
5104it believes it is safe to do so. The ``llvm.mem.parallel_loop_access`` metadata
5105which contains information about loop-carried memory dependencies can be helpful
5106in determining the safety of these transformations.
5107
5108'``llvm.loop.interleave.count``' Metadata
5109^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5110
5111This metadata suggests an interleave count to the loop interleaver.
5112The first operand is the string ``llvm.loop.interleave.count`` and the
5113second operand is an integer specifying the interleave count. For
5114example:
5115
5116.. code-block:: llvm
5117
5118   !0 = !{!"llvm.loop.interleave.count", i32 4}
5119
5120Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
5121multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
5122then the interleave count will be determined automatically.
5123
5124'``llvm.loop.vectorize.enable``' Metadata
5125^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5126
5127This metadata selectively enables or disables vectorization for the loop. The
5128first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
5129is a bit. If the bit operand value is 1 vectorization is enabled. A value of
51300 disables vectorization:
5131
5132.. code-block:: llvm
5133
5134   !0 = !{!"llvm.loop.vectorize.enable", i1 0}
5135   !1 = !{!"llvm.loop.vectorize.enable", i1 1}
5136
5137'``llvm.loop.vectorize.width``' Metadata
5138^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5139
5140This metadata sets the target width of the vectorizer. The first
5141operand is the string ``llvm.loop.vectorize.width`` and the second
5142operand is an integer specifying the width. For example:
5143
5144.. code-block:: llvm
5145
5146   !0 = !{!"llvm.loop.vectorize.width", i32 4}
5147
5148Note that setting ``llvm.loop.vectorize.width`` to 1 disables
5149vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
51500 or if the loop does not have this metadata the width will be
5151determined automatically.
5152
5153'``llvm.loop.unroll``'
5154^^^^^^^^^^^^^^^^^^^^^^
5155
5156Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
5157optimization hints such as the unroll factor. ``llvm.loop.unroll``
5158metadata should be used in conjunction with ``llvm.loop`` loop
5159identification metadata. The ``llvm.loop.unroll`` metadata are only
5160optimization hints and the unrolling will only be performed if the
5161optimizer believes it is safe to do so.
5162
5163'``llvm.loop.unroll.count``' Metadata
5164^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5165
5166This metadata suggests an unroll factor to the loop unroller. The
5167first operand is the string ``llvm.loop.unroll.count`` and the second
5168operand is a positive integer specifying the unroll factor. For
5169example:
5170
5171.. code-block:: llvm
5172
5173   !0 = !{!"llvm.loop.unroll.count", i32 4}
5174
5175If the trip count of the loop is less than the unroll count the loop
5176will be partially unrolled.
5177
5178'``llvm.loop.unroll.disable``' Metadata
5179^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5180
5181This metadata disables loop unrolling. The metadata has a single operand
5182which is the string ``llvm.loop.unroll.disable``. For example:
5183
5184.. code-block:: llvm
5185
5186   !0 = !{!"llvm.loop.unroll.disable"}
5187
5188'``llvm.loop.unroll.runtime.disable``' Metadata
5189^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5190
5191This metadata disables runtime loop unrolling. The metadata has a single
5192operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
5193
5194.. code-block:: llvm
5195
5196   !0 = !{!"llvm.loop.unroll.runtime.disable"}
5197
5198'``llvm.loop.unroll.enable``' Metadata
5199^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5200
5201This metadata suggests that the loop should be fully unrolled if the trip count
5202is known at compile time and partially unrolled if the trip count is not known
5203at compile time. The metadata has a single operand which is the string
5204``llvm.loop.unroll.enable``.  For example:
5205
5206.. code-block:: llvm
5207
5208   !0 = !{!"llvm.loop.unroll.enable"}
5209
5210'``llvm.loop.unroll.full``' Metadata
5211^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5212
5213This metadata suggests that the loop should be unrolled fully. The
5214metadata has a single operand which is the string ``llvm.loop.unroll.full``.
5215For example:
5216
5217.. code-block:: llvm
5218
5219   !0 = !{!"llvm.loop.unroll.full"}
5220
5221'``llvm.loop.unroll_and_jam``'
5222^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5223
5224This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata
5225above, but affect the unroll and jam pass. In addition any loop with
5226``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will
5227disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
5228unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
5229too.)
5230
5231The metadata for unroll and jam otherwise is the same as for ``unroll``.
5232``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and
5233``llvm.loop.unroll_and_jam.count`` do the same as for unroll.
5234``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints
5235and the normal safety checks will still be performed.
5236
5237'``llvm.loop.unroll_and_jam.count``' Metadata
5238^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5239
5240This metadata suggests an unroll and jam factor to use, similarly to
5241``llvm.loop.unroll.count``. The first operand is the string
5242``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer
5243specifying the unroll factor. For example:
5244
5245.. code-block:: llvm
5246
5247   !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4}
5248
5249If the trip count of the loop is less than the unroll count the loop
5250will be partially unroll and jammed.
5251
5252'``llvm.loop.unroll_and_jam.disable``' Metadata
5253^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5254
5255This metadata disables loop unroll and jamming. The metadata has a single
5256operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example:
5257
5258.. code-block:: llvm
5259
5260   !0 = !{!"llvm.loop.unroll_and_jam.disable"}
5261
5262'``llvm.loop.unroll_and_jam.enable``' Metadata
5263^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5264
5265This metadata suggests that the loop should be fully unroll and jammed if the
5266trip count is known at compile time and partially unrolled if the trip count is
5267not known at compile time. The metadata has a single operand which is the
5268string ``llvm.loop.unroll_and_jam.enable``.  For example:
5269
5270.. code-block:: llvm
5271
5272   !0 = !{!"llvm.loop.unroll_and_jam.enable"}
5273
5274'``llvm.loop.licm_versioning.disable``' Metadata
5275^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5276
5277This metadata indicates that the loop should not be versioned for the purpose
5278of enabling loop-invariant code motion (LICM). The metadata has a single operand
5279which is the string ``llvm.loop.licm_versioning.disable``. For example:
5280
5281.. code-block:: llvm
5282
5283   !0 = !{!"llvm.loop.licm_versioning.disable"}
5284
5285'``llvm.loop.distribute.enable``' Metadata
5286^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5287
5288Loop distribution allows splitting a loop into multiple loops.  Currently,
5289this is only performed if the entire loop cannot be vectorized due to unsafe
5290memory dependencies.  The transformation will attempt to isolate the unsafe
5291dependencies into their own loop.
5292
5293This metadata can be used to selectively enable or disable distribution of the
5294loop.  The first operand is the string ``llvm.loop.distribute.enable`` and the
5295second operand is a bit. If the bit operand value is 1 distribution is
5296enabled. A value of 0 disables distribution:
5297
5298.. code-block:: llvm
5299
5300   !0 = !{!"llvm.loop.distribute.enable", i1 0}
5301   !1 = !{!"llvm.loop.distribute.enable", i1 1}
5302
5303This metadata should be used in conjunction with ``llvm.loop`` loop
5304identification metadata.
5305
5306'``llvm.mem``'
5307^^^^^^^^^^^^^^^
5308
5309Metadata types used to annotate memory accesses with information helpful
5310for optimizations are prefixed with ``llvm.mem``.
5311
5312'``llvm.mem.parallel_loop_access``' Metadata
5313^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5314
5315The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier,
5316or metadata containing a list of loop identifiers for nested loops.
5317The metadata is attached to memory accessing instructions and denotes that
5318no loop carried memory dependence exist between it and other instructions denoted
5319with the same loop identifier. The metadata on memory reads also implies that
5320if conversion (i.e. speculative execution within a loop iteration) is safe.
5321
5322Precisely, given two instructions ``m1`` and ``m2`` that both have the
5323``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the
5324set of loops associated with that metadata, respectively, then there is no loop
5325carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and
5326``L2``.
5327
5328As a special case, if all memory accessing instructions in a loop have
5329``llvm.mem.parallel_loop_access`` metadata that refers to that loop, then the
5330loop has no loop carried memory dependences and is considered to be a parallel
5331loop.
5332
5333Note that if not all memory access instructions have such metadata referring to
5334the loop, then the loop is considered not being trivially parallel. Additional
5335memory dependence analysis is required to make that determination. As a fail
5336safe mechanism, this causes loops that were originally parallel to be considered
5337sequential (if optimization passes that are unaware of the parallel semantics
5338insert new memory instructions into the loop body).
5339
5340Example of a loop that is considered parallel due to its correct use of
5341both ``llvm.loop`` and ``llvm.mem.parallel_loop_access``
5342metadata types that refer to the same loop identifier metadata.
5343
5344.. code-block:: llvm
5345
5346   for.body:
5347     ...
5348     %val0 = load i32, i32* %arrayidx, !llvm.mem.parallel_loop_access !0
5349     ...
5350     store i32 %val0, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0
5351     ...
5352     br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
5353
5354   for.end:
5355   ...
5356   !0 = !{!0}
5357
5358It is also possible to have nested parallel loops. In that case the
5359memory accesses refer to a list of loop identifier metadata nodes instead of
5360the loop identifier metadata node directly:
5361
5362.. code-block:: llvm
5363
5364   outer.for.body:
5365     ...
5366     %val1 = load i32, i32* %arrayidx3, !llvm.mem.parallel_loop_access !2
5367     ...
5368     br label %inner.for.body
5369
5370   inner.for.body:
5371     ...
5372     %val0 = load i32, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0
5373     ...
5374     store i32 %val0, i32* %arrayidx2, !llvm.mem.parallel_loop_access !0
5375     ...
5376     br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
5377
5378   inner.for.end:
5379     ...
5380     store i32 %val1, i32* %arrayidx4, !llvm.mem.parallel_loop_access !2
5381     ...
5382     br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
5383
5384   outer.for.end:                                          ; preds = %for.body
5385   ...
5386   !0 = !{!1, !2} ; a list of loop identifiers
5387   !1 = !{!1} ; an identifier for the inner loop
5388   !2 = !{!2} ; an identifier for the outer loop
5389
5390'``irr_loop``' Metadata
5391^^^^^^^^^^^^^^^^^^^^^^^
5392
5393``irr_loop`` metadata may be attached to the terminator instruction of a basic
5394block that's an irreducible loop header (note that an irreducible loop has more
5395than once header basic blocks.) If ``irr_loop`` metadata is attached to the
5396terminator instruction of a basic block that is not really an irreducible loop
5397header, the behavior is undefined. The intent of this metadata is to improve the
5398accuracy of the block frequency propagation. For example, in the code below, the
5399block ``header0`` may have a loop header weight (relative to the other headers of
5400the irreducible loop) of 100:
5401
5402.. code-block:: llvm
5403
5404    header0:
5405    ...
5406    br i1 %cmp, label %t1, label %t2, !irr_loop !0
5407
5408    ...
5409    !0 = !{"loop_header_weight", i64 100}
5410
5411Irreducible loop header weights are typically based on profile data.
5412
5413'``invariant.group``' Metadata
5414^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5415
5416The experimental ``invariant.group`` metadata may be attached to
5417``load``/``store`` instructions referencing a single metadata with no entries.
5418The existence of the ``invariant.group`` metadata on the instruction tells
5419the optimizer that every ``load`` and ``store`` to the same pointer operand
5420can be assumed to load or store the same
5421value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
5422when two pointers are considered the same). Pointers returned by bitcast or
5423getelementptr with only zero indices are considered the same.
5424
5425Examples:
5426
5427.. code-block:: llvm
5428
5429   @unknownPtr = external global i8
5430   ...
5431   %ptr = alloca i8
5432   store i8 42, i8* %ptr, !invariant.group !0
5433   call void @foo(i8* %ptr)
5434
5435   %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
5436   call void @foo(i8* %ptr)
5437
5438   %newPtr = call i8* @getPointer(i8* %ptr)
5439   %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
5440
5441   %unknownValue = load i8, i8* @unknownPtr
5442   store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
5443
5444   call void @foo(i8* %ptr)
5445   %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr)
5446   %d = load i8, i8* %newPtr2, !invariant.group !0  ; Can't step through launder.invariant.group to get value of %ptr
5447
5448   ...
5449   declare void @foo(i8*)
5450   declare i8* @getPointer(i8*)
5451   declare i8* @llvm.launder.invariant.group(i8*)
5452
5453   !0 = !{}
5454
5455The invariant.group metadata must be dropped when replacing one pointer by
5456another based on aliasing information. This is because invariant.group is tied
5457to the SSA value of the pointer operand.
5458
5459.. code-block:: llvm
5460
5461  %v = load i8, i8* %x, !invariant.group !0
5462  ; if %x mustalias %y then we can replace the above instruction with
5463  %v = load i8, i8* %y
5464
5465Note that this is an experimental feature, which means that its semantics might
5466change in the future.
5467
5468'``type``' Metadata
5469^^^^^^^^^^^^^^^^^^^
5470
5471See :doc:`TypeMetadata`.
5472
5473'``associated``' Metadata
5474^^^^^^^^^^^^^^^^^^^^^^^^^
5475
5476The ``associated`` metadata may be attached to a global object
5477declaration with a single argument that references another global object.
5478
5479This metadata prevents discarding of the global object in linker GC
5480unless the referenced object is also discarded. The linker support for
5481this feature is spotty. For best compatibility, globals carrying this
5482metadata may also:
5483
5484- Be in a comdat with the referenced global.
5485- Be in @llvm.compiler.used.
5486- Have an explicit section with a name which is a valid C identifier.
5487
5488It does not have any effect on non-ELF targets.
5489
5490Example:
5491
5492.. code-block:: text
5493
5494    $a = comdat any
5495    @a = global i32 1, comdat $a
5496    @b = internal global i32 2, comdat $a, section "abc", !associated !0
5497    !0 = !{i32* @a}
5498
5499
5500'``prof``' Metadata
5501^^^^^^^^^^^^^^^^^^^
5502
5503The ``prof`` metadata is used to record profile data in the IR.
5504The first operand of the metadata node indicates the profile metadata
5505type. There are currently 3 types:
5506:ref:`branch_weights<prof_node_branch_weights>`,
5507:ref:`function_entry_count<prof_node_function_entry_count>`, and
5508:ref:`VP<prof_node_VP>`.
5509
5510.. _prof_node_branch_weights:
5511
5512branch_weights
5513""""""""""""""
5514
5515Branch weight metadata attached to a branch, select, switch or call instruction
5516represents the likeliness of the associated branch being taken.
5517For more information, see :doc:`BranchWeightMetadata`.
5518
5519.. _prof_node_function_entry_count:
5520
5521function_entry_count
5522""""""""""""""""""""
5523
5524Function entry count metadata can be attached to function definitions
5525to record the number of times the function is called. Used with BFI
5526information, it is also used to derive the basic block profile count.
5527For more information, see :doc:`BranchWeightMetadata`.
5528
5529.. _prof_node_VP:
5530
5531VP
5532""
5533
5534VP (value profile) metadata can be attached to instructions that have
5535value profile information. Currently this is indirect calls (where it
5536records the hottest callees) and calls to memory intrinsics such as memcpy,
5537memmove, and memset (where it records the hottest byte lengths).
5538
5539Each VP metadata node contains "VP" string, then a uint32_t value for the value
5540profiling kind, a uint64_t value for the total number of times the instruction
5541is executed, followed by uint64_t value and execution count pairs.
5542The value profiling kind is 0 for indirect call targets and 1 for memory
5543operations. For indirect call targets, each profile value is a hash
5544of the callee function name, and for memory operations each value is the
5545byte length.
5546
5547Note that the value counts do not need to add up to the total count
5548listed in the third operand (in practice only the top hottest values
5549are tracked and reported).
5550
5551Indirect call example:
5552
5553.. code-block:: llvm
5554
5555    call void %f(), !prof !1
5556    !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
5557
5558Note that the VP type is 0 (the second operand), which indicates this is
5559an indirect call value profile data. The third operand indicates that the
5560indirect call executed 1600 times. The 4th and 6th operands give the
5561hashes of the 2 hottest target functions' names (this is the same hash used
5562to represent function names in the profile database), and the 5th and 7th
5563operands give the execution count that each of the respective prior target
5564functions was called.
5565
5566Module Flags Metadata
5567=====================
5568
5569Information about the module as a whole is difficult to convey to LLVM's
5570subsystems. The LLVM IR isn't sufficient to transmit this information.
5571The ``llvm.module.flags`` named metadata exists in order to facilitate
5572this. These flags are in the form of key / value pairs --- much like a
5573dictionary --- making it easy for any subsystem who cares about a flag to
5574look it up.
5575
5576The ``llvm.module.flags`` metadata contains a list of metadata triplets.
5577Each triplet has the following form:
5578
5579-  The first element is a *behavior* flag, which specifies the behavior
5580   when two (or more) modules are merged together, and it encounters two
5581   (or more) metadata with the same ID. The supported behaviors are
5582   described below.
5583-  The second element is a metadata string that is a unique ID for the
5584   metadata. Each module may only have one flag entry for each unique ID (not
5585   including entries with the **Require** behavior).
5586-  The third element is the value of the flag.
5587
5588When two (or more) modules are merged together, the resulting
5589``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
5590each unique metadata ID string, there will be exactly one entry in the merged
5591modules ``llvm.module.flags`` metadata table, and the value for that entry will
5592be determined by the merge behavior flag, as described below. The only exception
5593is that entries with the *Require* behavior are always preserved.
5594
5595The following behaviors are supported:
5596
5597.. list-table::
5598   :header-rows: 1
5599   :widths: 10 90
5600
5601   * - Value
5602     - Behavior
5603
5604   * - 1
5605     - **Error**
5606           Emits an error if two values disagree, otherwise the resulting value
5607           is that of the operands.
5608
5609   * - 2
5610     - **Warning**
5611           Emits a warning if two values disagree. The result value will be the
5612           operand for the flag from the first module being linked.
5613
5614   * - 3
5615     - **Require**
5616           Adds a requirement that another module flag be present and have a
5617           specified value after linking is performed. The value must be a
5618           metadata pair, where the first element of the pair is the ID of the
5619           module flag to be restricted, and the second element of the pair is
5620           the value the module flag should be restricted to. This behavior can
5621           be used to restrict the allowable results (via triggering of an
5622           error) of linking IDs with the **Override** behavior.
5623
5624   * - 4
5625     - **Override**
5626           Uses the specified value, regardless of the behavior or value of the
5627           other module. If both modules specify **Override**, but the values
5628           differ, an error will be emitted.
5629
5630   * - 5
5631     - **Append**
5632           Appends the two values, which are required to be metadata nodes.
5633
5634   * - 6
5635     - **AppendUnique**
5636           Appends the two values, which are required to be metadata
5637           nodes. However, duplicate entries in the second list are dropped
5638           during the append operation.
5639
5640   * - 7
5641     - **Max**
5642           Takes the max of the two values, which are required to be integers.
5643
5644It is an error for a particular unique flag ID to have multiple behaviors,
5645except in the case of **Require** (which adds restrictions on another metadata
5646value) or **Override**.
5647
5648An example of module flags:
5649
5650.. code-block:: llvm
5651
5652    !0 = !{ i32 1, !"foo", i32 1 }
5653    !1 = !{ i32 4, !"bar", i32 37 }
5654    !2 = !{ i32 2, !"qux", i32 42 }
5655    !3 = !{ i32 3, !"qux",
5656      !{
5657        !"foo", i32 1
5658      }
5659    }
5660    !llvm.module.flags = !{ !0, !1, !2, !3 }
5661
5662-  Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
5663   if two or more ``!"foo"`` flags are seen is to emit an error if their
5664   values are not equal.
5665
5666-  Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
5667   behavior if two or more ``!"bar"`` flags are seen is to use the value
5668   '37'.
5669
5670-  Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
5671   behavior if two or more ``!"qux"`` flags are seen is to emit a
5672   warning if their values are not equal.
5673
5674-  Metadata ``!3`` has the ID ``!"qux"`` and the value:
5675
5676   ::
5677
5678       !{ !"foo", i32 1 }
5679
5680   The behavior is to emit an error if the ``llvm.module.flags`` does not
5681   contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
5682   performed.
5683
5684Objective-C Garbage Collection Module Flags Metadata
5685----------------------------------------------------
5686
5687On the Mach-O platform, Objective-C stores metadata about garbage
5688collection in a special section called "image info". The metadata
5689consists of a version number and a bitmask specifying what types of
5690garbage collection are supported (if any) by the file. If two or more
5691modules are linked together their garbage collection metadata needs to
5692be merged rather than appended together.
5693
5694The Objective-C garbage collection module flags metadata consists of the
5695following key-value pairs:
5696
5697.. list-table::
5698   :header-rows: 1
5699   :widths: 30 70
5700
5701   * - Key
5702     - Value
5703
5704   * - ``Objective-C Version``
5705     - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
5706
5707   * - ``Objective-C Image Info Version``
5708     - **[Required]** --- The version of the image info section. Currently
5709       always 0.
5710
5711   * - ``Objective-C Image Info Section``
5712     - **[Required]** --- The section to place the metadata. Valid values are
5713       ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
5714       ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
5715       Objective-C ABI version 2.
5716
5717   * - ``Objective-C Garbage Collection``
5718     - **[Required]** --- Specifies whether garbage collection is supported or
5719       not. Valid values are 0, for no garbage collection, and 2, for garbage
5720       collection supported.
5721
5722   * - ``Objective-C GC Only``
5723     - **[Optional]** --- Specifies that only garbage collection is supported.
5724       If present, its value must be 6. This flag requires that the
5725       ``Objective-C Garbage Collection`` flag have the value 2.
5726
5727Some important flag interactions:
5728
5729-  If a module with ``Objective-C Garbage Collection`` set to 0 is
5730   merged with a module with ``Objective-C Garbage Collection`` set to
5731   2, then the resulting module has the
5732   ``Objective-C Garbage Collection`` flag set to 0.
5733-  A module with ``Objective-C Garbage Collection`` set to 0 cannot be
5734   merged with a module with ``Objective-C GC Only`` set to 6.
5735
5736C type width Module Flags Metadata
5737----------------------------------
5738
5739The ARM backend emits a section into each generated object file describing the
5740options that it was compiled with (in a compiler-independent way) to prevent
5741linking incompatible objects, and to allow automatic library selection. Some
5742of these options are not visible at the IR level, namely wchar_t width and enum
5743width.
5744
5745To pass this information to the backend, these options are encoded in module
5746flags metadata, using the following key-value pairs:
5747
5748.. list-table::
5749   :header-rows: 1
5750   :widths: 30 70
5751
5752   * - Key
5753     - Value
5754
5755   * - short_wchar
5756     - * 0 --- sizeof(wchar_t) == 4
5757       * 1 --- sizeof(wchar_t) == 2
5758
5759   * - short_enum
5760     - * 0 --- Enums are at least as large as an ``int``.
5761       * 1 --- Enums are stored in the smallest integer type which can
5762         represent all of its values.
5763
5764For example, the following metadata section specifies that the module was
5765compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
5766enum is the smallest type which can represent all of its values::
5767
5768    !llvm.module.flags = !{!0, !1}
5769    !0 = !{i32 1, !"short_wchar", i32 1}
5770    !1 = !{i32 1, !"short_enum", i32 0}
5771
5772Automatic Linker Flags Named Metadata
5773=====================================
5774
5775Some targets support embedding flags to the linker inside individual object
5776files. Typically this is used in conjunction with language extensions which
5777allow source files to explicitly declare the libraries they depend on, and have
5778these automatically be transmitted to the linker via object files.
5779
5780These flags are encoded in the IR using named metadata with the name
5781``!llvm.linker.options``. Each operand is expected to be a metadata node
5782which should be a list of other metadata nodes, each of which should be a
5783list of metadata strings defining linker options.
5784
5785For example, the following metadata section specifies two separate sets of
5786linker options, presumably to link against ``libz`` and the ``Cocoa``
5787framework::
5788
5789    !0 = !{ !"-lz" },
5790    !1 = !{ !"-framework", !"Cocoa" } } }
5791    !llvm.linker.options = !{ !0, !1 }
5792
5793The metadata encoding as lists of lists of options, as opposed to a collapsed
5794list of options, is chosen so that the IR encoding can use multiple option
5795strings to specify e.g., a single library, while still having that specifier be
5796preserved as an atomic element that can be recognized by a target specific
5797assembly writer or object file emitter.
5798
5799Each individual option is required to be either a valid option for the target's
5800linker, or an option that is reserved by the target specific assembly writer or
5801object file emitter. No other aspect of these options is defined by the IR.
5802
5803.. _summary:
5804
5805ThinLTO Summary
5806===============
5807
5808Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
5809causes the building of a compact summary of the module that is emitted into
5810the bitcode. The summary is emitted into the LLVM assembly and identified
5811in syntax by a caret ('``^``').
5812
5813*Note that temporarily the summary entries are skipped when parsing the
5814assembly, although the parsing support is actively being implemented. The
5815following describes when the summary entries will be parsed once implemented.*
5816The summary will be parsed into a ModuleSummaryIndex object under the
5817same conditions where summary index is currently built from bitcode.
5818Specifically, tools that test the Thin Link portion of a ThinLTO compile
5819(i.e. llvm-lto and llvm-lto2), or when parsing a combined index
5820for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag.
5821Additionally, it will be parsed into a bitcode output, along with the Module
5822IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
5823of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
5824summary entries (just as they currently ignore summary entries in a bitcode
5825input file).
5826
5827There are currently 3 types of summary entries in the LLVM assembly:
5828:ref:`module paths<module_path_summary>`,
5829:ref:`global values<gv_summary>`, and
5830:ref:`type identifiers<typeid_summary>`.
5831
5832.. _module_path_summary:
5833
5834Module Path Summary Entry
5835-------------------------
5836
5837Each module path summary entry lists a module containing global values included
5838in the summary. For a single IR module there will be one such entry, but
5839in a combined summary index produced during the thin link, there will be
5840one module path entry per linked module with summary.
5841
5842Example:
5843
5844.. code-block:: llvm
5845
5846    ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
5847
5848The ``path`` field is a string path to the bitcode file, and the ``hash``
5849field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
5850incremental builds and caching.
5851
5852.. _gv_summary:
5853
5854Global Value Summary Entry
5855--------------------------
5856
5857Each global value summary entry corresponds to a global value defined or
5858referenced by a summarized module.
5859
5860Example:
5861
5862.. code-block:: llvm
5863
5864    ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
5865
5866For declarations, there will not be a summary list. For definitions, a
5867global value will contain a list of summaries, one per module containing
5868a definition. There can be multiple entries in a combined summary index
5869for symbols with weak linkage.
5870
5871Each ``Summary`` format will depend on whether the global value is a
5872:ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
5873:ref:`alias<alias_summary>`.
5874
5875.. _function_summary:
5876
5877Function Summary
5878^^^^^^^^^^^^^^^^
5879
5880If the global value is a function, the ``Summary`` entry will look like:
5881
5882.. code-block:: llvm
5883
5884    function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Refs]?
5885
5886The ``module`` field includes the summary entry id for the module containing
5887this definition, and the ``flags`` field contains information such as
5888the linkage type, a flag indicating whether it is legal to import the
5889definition, whether it is globally live and whether the linker resolved it
5890to a local definition (the latter two are populated during the thin link).
5891The ``insts`` field contains the number of IR instructions in the function.
5892Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
5893:ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
5894:ref:`Refs<refs_summary>`.
5895
5896.. _variable_summary:
5897
5898Global Variable Summary
5899^^^^^^^^^^^^^^^^^^^^^^^
5900
5901If the global value is a variable, the ``Summary`` entry will look like:
5902
5903.. code-block:: llvm
5904
5905    variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
5906
5907The variable entry contains a subset of the fields in a
5908:ref:`function summary <function_summary>`, see the descriptions there.
5909
5910.. _alias_summary:
5911
5912Alias Summary
5913^^^^^^^^^^^^^
5914
5915If the global value is an alias, the ``Summary`` entry will look like:
5916
5917.. code-block:: llvm
5918
5919    alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
5920
5921The ``module`` and ``flags`` fields are as described for a
5922:ref:`function summary <function_summary>`. The ``aliasee`` field
5923contains a reference to the global value summary entry of the aliasee.
5924
5925.. _funcflags_summary:
5926
5927Function Flags
5928^^^^^^^^^^^^^^
5929
5930The optional ``FuncFlags`` field looks like:
5931
5932.. code-block:: llvm
5933
5934    funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0)
5935
5936If unspecified, flags are assumed to hold the conservative ``false`` value of
5937``0``.
5938
5939.. _calls_summary:
5940
5941Calls
5942^^^^^
5943
5944The optional ``Calls`` field looks like:
5945
5946.. code-block:: llvm
5947
5948    calls: ((Callee)[, (Callee)]*)
5949
5950where each ``Callee`` looks like:
5951
5952.. code-block:: llvm
5953
5954    callee: ^1[, hotness: None]?[, relbf: 0]?
5955
5956The ``callee`` refers to the summary entry id of the callee. At most one
5957of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
5958``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
5959branch frequency relative to the entry frequency, scaled down by 2^8)
5960may be specified. The defaults are ``Unknown`` and ``0``, respectively.
5961
5962.. _refs_summary:
5963
5964Refs
5965^^^^
5966
5967The optional ``Refs`` field looks like:
5968
5969.. code-block:: llvm
5970
5971    refs: ((Ref)[, (Ref)]*)
5972
5973where each ``Ref`` contains a reference to the summary id of the referenced
5974value (e.g. ``^1``).
5975
5976.. _typeidinfo_summary:
5977
5978TypeIdInfo
5979^^^^^^^^^^
5980
5981The optional ``TypeIdInfo`` field, used for
5982`Control Flow Integrity <http://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
5983looks like:
5984
5985.. code-block:: llvm
5986
5987    typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
5988
5989These optional fields have the following forms:
5990
5991TypeTests
5992"""""""""
5993
5994.. code-block:: llvm
5995
5996    typeTests: (TypeIdRef[, TypeIdRef]*)
5997
5998Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
5999by summary id or ``GUID``.
6000
6001TypeTestAssumeVCalls
6002""""""""""""""""""""
6003
6004.. code-block:: llvm
6005
6006    typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
6007
6008Where each VFuncId has the format:
6009
6010.. code-block:: llvm
6011
6012    vFuncId: (TypeIdRef, offset: 16)
6013
6014Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
6015by summary id or ``GUID`` preceeded by a ``guid:`` tag.
6016
6017TypeCheckedLoadVCalls
6018"""""""""""""""""""""
6019
6020.. code-block:: llvm
6021
6022    typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
6023
6024Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
6025
6026TypeTestAssumeConstVCalls
6027"""""""""""""""""""""""""
6028
6029.. code-block:: llvm
6030
6031    typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
6032
6033Where each ConstVCall has the format:
6034
6035.. code-block:: llvm
6036
6037    VFuncId, args: (Arg[, Arg]*)
6038
6039and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
6040and each Arg is an integer argument number.
6041
6042TypeCheckedLoadConstVCalls
6043""""""""""""""""""""""""""
6044
6045.. code-block:: llvm
6046
6047    typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
6048
6049Where each ConstVCall has the format described for
6050``TypeTestAssumeConstVCalls``.
6051
6052.. _typeid_summary:
6053
6054Type ID Summary Entry
6055---------------------
6056
6057Each type id summary entry corresponds to a type identifier resolution
6058which is generated during the LTO link portion of the compile when building
6059with `Control Flow Integrity <http://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
6060so these are only present in a combined summary index.
6061
6062Example:
6063
6064.. code-block:: llvm
6065
6066    ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
6067
6068The ``typeTestRes`` gives the type test resolution ``kind`` (which may
6069be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
6070the ``size-1`` bit width. It is followed by optional flags, which default to 0,
6071and an optional WpdResolutions (whole program devirtualization resolution)
6072field that looks like:
6073
6074.. code-block:: llvm
6075
6076    wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
6077
6078where each entry is a mapping from the given byte offset to the whole-program
6079devirtualization resolution WpdRes, that has one of the following formats:
6080
6081.. code-block:: llvm
6082
6083    wpdRes: (kind: branchFunnel)
6084    wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
6085    wpdRes: (kind: indir)
6086
6087Additionally, each wpdRes has an optional ``resByArg`` field, which
6088describes the resolutions for calls with all constant integer arguments:
6089
6090.. code-block:: llvm
6091
6092    resByArg: (ResByArg[, ResByArg]*)
6093
6094where ResByArg is:
6095
6096.. code-block:: llvm
6097
6098    args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
6099
6100Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
6101or ``VirtualConstProp``. The ``info`` field is only used if the kind
6102is ``UniformRetVal`` (indicates the uniform return value), or
6103``UniqueRetVal`` (holds the return value associated with the unique vtable
6104(0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
6105not support the use of absolute symbols to store constants.
6106
6107.. _intrinsicglobalvariables:
6108
6109Intrinsic Global Variables
6110==========================
6111
6112LLVM has a number of "magic" global variables that contain data that
6113affect code generation or other IR semantics. These are documented here.
6114All globals of this sort should have a section specified as
6115"``llvm.metadata``". This section and all globals that start with
6116"``llvm.``" are reserved for use by LLVM.
6117
6118.. _gv_llvmused:
6119
6120The '``llvm.used``' Global Variable
6121-----------------------------------
6122
6123The ``@llvm.used`` global is an array which has
6124:ref:`appending linkage <linkage_appending>`. This array contains a list of
6125pointers to named global variables, functions and aliases which may optionally
6126have a pointer cast formed of bitcast or getelementptr. For example, a legal
6127use of it is:
6128
6129.. code-block:: llvm
6130
6131    @X = global i8 4
6132    @Y = global i32 123
6133
6134    @llvm.used = appending global [2 x i8*] [
6135       i8* @X,
6136       i8* bitcast (i32* @Y to i8*)
6137    ], section "llvm.metadata"
6138
6139If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
6140and linker are required to treat the symbol as if there is a reference to the
6141symbol that it cannot see (which is why they have to be named). For example, if
6142a variable has internal linkage and no references other than that from the
6143``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
6144references from inline asms and other things the compiler cannot "see", and
6145corresponds to "``attribute((used))``" in GNU C.
6146
6147On some targets, the code generator must emit a directive to the
6148assembler or object file to prevent the assembler and linker from
6149molesting the symbol.
6150
6151.. _gv_llvmcompilerused:
6152
6153The '``llvm.compiler.used``' Global Variable
6154--------------------------------------------
6155
6156The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
6157directive, except that it only prevents the compiler from touching the
6158symbol. On targets that support it, this allows an intelligent linker to
6159optimize references to the symbol without being impeded as it would be
6160by ``@llvm.used``.
6161
6162This is a rare construct that should only be used in rare circumstances,
6163and should not be exposed to source languages.
6164
6165.. _gv_llvmglobalctors:
6166
6167The '``llvm.global_ctors``' Global Variable
6168-------------------------------------------
6169
6170.. code-block:: llvm
6171
6172    %0 = type { i32, void ()*, i8* }
6173    @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
6174
6175The ``@llvm.global_ctors`` array contains a list of constructor
6176functions, priorities, and an optional associated global or function.
6177The functions referenced by this array will be called in ascending order
6178of priority (i.e. lowest first) when the module is loaded. The order of
6179functions with the same priority is not defined.
6180
6181If the third field is present, non-null, and points to a global variable
6182or function, the initializer function will only run if the associated
6183data from the current module is not discarded.
6184
6185.. _llvmglobaldtors:
6186
6187The '``llvm.global_dtors``' Global Variable
6188-------------------------------------------
6189
6190.. code-block:: llvm
6191
6192    %0 = type { i32, void ()*, i8* }
6193    @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }]
6194
6195The ``@llvm.global_dtors`` array contains a list of destructor
6196functions, priorities, and an optional associated global or function.
6197The functions referenced by this array will be called in descending
6198order of priority (i.e. highest first) when the module is unloaded. The
6199order of functions with the same priority is not defined.
6200
6201If the third field is present, non-null, and points to a global variable
6202or function, the destructor function will only run if the associated
6203data from the current module is not discarded.
6204
6205Instruction Reference
6206=====================
6207
6208The LLVM instruction set consists of several different classifications
6209of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
6210instructions <binaryops>`, :ref:`bitwise binary
6211instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
6212:ref:`other instructions <otherops>`.
6213
6214.. _terminators:
6215
6216Terminator Instructions
6217-----------------------
6218
6219As mentioned :ref:`previously <functionstructure>`, every basic block in a
6220program ends with a "Terminator" instruction, which indicates which
6221block should be executed after the current block is finished. These
6222terminator instructions typically yield a '``void``' value: they produce
6223control flow, not values (the one exception being the
6224':ref:`invoke <i_invoke>`' instruction).
6225
6226The terminator instructions are: ':ref:`ret <i_ret>`',
6227':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
6228':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
6229':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
6230':ref:`catchret <i_catchret>`',
6231':ref:`cleanupret <i_cleanupret>`',
6232and ':ref:`unreachable <i_unreachable>`'.
6233
6234.. _i_ret:
6235
6236'``ret``' Instruction
6237^^^^^^^^^^^^^^^^^^^^^
6238
6239Syntax:
6240"""""""
6241
6242::
6243
6244      ret <type> <value>       ; Return a value from a non-void function
6245      ret void                 ; Return from void function
6246
6247Overview:
6248"""""""""
6249
6250The '``ret``' instruction is used to return control flow (and optionally
6251a value) from a function back to the caller.
6252
6253There are two forms of the '``ret``' instruction: one that returns a
6254value and then causes control flow, and one that just causes control
6255flow to occur.
6256
6257Arguments:
6258""""""""""
6259
6260The '``ret``' instruction optionally accepts a single argument, the
6261return value. The type of the return value must be a ':ref:`first
6262class <t_firstclass>`' type.
6263
6264A function is not :ref:`well formed <wellformed>` if it it has a non-void
6265return type and contains a '``ret``' instruction with no return value or
6266a return value with a type that does not match its type, or if it has a
6267void return type and contains a '``ret``' instruction with a return
6268value.
6269
6270Semantics:
6271""""""""""
6272
6273When the '``ret``' instruction is executed, control flow returns back to
6274the calling function's context. If the caller is a
6275":ref:`call <i_call>`" instruction, execution continues at the
6276instruction after the call. If the caller was an
6277":ref:`invoke <i_invoke>`" instruction, execution continues at the
6278beginning of the "normal" destination block. If the instruction returns
6279a value, that value shall set the call or invoke instruction's return
6280value.
6281
6282Example:
6283""""""""
6284
6285.. code-block:: llvm
6286
6287      ret i32 5                       ; Return an integer value of 5
6288      ret void                        ; Return from a void function
6289      ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
6290
6291.. _i_br:
6292
6293'``br``' Instruction
6294^^^^^^^^^^^^^^^^^^^^
6295
6296Syntax:
6297"""""""
6298
6299::
6300
6301      br i1 <cond>, label <iftrue>, label <iffalse>
6302      br label <dest>          ; Unconditional branch
6303
6304Overview:
6305"""""""""
6306
6307The '``br``' instruction is used to cause control flow to transfer to a
6308different basic block in the current function. There are two forms of
6309this instruction, corresponding to a conditional branch and an
6310unconditional branch.
6311
6312Arguments:
6313""""""""""
6314
6315The conditional branch form of the '``br``' instruction takes a single
6316'``i1``' value and two '``label``' values. The unconditional form of the
6317'``br``' instruction takes a single '``label``' value as a target.
6318
6319Semantics:
6320""""""""""
6321
6322Upon execution of a conditional '``br``' instruction, the '``i1``'
6323argument is evaluated. If the value is ``true``, control flows to the
6324'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
6325to the '``iffalse``' ``label`` argument.
6326
6327Example:
6328""""""""
6329
6330.. code-block:: llvm
6331
6332    Test:
6333      %cond = icmp eq i32 %a, %b
6334      br i1 %cond, label %IfEqual, label %IfUnequal
6335    IfEqual:
6336      ret i32 1
6337    IfUnequal:
6338      ret i32 0
6339
6340.. _i_switch:
6341
6342'``switch``' Instruction
6343^^^^^^^^^^^^^^^^^^^^^^^^
6344
6345Syntax:
6346"""""""
6347
6348::
6349
6350      switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
6351
6352Overview:
6353"""""""""
6354
6355The '``switch``' instruction is used to transfer control flow to one of
6356several different places. It is a generalization of the '``br``'
6357instruction, allowing a branch to occur to one of many possible
6358destinations.
6359
6360Arguments:
6361""""""""""
6362
6363The '``switch``' instruction uses three parameters: an integer
6364comparison value '``value``', a default '``label``' destination, and an
6365array of pairs of comparison value constants and '``label``'s. The table
6366is not allowed to contain duplicate constant entries.
6367
6368Semantics:
6369""""""""""
6370
6371The ``switch`` instruction specifies a table of values and destinations.
6372When the '``switch``' instruction is executed, this table is searched
6373for the given value. If the value is found, control flow is transferred
6374to the corresponding destination; otherwise, control flow is transferred
6375to the default destination.
6376
6377Implementation:
6378"""""""""""""""
6379
6380Depending on properties of the target machine and the particular
6381``switch`` instruction, this instruction may be code generated in
6382different ways. For example, it could be generated as a series of
6383chained conditional branches or with a lookup table.
6384
6385Example:
6386""""""""
6387
6388.. code-block:: llvm
6389
6390     ; Emulate a conditional br instruction
6391     %Val = zext i1 %value to i32
6392     switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
6393
6394     ; Emulate an unconditional br instruction
6395     switch i32 0, label %dest [ ]
6396
6397     ; Implement a jump table:
6398     switch i32 %val, label %otherwise [ i32 0, label %onzero
6399                                         i32 1, label %onone
6400                                         i32 2, label %ontwo ]
6401
6402.. _i_indirectbr:
6403
6404'``indirectbr``' Instruction
6405^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6406
6407Syntax:
6408"""""""
6409
6410::
6411
6412      indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
6413
6414Overview:
6415"""""""""
6416
6417The '``indirectbr``' instruction implements an indirect branch to a
6418label within the current function, whose address is specified by
6419"``address``". Address must be derived from a
6420:ref:`blockaddress <blockaddress>` constant.
6421
6422Arguments:
6423""""""""""
6424
6425The '``address``' argument is the address of the label to jump to. The
6426rest of the arguments indicate the full set of possible destinations
6427that the address may point to. Blocks are allowed to occur multiple
6428times in the destination list, though this isn't particularly useful.
6429
6430This destination list is required so that dataflow analysis has an
6431accurate understanding of the CFG.
6432
6433Semantics:
6434""""""""""
6435
6436Control transfers to the block specified in the address argument. All
6437possible destination blocks must be listed in the label list, otherwise
6438this instruction has undefined behavior. This implies that jumps to
6439labels defined in other functions have undefined behavior as well.
6440
6441Implementation:
6442"""""""""""""""
6443
6444This is typically implemented with a jump through a register.
6445
6446Example:
6447""""""""
6448
6449.. code-block:: llvm
6450
6451     indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
6452
6453.. _i_invoke:
6454
6455'``invoke``' Instruction
6456^^^^^^^^^^^^^^^^^^^^^^^^
6457
6458Syntax:
6459"""""""
6460
6461::
6462
6463      <result> = invoke [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
6464                    [operand bundles] to label <normal label> unwind label <exception label>
6465
6466Overview:
6467"""""""""
6468
6469The '``invoke``' instruction causes control to transfer to a specified
6470function, with the possibility of control flow transfer to either the
6471'``normal``' label or the '``exception``' label. If the callee function
6472returns with the "``ret``" instruction, control flow will return to the
6473"normal" label. If the callee (or any indirect callees) returns via the
6474":ref:`resume <i_resume>`" instruction or other exception handling
6475mechanism, control is interrupted and continued at the dynamically
6476nearest "exception" label.
6477
6478The '``exception``' label is a `landing
6479pad <ExceptionHandling.html#overview>`_ for the exception. As such,
6480'``exception``' label is required to have the
6481":ref:`landingpad <i_landingpad>`" instruction, which contains the
6482information about the behavior of the program after unwinding happens,
6483as its first non-PHI instruction. The restrictions on the
6484"``landingpad``" instruction's tightly couples it to the "``invoke``"
6485instruction, so that the important information contained within the
6486"``landingpad``" instruction can't be lost through normal code motion.
6487
6488Arguments:
6489""""""""""
6490
6491This instruction requires several arguments:
6492
6493#. The optional "cconv" marker indicates which :ref:`calling
6494   convention <callingconv>` the call should use. If none is
6495   specified, the call defaults to using C calling conventions.
6496#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
6497   values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
6498   are valid here.
6499#. '``ty``': the type of the call instruction itself which is also the
6500   type of the return value. Functions that return no value are marked
6501   ``void``.
6502#. '``fnty``': shall be the signature of the function being invoked. The
6503   argument types must match the types implied by this signature. This
6504   type can be omitted if the function is not varargs.
6505#. '``fnptrval``': An LLVM value containing a pointer to a function to
6506   be invoked. In most cases, this is a direct function invocation, but
6507   indirect ``invoke``'s are just as possible, calling an arbitrary pointer
6508   to function value.
6509#. '``function args``': argument list whose types match the function
6510   signature argument types and parameter attributes. All arguments must
6511   be of :ref:`first class <t_firstclass>` type. If the function signature
6512   indicates the function accepts a variable number of arguments, the
6513   extra arguments can be specified.
6514#. '``normal label``': the label reached when the called function
6515   executes a '``ret``' instruction.
6516#. '``exception label``': the label reached when a callee returns via
6517   the :ref:`resume <i_resume>` instruction or other exception handling
6518   mechanism.
6519#. The optional :ref:`function attributes <fnattrs>` list.
6520#. The optional :ref:`operand bundles <opbundles>` list.
6521
6522Semantics:
6523""""""""""
6524
6525This instruction is designed to operate as a standard '``call``'
6526instruction in most regards. The primary difference is that it
6527establishes an association with a label, which is used by the runtime
6528library to unwind the stack.
6529
6530This instruction is used in languages with destructors to ensure that
6531proper cleanup is performed in the case of either a ``longjmp`` or a
6532thrown exception. Additionally, this is important for implementation of
6533'``catch``' clauses in high-level languages that support them.
6534
6535For the purposes of the SSA form, the definition of the value returned
6536by the '``invoke``' instruction is deemed to occur on the edge from the
6537current block to the "normal" label. If the callee unwinds then no
6538return value is available.
6539
6540Example:
6541""""""""
6542
6543.. code-block:: llvm
6544
6545      %retval = invoke i32 @Test(i32 15) to label %Continue
6546                  unwind label %TestCleanup              ; i32:retval set
6547      %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
6548                  unwind label %TestCleanup              ; i32:retval set
6549
6550.. _i_resume:
6551
6552'``resume``' Instruction
6553^^^^^^^^^^^^^^^^^^^^^^^^
6554
6555Syntax:
6556"""""""
6557
6558::
6559
6560      resume <type> <value>
6561
6562Overview:
6563"""""""""
6564
6565The '``resume``' instruction is a terminator instruction that has no
6566successors.
6567
6568Arguments:
6569""""""""""
6570
6571The '``resume``' instruction requires one argument, which must have the
6572same type as the result of any '``landingpad``' instruction in the same
6573function.
6574
6575Semantics:
6576""""""""""
6577
6578The '``resume``' instruction resumes propagation of an existing
6579(in-flight) exception whose unwinding was interrupted with a
6580:ref:`landingpad <i_landingpad>` instruction.
6581
6582Example:
6583""""""""
6584
6585.. code-block:: llvm
6586
6587      resume { i8*, i32 } %exn
6588
6589.. _i_catchswitch:
6590
6591'``catchswitch``' Instruction
6592^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6593
6594Syntax:
6595"""""""
6596
6597::
6598
6599      <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
6600      <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
6601
6602Overview:
6603"""""""""
6604
6605The '``catchswitch``' instruction is used by `LLVM's exception handling system
6606<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
6607that may be executed by the :ref:`EH personality routine <personalityfn>`.
6608
6609Arguments:
6610""""""""""
6611
6612The ``parent`` argument is the token of the funclet that contains the
6613``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
6614this operand may be the token ``none``.
6615
6616The ``default`` argument is the label of another basic block beginning with
6617either a ``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination
6618must be a legal target with respect to the ``parent`` links, as described in
6619the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
6620
6621The ``handlers`` are a nonempty list of successor blocks that each begin with a
6622:ref:`catchpad <i_catchpad>` instruction.
6623
6624Semantics:
6625""""""""""
6626
6627Executing this instruction transfers control to one of the successors in
6628``handlers``, if appropriate, or continues to unwind via the unwind label if
6629present.
6630
6631The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
6632it must be both the first non-phi instruction and last instruction in the basic
6633block. Therefore, it must be the only non-phi instruction in the block.
6634
6635Example:
6636""""""""
6637
6638.. code-block:: text
6639
6640    dispatch1:
6641      %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
6642    dispatch2:
6643      %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
6644
6645.. _i_catchret:
6646
6647'``catchret``' Instruction
6648^^^^^^^^^^^^^^^^^^^^^^^^^^
6649
6650Syntax:
6651"""""""
6652
6653::
6654
6655      catchret from <token> to label <normal>
6656
6657Overview:
6658"""""""""
6659
6660The '``catchret``' instruction is a terminator instruction that has a
6661single successor.
6662
6663
6664Arguments:
6665""""""""""
6666
6667The first argument to a '``catchret``' indicates which ``catchpad`` it
6668exits.  It must be a :ref:`catchpad <i_catchpad>`.
6669The second argument to a '``catchret``' specifies where control will
6670transfer to next.
6671
6672Semantics:
6673""""""""""
6674
6675The '``catchret``' instruction ends an existing (in-flight) exception whose
6676unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction.  The
6677:ref:`personality function <personalityfn>` gets a chance to execute arbitrary
6678code to, for example, destroy the active exception.  Control then transfers to
6679``normal``.
6680
6681The ``token`` argument must be a token produced by a ``catchpad`` instruction.
6682If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
6683funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
6684the ``catchret``'s behavior is undefined.
6685
6686Example:
6687""""""""
6688
6689.. code-block:: text
6690
6691      catchret from %catch label %continue
6692
6693.. _i_cleanupret:
6694
6695'``cleanupret``' Instruction
6696^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6697
6698Syntax:
6699"""""""
6700
6701::
6702
6703      cleanupret from <value> unwind label <continue>
6704      cleanupret from <value> unwind to caller
6705
6706Overview:
6707"""""""""
6708
6709The '``cleanupret``' instruction is a terminator instruction that has
6710an optional successor.
6711
6712
6713Arguments:
6714""""""""""
6715
6716The '``cleanupret``' instruction requires one argument, which indicates
6717which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
6718If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
6719funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
6720the ``cleanupret``'s behavior is undefined.
6721
6722The '``cleanupret``' instruction also has an optional successor, ``continue``,
6723which must be the label of another basic block beginning with either a
6724``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination must
6725be a legal target with respect to the ``parent`` links, as described in the
6726`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
6727
6728Semantics:
6729""""""""""
6730
6731The '``cleanupret``' instruction indicates to the
6732:ref:`personality function <personalityfn>` that one
6733:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
6734It transfers control to ``continue`` or unwinds out of the function.
6735
6736Example:
6737""""""""
6738
6739.. code-block:: text
6740
6741      cleanupret from %cleanup unwind to caller
6742      cleanupret from %cleanup unwind label %continue
6743
6744.. _i_unreachable:
6745
6746'``unreachable``' Instruction
6747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6748
6749Syntax:
6750"""""""
6751
6752::
6753
6754      unreachable
6755
6756Overview:
6757"""""""""
6758
6759The '``unreachable``' instruction has no defined semantics. This
6760instruction is used to inform the optimizer that a particular portion of
6761the code is not reachable. This can be used to indicate that the code
6762after a no-return function cannot be reached, and other facts.
6763
6764Semantics:
6765""""""""""
6766
6767The '``unreachable``' instruction has no defined semantics.
6768
6769.. _binaryops:
6770
6771Binary Operations
6772-----------------
6773
6774Binary operators are used to do most of the computation in a program.
6775They require two operands of the same type, execute an operation on
6776them, and produce a single value. The operands might represent multiple
6777data, as is the case with the :ref:`vector <t_vector>` data type. The
6778result value has the same type as its operands.
6779
6780There are several different binary operators:
6781
6782.. _i_add:
6783
6784'``add``' Instruction
6785^^^^^^^^^^^^^^^^^^^^^
6786
6787Syntax:
6788"""""""
6789
6790::
6791
6792      <result> = add <ty> <op1>, <op2>          ; yields ty:result
6793      <result> = add nuw <ty> <op1>, <op2>      ; yields ty:result
6794      <result> = add nsw <ty> <op1>, <op2>      ; yields ty:result
6795      <result> = add nuw nsw <ty> <op1>, <op2>  ; yields ty:result
6796
6797Overview:
6798"""""""""
6799
6800The '``add``' instruction returns the sum of its two operands.
6801
6802Arguments:
6803""""""""""
6804
6805The two arguments to the '``add``' instruction must be
6806:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6807arguments must have identical types.
6808
6809Semantics:
6810""""""""""
6811
6812The value produced is the integer sum of the two operands.
6813
6814If the sum has unsigned overflow, the result returned is the
6815mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
6816the result.
6817
6818Because LLVM integers use a two's complement representation, this
6819instruction is appropriate for both signed and unsigned integers.
6820
6821``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
6822respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
6823result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
6824unsigned and/or signed overflow, respectively, occurs.
6825
6826Example:
6827""""""""
6828
6829.. code-block:: text
6830
6831      <result> = add i32 4, %var          ; yields i32:result = 4 + %var
6832
6833.. _i_fadd:
6834
6835'``fadd``' Instruction
6836^^^^^^^^^^^^^^^^^^^^^^
6837
6838Syntax:
6839"""""""
6840
6841::
6842
6843      <result> = fadd [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
6844
6845Overview:
6846"""""""""
6847
6848The '``fadd``' instruction returns the sum of its two operands.
6849
6850Arguments:
6851""""""""""
6852
6853The two arguments to the '``fadd``' instruction must be
6854:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
6855floating-point values. Both arguments must have identical types.
6856
6857Semantics:
6858""""""""""
6859
6860The value produced is the floating-point sum of the two operands.
6861This instruction is assumed to execute in the default :ref:`floating-point
6862environment <floatenv>`.
6863This instruction can also take any number of :ref:`fast-math
6864flags <fastmath>`, which are optimization hints to enable otherwise
6865unsafe floating-point optimizations:
6866
6867Example:
6868""""""""
6869
6870.. code-block:: text
6871
6872      <result> = fadd float 4.0, %var          ; yields float:result = 4.0 + %var
6873
6874'``sub``' Instruction
6875^^^^^^^^^^^^^^^^^^^^^
6876
6877Syntax:
6878"""""""
6879
6880::
6881
6882      <result> = sub <ty> <op1>, <op2>          ; yields ty:result
6883      <result> = sub nuw <ty> <op1>, <op2>      ; yields ty:result
6884      <result> = sub nsw <ty> <op1>, <op2>      ; yields ty:result
6885      <result> = sub nuw nsw <ty> <op1>, <op2>  ; yields ty:result
6886
6887Overview:
6888"""""""""
6889
6890The '``sub``' instruction returns the difference of its two operands.
6891
6892Note that the '``sub``' instruction is used to represent the '``neg``'
6893instruction present in most other intermediate representations.
6894
6895Arguments:
6896""""""""""
6897
6898The two arguments to the '``sub``' instruction must be
6899:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6900arguments must have identical types.
6901
6902Semantics:
6903""""""""""
6904
6905The value produced is the integer difference of the two operands.
6906
6907If the difference has unsigned overflow, the result returned is the
6908mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
6909the result.
6910
6911Because LLVM integers use a two's complement representation, this
6912instruction is appropriate for both signed and unsigned integers.
6913
6914``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
6915respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
6916result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
6917unsigned and/or signed overflow, respectively, occurs.
6918
6919Example:
6920""""""""
6921
6922.. code-block:: text
6923
6924      <result> = sub i32 4, %var          ; yields i32:result = 4 - %var
6925      <result> = sub i32 0, %val          ; yields i32:result = -%var
6926
6927.. _i_fsub:
6928
6929'``fsub``' Instruction
6930^^^^^^^^^^^^^^^^^^^^^^
6931
6932Syntax:
6933"""""""
6934
6935::
6936
6937      <result> = fsub [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
6938
6939Overview:
6940"""""""""
6941
6942The '``fsub``' instruction returns the difference of its two operands.
6943
6944Note that the '``fsub``' instruction is used to represent the '``fneg``'
6945instruction present in most other intermediate representations.
6946
6947Arguments:
6948""""""""""
6949
6950The two arguments to the '``fsub``' instruction must be
6951:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
6952floating-point values. Both arguments must have identical types.
6953
6954Semantics:
6955""""""""""
6956
6957The value produced is the floating-point difference of the two operands.
6958This instruction is assumed to execute in the default :ref:`floating-point
6959environment <floatenv>`.
6960This instruction can also take any number of :ref:`fast-math
6961flags <fastmath>`, which are optimization hints to enable otherwise
6962unsafe floating-point optimizations:
6963
6964Example:
6965""""""""
6966
6967.. code-block:: text
6968
6969      <result> = fsub float 4.0, %var           ; yields float:result = 4.0 - %var
6970      <result> = fsub float -0.0, %val          ; yields float:result = -%var
6971
6972'``mul``' Instruction
6973^^^^^^^^^^^^^^^^^^^^^
6974
6975Syntax:
6976"""""""
6977
6978::
6979
6980      <result> = mul <ty> <op1>, <op2>          ; yields ty:result
6981      <result> = mul nuw <ty> <op1>, <op2>      ; yields ty:result
6982      <result> = mul nsw <ty> <op1>, <op2>      ; yields ty:result
6983      <result> = mul nuw nsw <ty> <op1>, <op2>  ; yields ty:result
6984
6985Overview:
6986"""""""""
6987
6988The '``mul``' instruction returns the product of its two operands.
6989
6990Arguments:
6991""""""""""
6992
6993The two arguments to the '``mul``' instruction must be
6994:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6995arguments must have identical types.
6996
6997Semantics:
6998""""""""""
6999
7000The value produced is the integer product of the two operands.
7001
7002If the result of the multiplication has unsigned overflow, the result
7003returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
7004bit width of the result.
7005
7006Because LLVM integers use a two's complement representation, and the
7007result is the same width as the operands, this instruction returns the
7008correct result for both signed and unsigned integers. If a full product
7009(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
7010sign-extended or zero-extended as appropriate to the width of the full
7011product.
7012
7013``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
7014respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
7015result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
7016unsigned and/or signed overflow, respectively, occurs.
7017
7018Example:
7019""""""""
7020
7021.. code-block:: text
7022
7023      <result> = mul i32 4, %var          ; yields i32:result = 4 * %var
7024
7025.. _i_fmul:
7026
7027'``fmul``' Instruction
7028^^^^^^^^^^^^^^^^^^^^^^
7029
7030Syntax:
7031"""""""
7032
7033::
7034
7035      <result> = fmul [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
7036
7037Overview:
7038"""""""""
7039
7040The '``fmul``' instruction returns the product of its two operands.
7041
7042Arguments:
7043""""""""""
7044
7045The two arguments to the '``fmul``' instruction must be
7046:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
7047floating-point values. Both arguments must have identical types.
7048
7049Semantics:
7050""""""""""
7051
7052The value produced is the floating-point product of the two operands.
7053This instruction is assumed to execute in the default :ref:`floating-point
7054environment <floatenv>`.
7055This instruction can also take any number of :ref:`fast-math
7056flags <fastmath>`, which are optimization hints to enable otherwise
7057unsafe floating-point optimizations:
7058
7059Example:
7060""""""""
7061
7062.. code-block:: text
7063
7064      <result> = fmul float 4.0, %var          ; yields float:result = 4.0 * %var
7065
7066'``udiv``' Instruction
7067^^^^^^^^^^^^^^^^^^^^^^
7068
7069Syntax:
7070"""""""
7071
7072::
7073
7074      <result> = udiv <ty> <op1>, <op2>         ; yields ty:result
7075      <result> = udiv exact <ty> <op1>, <op2>   ; yields ty:result
7076
7077Overview:
7078"""""""""
7079
7080The '``udiv``' instruction returns the quotient of its two operands.
7081
7082Arguments:
7083""""""""""
7084
7085The two arguments to the '``udiv``' instruction must be
7086:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7087arguments must have identical types.
7088
7089Semantics:
7090""""""""""
7091
7092The value produced is the unsigned integer quotient of the two operands.
7093
7094Note that unsigned integer division and signed integer division are
7095distinct operations; for signed integer division, use '``sdiv``'.
7096
7097Division by zero is undefined behavior. For vectors, if any element
7098of the divisor is zero, the operation has undefined behavior.
7099
7100
7101If the ``exact`` keyword is present, the result value of the ``udiv`` is
7102a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
7103such, "((a udiv exact b) mul b) == a").
7104
7105Example:
7106""""""""
7107
7108.. code-block:: text
7109
7110      <result> = udiv i32 4, %var          ; yields i32:result = 4 / %var
7111
7112'``sdiv``' Instruction
7113^^^^^^^^^^^^^^^^^^^^^^
7114
7115Syntax:
7116"""""""
7117
7118::
7119
7120      <result> = sdiv <ty> <op1>, <op2>         ; yields ty:result
7121      <result> = sdiv exact <ty> <op1>, <op2>   ; yields ty:result
7122
7123Overview:
7124"""""""""
7125
7126The '``sdiv``' instruction returns the quotient of its two operands.
7127
7128Arguments:
7129""""""""""
7130
7131The two arguments to the '``sdiv``' instruction must be
7132:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7133arguments must have identical types.
7134
7135Semantics:
7136""""""""""
7137
7138The value produced is the signed integer quotient of the two operands
7139rounded towards zero.
7140
7141Note that signed integer division and unsigned integer division are
7142distinct operations; for unsigned integer division, use '``udiv``'.
7143
7144Division by zero is undefined behavior. For vectors, if any element
7145of the divisor is zero, the operation has undefined behavior.
7146Overflow also leads to undefined behavior; this is a rare case, but can
7147occur, for example, by doing a 32-bit division of -2147483648 by -1.
7148
7149If the ``exact`` keyword is present, the result value of the ``sdiv`` is
7150a :ref:`poison value <poisonvalues>` if the result would be rounded.
7151
7152Example:
7153""""""""
7154
7155.. code-block:: text
7156
7157      <result> = sdiv i32 4, %var          ; yields i32:result = 4 / %var
7158
7159.. _i_fdiv:
7160
7161'``fdiv``' Instruction
7162^^^^^^^^^^^^^^^^^^^^^^
7163
7164Syntax:
7165"""""""
7166
7167::
7168
7169      <result> = fdiv [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
7170
7171Overview:
7172"""""""""
7173
7174The '``fdiv``' instruction returns the quotient of its two operands.
7175
7176Arguments:
7177""""""""""
7178
7179The two arguments to the '``fdiv``' instruction must be
7180:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
7181floating-point values. Both arguments must have identical types.
7182
7183Semantics:
7184""""""""""
7185
7186The value produced is the floating-point quotient of the two operands.
7187This instruction is assumed to execute in the default :ref:`floating-point
7188environment <floatenv>`.
7189This instruction can also take any number of :ref:`fast-math
7190flags <fastmath>`, which are optimization hints to enable otherwise
7191unsafe floating-point optimizations:
7192
7193Example:
7194""""""""
7195
7196.. code-block:: text
7197
7198      <result> = fdiv float 4.0, %var          ; yields float:result = 4.0 / %var
7199
7200'``urem``' Instruction
7201^^^^^^^^^^^^^^^^^^^^^^
7202
7203Syntax:
7204"""""""
7205
7206::
7207
7208      <result> = urem <ty> <op1>, <op2>   ; yields ty:result
7209
7210Overview:
7211"""""""""
7212
7213The '``urem``' instruction returns the remainder from the unsigned
7214division of its two arguments.
7215
7216Arguments:
7217""""""""""
7218
7219The two arguments to the '``urem``' instruction must be
7220:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7221arguments must have identical types.
7222
7223Semantics:
7224""""""""""
7225
7226This instruction returns the unsigned integer *remainder* of a division.
7227This instruction always performs an unsigned division to get the
7228remainder.
7229
7230Note that unsigned integer remainder and signed integer remainder are
7231distinct operations; for signed integer remainder, use '``srem``'.
7232
7233Taking the remainder of a division by zero is undefined behavior.
7234For vectors, if any element of the divisor is zero, the operation has
7235undefined behavior.
7236
7237Example:
7238""""""""
7239
7240.. code-block:: text
7241
7242      <result> = urem i32 4, %var          ; yields i32:result = 4 % %var
7243
7244'``srem``' Instruction
7245^^^^^^^^^^^^^^^^^^^^^^
7246
7247Syntax:
7248"""""""
7249
7250::
7251
7252      <result> = srem <ty> <op1>, <op2>   ; yields ty:result
7253
7254Overview:
7255"""""""""
7256
7257The '``srem``' instruction returns the remainder from the signed
7258division of its two operands. This instruction can also take
7259:ref:`vector <t_vector>` versions of the values in which case the elements
7260must be integers.
7261
7262Arguments:
7263""""""""""
7264
7265The two arguments to the '``srem``' instruction must be
7266:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7267arguments must have identical types.
7268
7269Semantics:
7270""""""""""
7271
7272This instruction returns the *remainder* of a division (where the result
7273is either zero or has the same sign as the dividend, ``op1``), not the
7274*modulo* operator (where the result is either zero or has the same sign
7275as the divisor, ``op2``) of a value. For more information about the
7276difference, see `The Math
7277Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
7278table of how this is implemented in various languages, please see
7279`Wikipedia: modulo
7280operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
7281
7282Note that signed integer remainder and unsigned integer remainder are
7283distinct operations; for unsigned integer remainder, use '``urem``'.
7284
7285Taking the remainder of a division by zero is undefined behavior.
7286For vectors, if any element of the divisor is zero, the operation has
7287undefined behavior.
7288Overflow also leads to undefined behavior; this is a rare case, but can
7289occur, for example, by taking the remainder of a 32-bit division of
7290-2147483648 by -1. (The remainder doesn't actually overflow, but this
7291rule lets srem be implemented using instructions that return both the
7292result of the division and the remainder.)
7293
7294Example:
7295""""""""
7296
7297.. code-block:: text
7298
7299      <result> = srem i32 4, %var          ; yields i32:result = 4 % %var
7300
7301.. _i_frem:
7302
7303'``frem``' Instruction
7304^^^^^^^^^^^^^^^^^^^^^^
7305
7306Syntax:
7307"""""""
7308
7309::
7310
7311      <result> = frem [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
7312
7313Overview:
7314"""""""""
7315
7316The '``frem``' instruction returns the remainder from the division of
7317its two operands.
7318
7319Arguments:
7320""""""""""
7321
7322The two arguments to the '``frem``' instruction must be
7323:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
7324floating-point values. Both arguments must have identical types.
7325
7326Semantics:
7327""""""""""
7328
7329The value produced is the floating-point remainder of the two operands.
7330This is the same output as a libm '``fmod``' function, but without any
7331possibility of setting ``errno``. The remainder has the same sign as the
7332dividend.
7333This instruction is assumed to execute in the default :ref:`floating-point
7334environment <floatenv>`.
7335This instruction can also take any number of :ref:`fast-math
7336flags <fastmath>`, which are optimization hints to enable otherwise
7337unsafe floating-point optimizations:
7338
7339Example:
7340""""""""
7341
7342.. code-block:: text
7343
7344      <result> = frem float 4.0, %var          ; yields float:result = 4.0 % %var
7345
7346.. _bitwiseops:
7347
7348Bitwise Binary Operations
7349-------------------------
7350
7351Bitwise binary operators are used to do various forms of bit-twiddling
7352in a program. They are generally very efficient instructions and can
7353commonly be strength reduced from other instructions. They require two
7354operands of the same type, execute an operation on them, and produce a
7355single value. The resulting value is the same type as its operands.
7356
7357'``shl``' Instruction
7358^^^^^^^^^^^^^^^^^^^^^
7359
7360Syntax:
7361"""""""
7362
7363::
7364
7365      <result> = shl <ty> <op1>, <op2>           ; yields ty:result
7366      <result> = shl nuw <ty> <op1>, <op2>       ; yields ty:result
7367      <result> = shl nsw <ty> <op1>, <op2>       ; yields ty:result
7368      <result> = shl nuw nsw <ty> <op1>, <op2>   ; yields ty:result
7369
7370Overview:
7371"""""""""
7372
7373The '``shl``' instruction returns the first operand shifted to the left
7374a specified number of bits.
7375
7376Arguments:
7377""""""""""
7378
7379Both arguments to the '``shl``' instruction must be the same
7380:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
7381'``op2``' is treated as an unsigned value.
7382
7383Semantics:
7384""""""""""
7385
7386The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
7387where ``n`` is the width of the result. If ``op2`` is (statically or
7388dynamically) equal to or larger than the number of bits in
7389``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
7390If the arguments are vectors, each vector element of ``op1`` is shifted
7391by the corresponding shift amount in ``op2``.
7392
7393If the ``nuw`` keyword is present, then the shift produces a poison
7394value if it shifts out any non-zero bits.
7395If the ``nsw`` keyword is present, then the shift produces a poison
7396value if it shifts out any bits that disagree with the resultant sign bit.
7397
7398Example:
7399""""""""
7400
7401.. code-block:: text
7402
7403      <result> = shl i32 4, %var   ; yields i32: 4 << %var
7404      <result> = shl i32 4, 2      ; yields i32: 16
7405      <result> = shl i32 1, 10     ; yields i32: 1024
7406      <result> = shl i32 1, 32     ; undefined
7407      <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 2, i32 4>
7408
7409'``lshr``' Instruction
7410^^^^^^^^^^^^^^^^^^^^^^
7411
7412Syntax:
7413"""""""
7414
7415::
7416
7417      <result> = lshr <ty> <op1>, <op2>         ; yields ty:result
7418      <result> = lshr exact <ty> <op1>, <op2>   ; yields ty:result
7419
7420Overview:
7421"""""""""
7422
7423The '``lshr``' instruction (logical shift right) returns the first
7424operand shifted to the right a specified number of bits with zero fill.
7425
7426Arguments:
7427""""""""""
7428
7429Both arguments to the '``lshr``' instruction must be the same
7430:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
7431'``op2``' is treated as an unsigned value.
7432
7433Semantics:
7434""""""""""
7435
7436This instruction always performs a logical shift right operation. The
7437most significant bits of the result will be filled with zero bits after
7438the shift. If ``op2`` is (statically or dynamically) equal to or larger
7439than the number of bits in ``op1``, this instruction returns a :ref:`poison
7440value <poisonvalues>`. If the arguments are vectors, each vector element
7441of ``op1`` is shifted by the corresponding shift amount in ``op2``.
7442
7443If the ``exact`` keyword is present, the result value of the ``lshr`` is
7444a poison value if any of the bits shifted out are non-zero.
7445
7446Example:
7447""""""""
7448
7449.. code-block:: text
7450
7451      <result> = lshr i32 4, 1   ; yields i32:result = 2
7452      <result> = lshr i32 4, 2   ; yields i32:result = 1
7453      <result> = lshr i8  4, 3   ; yields i8:result = 0
7454      <result> = lshr i8 -2, 1   ; yields i8:result = 0x7F
7455      <result> = lshr i32 1, 32  ; undefined
7456      <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
7457
7458'``ashr``' Instruction
7459^^^^^^^^^^^^^^^^^^^^^^
7460
7461Syntax:
7462"""""""
7463
7464::
7465
7466      <result> = ashr <ty> <op1>, <op2>         ; yields ty:result
7467      <result> = ashr exact <ty> <op1>, <op2>   ; yields ty:result
7468
7469Overview:
7470"""""""""
7471
7472The '``ashr``' instruction (arithmetic shift right) returns the first
7473operand shifted to the right a specified number of bits with sign
7474extension.
7475
7476Arguments:
7477""""""""""
7478
7479Both arguments to the '``ashr``' instruction must be the same
7480:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
7481'``op2``' is treated as an unsigned value.
7482
7483Semantics:
7484""""""""""
7485
7486This instruction always performs an arithmetic shift right operation,
7487The most significant bits of the result will be filled with the sign bit
7488of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
7489than the number of bits in ``op1``, this instruction returns a :ref:`poison
7490value <poisonvalues>`. If the arguments are vectors, each vector element
7491of ``op1`` is shifted by the corresponding shift amount in ``op2``.
7492
7493If the ``exact`` keyword is present, the result value of the ``ashr`` is
7494a poison value if any of the bits shifted out are non-zero.
7495
7496Example:
7497""""""""
7498
7499.. code-block:: text
7500
7501      <result> = ashr i32 4, 1   ; yields i32:result = 2
7502      <result> = ashr i32 4, 2   ; yields i32:result = 1
7503      <result> = ashr i8  4, 3   ; yields i8:result = 0
7504      <result> = ashr i8 -2, 1   ; yields i8:result = -1
7505      <result> = ashr i32 1, 32  ; undefined
7506      <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3>   ; yields: result=<2 x i32> < i32 -1, i32 0>
7507
7508'``and``' Instruction
7509^^^^^^^^^^^^^^^^^^^^^
7510
7511Syntax:
7512"""""""
7513
7514::
7515
7516      <result> = and <ty> <op1>, <op2>   ; yields ty:result
7517
7518Overview:
7519"""""""""
7520
7521The '``and``' instruction returns the bitwise logical and of its two
7522operands.
7523
7524Arguments:
7525""""""""""
7526
7527The two arguments to the '``and``' instruction must be
7528:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7529arguments must have identical types.
7530
7531Semantics:
7532""""""""""
7533
7534The truth table used for the '``and``' instruction is:
7535
7536+-----+-----+-----+
7537| In0 | In1 | Out |
7538+-----+-----+-----+
7539|   0 |   0 |   0 |
7540+-----+-----+-----+
7541|   0 |   1 |   0 |
7542+-----+-----+-----+
7543|   1 |   0 |   0 |
7544+-----+-----+-----+
7545|   1 |   1 |   1 |
7546+-----+-----+-----+
7547
7548Example:
7549""""""""
7550
7551.. code-block:: text
7552
7553      <result> = and i32 4, %var         ; yields i32:result = 4 & %var
7554      <result> = and i32 15, 40          ; yields i32:result = 8
7555      <result> = and i32 4, 8            ; yields i32:result = 0
7556
7557'``or``' Instruction
7558^^^^^^^^^^^^^^^^^^^^
7559
7560Syntax:
7561"""""""
7562
7563::
7564
7565      <result> = or <ty> <op1>, <op2>   ; yields ty:result
7566
7567Overview:
7568"""""""""
7569
7570The '``or``' instruction returns the bitwise logical inclusive or of its
7571two operands.
7572
7573Arguments:
7574""""""""""
7575
7576The two arguments to the '``or``' instruction must be
7577:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7578arguments must have identical types.
7579
7580Semantics:
7581""""""""""
7582
7583The truth table used for the '``or``' instruction is:
7584
7585+-----+-----+-----+
7586| In0 | In1 | Out |
7587+-----+-----+-----+
7588|   0 |   0 |   0 |
7589+-----+-----+-----+
7590|   0 |   1 |   1 |
7591+-----+-----+-----+
7592|   1 |   0 |   1 |
7593+-----+-----+-----+
7594|   1 |   1 |   1 |
7595+-----+-----+-----+
7596
7597Example:
7598""""""""
7599
7600::
7601
7602      <result> = or i32 4, %var         ; yields i32:result = 4 | %var
7603      <result> = or i32 15, 40          ; yields i32:result = 47
7604      <result> = or i32 4, 8            ; yields i32:result = 12
7605
7606'``xor``' Instruction
7607^^^^^^^^^^^^^^^^^^^^^
7608
7609Syntax:
7610"""""""
7611
7612::
7613
7614      <result> = xor <ty> <op1>, <op2>   ; yields ty:result
7615
7616Overview:
7617"""""""""
7618
7619The '``xor``' instruction returns the bitwise logical exclusive or of
7620its two operands. The ``xor`` is used to implement the "one's
7621complement" operation, which is the "~" operator in C.
7622
7623Arguments:
7624""""""""""
7625
7626The two arguments to the '``xor``' instruction must be
7627:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7628arguments must have identical types.
7629
7630Semantics:
7631""""""""""
7632
7633The truth table used for the '``xor``' instruction is:
7634
7635+-----+-----+-----+
7636| In0 | In1 | Out |
7637+-----+-----+-----+
7638|   0 |   0 |   0 |
7639+-----+-----+-----+
7640|   0 |   1 |   1 |
7641+-----+-----+-----+
7642|   1 |   0 |   1 |
7643+-----+-----+-----+
7644|   1 |   1 |   0 |
7645+-----+-----+-----+
7646
7647Example:
7648""""""""
7649
7650.. code-block:: text
7651
7652      <result> = xor i32 4, %var         ; yields i32:result = 4 ^ %var
7653      <result> = xor i32 15, 40          ; yields i32:result = 39
7654      <result> = xor i32 4, 8            ; yields i32:result = 12
7655      <result> = xor i32 %V, -1          ; yields i32:result = ~%V
7656
7657Vector Operations
7658-----------------
7659
7660LLVM supports several instructions to represent vector operations in a
7661target-independent manner. These instructions cover the element-access
7662and vector-specific operations needed to process vectors effectively.
7663While LLVM does directly support these vector operations, many
7664sophisticated algorithms will want to use target-specific intrinsics to
7665take full advantage of a specific target.
7666
7667.. _i_extractelement:
7668
7669'``extractelement``' Instruction
7670^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7671
7672Syntax:
7673"""""""
7674
7675::
7676
7677      <result> = extractelement <n x <ty>> <val>, <ty2> <idx>  ; yields <ty>
7678
7679Overview:
7680"""""""""
7681
7682The '``extractelement``' instruction extracts a single scalar element
7683from a vector at a specified index.
7684
7685Arguments:
7686""""""""""
7687
7688The first operand of an '``extractelement``' instruction is a value of
7689:ref:`vector <t_vector>` type. The second operand is an index indicating
7690the position from which to extract the element. The index may be a
7691variable of any integer type.
7692
7693Semantics:
7694""""""""""
7695
7696The result is a scalar of the same type as the element type of ``val``.
7697Its value is the value at position ``idx`` of ``val``. If ``idx``
7698exceeds the length of ``val``, the result is a
7699:ref:`poison value <poisonvalues>`.
7700
7701Example:
7702""""""""
7703
7704.. code-block:: text
7705
7706      <result> = extractelement <4 x i32> %vec, i32 0    ; yields i32
7707
7708.. _i_insertelement:
7709
7710'``insertelement``' Instruction
7711^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7712
7713Syntax:
7714"""""""
7715
7716::
7717
7718      <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx>    ; yields <n x <ty>>
7719
7720Overview:
7721"""""""""
7722
7723The '``insertelement``' instruction inserts a scalar element into a
7724vector at a specified index.
7725
7726Arguments:
7727""""""""""
7728
7729The first operand of an '``insertelement``' instruction is a value of
7730:ref:`vector <t_vector>` type. The second operand is a scalar value whose
7731type must equal the element type of the first operand. The third operand
7732is an index indicating the position at which to insert the value. The
7733index may be a variable of any integer type.
7734
7735Semantics:
7736""""""""""
7737
7738The result is a vector of the same type as ``val``. Its element values
7739are those of ``val`` except at position ``idx``, where it gets the value
7740``elt``. If ``idx`` exceeds the length of ``val``, the result
7741is a :ref:`poison value <poisonvalues>`.
7742
7743Example:
7744""""""""
7745
7746.. code-block:: text
7747
7748      <result> = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
7749
7750.. _i_shufflevector:
7751
7752'``shufflevector``' Instruction
7753^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7754
7755Syntax:
7756"""""""
7757
7758::
7759
7760      <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask>    ; yields <m x <ty>>
7761
7762Overview:
7763"""""""""
7764
7765The '``shufflevector``' instruction constructs a permutation of elements
7766from two input vectors, returning a vector with the same element type as
7767the input and length that is the same as the shuffle mask.
7768
7769Arguments:
7770""""""""""
7771
7772The first two operands of a '``shufflevector``' instruction are vectors
7773with the same type. The third argument is a shuffle mask whose element
7774type is always 'i32'. The result of the instruction is a vector whose
7775length is the same as the shuffle mask and whose element type is the
7776same as the element type of the first two operands.
7777
7778The shuffle mask operand is required to be a constant vector with either
7779constant integer or undef values.
7780
7781Semantics:
7782""""""""""
7783
7784The elements of the two input vectors are numbered from left to right
7785across both of the vectors. The shuffle mask operand specifies, for each
7786element of the result vector, which element of the two input vectors the
7787result element gets. If the shuffle mask is undef, the result vector is
7788undef. If any element of the mask operand is undef, that element of the
7789result is undef. If the shuffle mask selects an undef element from one
7790of the input vectors, the resulting element is undef.
7791
7792Example:
7793""""""""
7794
7795.. code-block:: text
7796
7797      <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
7798                              <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
7799      <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
7800                              <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
7801      <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
7802                              <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
7803      <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
7804                              <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
7805
7806Aggregate Operations
7807--------------------
7808
7809LLVM supports several instructions for working with
7810:ref:`aggregate <t_aggregate>` values.
7811
7812.. _i_extractvalue:
7813
7814'``extractvalue``' Instruction
7815^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7816
7817Syntax:
7818"""""""
7819
7820::
7821
7822      <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
7823
7824Overview:
7825"""""""""
7826
7827The '``extractvalue``' instruction extracts the value of a member field
7828from an :ref:`aggregate <t_aggregate>` value.
7829
7830Arguments:
7831""""""""""
7832
7833The first operand of an '``extractvalue``' instruction is a value of
7834:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
7835constant indices to specify which value to extract in a similar manner
7836as indices in a '``getelementptr``' instruction.
7837
7838The major differences to ``getelementptr`` indexing are:
7839
7840-  Since the value being indexed is not a pointer, the first index is
7841   omitted and assumed to be zero.
7842-  At least one index must be specified.
7843-  Not only struct indices but also array indices must be in bounds.
7844
7845Semantics:
7846""""""""""
7847
7848The result is the value at the position in the aggregate specified by
7849the index operands.
7850
7851Example:
7852""""""""
7853
7854.. code-block:: text
7855
7856      <result> = extractvalue {i32, float} %agg, 0    ; yields i32
7857
7858.. _i_insertvalue:
7859
7860'``insertvalue``' Instruction
7861^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7862
7863Syntax:
7864"""""""
7865
7866::
7867
7868      <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}*    ; yields <aggregate type>
7869
7870Overview:
7871"""""""""
7872
7873The '``insertvalue``' instruction inserts a value into a member field in
7874an :ref:`aggregate <t_aggregate>` value.
7875
7876Arguments:
7877""""""""""
7878
7879The first operand of an '``insertvalue``' instruction is a value of
7880:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
7881a first-class value to insert. The following operands are constant
7882indices indicating the position at which to insert the value in a
7883similar manner as indices in a '``extractvalue``' instruction. The value
7884to insert must have the same type as the value identified by the
7885indices.
7886
7887Semantics:
7888""""""""""
7889
7890The result is an aggregate of the same type as ``val``. Its value is
7891that of ``val`` except that the value at the position specified by the
7892indices is that of ``elt``.
7893
7894Example:
7895""""""""
7896
7897.. code-block:: llvm
7898
7899      %agg1 = insertvalue {i32, float} undef, i32 1, 0              ; yields {i32 1, float undef}
7900      %agg2 = insertvalue {i32, float} %agg1, float %val, 1         ; yields {i32 1, float %val}
7901      %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0    ; yields {i32 undef, {float %val}}
7902
7903.. _memoryops:
7904
7905Memory Access and Addressing Operations
7906---------------------------------------
7907
7908A key design point of an SSA-based representation is how it represents
7909memory. In LLVM, no memory locations are in SSA form, which makes things
7910very simple. This section describes how to read, write, and allocate
7911memory in LLVM.
7912
7913.. _i_alloca:
7914
7915'``alloca``' Instruction
7916^^^^^^^^^^^^^^^^^^^^^^^^
7917
7918Syntax:
7919"""""""
7920
7921::
7922
7923      <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)]     ; yields type addrspace(num)*:result
7924
7925Overview:
7926"""""""""
7927
7928The '``alloca``' instruction allocates memory on the stack frame of the
7929currently executing function, to be automatically released when this
7930function returns to its caller. The object is always allocated in the
7931address space for allocas indicated in the datalayout.
7932
7933Arguments:
7934""""""""""
7935
7936The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
7937bytes of memory on the runtime stack, returning a pointer of the
7938appropriate type to the program. If "NumElements" is specified, it is
7939the number of elements allocated, otherwise "NumElements" is defaulted
7940to be one. If a constant alignment is specified, the value result of the
7941allocation is guaranteed to be aligned to at least that boundary. The
7942alignment may not be greater than ``1 << 29``. If not specified, or if
7943zero, the target can choose to align the allocation on any convenient
7944boundary compatible with the type.
7945
7946'``type``' may be any sized type.
7947
7948Semantics:
7949""""""""""
7950
7951Memory is allocated; a pointer is returned. The operation is undefined
7952if there is insufficient stack space for the allocation. '``alloca``'d
7953memory is automatically released when the function returns. The
7954'``alloca``' instruction is commonly used to represent automatic
7955variables that must have an address available. When the function returns
7956(either with the ``ret`` or ``resume`` instructions), the memory is
7957reclaimed. Allocating zero bytes is legal, but the returned pointer may not
7958be unique. The order in which memory is allocated (ie., which way the stack
7959grows) is not specified.
7960
7961Example:
7962""""""""
7963
7964.. code-block:: llvm
7965
7966      %ptr = alloca i32                             ; yields i32*:ptr
7967      %ptr = alloca i32, i32 4                      ; yields i32*:ptr
7968      %ptr = alloca i32, i32 4, align 1024          ; yields i32*:ptr
7969      %ptr = alloca i32, align 1024                 ; yields i32*:ptr
7970
7971.. _i_load:
7972
7973'``load``' Instruction
7974^^^^^^^^^^^^^^^^^^^^^^
7975
7976Syntax:
7977"""""""
7978
7979::
7980
7981      <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>][, !invariant.group !<index>][, !nonnull !<index>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>]
7982      <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<index>]
7983      !<index> = !{ i32 1 }
7984      !<deref_bytes_node> = !{i64 <dereferenceable_bytes>}
7985      !<align_node> = !{ i64 <value_alignment> }
7986
7987Overview:
7988"""""""""
7989
7990The '``load``' instruction is used to read from memory.
7991
7992Arguments:
7993""""""""""
7994
7995The argument to the ``load`` instruction specifies the memory address from which
7996to load. The type specified must be a :ref:`first class <t_firstclass>` type of
7997known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
7998the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
7999modify the number or order of execution of this ``load`` with other
8000:ref:`volatile operations <volatile>`.
8001
8002If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
8003<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
8004``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
8005Atomic loads produce :ref:`defined <memmodel>` results when they may see
8006multiple atomic stores. The type of the pointee must be an integer, pointer, or
8007floating-point type whose bit width is a power of two greater than or equal to
8008eight and less than or equal to a target-specific size limit.  ``align`` must be
8009explicitly specified on atomic loads, and the load has undefined behavior if the
8010alignment is not set to a value which is at least the size in bytes of the
8011pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
8012
8013The optional constant ``align`` argument specifies the alignment of the
8014operation (that is, the alignment of the memory address). A value of 0
8015or an omitted ``align`` argument means that the operation has the ABI
8016alignment for the target. It is the responsibility of the code emitter
8017to ensure that the alignment information is correct. Overestimating the
8018alignment results in undefined behavior. Underestimating the alignment
8019may produce less efficient code. An alignment of 1 is always safe. The
8020maximum possible alignment is ``1 << 29``. An alignment value higher
8021than the size of the loaded type implies memory up to the alignment
8022value bytes can be safely loaded without trapping in the default
8023address space. Access of the high bytes can interfere with debugging
8024tools, so should not be accessed if the function has the
8025``sanitize_thread`` or ``sanitize_address`` attributes.
8026
8027The optional ``!nontemporal`` metadata must reference a single
8028metadata name ``<index>`` corresponding to a metadata node with one
8029``i32`` entry of value 1. The existence of the ``!nontemporal``
8030metadata on the instruction tells the optimizer and code generator
8031that this load is not expected to be reused in the cache. The code
8032generator may select special instructions to save cache bandwidth, such
8033as the ``MOVNT`` instruction on x86.
8034
8035The optional ``!invariant.load`` metadata must reference a single
8036metadata name ``<index>`` corresponding to a metadata node with no
8037entries. If a load instruction tagged with the ``!invariant.load``
8038metadata is executed, the optimizer may assume the memory location
8039referenced by the load contains the same value at all points in the
8040program where the memory location is known to be dereferenceable;
8041otherwise, the behavior is undefined.
8042
8043The optional ``!invariant.group`` metadata must reference a single metadata name
8044 ``<index>`` corresponding to a metadata node with no entries.
8045 See ``invariant.group`` metadata.
8046
8047The optional ``!nonnull`` metadata must reference a single
8048metadata name ``<index>`` corresponding to a metadata node with no
8049entries. The existence of the ``!nonnull`` metadata on the
8050instruction tells the optimizer that the value loaded is known to
8051never be null. If the value is null at runtime, the behavior is undefined.
8052This is analogous to the ``nonnull`` attribute on parameters and return
8053values. This metadata can only be applied to loads of a pointer type.
8054
8055The optional ``!dereferenceable`` metadata must reference a single metadata
8056name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
8057entry. The existence of the ``!dereferenceable`` metadata on the instruction
8058tells the optimizer that the value loaded is known to be dereferenceable.
8059The number of bytes known to be dereferenceable is specified by the integer
8060value in the metadata node. This is analogous to the ''dereferenceable''
8061attribute on parameters and return values. This metadata can only be applied
8062to loads of a pointer type.
8063
8064The optional ``!dereferenceable_or_null`` metadata must reference a single
8065metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
8066``i64`` entry. The existence of the ``!dereferenceable_or_null`` metadata on the
8067instruction tells the optimizer that the value loaded is known to be either
8068dereferenceable or null.
8069The number of bytes known to be dereferenceable is specified by the integer
8070value in the metadata node. This is analogous to the ''dereferenceable_or_null''
8071attribute on parameters and return values. This metadata can only be applied
8072to loads of a pointer type.
8073
8074The optional ``!align`` metadata must reference a single metadata name
8075``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
8076The existence of the ``!align`` metadata on the instruction tells the
8077optimizer that the value loaded is known to be aligned to a boundary specified
8078by the integer value in the metadata node. The alignment must be a power of 2.
8079This is analogous to the ''align'' attribute on parameters and return values.
8080This metadata can only be applied to loads of a pointer type. If the returned
8081value is not appropriately aligned at runtime, the behavior is undefined.
8082
8083Semantics:
8084""""""""""
8085
8086The location of memory pointed to is loaded. If the value being loaded
8087is of scalar type then the number of bytes read does not exceed the
8088minimum number of bytes needed to hold all bits of the type. For
8089example, loading an ``i24`` reads at most three bytes. When loading a
8090value of a type like ``i20`` with a size that is not an integral number
8091of bytes, the result is undefined if the value was not originally
8092written using a store of the same type.
8093
8094Examples:
8095"""""""""
8096
8097.. code-block:: llvm
8098
8099      %ptr = alloca i32                               ; yields i32*:ptr
8100      store i32 3, i32* %ptr                          ; yields void
8101      %val = load i32, i32* %ptr                      ; yields i32:val = i32 3
8102
8103.. _i_store:
8104
8105'``store``' Instruction
8106^^^^^^^^^^^^^^^^^^^^^^^
8107
8108Syntax:
8109"""""""
8110
8111::
8112
8113      store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.group !<index>]        ; yields void
8114      store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<index>] ; yields void
8115
8116Overview:
8117"""""""""
8118
8119The '``store``' instruction is used to write to memory.
8120
8121Arguments:
8122""""""""""
8123
8124There are two arguments to the ``store`` instruction: a value to store and an
8125address at which to store it. The type of the ``<pointer>`` operand must be a
8126pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
8127operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
8128allowed to modify the number or order of execution of this ``store`` with other
8129:ref:`volatile operations <volatile>`.  Only values of :ref:`first class
8130<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
8131structural type <t_opaque>`) can be stored.
8132
8133If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
8134<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
8135``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
8136Atomic loads produce :ref:`defined <memmodel>` results when they may see
8137multiple atomic stores. The type of the pointee must be an integer, pointer, or
8138floating-point type whose bit width is a power of two greater than or equal to
8139eight and less than or equal to a target-specific size limit.  ``align`` must be
8140explicitly specified on atomic stores, and the store has undefined behavior if
8141the alignment is not set to a value which is at least the size in bytes of the
8142pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
8143
8144The optional constant ``align`` argument specifies the alignment of the
8145operation (that is, the alignment of the memory address). A value of 0
8146or an omitted ``align`` argument means that the operation has the ABI
8147alignment for the target. It is the responsibility of the code emitter
8148to ensure that the alignment information is correct. Overestimating the
8149alignment results in undefined behavior. Underestimating the
8150alignment may produce less efficient code. An alignment of 1 is always
8151safe. The maximum possible alignment is ``1 << 29``. An alignment
8152value higher than the size of the stored type implies memory up to the
8153alignment value bytes can be stored to without trapping in the default
8154address space. Storing to the higher bytes however may result in data
8155races if another thread can access the same address. Introducing a
8156data race is not allowed. Storing to the extra bytes is not allowed
8157even in situations where a data race is known to not exist if the
8158function has the ``sanitize_address`` attribute.
8159
8160The optional ``!nontemporal`` metadata must reference a single metadata
8161name ``<index>`` corresponding to a metadata node with one ``i32`` entry of
8162value 1. The existence of the ``!nontemporal`` metadata on the instruction
8163tells the optimizer and code generator that this load is not expected to
8164be reused in the cache. The code generator may select special
8165instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
8166x86.
8167
8168The optional ``!invariant.group`` metadata must reference a
8169single metadata name ``<index>``. See ``invariant.group`` metadata.
8170
8171Semantics:
8172""""""""""
8173
8174The contents of memory are updated to contain ``<value>`` at the
8175location specified by the ``<pointer>`` operand. If ``<value>`` is
8176of scalar type then the number of bytes written does not exceed the
8177minimum number of bytes needed to hold all bits of the type. For
8178example, storing an ``i24`` writes at most three bytes. When writing a
8179value of a type like ``i20`` with a size that is not an integral number
8180of bytes, it is unspecified what happens to the extra bits that do not
8181belong to the type, but they will typically be overwritten.
8182
8183Example:
8184""""""""
8185
8186.. code-block:: llvm
8187
8188      %ptr = alloca i32                               ; yields i32*:ptr
8189      store i32 3, i32* %ptr                          ; yields void
8190      %val = load i32, i32* %ptr                      ; yields i32:val = i32 3
8191
8192.. _i_fence:
8193
8194'``fence``' Instruction
8195^^^^^^^^^^^^^^^^^^^^^^^
8196
8197Syntax:
8198"""""""
8199
8200::
8201
8202      fence [syncscope("<target-scope>")] <ordering>  ; yields void
8203
8204Overview:
8205"""""""""
8206
8207The '``fence``' instruction is used to introduce happens-before edges
8208between operations.
8209
8210Arguments:
8211""""""""""
8212
8213'``fence``' instructions take an :ref:`ordering <ordering>` argument which
8214defines what *synchronizes-with* edges they add. They can only be given
8215``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
8216
8217Semantics:
8218""""""""""
8219
8220A fence A which has (at least) ``release`` ordering semantics
8221*synchronizes with* a fence B with (at least) ``acquire`` ordering
8222semantics if and only if there exist atomic operations X and Y, both
8223operating on some atomic object M, such that A is sequenced before X, X
8224modifies M (either directly or through some side effect of a sequence
8225headed by X), Y is sequenced before B, and Y observes M. This provides a
8226*happens-before* dependency between A and B. Rather than an explicit
8227``fence``, one (but not both) of the atomic operations X or Y might
8228provide a ``release`` or ``acquire`` (resp.) ordering constraint and
8229still *synchronize-with* the explicit ``fence`` and establish the
8230*happens-before* edge.
8231
8232A ``fence`` which has ``seq_cst`` ordering, in addition to having both
8233``acquire`` and ``release`` semantics specified above, participates in
8234the global program order of other ``seq_cst`` operations and/or fences.
8235
8236A ``fence`` instruction can also take an optional
8237":ref:`syncscope <syncscope>`" argument.
8238
8239Example:
8240""""""""
8241
8242.. code-block:: text
8243
8244      fence acquire                                        ; yields void
8245      fence syncscope("singlethread") seq_cst              ; yields void
8246      fence syncscope("agent") seq_cst                     ; yields void
8247
8248.. _i_cmpxchg:
8249
8250'``cmpxchg``' Instruction
8251^^^^^^^^^^^^^^^^^^^^^^^^^
8252
8253Syntax:
8254"""""""
8255
8256::
8257
8258      cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering> ; yields  { ty, i1 }
8259
8260Overview:
8261"""""""""
8262
8263The '``cmpxchg``' instruction is used to atomically modify memory. It
8264loads a value in memory and compares it to a given value. If they are
8265equal, it tries to store a new value into the memory.
8266
8267Arguments:
8268""""""""""
8269
8270There are three arguments to the '``cmpxchg``' instruction: an address
8271to operate on, a value to compare to the value currently be at that
8272address, and a new value to place at that address if the compared values
8273are equal. The type of '<cmp>' must be an integer or pointer type whose
8274bit width is a power of two greater than or equal to eight and less
8275than or equal to a target-specific size limit. '<cmp>' and '<new>' must
8276have the same type, and the type of '<pointer>' must be a pointer to
8277that type. If the ``cmpxchg`` is marked as ``volatile``, then the
8278optimizer is not allowed to modify the number or order of execution of
8279this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
8280
8281The success and failure :ref:`ordering <ordering>` arguments specify how this
8282``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
8283must be at least ``monotonic``, the ordering constraint on failure must be no
8284stronger than that on success, and the failure ordering cannot be either
8285``release`` or ``acq_rel``.
8286
8287A ``cmpxchg`` instruction can also take an optional
8288":ref:`syncscope <syncscope>`" argument.
8289
8290The pointer passed into cmpxchg must have alignment greater than or
8291equal to the size in memory of the operand.
8292
8293Semantics:
8294""""""""""
8295
8296The contents of memory at the location specified by the '``<pointer>``' operand
8297is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
8298written to the location. The original value at the location is returned,
8299together with a flag indicating success (true) or failure (false).
8300
8301If the cmpxchg operation is marked as ``weak`` then a spurious failure is
8302permitted: the operation may not write ``<new>`` even if the comparison
8303matched.
8304
8305If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
8306if the value loaded equals ``cmp``.
8307
8308A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
8309identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
8310load with an ordering parameter determined the second ordering parameter.
8311
8312Example:
8313""""""""
8314
8315.. code-block:: llvm
8316
8317    entry:
8318      %orig = load atomic i32, i32* %ptr unordered, align 4                      ; yields i32
8319      br label %loop
8320
8321    loop:
8322      %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
8323      %squared = mul i32 %cmp, %cmp
8324      %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields  { i32, i1 }
8325      %value_loaded = extractvalue { i32, i1 } %val_success, 0
8326      %success = extractvalue { i32, i1 } %val_success, 1
8327      br i1 %success, label %done, label %loop
8328
8329    done:
8330      ...
8331
8332.. _i_atomicrmw:
8333
8334'``atomicrmw``' Instruction
8335^^^^^^^^^^^^^^^^^^^^^^^^^^^
8336
8337Syntax:
8338"""""""
8339
8340::
8341
8342      atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>                   ; yields ty
8343
8344Overview:
8345"""""""""
8346
8347The '``atomicrmw``' instruction is used to atomically modify memory.
8348
8349Arguments:
8350""""""""""
8351
8352There are three arguments to the '``atomicrmw``' instruction: an
8353operation to apply, an address whose value to modify, an argument to the
8354operation. The operation must be one of the following keywords:
8355
8356-  xchg
8357-  add
8358-  sub
8359-  and
8360-  nand
8361-  or
8362-  xor
8363-  max
8364-  min
8365-  umax
8366-  umin
8367
8368The type of '<value>' must be an integer type whose bit width is a power
8369of two greater than or equal to eight and less than or equal to a
8370target-specific size limit. The type of the '``<pointer>``' operand must
8371be a pointer to that type. If the ``atomicrmw`` is marked as
8372``volatile``, then the optimizer is not allowed to modify the number or
8373order of execution of this ``atomicrmw`` with other :ref:`volatile
8374operations <volatile>`.
8375
8376A ``atomicrmw`` instruction can also take an optional
8377":ref:`syncscope <syncscope>`" argument.
8378
8379Semantics:
8380""""""""""
8381
8382The contents of memory at the location specified by the '``<pointer>``'
8383operand are atomically read, modified, and written back. The original
8384value at the location is returned. The modification is specified by the
8385operation argument:
8386
8387-  xchg: ``*ptr = val``
8388-  add: ``*ptr = *ptr + val``
8389-  sub: ``*ptr = *ptr - val``
8390-  and: ``*ptr = *ptr & val``
8391-  nand: ``*ptr = ~(*ptr & val)``
8392-  or: ``*ptr = *ptr | val``
8393-  xor: ``*ptr = *ptr ^ val``
8394-  max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
8395-  min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
8396-  umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned
8397   comparison)
8398-  umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned
8399   comparison)
8400
8401Example:
8402""""""""
8403
8404.. code-block:: llvm
8405
8406      %old = atomicrmw add i32* %ptr, i32 1 acquire                        ; yields i32
8407
8408.. _i_getelementptr:
8409
8410'``getelementptr``' Instruction
8411^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8412
8413Syntax:
8414"""""""
8415
8416::
8417
8418      <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
8419      <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
8420      <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx>
8421
8422Overview:
8423"""""""""
8424
8425The '``getelementptr``' instruction is used to get the address of a
8426subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
8427address calculation only and does not access memory. The instruction can also
8428be used to calculate a vector of such addresses.
8429
8430Arguments:
8431""""""""""
8432
8433The first argument is always a type used as the basis for the calculations.
8434The second argument is always a pointer or a vector of pointers, and is the
8435base address to start from. The remaining arguments are indices
8436that indicate which of the elements of the aggregate object are indexed.
8437The interpretation of each index is dependent on the type being indexed
8438into. The first index always indexes the pointer value given as the
8439second argument, the second index indexes a value of the type pointed to
8440(not necessarily the value directly pointed to, since the first index
8441can be non-zero), etc. The first type indexed into must be a pointer
8442value, subsequent types can be arrays, vectors, and structs. Note that
8443subsequent types being indexed into can never be pointers, since that
8444would require loading the pointer before continuing calculation.
8445
8446The type of each index argument depends on the type it is indexing into.
8447When indexing into a (optionally packed) structure, only ``i32`` integer
8448**constants** are allowed (when using a vector of indices they must all
8449be the **same** ``i32`` integer constant). When indexing into an array,
8450pointer or vector, integers of any width are allowed, and they are not
8451required to be constant. These integers are treated as signed values
8452where relevant.
8453
8454For example, let's consider a C code fragment and how it gets compiled
8455to LLVM:
8456
8457.. code-block:: c
8458
8459    struct RT {
8460      char A;
8461      int B[10][20];
8462      char C;
8463    };
8464    struct ST {
8465      int X;
8466      double Y;
8467      struct RT Z;
8468    };
8469
8470    int *foo(struct ST *s) {
8471      return &s[1].Z.B[5][13];
8472    }
8473
8474The LLVM code generated by Clang is:
8475
8476.. code-block:: llvm
8477
8478    %struct.RT = type { i8, [10 x [20 x i32]], i8 }
8479    %struct.ST = type { i32, double, %struct.RT }
8480
8481    define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
8482    entry:
8483      %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
8484      ret i32* %arrayidx
8485    }
8486
8487Semantics:
8488""""""""""
8489
8490In the example above, the first index is indexing into the
8491'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
8492= '``{ i32, double, %struct.RT }``' type, a structure. The second index
8493indexes into the third element of the structure, yielding a
8494'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
8495structure. The third index indexes into the second element of the
8496structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
8497dimensions of the array are subscripted into, yielding an '``i32``'
8498type. The '``getelementptr``' instruction returns a pointer to this
8499element, thus computing a value of '``i32*``' type.
8500
8501Note that it is perfectly legal to index partially through a structure,
8502returning a pointer to an inner element. Because of this, the LLVM code
8503for the given testcase is equivalent to:
8504
8505.. code-block:: llvm
8506
8507    define i32* @foo(%struct.ST* %s) {
8508      %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1                        ; yields %struct.ST*:%t1
8509      %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2                ; yields %struct.RT*:%t2
8510      %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1                ; yields [10 x [20 x i32]]*:%t3
8511      %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5  ; yields [20 x i32]*:%t4
8512      %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13               ; yields i32*:%t5
8513      ret i32* %t5
8514    }
8515
8516If the ``inbounds`` keyword is present, the result value of the
8517``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base
8518pointer is not an *in bounds* address of an allocated object, or if any
8519of the addresses that would be formed by successive addition of the
8520offsets implied by the indices to the base address with infinitely
8521precise signed arithmetic are not an *in bounds* address of that
8522allocated object. The *in bounds* addresses for an allocated object are
8523all the addresses that point into the object, plus the address one byte
8524past the end. The only *in bounds* address for a null pointer in the
8525default address-space is the null pointer itself. In cases where the
8526base is a vector of pointers the ``inbounds`` keyword applies to each
8527of the computations element-wise.
8528
8529If the ``inbounds`` keyword is not present, the offsets are added to the
8530base address with silently-wrapping two's complement arithmetic. If the
8531offsets have a different width from the pointer, they are sign-extended
8532or truncated to the width of the pointer. The result value of the
8533``getelementptr`` may be outside the object pointed to by the base
8534pointer. The result value may not necessarily be used to access memory
8535though, even if it happens to point into allocated storage. See the
8536:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
8537information.
8538
8539If the ``inrange`` keyword is present before any index, loading from or
8540storing to any pointer derived from the ``getelementptr`` has undefined
8541behavior if the load or store would access memory outside of the bounds of
8542the element selected by the index marked as ``inrange``. The result of a
8543pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
8544involving memory) involving a pointer derived from a ``getelementptr`` with
8545the ``inrange`` keyword is undefined, with the exception of comparisons
8546in the case where both operands are in the range of the element selected
8547by the ``inrange`` keyword, inclusive of the address one past the end of
8548that element. Note that the ``inrange`` keyword is currently only allowed
8549in constant ``getelementptr`` expressions.
8550
8551The getelementptr instruction is often confusing. For some more insight
8552into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
8553
8554Example:
8555""""""""
8556
8557.. code-block:: llvm
8558
8559        ; yields [12 x i8]*:aptr
8560        %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1
8561        ; yields i8*:vptr
8562        %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
8563        ; yields i8*:eptr
8564        %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1
8565        ; yields i32*:iptr
8566        %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0
8567
8568Vector of pointers:
8569"""""""""""""""""""
8570
8571The ``getelementptr`` returns a vector of pointers, instead of a single address,
8572when one or more of its arguments is a vector. In such cases, all vector
8573arguments should have the same number of elements, and every scalar argument
8574will be effectively broadcast into a vector during address calculation.
8575
8576.. code-block:: llvm
8577
8578     ; All arguments are vectors:
8579     ;   A[i] = ptrs[i] + offsets[i]*sizeof(i8)
8580     %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
8581
8582     ; Add the same scalar offset to each pointer of a vector:
8583     ;   A[i] = ptrs[i] + offset*sizeof(i8)
8584     %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
8585
8586     ; Add distinct offsets to the same pointer:
8587     ;   A[i] = ptr + offsets[i]*sizeof(i8)
8588     %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
8589
8590     ; In all cases described above the type of the result is <4 x i8*>
8591
8592The two following instructions are equivalent:
8593
8594.. code-block:: llvm
8595
8596     getelementptr  %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
8597       <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
8598       <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
8599       <4 x i32> %ind4,
8600       <4 x i64> <i64 13, i64 13, i64 13, i64 13>
8601
8602     getelementptr  %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
8603       i32 2, i32 1, <4 x i32> %ind4, i64 13
8604
8605Let's look at the C code, where the vector version of ``getelementptr``
8606makes sense:
8607
8608.. code-block:: c
8609
8610    // Let's assume that we vectorize the following loop:
8611    double *A, *B; int *C;
8612    for (int i = 0; i < size; ++i) {
8613      A[i] = B[C[i]];
8614    }
8615
8616.. code-block:: llvm
8617
8618    ; get pointers for 8 elements from array B
8619    %ptrs = getelementptr double, double* %B, <8 x i32> %C
8620    ; load 8 elements from array B into A
8621    %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs,
8622         i32 8, <8 x i1> %mask, <8 x double> %passthru)
8623
8624Conversion Operations
8625---------------------
8626
8627The instructions in this category are the conversion instructions
8628(casting) which all take a single operand and a type. They perform
8629various bit conversions on the operand.
8630
8631.. _i_trunc:
8632
8633'``trunc .. to``' Instruction
8634^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8635
8636Syntax:
8637"""""""
8638
8639::
8640
8641      <result> = trunc <ty> <value> to <ty2>             ; yields ty2
8642
8643Overview:
8644"""""""""
8645
8646The '``trunc``' instruction truncates its operand to the type ``ty2``.
8647
8648Arguments:
8649""""""""""
8650
8651The '``trunc``' instruction takes a value to trunc, and a type to trunc
8652it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
8653of the same number of integers. The bit size of the ``value`` must be
8654larger than the bit size of the destination type, ``ty2``. Equal sized
8655types are not allowed.
8656
8657Semantics:
8658""""""""""
8659
8660The '``trunc``' instruction truncates the high order bits in ``value``
8661and converts the remaining bits to ``ty2``. Since the source size must
8662be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
8663It will always truncate bits.
8664
8665Example:
8666""""""""
8667
8668.. code-block:: llvm
8669
8670      %X = trunc i32 257 to i8                        ; yields i8:1
8671      %Y = trunc i32 123 to i1                        ; yields i1:true
8672      %Z = trunc i32 122 to i1                        ; yields i1:false
8673      %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
8674
8675.. _i_zext:
8676
8677'``zext .. to``' Instruction
8678^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8679
8680Syntax:
8681"""""""
8682
8683::
8684
8685      <result> = zext <ty> <value> to <ty2>             ; yields ty2
8686
8687Overview:
8688"""""""""
8689
8690The '``zext``' instruction zero extends its operand to type ``ty2``.
8691
8692Arguments:
8693""""""""""
8694
8695The '``zext``' instruction takes a value to cast, and a type to cast it
8696to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
8697the same number of integers. The bit size of the ``value`` must be
8698smaller than the bit size of the destination type, ``ty2``.
8699
8700Semantics:
8701""""""""""
8702
8703The ``zext`` fills the high order bits of the ``value`` with zero bits
8704until it reaches the size of the destination type, ``ty2``.
8705
8706When zero extending from i1, the result will always be either 0 or 1.
8707
8708Example:
8709""""""""
8710
8711.. code-block:: llvm
8712
8713      %X = zext i32 257 to i64              ; yields i64:257
8714      %Y = zext i1 true to i32              ; yields i32:1
8715      %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
8716
8717.. _i_sext:
8718
8719'``sext .. to``' Instruction
8720^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8721
8722Syntax:
8723"""""""
8724
8725::
8726
8727      <result> = sext <ty> <value> to <ty2>             ; yields ty2
8728
8729Overview:
8730"""""""""
8731
8732The '``sext``' sign extends ``value`` to the type ``ty2``.
8733
8734Arguments:
8735""""""""""
8736
8737The '``sext``' instruction takes a value to cast, and a type to cast it
8738to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
8739the same number of integers. The bit size of the ``value`` must be
8740smaller than the bit size of the destination type, ``ty2``.
8741
8742Semantics:
8743""""""""""
8744
8745The '``sext``' instruction performs a sign extension by copying the sign
8746bit (highest order bit) of the ``value`` until it reaches the bit size
8747of the type ``ty2``.
8748
8749When sign extending from i1, the extension always results in -1 or 0.
8750
8751Example:
8752""""""""
8753
8754.. code-block:: llvm
8755
8756      %X = sext i8  -1 to i16              ; yields i16   :65535
8757      %Y = sext i1 true to i32             ; yields i32:-1
8758      %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
8759
8760'``fptrunc .. to``' Instruction
8761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8762
8763Syntax:
8764"""""""
8765
8766::
8767
8768      <result> = fptrunc <ty> <value> to <ty2>             ; yields ty2
8769
8770Overview:
8771"""""""""
8772
8773The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
8774
8775Arguments:
8776""""""""""
8777
8778The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
8779value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
8780The size of ``value`` must be larger than the size of ``ty2``. This
8781implies that ``fptrunc`` cannot be used to make a *no-op cast*.
8782
8783Semantics:
8784""""""""""
8785
8786The '``fptrunc``' instruction casts a ``value`` from a larger
8787:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
8788<t_floating>` type.
8789This instruction is assumed to execute in the default :ref:`floating-point
8790environment <floatenv>`.
8791
8792Example:
8793""""""""
8794
8795.. code-block:: llvm
8796
8797      %X = fptrunc double 16777217.0 to float    ; yields float:16777216.0
8798      %Y = fptrunc double 1.0E+300 to half       ; yields half:+infinity
8799
8800'``fpext .. to``' Instruction
8801^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8802
8803Syntax:
8804"""""""
8805
8806::
8807
8808      <result> = fpext <ty> <value> to <ty2>             ; yields ty2
8809
8810Overview:
8811"""""""""
8812
8813The '``fpext``' extends a floating-point ``value`` to a larger floating-point
8814value.
8815
8816Arguments:
8817""""""""""
8818
8819The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
8820``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
8821to. The source type must be smaller than the destination type.
8822
8823Semantics:
8824""""""""""
8825
8826The '``fpext``' instruction extends the ``value`` from a smaller
8827:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
8828<t_floating>` type. The ``fpext`` cannot be used to make a
8829*no-op cast* because it always changes bits. Use ``bitcast`` to make a
8830*no-op cast* for a floating-point cast.
8831
8832Example:
8833""""""""
8834
8835.. code-block:: llvm
8836
8837      %X = fpext float 3.125 to double         ; yields double:3.125000e+00
8838      %Y = fpext double %X to fp128            ; yields fp128:0xL00000000000000004000900000000000
8839
8840'``fptoui .. to``' Instruction
8841^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8842
8843Syntax:
8844"""""""
8845
8846::
8847
8848      <result> = fptoui <ty> <value> to <ty2>             ; yields ty2
8849
8850Overview:
8851"""""""""
8852
8853The '``fptoui``' converts a floating-point ``value`` to its unsigned
8854integer equivalent of type ``ty2``.
8855
8856Arguments:
8857""""""""""
8858
8859The '``fptoui``' instruction takes a value to cast, which must be a
8860scalar or vector :ref:`floating-point <t_floating>` value, and a type to
8861cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
8862``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
8863type with the same number of elements as ``ty``
8864
8865Semantics:
8866""""""""""
8867
8868The '``fptoui``' instruction converts its :ref:`floating-point
8869<t_floating>` operand into the nearest (rounding towards zero)
8870unsigned integer value. If the value cannot fit in ``ty2``, the result
8871is a :ref:`poison value <poisonvalues>`.
8872
8873Example:
8874""""""""
8875
8876.. code-block:: llvm
8877
8878      %X = fptoui double 123.0 to i32      ; yields i32:123
8879      %Y = fptoui float 1.0E+300 to i1     ; yields undefined:1
8880      %Z = fptoui float 1.04E+17 to i8     ; yields undefined:1
8881
8882'``fptosi .. to``' Instruction
8883^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8884
8885Syntax:
8886"""""""
8887
8888::
8889
8890      <result> = fptosi <ty> <value> to <ty2>             ; yields ty2
8891
8892Overview:
8893"""""""""
8894
8895The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
8896``value`` to type ``ty2``.
8897
8898Arguments:
8899""""""""""
8900
8901The '``fptosi``' instruction takes a value to cast, which must be a
8902scalar or vector :ref:`floating-point <t_floating>` value, and a type to
8903cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
8904``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
8905type with the same number of elements as ``ty``
8906
8907Semantics:
8908""""""""""
8909
8910The '``fptosi``' instruction converts its :ref:`floating-point
8911<t_floating>` operand into the nearest (rounding towards zero)
8912signed integer value. If the value cannot fit in ``ty2``, the result
8913is a :ref:`poison value <poisonvalues>`.
8914
8915Example:
8916""""""""
8917
8918.. code-block:: llvm
8919
8920      %X = fptosi double -123.0 to i32      ; yields i32:-123
8921      %Y = fptosi float 1.0E-247 to i1      ; yields undefined:1
8922      %Z = fptosi float 1.04E+17 to i8      ; yields undefined:1
8923
8924'``uitofp .. to``' Instruction
8925^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8926
8927Syntax:
8928"""""""
8929
8930::
8931
8932      <result> = uitofp <ty> <value> to <ty2>             ; yields ty2
8933
8934Overview:
8935"""""""""
8936
8937The '``uitofp``' instruction regards ``value`` as an unsigned integer
8938and converts that value to the ``ty2`` type.
8939
8940Arguments:
8941""""""""""
8942
8943The '``uitofp``' instruction takes a value to cast, which must be a
8944scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
8945``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
8946``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
8947type with the same number of elements as ``ty``
8948
8949Semantics:
8950""""""""""
8951
8952The '``uitofp``' instruction interprets its operand as an unsigned
8953integer quantity and converts it to the corresponding floating-point
8954value. If the value cannot be exactly represented, it is rounded using
8955the default rounding mode.
8956
8957
8958Example:
8959""""""""
8960
8961.. code-block:: llvm
8962
8963      %X = uitofp i32 257 to float         ; yields float:257.0
8964      %Y = uitofp i8 -1 to double          ; yields double:255.0
8965
8966'``sitofp .. to``' Instruction
8967^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8968
8969Syntax:
8970"""""""
8971
8972::
8973
8974      <result> = sitofp <ty> <value> to <ty2>             ; yields ty2
8975
8976Overview:
8977"""""""""
8978
8979The '``sitofp``' instruction regards ``value`` as a signed integer and
8980converts that value to the ``ty2`` type.
8981
8982Arguments:
8983""""""""""
8984
8985The '``sitofp``' instruction takes a value to cast, which must be a
8986scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
8987``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
8988``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
8989type with the same number of elements as ``ty``
8990
8991Semantics:
8992""""""""""
8993
8994The '``sitofp``' instruction interprets its operand as a signed integer
8995quantity and converts it to the corresponding floating-point value. If the
8996value cannot be exactly represented, it is rounded using the default rounding
8997mode.
8998
8999Example:
9000""""""""
9001
9002.. code-block:: llvm
9003
9004      %X = sitofp i32 257 to float         ; yields float:257.0
9005      %Y = sitofp i8 -1 to double          ; yields double:-1.0
9006
9007.. _i_ptrtoint:
9008
9009'``ptrtoint .. to``' Instruction
9010^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9011
9012Syntax:
9013"""""""
9014
9015::
9016
9017      <result> = ptrtoint <ty> <value> to <ty2>             ; yields ty2
9018
9019Overview:
9020"""""""""
9021
9022The '``ptrtoint``' instruction converts the pointer or a vector of
9023pointers ``value`` to the integer (or vector of integers) type ``ty2``.
9024
9025Arguments:
9026""""""""""
9027
9028The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
9029a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
9030type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
9031a vector of integers type.
9032
9033Semantics:
9034""""""""""
9035
9036The '``ptrtoint``' instruction converts ``value`` to integer type
9037``ty2`` by interpreting the pointer value as an integer and either
9038truncating or zero extending that value to the size of the integer type.
9039If ``value`` is smaller than ``ty2`` then a zero extension is done. If
9040``value`` is larger than ``ty2`` then a truncation is done. If they are
9041the same size, then nothing is done (*no-op cast*) other than a type
9042change.
9043
9044Example:
9045""""""""
9046
9047.. code-block:: llvm
9048
9049      %X = ptrtoint i32* %P to i8                         ; yields truncation on 32-bit architecture
9050      %Y = ptrtoint i32* %P to i64                        ; yields zero extension on 32-bit architecture
9051      %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
9052
9053.. _i_inttoptr:
9054
9055'``inttoptr .. to``' Instruction
9056^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9057
9058Syntax:
9059"""""""
9060
9061::
9062
9063      <result> = inttoptr <ty> <value> to <ty2>             ; yields ty2
9064
9065Overview:
9066"""""""""
9067
9068The '``inttoptr``' instruction converts an integer ``value`` to a
9069pointer type, ``ty2``.
9070
9071Arguments:
9072""""""""""
9073
9074The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
9075cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
9076type.
9077
9078Semantics:
9079""""""""""
9080
9081The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
9082applying either a zero extension or a truncation depending on the size
9083of the integer ``value``. If ``value`` is larger than the size of a
9084pointer then a truncation is done. If ``value`` is smaller than the size
9085of a pointer then a zero extension is done. If they are the same size,
9086nothing is done (*no-op cast*).
9087
9088Example:
9089""""""""
9090
9091.. code-block:: llvm
9092
9093      %X = inttoptr i32 255 to i32*          ; yields zero extension on 64-bit architecture
9094      %Y = inttoptr i32 255 to i32*          ; yields no-op on 32-bit architecture
9095      %Z = inttoptr i64 0 to i32*            ; yields truncation on 32-bit architecture
9096      %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers
9097
9098.. _i_bitcast:
9099
9100'``bitcast .. to``' Instruction
9101^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9102
9103Syntax:
9104"""""""
9105
9106::
9107
9108      <result> = bitcast <ty> <value> to <ty2>             ; yields ty2
9109
9110Overview:
9111"""""""""
9112
9113The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
9114changing any bits.
9115
9116Arguments:
9117""""""""""
9118
9119The '``bitcast``' instruction takes a value to cast, which must be a
9120non-aggregate first class value, and a type to cast it to, which must
9121also be a non-aggregate :ref:`first class <t_firstclass>` type. The
9122bit sizes of ``value`` and the destination type, ``ty2``, must be
9123identical. If the source type is a pointer, the destination type must
9124also be a pointer of the same size. This instruction supports bitwise
9125conversion of vectors to integers and to vectors of other types (as
9126long as they have the same size).
9127
9128Semantics:
9129""""""""""
9130
9131The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
9132is always a *no-op cast* because no bits change with this
9133conversion. The conversion is done as if the ``value`` had been stored
9134to memory and read back as type ``ty2``. Pointer (or vector of
9135pointers) types may only be converted to other pointer (or vector of
9136pointers) types with the same address space through this instruction.
9137To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
9138or :ref:`ptrtoint <i_ptrtoint>` instructions first.
9139
9140Example:
9141""""""""
9142
9143.. code-block:: text
9144
9145      %X = bitcast i8 255 to i8              ; yields i8 :-1
9146      %Y = bitcast i32* %x to sint*          ; yields sint*:%x
9147      %Z = bitcast <2 x int> %V to i64;        ; yields i64: %V
9148      %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
9149
9150.. _i_addrspacecast:
9151
9152'``addrspacecast .. to``' Instruction
9153^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9154
9155Syntax:
9156"""""""
9157
9158::
9159
9160      <result> = addrspacecast <pty> <ptrval> to <pty2>       ; yields pty2
9161
9162Overview:
9163"""""""""
9164
9165The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
9166address space ``n`` to type ``pty2`` in address space ``m``.
9167
9168Arguments:
9169""""""""""
9170
9171The '``addrspacecast``' instruction takes a pointer or vector of pointer value
9172to cast and a pointer type to cast it to, which must have a different
9173address space.
9174
9175Semantics:
9176""""""""""
9177
9178The '``addrspacecast``' instruction converts the pointer value
9179``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
9180value modification, depending on the target and the address space
9181pair. Pointer conversions within the same address space must be
9182performed with the ``bitcast`` instruction. Note that if the address space
9183conversion is legal then both result and operand refer to the same memory
9184location.
9185
9186Example:
9187""""""""
9188
9189.. code-block:: llvm
9190
9191      %X = addrspacecast i32* %x to i32 addrspace(1)*    ; yields i32 addrspace(1)*:%x
9192      %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)*    ; yields i64 addrspace(2)*:%y
9193      %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*>   ; yields <4 x float addrspace(3)*>:%z
9194
9195.. _otherops:
9196
9197Other Operations
9198----------------
9199
9200The instructions in this category are the "miscellaneous" instructions,
9201which defy better classification.
9202
9203.. _i_icmp:
9204
9205'``icmp``' Instruction
9206^^^^^^^^^^^^^^^^^^^^^^
9207
9208Syntax:
9209"""""""
9210
9211::
9212
9213      <result> = icmp <cond> <ty> <op1>, <op2>   ; yields i1 or <N x i1>:result
9214
9215Overview:
9216"""""""""
9217
9218The '``icmp``' instruction returns a boolean value or a vector of
9219boolean values based on comparison of its two integer, integer vector,
9220pointer, or pointer vector operands.
9221
9222Arguments:
9223""""""""""
9224
9225The '``icmp``' instruction takes three operands. The first operand is
9226the condition code indicating the kind of comparison to perform. It is
9227not a value, just a keyword. The possible condition codes are:
9228
9229#. ``eq``: equal
9230#. ``ne``: not equal
9231#. ``ugt``: unsigned greater than
9232#. ``uge``: unsigned greater or equal
9233#. ``ult``: unsigned less than
9234#. ``ule``: unsigned less or equal
9235#. ``sgt``: signed greater than
9236#. ``sge``: signed greater or equal
9237#. ``slt``: signed less than
9238#. ``sle``: signed less or equal
9239
9240The remaining two arguments must be :ref:`integer <t_integer>` or
9241:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
9242must also be identical types.
9243
9244Semantics:
9245""""""""""
9246
9247The '``icmp``' compares ``op1`` and ``op2`` according to the condition
9248code given as ``cond``. The comparison performed always yields either an
9249:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
9250
9251#. ``eq``: yields ``true`` if the operands are equal, ``false``
9252   otherwise. No sign interpretation is necessary or performed.
9253#. ``ne``: yields ``true`` if the operands are unequal, ``false``
9254   otherwise. No sign interpretation is necessary or performed.
9255#. ``ugt``: interprets the operands as unsigned values and yields
9256   ``true`` if ``op1`` is greater than ``op2``.
9257#. ``uge``: interprets the operands as unsigned values and yields
9258   ``true`` if ``op1`` is greater than or equal to ``op2``.
9259#. ``ult``: interprets the operands as unsigned values and yields
9260   ``true`` if ``op1`` is less than ``op2``.
9261#. ``ule``: interprets the operands as unsigned values and yields
9262   ``true`` if ``op1`` is less than or equal to ``op2``.
9263#. ``sgt``: interprets the operands as signed values and yields ``true``
9264   if ``op1`` is greater than ``op2``.
9265#. ``sge``: interprets the operands as signed values and yields ``true``
9266   if ``op1`` is greater than or equal to ``op2``.
9267#. ``slt``: interprets the operands as signed values and yields ``true``
9268   if ``op1`` is less than ``op2``.
9269#. ``sle``: interprets the operands as signed values and yields ``true``
9270   if ``op1`` is less than or equal to ``op2``.
9271
9272If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
9273are compared as if they were integers.
9274
9275If the operands are integer vectors, then they are compared element by
9276element. The result is an ``i1`` vector with the same number of elements
9277as the values being compared. Otherwise, the result is an ``i1``.
9278
9279Example:
9280""""""""
9281
9282.. code-block:: text
9283
9284      <result> = icmp eq i32 4, 5          ; yields: result=false
9285      <result> = icmp ne float* %X, %X     ; yields: result=false
9286      <result> = icmp ult i16  4, 5        ; yields: result=true
9287      <result> = icmp sgt i16  4, 5        ; yields: result=false
9288      <result> = icmp ule i16 -4, 5        ; yields: result=false
9289      <result> = icmp sge i16  4, 5        ; yields: result=false
9290
9291.. _i_fcmp:
9292
9293'``fcmp``' Instruction
9294^^^^^^^^^^^^^^^^^^^^^^
9295
9296Syntax:
9297"""""""
9298
9299::
9300
9301      <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2>     ; yields i1 or <N x i1>:result
9302
9303Overview:
9304"""""""""
9305
9306The '``fcmp``' instruction returns a boolean value or vector of boolean
9307values based on comparison of its operands.
9308
9309If the operands are floating-point scalars, then the result type is a
9310boolean (:ref:`i1 <t_integer>`).
9311
9312If the operands are floating-point vectors, then the result type is a
9313vector of boolean with the same number of elements as the operands being
9314compared.
9315
9316Arguments:
9317""""""""""
9318
9319The '``fcmp``' instruction takes three operands. The first operand is
9320the condition code indicating the kind of comparison to perform. It is
9321not a value, just a keyword. The possible condition codes are:
9322
9323#. ``false``: no comparison, always returns false
9324#. ``oeq``: ordered and equal
9325#. ``ogt``: ordered and greater than
9326#. ``oge``: ordered and greater than or equal
9327#. ``olt``: ordered and less than
9328#. ``ole``: ordered and less than or equal
9329#. ``one``: ordered and not equal
9330#. ``ord``: ordered (no nans)
9331#. ``ueq``: unordered or equal
9332#. ``ugt``: unordered or greater than
9333#. ``uge``: unordered or greater than or equal
9334#. ``ult``: unordered or less than
9335#. ``ule``: unordered or less than or equal
9336#. ``une``: unordered or not equal
9337#. ``uno``: unordered (either nans)
9338#. ``true``: no comparison, always returns true
9339
9340*Ordered* means that neither operand is a QNAN while *unordered* means
9341that either operand may be a QNAN.
9342
9343Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
9344<t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
9345They must have identical types.
9346
9347Semantics:
9348""""""""""
9349
9350The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
9351condition code given as ``cond``. If the operands are vectors, then the
9352vectors are compared element by element. Each comparison performed
9353always yields an :ref:`i1 <t_integer>` result, as follows:
9354
9355#. ``false``: always yields ``false``, regardless of operands.
9356#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
9357   is equal to ``op2``.
9358#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
9359   is greater than ``op2``.
9360#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
9361   is greater than or equal to ``op2``.
9362#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
9363   is less than ``op2``.
9364#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
9365   is less than or equal to ``op2``.
9366#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
9367   is not equal to ``op2``.
9368#. ``ord``: yields ``true`` if both operands are not a QNAN.
9369#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
9370   equal to ``op2``.
9371#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
9372   greater than ``op2``.
9373#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
9374   greater than or equal to ``op2``.
9375#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
9376   less than ``op2``.
9377#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
9378   less than or equal to ``op2``.
9379#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
9380   not equal to ``op2``.
9381#. ``uno``: yields ``true`` if either operand is a QNAN.
9382#. ``true``: always yields ``true``, regardless of operands.
9383
9384The ``fcmp`` instruction can also optionally take any number of
9385:ref:`fast-math flags <fastmath>`, which are optimization hints to enable
9386otherwise unsafe floating-point optimizations.
9387
9388Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
9389only flags that have any effect on its semantics are those that allow
9390assumptions to be made about the values of input arguments; namely
9391``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information.
9392
9393Example:
9394""""""""
9395
9396.. code-block:: text
9397
9398      <result> = fcmp oeq float 4.0, 5.0    ; yields: result=false
9399      <result> = fcmp one float 4.0, 5.0    ; yields: result=true
9400      <result> = fcmp olt float 4.0, 5.0    ; yields: result=true
9401      <result> = fcmp ueq double 1.0, 2.0   ; yields: result=false
9402
9403.. _i_phi:
9404
9405'``phi``' Instruction
9406^^^^^^^^^^^^^^^^^^^^^
9407
9408Syntax:
9409"""""""
9410
9411::
9412
9413      <result> = phi <ty> [ <val0>, <label0>], ...
9414
9415Overview:
9416"""""""""
9417
9418The '``phi``' instruction is used to implement the φ node in the SSA
9419graph representing the function.
9420
9421Arguments:
9422""""""""""
9423
9424The type of the incoming values is specified with the first type field.
9425After this, the '``phi``' instruction takes a list of pairs as
9426arguments, with one pair for each predecessor basic block of the current
9427block. Only values of :ref:`first class <t_firstclass>` type may be used as
9428the value arguments to the PHI node. Only labels may be used as the
9429label arguments.
9430
9431There must be no non-phi instructions between the start of a basic block
9432and the PHI instructions: i.e. PHI instructions must be first in a basic
9433block.
9434
9435For the purposes of the SSA form, the use of each incoming value is
9436deemed to occur on the edge from the corresponding predecessor block to
9437the current block (but after any definition of an '``invoke``'
9438instruction's return value on the same edge).
9439
9440Semantics:
9441""""""""""
9442
9443At runtime, the '``phi``' instruction logically takes on the value
9444specified by the pair corresponding to the predecessor basic block that
9445executed just prior to the current block.
9446
9447Example:
9448""""""""
9449
9450.. code-block:: llvm
9451
9452    Loop:       ; Infinite loop that counts from 0 on up...
9453      %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
9454      %nextindvar = add i32 %indvar, 1
9455      br label %Loop
9456
9457.. _i_select:
9458
9459'``select``' Instruction
9460^^^^^^^^^^^^^^^^^^^^^^^^
9461
9462Syntax:
9463"""""""
9464
9465::
9466
9467      <result> = select selty <cond>, <ty> <val1>, <ty> <val2>             ; yields ty
9468
9469      selty is either i1 or {<N x i1>}
9470
9471Overview:
9472"""""""""
9473
9474The '``select``' instruction is used to choose one value based on a
9475condition, without IR-level branching.
9476
9477Arguments:
9478""""""""""
9479
9480The '``select``' instruction requires an 'i1' value or a vector of 'i1'
9481values indicating the condition, and two values of the same :ref:`first
9482class <t_firstclass>` type.
9483
9484Semantics:
9485""""""""""
9486
9487If the condition is an i1 and it evaluates to 1, the instruction returns
9488the first value argument; otherwise, it returns the second value
9489argument.
9490
9491If the condition is a vector of i1, then the value arguments must be
9492vectors of the same size, and the selection is done element by element.
9493
9494If the condition is an i1 and the value arguments are vectors of the
9495same size, then an entire vector is selected.
9496
9497Example:
9498""""""""
9499
9500.. code-block:: llvm
9501
9502      %X = select i1 true, i8 17, i8 42          ; yields i8:17
9503
9504.. _i_call:
9505
9506'``call``' Instruction
9507^^^^^^^^^^^^^^^^^^^^^^
9508
9509Syntax:
9510"""""""
9511
9512::
9513
9514      <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
9515                   [ operand bundles ]
9516
9517Overview:
9518"""""""""
9519
9520The '``call``' instruction represents a simple function call.
9521
9522Arguments:
9523""""""""""
9524
9525This instruction requires several arguments:
9526
9527#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
9528   should perform tail call optimization. The ``tail`` marker is a hint that
9529   `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker
9530   means that the call must be tail call optimized in order for the program to
9531   be correct. The ``musttail`` marker provides these guarantees:
9532
9533   #. The call will not cause unbounded stack growth if it is part of a
9534      recursive cycle in the call graph.
9535   #. Arguments with the :ref:`inalloca <attr_inalloca>` attribute are
9536      forwarded in place.
9537
9538   Both markers imply that the callee does not access allocas from the caller.
9539   The ``tail`` marker additionally implies that the callee does not access
9540   varargs from the caller, while ``musttail`` implies that varargs from the
9541   caller are passed to the callee. Calls marked ``musttail`` must obey the
9542   following additional  rules:
9543
9544   - The call must immediately precede a :ref:`ret <i_ret>` instruction,
9545     or a pointer bitcast followed by a ret instruction.
9546   - The ret instruction must return the (possibly bitcasted) value
9547     produced by the call or void.
9548   - The caller and callee prototypes must match. Pointer types of
9549     parameters or return types may differ in pointee type, but not
9550     in address space.
9551   - The calling conventions of the caller and callee must match.
9552   - All ABI-impacting function attributes, such as sret, byval, inreg,
9553     returned, and inalloca, must match.
9554   - The callee must be varargs iff the caller is varargs. Bitcasting a
9555     non-varargs function to the appropriate varargs type is legal so
9556     long as the non-varargs prefixes obey the other rules.
9557
9558   Tail call optimization for calls marked ``tail`` is guaranteed to occur if
9559   the following conditions are met:
9560
9561   -  Caller and callee both have the calling convention ``fastcc``.
9562   -  The call is in tail position (ret immediately follows call and ret
9563      uses value of call or is void).
9564   -  Option ``-tailcallopt`` is enabled, or
9565      ``llvm::GuaranteedTailCallOpt`` is ``true``.
9566   -  `Platform-specific constraints are
9567      met. <CodeGenerator.html#tailcallopt>`_
9568
9569#. The optional ``notail`` marker indicates that the optimizers should not add
9570   ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
9571   call optimization from being performed on the call.
9572
9573#. The optional ``fast-math flags`` marker indicates that the call has one or more
9574   :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
9575   otherwise unsafe floating-point optimizations. Fast-math flags are only valid
9576   for calls that return a floating-point scalar or vector type.
9577
9578#. The optional "cconv" marker indicates which :ref:`calling
9579   convention <callingconv>` the call should use. If none is
9580   specified, the call defaults to using C calling conventions. The
9581   calling convention of the call must match the calling convention of
9582   the target function, or else the behavior is undefined.
9583#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
9584   values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
9585   are valid here.
9586#. '``ty``': the type of the call instruction itself which is also the
9587   type of the return value. Functions that return no value are marked
9588   ``void``.
9589#. '``fnty``': shall be the signature of the function being called. The
9590   argument types must match the types implied by this signature. This
9591   type can be omitted if the function is not varargs.
9592#. '``fnptrval``': An LLVM value containing a pointer to a function to
9593   be called. In most cases, this is a direct function call, but
9594   indirect ``call``'s are just as possible, calling an arbitrary pointer
9595   to function value.
9596#. '``function args``': argument list whose types match the function
9597   signature argument types and parameter attributes. All arguments must
9598   be of :ref:`first class <t_firstclass>` type. If the function signature
9599   indicates the function accepts a variable number of arguments, the
9600   extra arguments can be specified.
9601#. The optional :ref:`function attributes <fnattrs>` list.
9602#. The optional :ref:`operand bundles <opbundles>` list.
9603
9604Semantics:
9605""""""""""
9606
9607The '``call``' instruction is used to cause control flow to transfer to
9608a specified function, with its incoming arguments bound to the specified
9609values. Upon a '``ret``' instruction in the called function, control
9610flow continues with the instruction after the function call, and the
9611return value of the function is bound to the result argument.
9612
9613Example:
9614""""""""
9615
9616.. code-block:: llvm
9617
9618      %retval = call i32 @test(i32 %argc)
9619      call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42)        ; yields i32
9620      %X = tail call i32 @foo()                                    ; yields i32
9621      %Y = tail call fastcc i32 @foo()  ; yields i32
9622      call void %foo(i8 97 signext)
9623
9624      %struct.A = type { i32, i8 }
9625      %r = call %struct.A @foo()                        ; yields { i32, i8 }
9626      %gr = extractvalue %struct.A %r, 0                ; yields i32
9627      %gr1 = extractvalue %struct.A %r, 1               ; yields i8
9628      %Z = call void @foo() noreturn                    ; indicates that %foo never returns normally
9629      %ZZ = call zeroext i32 @bar()                     ; Return value is %zero extended
9630
9631llvm treats calls to some functions with names and arguments that match
9632the standard C99 library as being the C99 library functions, and may
9633perform optimizations or generate code for them under that assumption.
9634This is something we'd like to change in the future to provide better
9635support for freestanding environments and non-C-based languages.
9636
9637.. _i_va_arg:
9638
9639'``va_arg``' Instruction
9640^^^^^^^^^^^^^^^^^^^^^^^^
9641
9642Syntax:
9643"""""""
9644
9645::
9646
9647      <resultval> = va_arg <va_list*> <arglist>, <argty>
9648
9649Overview:
9650"""""""""
9651
9652The '``va_arg``' instruction is used to access arguments passed through
9653the "variable argument" area of a function call. It is used to implement
9654the ``va_arg`` macro in C.
9655
9656Arguments:
9657""""""""""
9658
9659This instruction takes a ``va_list*`` value and the type of the
9660argument. It returns a value of the specified argument type and
9661increments the ``va_list`` to point to the next argument. The actual
9662type of ``va_list`` is target specific.
9663
9664Semantics:
9665""""""""""
9666
9667The '``va_arg``' instruction loads an argument of the specified type
9668from the specified ``va_list`` and causes the ``va_list`` to point to
9669the next argument. For more information, see the variable argument
9670handling :ref:`Intrinsic Functions <int_varargs>`.
9671
9672It is legal for this instruction to be called in a function which does
9673not take a variable number of arguments, for example, the ``vfprintf``
9674function.
9675
9676``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
9677function <intrinsics>` because it takes a type as an argument.
9678
9679Example:
9680""""""""
9681
9682See the :ref:`variable argument processing <int_varargs>` section.
9683
9684Note that the code generator does not yet fully support va\_arg on many
9685targets. Also, it does not currently support va\_arg with aggregate
9686types on any target.
9687
9688.. _i_landingpad:
9689
9690'``landingpad``' Instruction
9691^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9692
9693Syntax:
9694"""""""
9695
9696::
9697
9698      <resultval> = landingpad <resultty> <clause>+
9699      <resultval> = landingpad <resultty> cleanup <clause>*
9700
9701      <clause> := catch <type> <value>
9702      <clause> := filter <array constant type> <array constant>
9703
9704Overview:
9705"""""""""
9706
9707The '``landingpad``' instruction is used by `LLVM's exception handling
9708system <ExceptionHandling.html#overview>`_ to specify that a basic block
9709is a landing pad --- one where the exception lands, and corresponds to the
9710code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
9711defines values supplied by the :ref:`personality function <personalityfn>` upon
9712re-entry to the function. The ``resultval`` has the type ``resultty``.
9713
9714Arguments:
9715""""""""""
9716
9717The optional
9718``cleanup`` flag indicates that the landing pad block is a cleanup.
9719
9720A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
9721contains the global variable representing the "type" that may be caught
9722or filtered respectively. Unlike the ``catch`` clause, the ``filter``
9723clause takes an array constant as its argument. Use
9724"``[0 x i8**] undef``" for a filter which cannot throw. The
9725'``landingpad``' instruction must contain *at least* one ``clause`` or
9726the ``cleanup`` flag.
9727
9728Semantics:
9729""""""""""
9730
9731The '``landingpad``' instruction defines the values which are set by the
9732:ref:`personality function <personalityfn>` upon re-entry to the function, and
9733therefore the "result type" of the ``landingpad`` instruction. As with
9734calling conventions, how the personality function results are
9735represented in LLVM IR is target specific.
9736
9737The clauses are applied in order from top to bottom. If two
9738``landingpad`` instructions are merged together through inlining, the
9739clauses from the calling function are appended to the list of clauses.
9740When the call stack is being unwound due to an exception being thrown,
9741the exception is compared against each ``clause`` in turn. If it doesn't
9742match any of the clauses, and the ``cleanup`` flag is not set, then
9743unwinding continues further up the call stack.
9744
9745The ``landingpad`` instruction has several restrictions:
9746
9747-  A landing pad block is a basic block which is the unwind destination
9748   of an '``invoke``' instruction.
9749-  A landing pad block must have a '``landingpad``' instruction as its
9750   first non-PHI instruction.
9751-  There can be only one '``landingpad``' instruction within the landing
9752   pad block.
9753-  A basic block that is not a landing pad block may not include a
9754   '``landingpad``' instruction.
9755
9756Example:
9757""""""""
9758
9759.. code-block:: llvm
9760
9761      ;; A landing pad which can catch an integer.
9762      %res = landingpad { i8*, i32 }
9763               catch i8** @_ZTIi
9764      ;; A landing pad that is a cleanup.
9765      %res = landingpad { i8*, i32 }
9766               cleanup
9767      ;; A landing pad which can catch an integer and can only throw a double.
9768      %res = landingpad { i8*, i32 }
9769               catch i8** @_ZTIi
9770               filter [1 x i8**] [@_ZTId]
9771
9772.. _i_catchpad:
9773
9774'``catchpad``' Instruction
9775^^^^^^^^^^^^^^^^^^^^^^^^^^
9776
9777Syntax:
9778"""""""
9779
9780::
9781
9782      <resultval> = catchpad within <catchswitch> [<args>*]
9783
9784Overview:
9785"""""""""
9786
9787The '``catchpad``' instruction is used by `LLVM's exception handling
9788system <ExceptionHandling.html#overview>`_ to specify that a basic block
9789begins a catch handler --- one where a personality routine attempts to transfer
9790control to catch an exception.
9791
9792Arguments:
9793""""""""""
9794
9795The ``catchswitch`` operand must always be a token produced by a
9796:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
9797ensures that each ``catchpad`` has exactly one predecessor block, and it always
9798terminates in a ``catchswitch``.
9799
9800The ``args`` correspond to whatever information the personality routine
9801requires to know if this is an appropriate handler for the exception. Control
9802will transfer to the ``catchpad`` if this is the first appropriate handler for
9803the exception.
9804
9805The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
9806``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
9807pads.
9808
9809Semantics:
9810""""""""""
9811
9812When the call stack is being unwound due to an exception being thrown, the
9813exception is compared against the ``args``. If it doesn't match, control will
9814not reach the ``catchpad`` instruction.  The representation of ``args`` is
9815entirely target and personality function-specific.
9816
9817Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
9818instruction must be the first non-phi of its parent basic block.
9819
9820The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
9821instructions is described in the
9822`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
9823
9824When a ``catchpad`` has been "entered" but not yet "exited" (as
9825described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9826it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
9827that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
9828
9829Example:
9830""""""""
9831
9832.. code-block:: text
9833
9834    dispatch:
9835      %cs = catchswitch within none [label %handler0] unwind to caller
9836      ;; A catch block which can catch an integer.
9837    handler0:
9838      %tok = catchpad within %cs [i8** @_ZTIi]
9839
9840.. _i_cleanuppad:
9841
9842'``cleanuppad``' Instruction
9843^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9844
9845Syntax:
9846"""""""
9847
9848::
9849
9850      <resultval> = cleanuppad within <parent> [<args>*]
9851
9852Overview:
9853"""""""""
9854
9855The '``cleanuppad``' instruction is used by `LLVM's exception handling
9856system <ExceptionHandling.html#overview>`_ to specify that a basic block
9857is a cleanup block --- one where a personality routine attempts to
9858transfer control to run cleanup actions.
9859The ``args`` correspond to whatever additional
9860information the :ref:`personality function <personalityfn>` requires to
9861execute the cleanup.
9862The ``resultval`` has the type :ref:`token <t_token>` and is used to
9863match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
9864The ``parent`` argument is the token of the funclet that contains the
9865``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
9866this operand may be the token ``none``.
9867
9868Arguments:
9869""""""""""
9870
9871The instruction takes a list of arbitrary values which are interpreted
9872by the :ref:`personality function <personalityfn>`.
9873
9874Semantics:
9875""""""""""
9876
9877When the call stack is being unwound due to an exception being thrown,
9878the :ref:`personality function <personalityfn>` transfers control to the
9879``cleanuppad`` with the aid of the personality-specific arguments.
9880As with calling conventions, how the personality function results are
9881represented in LLVM IR is target specific.
9882
9883The ``cleanuppad`` instruction has several restrictions:
9884
9885-  A cleanup block is a basic block which is the unwind destination of
9886   an exceptional instruction.
9887-  A cleanup block must have a '``cleanuppad``' instruction as its
9888   first non-PHI instruction.
9889-  There can be only one '``cleanuppad``' instruction within the
9890   cleanup block.
9891-  A basic block that is not a cleanup block may not include a
9892   '``cleanuppad``' instruction.
9893
9894When a ``cleanuppad`` has been "entered" but not yet "exited" (as
9895described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9896it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
9897that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
9898
9899Example:
9900""""""""
9901
9902.. code-block:: text
9903
9904      %tok = cleanuppad within %cs []
9905
9906.. _intrinsics:
9907
9908Intrinsic Functions
9909===================
9910
9911LLVM supports the notion of an "intrinsic function". These functions
9912have well known names and semantics and are required to follow certain
9913restrictions. Overall, these intrinsics represent an extension mechanism
9914for the LLVM language that does not require changing all of the
9915transformations in LLVM when adding to the language (or the bitcode
9916reader/writer, the parser, etc...).
9917
9918Intrinsic function names must all start with an "``llvm.``" prefix. This
9919prefix is reserved in LLVM for intrinsic names; thus, function names may
9920not begin with this prefix. Intrinsic functions must always be external
9921functions: you cannot define the body of intrinsic functions. Intrinsic
9922functions may only be used in call or invoke instructions: it is illegal
9923to take the address of an intrinsic function. Additionally, because
9924intrinsic functions are part of the LLVM language, it is required if any
9925are added that they be documented here.
9926
9927Some intrinsic functions can be overloaded, i.e., the intrinsic
9928represents a family of functions that perform the same operation but on
9929different data types. Because LLVM can represent over 8 million
9930different integer types, overloading is used commonly to allow an
9931intrinsic function to operate on any integer type. One or more of the
9932argument types or the result type can be overloaded to accept any
9933integer type. Argument types may also be defined as exactly matching a
9934previous argument's type or the result type. This allows an intrinsic
9935function which accepts multiple arguments, but needs all of them to be
9936of the same type, to only be overloaded with respect to a single
9937argument or the result.
9938
9939Overloaded intrinsics will have the names of its overloaded argument
9940types encoded into its function name, each preceded by a period. Only
9941those types which are overloaded result in a name suffix. Arguments
9942whose type is matched against another type do not. For example, the
9943``llvm.ctpop`` function can take an integer of any width and returns an
9944integer of exactly the same integer width. This leads to a family of
9945functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
9946``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
9947overloaded, and only one type suffix is required. Because the argument's
9948type is matched against the return type, it does not require its own
9949name suffix.
9950
9951To learn how to add an intrinsic function, please see the `Extending
9952LLVM Guide <ExtendingLLVM.html>`_.
9953
9954.. _int_varargs:
9955
9956Variable Argument Handling Intrinsics
9957-------------------------------------
9958
9959Variable argument support is defined in LLVM with the
9960:ref:`va_arg <i_va_arg>` instruction and these three intrinsic
9961functions. These functions are related to the similarly named macros
9962defined in the ``<stdarg.h>`` header file.
9963
9964All of these functions operate on arguments that use a target-specific
9965value type "``va_list``". The LLVM assembly language reference manual
9966does not define what this type is, so all transformations should be
9967prepared to handle these functions regardless of the type used.
9968
9969This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
9970variable argument handling intrinsic functions are used.
9971
9972.. code-block:: llvm
9973
9974    ; This struct is different for every platform. For most platforms,
9975    ; it is merely an i8*.
9976    %struct.va_list = type { i8* }
9977
9978    ; For Unix x86_64 platforms, va_list is the following struct:
9979    ; %struct.va_list = type { i32, i32, i8*, i8* }
9980
9981    define i32 @test(i32 %X, ...) {
9982      ; Initialize variable argument processing
9983      %ap = alloca %struct.va_list
9984      %ap2 = bitcast %struct.va_list* %ap to i8*
9985      call void @llvm.va_start(i8* %ap2)
9986
9987      ; Read a single integer argument
9988      %tmp = va_arg i8* %ap2, i32
9989
9990      ; Demonstrate usage of llvm.va_copy and llvm.va_end
9991      %aq = alloca i8*
9992      %aq2 = bitcast i8** %aq to i8*
9993      call void @llvm.va_copy(i8* %aq2, i8* %ap2)
9994      call void @llvm.va_end(i8* %aq2)
9995
9996      ; Stop processing of arguments.
9997      call void @llvm.va_end(i8* %ap2)
9998      ret i32 %tmp
9999    }
10000
10001    declare void @llvm.va_start(i8*)
10002    declare void @llvm.va_copy(i8*, i8*)
10003    declare void @llvm.va_end(i8*)
10004
10005.. _int_va_start:
10006
10007'``llvm.va_start``' Intrinsic
10008^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10009
10010Syntax:
10011"""""""
10012
10013::
10014
10015      declare void @llvm.va_start(i8* <arglist>)
10016
10017Overview:
10018"""""""""
10019
10020The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for
10021subsequent use by ``va_arg``.
10022
10023Arguments:
10024""""""""""
10025
10026The argument is a pointer to a ``va_list`` element to initialize.
10027
10028Semantics:
10029""""""""""
10030
10031The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
10032available in C. In a target-dependent way, it initializes the
10033``va_list`` element to which the argument points, so that the next call
10034to ``va_arg`` will produce the first variable argument passed to the
10035function. Unlike the C ``va_start`` macro, this intrinsic does not need
10036to know the last argument of the function as the compiler can figure
10037that out.
10038
10039'``llvm.va_end``' Intrinsic
10040^^^^^^^^^^^^^^^^^^^^^^^^^^^
10041
10042Syntax:
10043"""""""
10044
10045::
10046
10047      declare void @llvm.va_end(i8* <arglist>)
10048
10049Overview:
10050"""""""""
10051
10052The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been
10053initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
10054
10055Arguments:
10056""""""""""
10057
10058The argument is a pointer to a ``va_list`` to destroy.
10059
10060Semantics:
10061""""""""""
10062
10063The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
10064available in C. In a target-dependent way, it destroys the ``va_list``
10065element to which the argument points. Calls to
10066:ref:`llvm.va_start <int_va_start>` and
10067:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
10068``llvm.va_end``.
10069
10070.. _int_va_copy:
10071
10072'``llvm.va_copy``' Intrinsic
10073^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10074
10075Syntax:
10076"""""""
10077
10078::
10079
10080      declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>)
10081
10082Overview:
10083"""""""""
10084
10085The '``llvm.va_copy``' intrinsic copies the current argument position
10086from the source argument list to the destination argument list.
10087
10088Arguments:
10089""""""""""
10090
10091The first argument is a pointer to a ``va_list`` element to initialize.
10092The second argument is a pointer to a ``va_list`` element to copy from.
10093
10094Semantics:
10095""""""""""
10096
10097The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
10098available in C. In a target-dependent way, it copies the source
10099``va_list`` element into the destination ``va_list`` element. This
10100intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
10101arbitrarily complex and require, for example, memory allocation.
10102
10103Accurate Garbage Collection Intrinsics
10104--------------------------------------
10105
10106LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
10107(GC) requires the frontend to generate code containing appropriate intrinsic
10108calls and select an appropriate GC strategy which knows how to lower these
10109intrinsics in a manner which is appropriate for the target collector.
10110
10111These intrinsics allow identification of :ref:`GC roots on the
10112stack <int_gcroot>`, as well as garbage collector implementations that
10113require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
10114Frontends for type-safe garbage collected languages should generate
10115these intrinsics to make use of the LLVM garbage collectors. For more
10116details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
10117
10118Experimental Statepoint Intrinsics
10119^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10120
10121LLVM provides an second experimental set of intrinsics for describing garbage
10122collection safepoints in compiled code. These intrinsics are an alternative
10123to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
10124:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
10125differences in approach are covered in the `Garbage Collection with LLVM
10126<GarbageCollection.html>`_ documentation. The intrinsics themselves are
10127described in :doc:`Statepoints`.
10128
10129.. _int_gcroot:
10130
10131'``llvm.gcroot``' Intrinsic
10132^^^^^^^^^^^^^^^^^^^^^^^^^^^
10133
10134Syntax:
10135"""""""
10136
10137::
10138
10139      declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
10140
10141Overview:
10142"""""""""
10143
10144The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
10145the code generator, and allows some metadata to be associated with it.
10146
10147Arguments:
10148""""""""""
10149
10150The first argument specifies the address of a stack object that contains
10151the root pointer. The second pointer (which must be either a constant or
10152a global value address) contains the meta-data to be associated with the
10153root.
10154
10155Semantics:
10156""""""""""
10157
10158At runtime, a call to this intrinsic stores a null pointer into the
10159"ptrloc" location. At compile-time, the code generator generates
10160information to allow the runtime to find the pointer at GC safe points.
10161The '``llvm.gcroot``' intrinsic may only be used in a function which
10162:ref:`specifies a GC algorithm <gc>`.
10163
10164.. _int_gcread:
10165
10166'``llvm.gcread``' Intrinsic
10167^^^^^^^^^^^^^^^^^^^^^^^^^^^
10168
10169Syntax:
10170"""""""
10171
10172::
10173
10174      declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr)
10175
10176Overview:
10177"""""""""
10178
10179The '``llvm.gcread``' intrinsic identifies reads of references from heap
10180locations, allowing garbage collector implementations that require read
10181barriers.
10182
10183Arguments:
10184""""""""""
10185
10186The second argument is the address to read from, which should be an
10187address allocated from the garbage collector. The first object is a
10188pointer to the start of the referenced object, if needed by the language
10189runtime (otherwise null).
10190
10191Semantics:
10192""""""""""
10193
10194The '``llvm.gcread``' intrinsic has the same semantics as a load
10195instruction, but may be replaced with substantially more complex code by
10196the garbage collector runtime, as needed. The '``llvm.gcread``'
10197intrinsic may only be used in a function which :ref:`specifies a GC
10198algorithm <gc>`.
10199
10200.. _int_gcwrite:
10201
10202'``llvm.gcwrite``' Intrinsic
10203^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10204
10205Syntax:
10206"""""""
10207
10208::
10209
10210      declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2)
10211
10212Overview:
10213"""""""""
10214
10215The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
10216locations, allowing garbage collector implementations that require write
10217barriers (such as generational or reference counting collectors).
10218
10219Arguments:
10220""""""""""
10221
10222The first argument is the reference to store, the second is the start of
10223the object to store it to, and the third is the address of the field of
10224Obj to store to. If the runtime does not require a pointer to the
10225object, Obj may be null.
10226
10227Semantics:
10228""""""""""
10229
10230The '``llvm.gcwrite``' intrinsic has the same semantics as a store
10231instruction, but may be replaced with substantially more complex code by
10232the garbage collector runtime, as needed. The '``llvm.gcwrite``'
10233intrinsic may only be used in a function which :ref:`specifies a GC
10234algorithm <gc>`.
10235
10236Code Generator Intrinsics
10237-------------------------
10238
10239These intrinsics are provided by LLVM to expose special features that
10240may only be implemented with code generator support.
10241
10242'``llvm.returnaddress``' Intrinsic
10243^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10244
10245Syntax:
10246"""""""
10247
10248::
10249
10250      declare i8* @llvm.returnaddress(i32 <level>)
10251
10252Overview:
10253"""""""""
10254
10255The '``llvm.returnaddress``' intrinsic attempts to compute a
10256target-specific value indicating the return address of the current
10257function or one of its callers.
10258
10259Arguments:
10260""""""""""
10261
10262The argument to this intrinsic indicates which function to return the
10263address for. Zero indicates the calling function, one indicates its
10264caller, etc. The argument is **required** to be a constant integer
10265value.
10266
10267Semantics:
10268""""""""""
10269
10270The '``llvm.returnaddress``' intrinsic either returns a pointer
10271indicating the return address of the specified call frame, or zero if it
10272cannot be identified. The value returned by this intrinsic is likely to
10273be incorrect or 0 for arguments other than zero, so it should only be
10274used for debugging purposes.
10275
10276Note that calling this intrinsic does not prevent function inlining or
10277other aggressive transformations, so the value returned may not be that
10278of the obvious source-language caller.
10279
10280'``llvm.addressofreturnaddress``' Intrinsic
10281^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10282
10283Syntax:
10284"""""""
10285
10286::
10287
10288      declare i8* @llvm.addressofreturnaddress()
10289
10290Overview:
10291"""""""""
10292
10293The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
10294pointer to the place in the stack frame where the return address of the
10295current function is stored.
10296
10297Semantics:
10298""""""""""
10299
10300Note that calling this intrinsic does not prevent function inlining or
10301other aggressive transformations, so the value returned may not be that
10302of the obvious source-language caller.
10303
10304This intrinsic is only implemented for x86.
10305
10306'``llvm.frameaddress``' Intrinsic
10307^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10308
10309Syntax:
10310"""""""
10311
10312::
10313
10314      declare i8* @llvm.frameaddress(i32 <level>)
10315
10316Overview:
10317"""""""""
10318
10319The '``llvm.frameaddress``' intrinsic attempts to return the
10320target-specific frame pointer value for the specified stack frame.
10321
10322Arguments:
10323""""""""""
10324
10325The argument to this intrinsic indicates which function to return the
10326frame pointer for. Zero indicates the calling function, one indicates
10327its caller, etc. The argument is **required** to be a constant integer
10328value.
10329
10330Semantics:
10331""""""""""
10332
10333The '``llvm.frameaddress``' intrinsic either returns a pointer
10334indicating the frame address of the specified call frame, or zero if it
10335cannot be identified. The value returned by this intrinsic is likely to
10336be incorrect or 0 for arguments other than zero, so it should only be
10337used for debugging purposes.
10338
10339Note that calling this intrinsic does not prevent function inlining or
10340other aggressive transformations, so the value returned may not be that
10341of the obvious source-language caller.
10342
10343'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
10344^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10345
10346Syntax:
10347"""""""
10348
10349::
10350
10351      declare void @llvm.localescape(...)
10352      declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx)
10353
10354Overview:
10355"""""""""
10356
10357The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
10358allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
10359live frame pointer to recover the address of the allocation. The offset is
10360computed during frame layout of the caller of ``llvm.localescape``.
10361
10362Arguments:
10363""""""""""
10364
10365All arguments to '``llvm.localescape``' must be pointers to static allocas or
10366casts of static allocas. Each function can only call '``llvm.localescape``'
10367once, and it can only do so from the entry block.
10368
10369The ``func`` argument to '``llvm.localrecover``' must be a constant
10370bitcasted pointer to a function defined in the current module. The code
10371generator cannot determine the frame allocation offset of functions defined in
10372other modules.
10373
10374The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
10375call frame that is currently live. The return value of '``llvm.localaddress``'
10376is one way to produce such a value, but various runtimes also expose a suitable
10377pointer in platform-specific ways.
10378
10379The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
10380'``llvm.localescape``' to recover. It is zero-indexed.
10381
10382Semantics:
10383""""""""""
10384
10385These intrinsics allow a group of functions to share access to a set of local
10386stack allocations of a one parent function. The parent function may call the
10387'``llvm.localescape``' intrinsic once from the function entry block, and the
10388child functions can use '``llvm.localrecover``' to access the escaped allocas.
10389The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
10390the escaped allocas are allocated, which would break attempts to use
10391'``llvm.localrecover``'.
10392
10393.. _int_read_register:
10394.. _int_write_register:
10395
10396'``llvm.read_register``' and '``llvm.write_register``' Intrinsics
10397^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10398
10399Syntax:
10400"""""""
10401
10402::
10403
10404      declare i32 @llvm.read_register.i32(metadata)
10405      declare i64 @llvm.read_register.i64(metadata)
10406      declare void @llvm.write_register.i32(metadata, i32 @value)
10407      declare void @llvm.write_register.i64(metadata, i64 @value)
10408      !0 = !{!"sp\00"}
10409
10410Overview:
10411"""""""""
10412
10413The '``llvm.read_register``' and '``llvm.write_register``' intrinsics
10414provides access to the named register. The register must be valid on
10415the architecture being compiled to. The type needs to be compatible
10416with the register being read.
10417
10418Semantics:
10419""""""""""
10420
10421The '``llvm.read_register``' intrinsic returns the current value of the
10422register, where possible. The '``llvm.write_register``' intrinsic sets
10423the current value of the register, where possible.
10424
10425This is useful to implement named register global variables that need
10426to always be mapped to a specific register, as is common practice on
10427bare-metal programs including OS kernels.
10428
10429The compiler doesn't check for register availability or use of the used
10430register in surrounding code, including inline assembly. Because of that,
10431allocatable registers are not supported.
10432
10433Warning: So far it only works with the stack pointer on selected
10434architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
10435work is needed to support other registers and even more so, allocatable
10436registers.
10437
10438.. _int_stacksave:
10439
10440'``llvm.stacksave``' Intrinsic
10441^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10442
10443Syntax:
10444"""""""
10445
10446::
10447
10448      declare i8* @llvm.stacksave()
10449
10450Overview:
10451"""""""""
10452
10453The '``llvm.stacksave``' intrinsic is used to remember the current state
10454of the function stack, for use with
10455:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
10456implementing language features like scoped automatic variable sized
10457arrays in C99.
10458
10459Semantics:
10460""""""""""
10461
10462This intrinsic returns a opaque pointer value that can be passed to
10463:ref:`llvm.stackrestore <int_stackrestore>`. When an
10464``llvm.stackrestore`` intrinsic is executed with a value saved from
10465``llvm.stacksave``, it effectively restores the state of the stack to
10466the state it was in when the ``llvm.stacksave`` intrinsic executed. In
10467practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
10468were allocated after the ``llvm.stacksave`` was executed.
10469
10470.. _int_stackrestore:
10471
10472'``llvm.stackrestore``' Intrinsic
10473^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10474
10475Syntax:
10476"""""""
10477
10478::
10479
10480      declare void @llvm.stackrestore(i8* %ptr)
10481
10482Overview:
10483"""""""""
10484
10485The '``llvm.stackrestore``' intrinsic is used to restore the state of
10486the function stack to the state it was in when the corresponding
10487:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
10488useful for implementing language features like scoped automatic variable
10489sized arrays in C99.
10490
10491Semantics:
10492""""""""""
10493
10494See the description for :ref:`llvm.stacksave <int_stacksave>`.
10495
10496.. _int_get_dynamic_area_offset:
10497
10498'``llvm.get.dynamic.area.offset``' Intrinsic
10499^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10500
10501Syntax:
10502"""""""
10503
10504::
10505
10506      declare i32 @llvm.get.dynamic.area.offset.i32()
10507      declare i64 @llvm.get.dynamic.area.offset.i64()
10508
10509Overview:
10510"""""""""
10511
10512      The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
10513      get the offset from native stack pointer to the address of the most
10514      recent dynamic alloca on the caller's stack. These intrinsics are
10515      intendend for use in combination with
10516      :ref:`llvm.stacksave <int_stacksave>` to get a
10517      pointer to the most recent dynamic alloca. This is useful, for example,
10518      for AddressSanitizer's stack unpoisoning routines.
10519
10520Semantics:
10521""""""""""
10522
10523      These intrinsics return a non-negative integer value that can be used to
10524      get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
10525      on the caller's stack. In particular, for targets where stack grows downwards,
10526      adding this offset to the native stack pointer would get the address of the most
10527      recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
10528      complicated, because subtracting this value from stack pointer would get the address
10529      one past the end of the most recent dynamic alloca.
10530
10531      Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
10532      returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
10533      compile-time-known constant value.
10534
10535      The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
10536      must match the target's default address space's (address space 0) pointer type.
10537
10538'``llvm.prefetch``' Intrinsic
10539^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10540
10541Syntax:
10542"""""""
10543
10544::
10545
10546      declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
10547
10548Overview:
10549"""""""""
10550
10551The '``llvm.prefetch``' intrinsic is a hint to the code generator to
10552insert a prefetch instruction if supported; otherwise, it is a noop.
10553Prefetches have no effect on the behavior of the program but can change
10554its performance characteristics.
10555
10556Arguments:
10557""""""""""
10558
10559``address`` is the address to be prefetched, ``rw`` is the specifier
10560determining if the fetch should be for a read (0) or write (1), and
10561``locality`` is a temporal locality specifier ranging from (0) - no
10562locality, to (3) - extremely local keep in cache. The ``cache type``
10563specifies whether the prefetch is performed on the data (1) or
10564instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
10565arguments must be constant integers.
10566
10567Semantics:
10568""""""""""
10569
10570This intrinsic does not modify the behavior of the program. In
10571particular, prefetches cannot trap and do not produce a value. On
10572targets that support this intrinsic, the prefetch can provide hints to
10573the processor cache for better performance.
10574
10575'``llvm.pcmarker``' Intrinsic
10576^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10577
10578Syntax:
10579"""""""
10580
10581::
10582
10583      declare void @llvm.pcmarker(i32 <id>)
10584
10585Overview:
10586"""""""""
10587
10588The '``llvm.pcmarker``' intrinsic is a method to export a Program
10589Counter (PC) in a region of code to simulators and other tools. The
10590method is target specific, but it is expected that the marker will use
10591exported symbols to transmit the PC of the marker. The marker makes no
10592guarantees that it will remain with any specific instruction after
10593optimizations. It is possible that the presence of a marker will inhibit
10594optimizations. The intended use is to be inserted after optimizations to
10595allow correlations of simulation runs.
10596
10597Arguments:
10598""""""""""
10599
10600``id`` is a numerical id identifying the marker.
10601
10602Semantics:
10603""""""""""
10604
10605This intrinsic does not modify the behavior of the program. Backends
10606that do not support this intrinsic may ignore it.
10607
10608'``llvm.readcyclecounter``' Intrinsic
10609^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10610
10611Syntax:
10612"""""""
10613
10614::
10615
10616      declare i64 @llvm.readcyclecounter()
10617
10618Overview:
10619"""""""""
10620
10621The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
10622counter register (or similar low latency, high accuracy clocks) on those
10623targets that support it. On X86, it should map to RDTSC. On Alpha, it
10624should map to RPCC. As the backing counters overflow quickly (on the
10625order of 9 seconds on alpha), this should only be used for small
10626timings.
10627
10628Semantics:
10629""""""""""
10630
10631When directly supported, reading the cycle counter should not modify any
10632memory. Implementations are allowed to either return a application
10633specific value or a system wide value. On backends without support, this
10634is lowered to a constant 0.
10635
10636Note that runtime support may be conditional on the privilege-level code is
10637running at and the host platform.
10638
10639'``llvm.clear_cache``' Intrinsic
10640^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10641
10642Syntax:
10643"""""""
10644
10645::
10646
10647      declare void @llvm.clear_cache(i8*, i8*)
10648
10649Overview:
10650"""""""""
10651
10652The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
10653in the specified range to the execution unit of the processor. On
10654targets with non-unified instruction and data cache, the implementation
10655flushes the instruction cache.
10656
10657Semantics:
10658""""""""""
10659
10660On platforms with coherent instruction and data caches (e.g. x86), this
10661intrinsic is a nop. On platforms with non-coherent instruction and data
10662cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
10663instructions or a system call, if cache flushing requires special
10664privileges.
10665
10666The default behavior is to emit a call to ``__clear_cache`` from the run
10667time library.
10668
10669This instrinsic does *not* empty the instruction pipeline. Modifications
10670of the current function are outside the scope of the intrinsic.
10671
10672'``llvm.instrprof.increment``' Intrinsic
10673^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10674
10675Syntax:
10676"""""""
10677
10678::
10679
10680      declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>,
10681                                             i32 <num-counters>, i32 <index>)
10682
10683Overview:
10684"""""""""
10685
10686The '``llvm.instrprof.increment``' intrinsic can be emitted by a
10687frontend for use with instrumentation based profiling. These will be
10688lowered by the ``-instrprof`` pass to generate execution counts of a
10689program at runtime.
10690
10691Arguments:
10692""""""""""
10693
10694The first argument is a pointer to a global variable containing the
10695name of the entity being instrumented. This should generally be the
10696(mangled) function name for a set of counters.
10697
10698The second argument is a hash value that can be used by the consumer
10699of the profile data to detect changes to the instrumented source, and
10700the third is the number of counters associated with ``name``. It is an
10701error if ``hash`` or ``num-counters`` differ between two instances of
10702``instrprof.increment`` that refer to the same name.
10703
10704The last argument refers to which of the counters for ``name`` should
10705be incremented. It should be a value between 0 and ``num-counters``.
10706
10707Semantics:
10708""""""""""
10709
10710This intrinsic represents an increment of a profiling counter. It will
10711cause the ``-instrprof`` pass to generate the appropriate data
10712structures and the code to increment the appropriate value, in a
10713format that can be written out by a compiler runtime and consumed via
10714the ``llvm-profdata`` tool.
10715
10716'``llvm.instrprof.increment.step``' Intrinsic
10717^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10718
10719Syntax:
10720"""""""
10721
10722::
10723
10724      declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>,
10725                                                  i32 <num-counters>,
10726                                                  i32 <index>, i64 <step>)
10727
10728Overview:
10729"""""""""
10730
10731The '``llvm.instrprof.increment.step``' intrinsic is an extension to
10732the '``llvm.instrprof.increment``' intrinsic with an additional fifth
10733argument to specify the step of the increment.
10734
10735Arguments:
10736""""""""""
10737The first four arguments are the same as '``llvm.instrprof.increment``'
10738intrinsic.
10739
10740The last argument specifies the value of the increment of the counter variable.
10741
10742Semantics:
10743""""""""""
10744See description of '``llvm.instrprof.increment``' instrinsic.
10745
10746
10747'``llvm.instrprof.value.profile``' Intrinsic
10748^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10749
10750Syntax:
10751"""""""
10752
10753::
10754
10755      declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>,
10756                                                 i64 <value>, i32 <value_kind>,
10757                                                 i32 <index>)
10758
10759Overview:
10760"""""""""
10761
10762The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
10763frontend for use with instrumentation based profiling. This will be
10764lowered by the ``-instrprof`` pass to find out the target values,
10765instrumented expressions take in a program at runtime.
10766
10767Arguments:
10768""""""""""
10769
10770The first argument is a pointer to a global variable containing the
10771name of the entity being instrumented. ``name`` should generally be the
10772(mangled) function name for a set of counters.
10773
10774The second argument is a hash value that can be used by the consumer
10775of the profile data to detect changes to the instrumented source. It
10776is an error if ``hash`` differs between two instances of
10777``llvm.instrprof.*`` that refer to the same name.
10778
10779The third argument is the value of the expression being profiled. The profiled
10780expression's value should be representable as an unsigned 64-bit value. The
10781fourth argument represents the kind of value profiling that is being done. The
10782supported value profiling kinds are enumerated through the
10783``InstrProfValueKind`` type declared in the
10784``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
10785index of the instrumented expression within ``name``. It should be >= 0.
10786
10787Semantics:
10788""""""""""
10789
10790This intrinsic represents the point where a call to a runtime routine
10791should be inserted for value profiling of target expressions. ``-instrprof``
10792pass will generate the appropriate data structures and replace the
10793``llvm.instrprof.value.profile`` intrinsic with the call to the profile
10794runtime library with proper arguments.
10795
10796'``llvm.thread.pointer``' Intrinsic
10797^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10798
10799Syntax:
10800"""""""
10801
10802::
10803
10804      declare i8* @llvm.thread.pointer()
10805
10806Overview:
10807"""""""""
10808
10809The '``llvm.thread.pointer``' intrinsic returns the value of the thread
10810pointer.
10811
10812Semantics:
10813""""""""""
10814
10815The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
10816for the current thread.  The exact semantics of this value are target
10817specific: it may point to the start of TLS area, to the end, or somewhere
10818in the middle.  Depending on the target, this intrinsic may read a register,
10819call a helper function, read from an alternate memory space, or perform
10820other operations necessary to locate the TLS area.  Not all targets support
10821this intrinsic.
10822
10823Standard C Library Intrinsics
10824-----------------------------
10825
10826LLVM provides intrinsics for a few important standard C library
10827functions. These intrinsics allow source-language front-ends to pass
10828information about the alignment of the pointer arguments to the code
10829generator, providing opportunity for more efficient code generation.
10830
10831.. _int_memcpy:
10832
10833'``llvm.memcpy``' Intrinsic
10834^^^^^^^^^^^^^^^^^^^^^^^^^^^
10835
10836Syntax:
10837"""""""
10838
10839This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
10840integer bit width and for different address spaces. Not all targets
10841support all bit widths however.
10842
10843::
10844
10845      declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
10846                                              i32 <len>, i1 <isvolatile>)
10847      declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
10848                                              i64 <len>, i1 <isvolatile>)
10849
10850Overview:
10851"""""""""
10852
10853The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
10854source location to the destination location.
10855
10856Note that, unlike the standard libc function, the ``llvm.memcpy.*``
10857intrinsics do not return a value, takes extra isvolatile
10858arguments and the pointers can be in specified address spaces.
10859
10860Arguments:
10861""""""""""
10862
10863The first argument is a pointer to the destination, the second is a
10864pointer to the source. The third argument is an integer argument
10865specifying the number of bytes to copy, and the fourth is a
10866boolean indicating a volatile access.
10867
10868The :ref:`align <attr_align>` parameter attribute can be provided
10869for the first and second arguments.
10870
10871If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
10872a :ref:`volatile operation <volatile>`. The detailed access behavior is not
10873very cleanly specified and it is unwise to depend on it.
10874
10875Semantics:
10876""""""""""
10877
10878The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
10879source location to the destination location, which are not allowed to
10880overlap. It copies "len" bytes of memory over. If the argument is known
10881to be aligned to some boundary, this can be specified as the fourth
10882argument, otherwise it should be set to 0 or 1 (both meaning no alignment).
10883
10884.. _int_memmove:
10885
10886'``llvm.memmove``' Intrinsic
10887^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10888
10889Syntax:
10890"""""""
10891
10892This is an overloaded intrinsic. You can use llvm.memmove on any integer
10893bit width and for different address space. Not all targets support all
10894bit widths however.
10895
10896::
10897
10898      declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
10899                                               i32 <len>, i1 <isvolatile>)
10900      declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
10901                                               i64 <len>, i1 <isvolatile>)
10902
10903Overview:
10904"""""""""
10905
10906The '``llvm.memmove.*``' intrinsics move a block of memory from the
10907source location to the destination location. It is similar to the
10908'``llvm.memcpy``' intrinsic but allows the two memory locations to
10909overlap.
10910
10911Note that, unlike the standard libc function, the ``llvm.memmove.*``
10912intrinsics do not return a value, takes an extra isvolatile
10913argument and the pointers can be in specified address spaces.
10914
10915Arguments:
10916""""""""""
10917
10918The first argument is a pointer to the destination, the second is a
10919pointer to the source. The third argument is an integer argument
10920specifying the number of bytes to copy, and the fourth is a
10921boolean indicating a volatile access.
10922
10923The :ref:`align <attr_align>` parameter attribute can be provided
10924for the first and second arguments.
10925
10926If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
10927is a :ref:`volatile operation <volatile>`. The detailed access behavior is
10928not very cleanly specified and it is unwise to depend on it.
10929
10930Semantics:
10931""""""""""
10932
10933The '``llvm.memmove.*``' intrinsics copy a block of memory from the
10934source location to the destination location, which may overlap. It
10935copies "len" bytes of memory over. If the argument is known to be
10936aligned to some boundary, this can be specified as the fourth argument,
10937otherwise it should be set to 0 or 1 (both meaning no alignment).
10938
10939.. _int_memset:
10940
10941'``llvm.memset.*``' Intrinsics
10942^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10943
10944Syntax:
10945"""""""
10946
10947This is an overloaded intrinsic. You can use llvm.memset on any integer
10948bit width and for different address spaces. However, not all targets
10949support all bit widths.
10950
10951::
10952
10953      declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>,
10954                                         i32 <len>, i1 <isvolatile>)
10955      declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>,
10956                                         i64 <len>, i1 <isvolatile>)
10957
10958Overview:
10959"""""""""
10960
10961The '``llvm.memset.*``' intrinsics fill a block of memory with a
10962particular byte value.
10963
10964Note that, unlike the standard libc function, the ``llvm.memset``
10965intrinsic does not return a value and takes an extra volatile
10966argument. Also, the destination can be in an arbitrary address space.
10967
10968Arguments:
10969""""""""""
10970
10971The first argument is a pointer to the destination to fill, the second
10972is the byte value with which to fill it, the third argument is an
10973integer argument specifying the number of bytes to fill, and the fourth
10974is a boolean indicating a volatile access.
10975
10976The :ref:`align <attr_align>` parameter attribute can be provided
10977for the first arguments.
10978
10979If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
10980a :ref:`volatile operation <volatile>`. The detailed access behavior is not
10981very cleanly specified and it is unwise to depend on it.
10982
10983Semantics:
10984""""""""""
10985
10986The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
10987at the destination location.
10988
10989'``llvm.sqrt.*``' Intrinsic
10990^^^^^^^^^^^^^^^^^^^^^^^^^^^
10991
10992Syntax:
10993"""""""
10994
10995This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
10996floating-point or vector of floating-point type. Not all targets support
10997all types however.
10998
10999::
11000
11001      declare float     @llvm.sqrt.f32(float %Val)
11002      declare double    @llvm.sqrt.f64(double %Val)
11003      declare x86_fp80  @llvm.sqrt.f80(x86_fp80 %Val)
11004      declare fp128     @llvm.sqrt.f128(fp128 %Val)
11005      declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
11006
11007Overview:
11008"""""""""
11009
11010The '``llvm.sqrt``' intrinsics return the square root of the specified value.
11011
11012Arguments:
11013""""""""""
11014
11015The argument and return value are floating-point numbers of the same type.
11016
11017Semantics:
11018""""""""""
11019
11020Return the same value as a corresponding libm '``sqrt``' function but without
11021trapping or setting ``errno``. For types specified by IEEE-754, the result
11022matches a conforming libm implementation.
11023
11024When specified with the fast-math-flag 'afn', the result may be approximated
11025using a less accurate calculation.
11026
11027'``llvm.powi.*``' Intrinsic
11028^^^^^^^^^^^^^^^^^^^^^^^^^^^
11029
11030Syntax:
11031"""""""
11032
11033This is an overloaded intrinsic. You can use ``llvm.powi`` on any
11034floating-point or vector of floating-point type. Not all targets support
11035all types however.
11036
11037::
11038
11039      declare float     @llvm.powi.f32(float  %Val, i32 %power)
11040      declare double    @llvm.powi.f64(double %Val, i32 %power)
11041      declare x86_fp80  @llvm.powi.f80(x86_fp80  %Val, i32 %power)
11042      declare fp128     @llvm.powi.f128(fp128 %Val, i32 %power)
11043      declare ppc_fp128 @llvm.powi.ppcf128(ppc_fp128  %Val, i32 %power)
11044
11045Overview:
11046"""""""""
11047
11048The '``llvm.powi.*``' intrinsics return the first operand raised to the
11049specified (positive or negative) power. The order of evaluation of
11050multiplications is not defined. When a vector of floating-point type is
11051used, the second argument remains a scalar integer value.
11052
11053Arguments:
11054""""""""""
11055
11056The second argument is an integer power, and the first is a value to
11057raise to that power.
11058
11059Semantics:
11060""""""""""
11061
11062This function returns the first value raised to the second power with an
11063unspecified sequence of rounding operations.
11064
11065'``llvm.sin.*``' Intrinsic
11066^^^^^^^^^^^^^^^^^^^^^^^^^^
11067
11068Syntax:
11069"""""""
11070
11071This is an overloaded intrinsic. You can use ``llvm.sin`` on any
11072floating-point or vector of floating-point type. Not all targets support
11073all types however.
11074
11075::
11076
11077      declare float     @llvm.sin.f32(float  %Val)
11078      declare double    @llvm.sin.f64(double %Val)
11079      declare x86_fp80  @llvm.sin.f80(x86_fp80  %Val)
11080      declare fp128     @llvm.sin.f128(fp128 %Val)
11081      declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128  %Val)
11082
11083Overview:
11084"""""""""
11085
11086The '``llvm.sin.*``' intrinsics return the sine of the operand.
11087
11088Arguments:
11089""""""""""
11090
11091The argument and return value are floating-point numbers of the same type.
11092
11093Semantics:
11094""""""""""
11095
11096Return the same value as a corresponding libm '``sin``' function but without
11097trapping or setting ``errno``.
11098
11099When specified with the fast-math-flag 'afn', the result may be approximated
11100using a less accurate calculation.
11101
11102'``llvm.cos.*``' Intrinsic
11103^^^^^^^^^^^^^^^^^^^^^^^^^^
11104
11105Syntax:
11106"""""""
11107
11108This is an overloaded intrinsic. You can use ``llvm.cos`` on any
11109floating-point or vector of floating-point type. Not all targets support
11110all types however.
11111
11112::
11113
11114      declare float     @llvm.cos.f32(float  %Val)
11115      declare double    @llvm.cos.f64(double %Val)
11116      declare x86_fp80  @llvm.cos.f80(x86_fp80  %Val)
11117      declare fp128     @llvm.cos.f128(fp128 %Val)
11118      declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128  %Val)
11119
11120Overview:
11121"""""""""
11122
11123The '``llvm.cos.*``' intrinsics return the cosine of the operand.
11124
11125Arguments:
11126""""""""""
11127
11128The argument and return value are floating-point numbers of the same type.
11129
11130Semantics:
11131""""""""""
11132
11133Return the same value as a corresponding libm '``cos``' function but without
11134trapping or setting ``errno``.
11135
11136When specified with the fast-math-flag 'afn', the result may be approximated
11137using a less accurate calculation.
11138
11139'``llvm.pow.*``' Intrinsic
11140^^^^^^^^^^^^^^^^^^^^^^^^^^
11141
11142Syntax:
11143"""""""
11144
11145This is an overloaded intrinsic. You can use ``llvm.pow`` on any
11146floating-point or vector of floating-point type. Not all targets support
11147all types however.
11148
11149::
11150
11151      declare float     @llvm.pow.f32(float  %Val, float %Power)
11152      declare double    @llvm.pow.f64(double %Val, double %Power)
11153      declare x86_fp80  @llvm.pow.f80(x86_fp80  %Val, x86_fp80 %Power)
11154      declare fp128     @llvm.pow.f128(fp128 %Val, fp128 %Power)
11155      declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128  %Val, ppc_fp128 Power)
11156
11157Overview:
11158"""""""""
11159
11160The '``llvm.pow.*``' intrinsics return the first operand raised to the
11161specified (positive or negative) power.
11162
11163Arguments:
11164""""""""""
11165
11166The arguments and return value are floating-point numbers of the same type.
11167
11168Semantics:
11169""""""""""
11170
11171Return the same value as a corresponding libm '``pow``' function but without
11172trapping or setting ``errno``.
11173
11174When specified with the fast-math-flag 'afn', the result may be approximated
11175using a less accurate calculation.
11176
11177'``llvm.exp.*``' Intrinsic
11178^^^^^^^^^^^^^^^^^^^^^^^^^^
11179
11180Syntax:
11181"""""""
11182
11183This is an overloaded intrinsic. You can use ``llvm.exp`` on any
11184floating-point or vector of floating-point type. Not all targets support
11185all types however.
11186
11187::
11188
11189      declare float     @llvm.exp.f32(float  %Val)
11190      declare double    @llvm.exp.f64(double %Val)
11191      declare x86_fp80  @llvm.exp.f80(x86_fp80  %Val)
11192      declare fp128     @llvm.exp.f128(fp128 %Val)
11193      declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128  %Val)
11194
11195Overview:
11196"""""""""
11197
11198The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
11199value.
11200
11201Arguments:
11202""""""""""
11203
11204The argument and return value are floating-point numbers of the same type.
11205
11206Semantics:
11207""""""""""
11208
11209Return the same value as a corresponding libm '``exp``' function but without
11210trapping or setting ``errno``.
11211
11212When specified with the fast-math-flag 'afn', the result may be approximated
11213using a less accurate calculation.
11214
11215'``llvm.exp2.*``' Intrinsic
11216^^^^^^^^^^^^^^^^^^^^^^^^^^^
11217
11218Syntax:
11219"""""""
11220
11221This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
11222floating-point or vector of floating-point type. Not all targets support
11223all types however.
11224
11225::
11226
11227      declare float     @llvm.exp2.f32(float  %Val)
11228      declare double    @llvm.exp2.f64(double %Val)
11229      declare x86_fp80  @llvm.exp2.f80(x86_fp80  %Val)
11230      declare fp128     @llvm.exp2.f128(fp128 %Val)
11231      declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128  %Val)
11232
11233Overview:
11234"""""""""
11235
11236The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
11237specified value.
11238
11239Arguments:
11240""""""""""
11241
11242The argument and return value are floating-point numbers of the same type.
11243
11244Semantics:
11245""""""""""
11246
11247Return the same value as a corresponding libm '``exp2``' function but without
11248trapping or setting ``errno``.
11249
11250When specified with the fast-math-flag 'afn', the result may be approximated
11251using a less accurate calculation.
11252
11253'``llvm.log.*``' Intrinsic
11254^^^^^^^^^^^^^^^^^^^^^^^^^^
11255
11256Syntax:
11257"""""""
11258
11259This is an overloaded intrinsic. You can use ``llvm.log`` on any
11260floating-point or vector of floating-point type. Not all targets support
11261all types however.
11262
11263::
11264
11265      declare float     @llvm.log.f32(float  %Val)
11266      declare double    @llvm.log.f64(double %Val)
11267      declare x86_fp80  @llvm.log.f80(x86_fp80  %Val)
11268      declare fp128     @llvm.log.f128(fp128 %Val)
11269      declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128  %Val)
11270
11271Overview:
11272"""""""""
11273
11274The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
11275value.
11276
11277Arguments:
11278""""""""""
11279
11280The argument and return value are floating-point numbers of the same type.
11281
11282Semantics:
11283""""""""""
11284
11285Return the same value as a corresponding libm '``log``' function but without
11286trapping or setting ``errno``.
11287
11288When specified with the fast-math-flag 'afn', the result may be approximated
11289using a less accurate calculation.
11290
11291'``llvm.log10.*``' Intrinsic
11292^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11293
11294Syntax:
11295"""""""
11296
11297This is an overloaded intrinsic. You can use ``llvm.log10`` on any
11298floating-point or vector of floating-point type. Not all targets support
11299all types however.
11300
11301::
11302
11303      declare float     @llvm.log10.f32(float  %Val)
11304      declare double    @llvm.log10.f64(double %Val)
11305      declare x86_fp80  @llvm.log10.f80(x86_fp80  %Val)
11306      declare fp128     @llvm.log10.f128(fp128 %Val)
11307      declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128  %Val)
11308
11309Overview:
11310"""""""""
11311
11312The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
11313specified value.
11314
11315Arguments:
11316""""""""""
11317
11318The argument and return value are floating-point numbers of the same type.
11319
11320Semantics:
11321""""""""""
11322
11323Return the same value as a corresponding libm '``log10``' function but without
11324trapping or setting ``errno``.
11325
11326When specified with the fast-math-flag 'afn', the result may be approximated
11327using a less accurate calculation.
11328
11329'``llvm.log2.*``' Intrinsic
11330^^^^^^^^^^^^^^^^^^^^^^^^^^^
11331
11332Syntax:
11333"""""""
11334
11335This is an overloaded intrinsic. You can use ``llvm.log2`` on any
11336floating-point or vector of floating-point type. Not all targets support
11337all types however.
11338
11339::
11340
11341      declare float     @llvm.log2.f32(float  %Val)
11342      declare double    @llvm.log2.f64(double %Val)
11343      declare x86_fp80  @llvm.log2.f80(x86_fp80  %Val)
11344      declare fp128     @llvm.log2.f128(fp128 %Val)
11345      declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128  %Val)
11346
11347Overview:
11348"""""""""
11349
11350The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
11351value.
11352
11353Arguments:
11354""""""""""
11355
11356The argument and return value are floating-point numbers of the same type.
11357
11358Semantics:
11359""""""""""
11360
11361Return the same value as a corresponding libm '``log2``' function but without
11362trapping or setting ``errno``.
11363
11364When specified with the fast-math-flag 'afn', the result may be approximated
11365using a less accurate calculation.
11366
11367'``llvm.fma.*``' Intrinsic
11368^^^^^^^^^^^^^^^^^^^^^^^^^^
11369
11370Syntax:
11371"""""""
11372
11373This is an overloaded intrinsic. You can use ``llvm.fma`` on any
11374floating-point or vector of floating-point type. Not all targets support
11375all types however.
11376
11377::
11378
11379      declare float     @llvm.fma.f32(float  %a, float  %b, float  %c)
11380      declare double    @llvm.fma.f64(double %a, double %b, double %c)
11381      declare x86_fp80  @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
11382      declare fp128     @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
11383      declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
11384
11385Overview:
11386"""""""""
11387
11388The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
11389
11390Arguments:
11391""""""""""
11392
11393The arguments and return value are floating-point numbers of the same type.
11394
11395Semantics:
11396""""""""""
11397
11398Return the same value as a corresponding libm '``fma``' function but without
11399trapping or setting ``errno``.
11400
11401When specified with the fast-math-flag 'afn', the result may be approximated
11402using a less accurate calculation.
11403
11404'``llvm.fabs.*``' Intrinsic
11405^^^^^^^^^^^^^^^^^^^^^^^^^^^
11406
11407Syntax:
11408"""""""
11409
11410This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
11411floating-point or vector of floating-point type. Not all targets support
11412all types however.
11413
11414::
11415
11416      declare float     @llvm.fabs.f32(float  %Val)
11417      declare double    @llvm.fabs.f64(double %Val)
11418      declare x86_fp80  @llvm.fabs.f80(x86_fp80 %Val)
11419      declare fp128     @llvm.fabs.f128(fp128 %Val)
11420      declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
11421
11422Overview:
11423"""""""""
11424
11425The '``llvm.fabs.*``' intrinsics return the absolute value of the
11426operand.
11427
11428Arguments:
11429""""""""""
11430
11431The argument and return value are floating-point numbers of the same
11432type.
11433
11434Semantics:
11435""""""""""
11436
11437This function returns the same values as the libm ``fabs`` functions
11438would, and handles error conditions in the same way.
11439
11440'``llvm.minnum.*``' Intrinsic
11441^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11442
11443Syntax:
11444"""""""
11445
11446This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
11447floating-point or vector of floating-point type. Not all targets support
11448all types however.
11449
11450::
11451
11452      declare float     @llvm.minnum.f32(float %Val0, float %Val1)
11453      declare double    @llvm.minnum.f64(double %Val0, double %Val1)
11454      declare x86_fp80  @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
11455      declare fp128     @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
11456      declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
11457
11458Overview:
11459"""""""""
11460
11461The '``llvm.minnum.*``' intrinsics return the minimum of the two
11462arguments.
11463
11464
11465Arguments:
11466""""""""""
11467
11468The arguments and return value are floating-point numbers of the same
11469type.
11470
11471Semantics:
11472""""""""""
11473
11474Follows the IEEE-754 semantics for minNum, which also match for libm's
11475fmin.
11476
11477If either operand is a NaN, returns the other non-NaN operand. Returns
11478NaN only if both operands are NaN. If the operands compare equal,
11479returns a value that compares equal to both operands. This means that
11480fmin(+/-0.0, +/-0.0) could return either -0.0 or 0.0.
11481
11482'``llvm.maxnum.*``' Intrinsic
11483^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11484
11485Syntax:
11486"""""""
11487
11488This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
11489floating-point or vector of floating-point type. Not all targets support
11490all types however.
11491
11492::
11493
11494      declare float     @llvm.maxnum.f32(float  %Val0, float  %Val1l)
11495      declare double    @llvm.maxnum.f64(double %Val0, double %Val1)
11496      declare x86_fp80  @llvm.maxnum.f80(x86_fp80  %Val0, x86_fp80  %Val1)
11497      declare fp128     @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
11498      declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128  %Val0, ppc_fp128  %Val1)
11499
11500Overview:
11501"""""""""
11502
11503The '``llvm.maxnum.*``' intrinsics return the maximum of the two
11504arguments.
11505
11506
11507Arguments:
11508""""""""""
11509
11510The arguments and return value are floating-point numbers of the same
11511type.
11512
11513Semantics:
11514""""""""""
11515Follows the IEEE-754 semantics for maxNum, which also match for libm's
11516fmax.
11517
11518If either operand is a NaN, returns the other non-NaN operand. Returns
11519NaN only if both operands are NaN. If the operands compare equal,
11520returns a value that compares equal to both operands. This means that
11521fmax(+/-0.0, +/-0.0) could return either -0.0 or 0.0.
11522
11523'``llvm.copysign.*``' Intrinsic
11524^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11525
11526Syntax:
11527"""""""
11528
11529This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
11530floating-point or vector of floating-point type. Not all targets support
11531all types however.
11532
11533::
11534
11535      declare float     @llvm.copysign.f32(float  %Mag, float  %Sgn)
11536      declare double    @llvm.copysign.f64(double %Mag, double %Sgn)
11537      declare x86_fp80  @llvm.copysign.f80(x86_fp80  %Mag, x86_fp80  %Sgn)
11538      declare fp128     @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
11539      declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128  %Mag, ppc_fp128  %Sgn)
11540
11541Overview:
11542"""""""""
11543
11544The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
11545first operand and the sign of the second operand.
11546
11547Arguments:
11548""""""""""
11549
11550The arguments and return value are floating-point numbers of the same
11551type.
11552
11553Semantics:
11554""""""""""
11555
11556This function returns the same values as the libm ``copysign``
11557functions would, and handles error conditions in the same way.
11558
11559'``llvm.floor.*``' Intrinsic
11560^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11561
11562Syntax:
11563"""""""
11564
11565This is an overloaded intrinsic. You can use ``llvm.floor`` on any
11566floating-point or vector of floating-point type. Not all targets support
11567all types however.
11568
11569::
11570
11571      declare float     @llvm.floor.f32(float  %Val)
11572      declare double    @llvm.floor.f64(double %Val)
11573      declare x86_fp80  @llvm.floor.f80(x86_fp80  %Val)
11574      declare fp128     @llvm.floor.f128(fp128 %Val)
11575      declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128  %Val)
11576
11577Overview:
11578"""""""""
11579
11580The '``llvm.floor.*``' intrinsics return the floor of the operand.
11581
11582Arguments:
11583""""""""""
11584
11585The argument and return value are floating-point numbers of the same
11586type.
11587
11588Semantics:
11589""""""""""
11590
11591This function returns the same values as the libm ``floor`` functions
11592would, and handles error conditions in the same way.
11593
11594'``llvm.ceil.*``' Intrinsic
11595^^^^^^^^^^^^^^^^^^^^^^^^^^^
11596
11597Syntax:
11598"""""""
11599
11600This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
11601floating-point or vector of floating-point type. Not all targets support
11602all types however.
11603
11604::
11605
11606      declare float     @llvm.ceil.f32(float  %Val)
11607      declare double    @llvm.ceil.f64(double %Val)
11608      declare x86_fp80  @llvm.ceil.f80(x86_fp80  %Val)
11609      declare fp128     @llvm.ceil.f128(fp128 %Val)
11610      declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128  %Val)
11611
11612Overview:
11613"""""""""
11614
11615The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
11616
11617Arguments:
11618""""""""""
11619
11620The argument and return value are floating-point numbers of the same
11621type.
11622
11623Semantics:
11624""""""""""
11625
11626This function returns the same values as the libm ``ceil`` functions
11627would, and handles error conditions in the same way.
11628
11629'``llvm.trunc.*``' Intrinsic
11630^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11631
11632Syntax:
11633"""""""
11634
11635This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
11636floating-point or vector of floating-point type. Not all targets support
11637all types however.
11638
11639::
11640
11641      declare float     @llvm.trunc.f32(float  %Val)
11642      declare double    @llvm.trunc.f64(double %Val)
11643      declare x86_fp80  @llvm.trunc.f80(x86_fp80  %Val)
11644      declare fp128     @llvm.trunc.f128(fp128 %Val)
11645      declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128  %Val)
11646
11647Overview:
11648"""""""""
11649
11650The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
11651nearest integer not larger in magnitude than the operand.
11652
11653Arguments:
11654""""""""""
11655
11656The argument and return value are floating-point numbers of the same
11657type.
11658
11659Semantics:
11660""""""""""
11661
11662This function returns the same values as the libm ``trunc`` functions
11663would, and handles error conditions in the same way.
11664
11665'``llvm.rint.*``' Intrinsic
11666^^^^^^^^^^^^^^^^^^^^^^^^^^^
11667
11668Syntax:
11669"""""""
11670
11671This is an overloaded intrinsic. You can use ``llvm.rint`` on any
11672floating-point or vector of floating-point type. Not all targets support
11673all types however.
11674
11675::
11676
11677      declare float     @llvm.rint.f32(float  %Val)
11678      declare double    @llvm.rint.f64(double %Val)
11679      declare x86_fp80  @llvm.rint.f80(x86_fp80  %Val)
11680      declare fp128     @llvm.rint.f128(fp128 %Val)
11681      declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128  %Val)
11682
11683Overview:
11684"""""""""
11685
11686The '``llvm.rint.*``' intrinsics returns the operand rounded to the
11687nearest integer. It may raise an inexact floating-point exception if the
11688operand isn't an integer.
11689
11690Arguments:
11691""""""""""
11692
11693The argument and return value are floating-point numbers of the same
11694type.
11695
11696Semantics:
11697""""""""""
11698
11699This function returns the same values as the libm ``rint`` functions
11700would, and handles error conditions in the same way.
11701
11702'``llvm.nearbyint.*``' Intrinsic
11703^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11704
11705Syntax:
11706"""""""
11707
11708This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
11709floating-point or vector of floating-point type. Not all targets support
11710all types however.
11711
11712::
11713
11714      declare float     @llvm.nearbyint.f32(float  %Val)
11715      declare double    @llvm.nearbyint.f64(double %Val)
11716      declare x86_fp80  @llvm.nearbyint.f80(x86_fp80  %Val)
11717      declare fp128     @llvm.nearbyint.f128(fp128 %Val)
11718      declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128  %Val)
11719
11720Overview:
11721"""""""""
11722
11723The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
11724nearest integer.
11725
11726Arguments:
11727""""""""""
11728
11729The argument and return value are floating-point numbers of the same
11730type.
11731
11732Semantics:
11733""""""""""
11734
11735This function returns the same values as the libm ``nearbyint``
11736functions would, and handles error conditions in the same way.
11737
11738'``llvm.round.*``' Intrinsic
11739^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11740
11741Syntax:
11742"""""""
11743
11744This is an overloaded intrinsic. You can use ``llvm.round`` on any
11745floating-point or vector of floating-point type. Not all targets support
11746all types however.
11747
11748::
11749
11750      declare float     @llvm.round.f32(float  %Val)
11751      declare double    @llvm.round.f64(double %Val)
11752      declare x86_fp80  @llvm.round.f80(x86_fp80  %Val)
11753      declare fp128     @llvm.round.f128(fp128 %Val)
11754      declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128  %Val)
11755
11756Overview:
11757"""""""""
11758
11759The '``llvm.round.*``' intrinsics returns the operand rounded to the
11760nearest integer.
11761
11762Arguments:
11763""""""""""
11764
11765The argument and return value are floating-point numbers of the same
11766type.
11767
11768Semantics:
11769""""""""""
11770
11771This function returns the same values as the libm ``round``
11772functions would, and handles error conditions in the same way.
11773
11774Bit Manipulation Intrinsics
11775---------------------------
11776
11777LLVM provides intrinsics for a few important bit manipulation
11778operations. These allow efficient code generation for some algorithms.
11779
11780'``llvm.bitreverse.*``' Intrinsics
11781^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11782
11783Syntax:
11784"""""""
11785
11786This is an overloaded intrinsic function. You can use bitreverse on any
11787integer type.
11788
11789::
11790
11791      declare i16 @llvm.bitreverse.i16(i16 <id>)
11792      declare i32 @llvm.bitreverse.i32(i32 <id>)
11793      declare i64 @llvm.bitreverse.i64(i64 <id>)
11794
11795Overview:
11796"""""""""
11797
11798The '``llvm.bitreverse``' family of intrinsics is used to reverse the
11799bitpattern of an integer value; for example ``0b10110110`` becomes
11800``0b01101101``.
11801
11802Semantics:
11803""""""""""
11804
11805The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
11806``M`` in the input moved to bit ``N-M`` in the output.
11807
11808'``llvm.bswap.*``' Intrinsics
11809^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11810
11811Syntax:
11812"""""""
11813
11814This is an overloaded intrinsic function. You can use bswap on any
11815integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
11816
11817::
11818
11819      declare i16 @llvm.bswap.i16(i16 <id>)
11820      declare i32 @llvm.bswap.i32(i32 <id>)
11821      declare i64 @llvm.bswap.i64(i64 <id>)
11822
11823Overview:
11824"""""""""
11825
11826The '``llvm.bswap``' family of intrinsics is used to byte swap integer
11827values with an even number of bytes (positive multiple of 16 bits).
11828These are useful for performing operations on data that is not in the
11829target's native byte order.
11830
11831Semantics:
11832""""""""""
11833
11834The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
11835and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
11836intrinsic returns an i32 value that has the four bytes of the input i32
11837swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
11838returned i32 will have its bytes in 3, 2, 1, 0 order. The
11839``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
11840concept to additional even-byte lengths (6 bytes, 8 bytes and more,
11841respectively).
11842
11843'``llvm.ctpop.*``' Intrinsic
11844^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11845
11846Syntax:
11847"""""""
11848
11849This is an overloaded intrinsic. You can use llvm.ctpop on any integer
11850bit width, or on any vector with integer elements. Not all targets
11851support all bit widths or vector types, however.
11852
11853::
11854
11855      declare i8 @llvm.ctpop.i8(i8  <src>)
11856      declare i16 @llvm.ctpop.i16(i16 <src>)
11857      declare i32 @llvm.ctpop.i32(i32 <src>)
11858      declare i64 @llvm.ctpop.i64(i64 <src>)
11859      declare i256 @llvm.ctpop.i256(i256 <src>)
11860      declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
11861
11862Overview:
11863"""""""""
11864
11865The '``llvm.ctpop``' family of intrinsics counts the number of bits set
11866in a value.
11867
11868Arguments:
11869""""""""""
11870
11871The only argument is the value to be counted. The argument may be of any
11872integer type, or a vector with integer elements. The return type must
11873match the argument type.
11874
11875Semantics:
11876""""""""""
11877
11878The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
11879each element of a vector.
11880
11881'``llvm.ctlz.*``' Intrinsic
11882^^^^^^^^^^^^^^^^^^^^^^^^^^^
11883
11884Syntax:
11885"""""""
11886
11887This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
11888integer bit width, or any vector whose elements are integers. Not all
11889targets support all bit widths or vector types, however.
11890
11891::
11892
11893      declare i8   @llvm.ctlz.i8  (i8   <src>, i1 <is_zero_undef>)
11894      declare i16  @llvm.ctlz.i16 (i16  <src>, i1 <is_zero_undef>)
11895      declare i32  @llvm.ctlz.i32 (i32  <src>, i1 <is_zero_undef>)
11896      declare i64  @llvm.ctlz.i64 (i64  <src>, i1 <is_zero_undef>)
11897      declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>)
11898      declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
11899
11900Overview:
11901"""""""""
11902
11903The '``llvm.ctlz``' family of intrinsic functions counts the number of
11904leading zeros in a variable.
11905
11906Arguments:
11907""""""""""
11908
11909The first argument is the value to be counted. This argument may be of
11910any integer type, or a vector with integer element type. The return
11911type must match the first argument type.
11912
11913The second argument must be a constant and is a flag to indicate whether
11914the intrinsic should ensure that a zero as the first argument produces a
11915defined result. Historically some architectures did not provide a
11916defined result for zero values as efficiently, and many algorithms are
11917now predicated on avoiding zero-value inputs.
11918
11919Semantics:
11920""""""""""
11921
11922The '``llvm.ctlz``' intrinsic counts the leading (most significant)
11923zeros in a variable, or within each element of the vector. If
11924``src == 0`` then the result is the size in bits of the type of ``src``
11925if ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
11926``llvm.ctlz(i32 2) = 30``.
11927
11928'``llvm.cttz.*``' Intrinsic
11929^^^^^^^^^^^^^^^^^^^^^^^^^^^
11930
11931Syntax:
11932"""""""
11933
11934This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
11935integer bit width, or any vector of integer elements. Not all targets
11936support all bit widths or vector types, however.
11937
11938::
11939
11940      declare i8   @llvm.cttz.i8  (i8   <src>, i1 <is_zero_undef>)
11941      declare i16  @llvm.cttz.i16 (i16  <src>, i1 <is_zero_undef>)
11942      declare i32  @llvm.cttz.i32 (i32  <src>, i1 <is_zero_undef>)
11943      declare i64  @llvm.cttz.i64 (i64  <src>, i1 <is_zero_undef>)
11944      declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>)
11945      declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
11946
11947Overview:
11948"""""""""
11949
11950The '``llvm.cttz``' family of intrinsic functions counts the number of
11951trailing zeros.
11952
11953Arguments:
11954""""""""""
11955
11956The first argument is the value to be counted. This argument may be of
11957any integer type, or a vector with integer element type. The return
11958type must match the first argument type.
11959
11960The second argument must be a constant and is a flag to indicate whether
11961the intrinsic should ensure that a zero as the first argument produces a
11962defined result. Historically some architectures did not provide a
11963defined result for zero values as efficiently, and many algorithms are
11964now predicated on avoiding zero-value inputs.
11965
11966Semantics:
11967""""""""""
11968
11969The '``llvm.cttz``' intrinsic counts the trailing (least significant)
11970zeros in a variable, or within each element of a vector. If ``src == 0``
11971then the result is the size in bits of the type of ``src`` if
11972``is_zero_undef == 0`` and ``undef`` otherwise. For example,
11973``llvm.cttz(2) = 1``.
11974
11975.. _int_overflow:
11976
11977'``llvm.fshl.*``' Intrinsic
11978^^^^^^^^^^^^^^^^^^^^^^^^^^^
11979
11980Syntax:
11981"""""""
11982
11983This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
11984integer bit width or any vector of integer elements. Not all targets
11985support all bit widths or vector types, however.
11986
11987::
11988
11989      declare i8  @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
11990      declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c)
11991      declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
11992
11993Overview:
11994"""""""""
11995
11996The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
11997the first two values are concatenated as { %a : %b } (%a is the most significant
11998bits of the wide value), the combined value is shifted left, and the most
11999significant bits are extracted to produce a result that is the same size as the
12000original arguments. If the first 2 arguments are identical, this is equivalent
12001to a rotate left operation. For vector types, the operation occurs for each
12002element of the vector. The shift argument is treated as an unsigned amount
12003modulo the element size of the arguments.
12004
12005Arguments:
12006""""""""""
12007
12008The first two arguments are the values to be concatenated. The third
12009argument is the shift amount. The arguments may be any integer type or a
12010vector with integer element type. All arguments and the return value must
12011have the same type.
12012
12013Example:
12014""""""""
12015
12016.. code-block:: text
12017
12018      %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
12019      %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15)  ; %r = i8: 128 (0b10000000)
12020      %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11)  ; %r = i8: 120 (0b01111000)
12021      %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8)   ; %r = i8: 0   (0b00000000)
12022
12023'``llvm.fshr.*``' Intrinsic
12024^^^^^^^^^^^^^^^^^^^^^^^^^^^
12025
12026Syntax:
12027"""""""
12028
12029This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
12030integer bit width or any vector of integer elements. Not all targets
12031support all bit widths or vector types, however.
12032
12033::
12034
12035      declare i8  @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
12036      declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c)
12037      declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
12038
12039Overview:
12040"""""""""
12041
12042The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
12043the first two values are concatenated as { %a : %b } (%a is the most significant
12044bits of the wide value), the combined value is shifted right, and the least
12045significant bits are extracted to produce a result that is the same size as the
12046original arguments. If the first 2 arguments are identical, this is equivalent
12047to a rotate right operation. For vector types, the operation occurs for each
12048element of the vector. The shift argument is treated as an unsigned amount
12049modulo the element size of the arguments.
12050
12051Arguments:
12052""""""""""
12053
12054The first two arguments are the values to be concatenated. The third
12055argument is the shift amount. The arguments may be any integer type or a
12056vector with integer element type. All arguments and the return value must
12057have the same type.
12058
12059Example:
12060""""""""
12061
12062.. code-block:: text
12063
12064      %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
12065      %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15)  ; %r = i8: 254 (0b11111110)
12066      %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11)  ; %r = i8: 225 (0b11100001)
12067      %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8)   ; %r = i8: 255 (0b11111111)
12068
12069Arithmetic with Overflow Intrinsics
12070-----------------------------------
12071
12072LLVM provides intrinsics for fast arithmetic overflow checking.
12073
12074Each of these intrinsics returns a two-element struct. The first
12075element of this struct contains the result of the corresponding
12076arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
12077the result. Therefore, for example, the first element of the struct
12078returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
12079result of a 32-bit ``add`` instruction with the same operands, where
12080the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
12081
12082The second element of the result is an ``i1`` that is 1 if the
12083arithmetic operation overflowed and 0 otherwise. An operation
12084overflows if, for any values of its operands ``A`` and ``B`` and for
12085any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
12086not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
12087``sext`` for signed overflow and ``zext`` for unsigned overflow, and
12088``op`` is the underlying arithmetic operation.
12089
12090The behavior of these intrinsics is well-defined for all argument
12091values.
12092
12093'``llvm.sadd.with.overflow.*``' Intrinsics
12094^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12095
12096Syntax:
12097"""""""
12098
12099This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
12100on any integer bit width.
12101
12102::
12103
12104      declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
12105      declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
12106      declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
12107
12108Overview:
12109"""""""""
12110
12111The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
12112a signed addition of the two arguments, and indicate whether an overflow
12113occurred during the signed summation.
12114
12115Arguments:
12116""""""""""
12117
12118The arguments (%a and %b) and the first element of the result structure
12119may be of integer types of any bit width, but they must have the same
12120bit width. The second element of the result structure must be of type
12121``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
12122addition.
12123
12124Semantics:
12125""""""""""
12126
12127The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
12128a signed addition of the two variables. They return a structure --- the
12129first element of which is the signed summation, and the second element
12130of which is a bit specifying if the signed summation resulted in an
12131overflow.
12132
12133Examples:
12134"""""""""
12135
12136.. code-block:: llvm
12137
12138      %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
12139      %sum = extractvalue {i32, i1} %res, 0
12140      %obit = extractvalue {i32, i1} %res, 1
12141      br i1 %obit, label %overflow, label %normal
12142
12143'``llvm.uadd.with.overflow.*``' Intrinsics
12144^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12145
12146Syntax:
12147"""""""
12148
12149This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
12150on any integer bit width.
12151
12152::
12153
12154      declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
12155      declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
12156      declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
12157
12158Overview:
12159"""""""""
12160
12161The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
12162an unsigned addition of the two arguments, and indicate whether a carry
12163occurred during the unsigned summation.
12164
12165Arguments:
12166""""""""""
12167
12168The arguments (%a and %b) and the first element of the result structure
12169may be of integer types of any bit width, but they must have the same
12170bit width. The second element of the result structure must be of type
12171``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
12172addition.
12173
12174Semantics:
12175""""""""""
12176
12177The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
12178an unsigned addition of the two arguments. They return a structure --- the
12179first element of which is the sum, and the second element of which is a
12180bit specifying if the unsigned summation resulted in a carry.
12181
12182Examples:
12183"""""""""
12184
12185.. code-block:: llvm
12186
12187      %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
12188      %sum = extractvalue {i32, i1} %res, 0
12189      %obit = extractvalue {i32, i1} %res, 1
12190      br i1 %obit, label %carry, label %normal
12191
12192'``llvm.ssub.with.overflow.*``' Intrinsics
12193^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12194
12195Syntax:
12196"""""""
12197
12198This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
12199on any integer bit width.
12200
12201::
12202
12203      declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
12204      declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
12205      declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
12206
12207Overview:
12208"""""""""
12209
12210The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
12211a signed subtraction of the two arguments, and indicate whether an
12212overflow occurred during the signed subtraction.
12213
12214Arguments:
12215""""""""""
12216
12217The arguments (%a and %b) and the first element of the result structure
12218may be of integer types of any bit width, but they must have the same
12219bit width. The second element of the result structure must be of type
12220``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
12221subtraction.
12222
12223Semantics:
12224""""""""""
12225
12226The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
12227a signed subtraction of the two arguments. They return a structure --- the
12228first element of which is the subtraction, and the second element of
12229which is a bit specifying if the signed subtraction resulted in an
12230overflow.
12231
12232Examples:
12233"""""""""
12234
12235.. code-block:: llvm
12236
12237      %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
12238      %sum = extractvalue {i32, i1} %res, 0
12239      %obit = extractvalue {i32, i1} %res, 1
12240      br i1 %obit, label %overflow, label %normal
12241
12242'``llvm.usub.with.overflow.*``' Intrinsics
12243^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12244
12245Syntax:
12246"""""""
12247
12248This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
12249on any integer bit width.
12250
12251::
12252
12253      declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
12254      declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
12255      declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
12256
12257Overview:
12258"""""""""
12259
12260The '``llvm.usub.with.overflow``' family of intrinsic functions perform
12261an unsigned subtraction of the two arguments, and indicate whether an
12262overflow occurred during the unsigned subtraction.
12263
12264Arguments:
12265""""""""""
12266
12267The arguments (%a and %b) and the first element of the result structure
12268may be of integer types of any bit width, but they must have the same
12269bit width. The second element of the result structure must be of type
12270``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
12271subtraction.
12272
12273Semantics:
12274""""""""""
12275
12276The '``llvm.usub.with.overflow``' family of intrinsic functions perform
12277an unsigned subtraction of the two arguments. They return a structure ---
12278the first element of which is the subtraction, and the second element of
12279which is a bit specifying if the unsigned subtraction resulted in an
12280overflow.
12281
12282Examples:
12283"""""""""
12284
12285.. code-block:: llvm
12286
12287      %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
12288      %sum = extractvalue {i32, i1} %res, 0
12289      %obit = extractvalue {i32, i1} %res, 1
12290      br i1 %obit, label %overflow, label %normal
12291
12292'``llvm.smul.with.overflow.*``' Intrinsics
12293^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12294
12295Syntax:
12296"""""""
12297
12298This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
12299on any integer bit width.
12300
12301::
12302
12303      declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
12304      declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
12305      declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
12306
12307Overview:
12308"""""""""
12309
12310The '``llvm.smul.with.overflow``' family of intrinsic functions perform
12311a signed multiplication of the two arguments, and indicate whether an
12312overflow occurred during the signed multiplication.
12313
12314Arguments:
12315""""""""""
12316
12317The arguments (%a and %b) and the first element of the result structure
12318may be of integer types of any bit width, but they must have the same
12319bit width. The second element of the result structure must be of type
12320``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
12321multiplication.
12322
12323Semantics:
12324""""""""""
12325
12326The '``llvm.smul.with.overflow``' family of intrinsic functions perform
12327a signed multiplication of the two arguments. They return a structure ---
12328the first element of which is the multiplication, and the second element
12329of which is a bit specifying if the signed multiplication resulted in an
12330overflow.
12331
12332Examples:
12333"""""""""
12334
12335.. code-block:: llvm
12336
12337      %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
12338      %sum = extractvalue {i32, i1} %res, 0
12339      %obit = extractvalue {i32, i1} %res, 1
12340      br i1 %obit, label %overflow, label %normal
12341
12342'``llvm.umul.with.overflow.*``' Intrinsics
12343^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12344
12345Syntax:
12346"""""""
12347
12348This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
12349on any integer bit width.
12350
12351::
12352
12353      declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
12354      declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
12355      declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
12356
12357Overview:
12358"""""""""
12359
12360The '``llvm.umul.with.overflow``' family of intrinsic functions perform
12361a unsigned multiplication of the two arguments, and indicate whether an
12362overflow occurred during the unsigned multiplication.
12363
12364Arguments:
12365""""""""""
12366
12367The arguments (%a and %b) and the first element of the result structure
12368may be of integer types of any bit width, but they must have the same
12369bit width. The second element of the result structure must be of type
12370``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
12371multiplication.
12372
12373Semantics:
12374""""""""""
12375
12376The '``llvm.umul.with.overflow``' family of intrinsic functions perform
12377an unsigned multiplication of the two arguments. They return a structure ---
12378the first element of which is the multiplication, and the second
12379element of which is a bit specifying if the unsigned multiplication
12380resulted in an overflow.
12381
12382Examples:
12383"""""""""
12384
12385.. code-block:: llvm
12386
12387      %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
12388      %sum = extractvalue {i32, i1} %res, 0
12389      %obit = extractvalue {i32, i1} %res, 1
12390      br i1 %obit, label %overflow, label %normal
12391
12392Specialised Arithmetic Intrinsics
12393---------------------------------
12394
12395'``llvm.canonicalize.*``' Intrinsic
12396^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12397
12398Syntax:
12399"""""""
12400
12401::
12402
12403      declare float @llvm.canonicalize.f32(float %a)
12404      declare double @llvm.canonicalize.f64(double %b)
12405
12406Overview:
12407"""""""""
12408
12409The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
12410encoding of a floating-point number. This canonicalization is useful for
12411implementing certain numeric primitives such as frexp. The canonical encoding is
12412defined by IEEE-754-2008 to be:
12413
12414::
12415
12416      2.1.8 canonical encoding: The preferred encoding of a floating-point
12417      representation in a format. Applied to declets, significands of finite
12418      numbers, infinities, and NaNs, especially in decimal formats.
12419
12420This operation can also be considered equivalent to the IEEE-754-2008
12421conversion of a floating-point value to the same format. NaNs are handled
12422according to section 6.2.
12423
12424Examples of non-canonical encodings:
12425
12426- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
12427  converted to a canonical representation per hardware-specific protocol.
12428- Many normal decimal floating-point numbers have non-canonical alternative
12429  encodings.
12430- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
12431  These are treated as non-canonical encodings of zero and will be flushed to
12432  a zero of the same sign by this operation.
12433
12434Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
12435default exception handling must signal an invalid exception, and produce a
12436quiet NaN result.
12437
12438This function should always be implementable as multiplication by 1.0, provided
12439that the compiler does not constant fold the operation. Likewise, division by
124401.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
12441-0.0 is also sufficient provided that the rounding mode is not -Infinity.
12442
12443``@llvm.canonicalize`` must preserve the equality relation. That is:
12444
12445- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
12446- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
12447  to ``(x == y)``
12448
12449Additionally, the sign of zero must be conserved:
12450``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
12451
12452The payload bits of a NaN must be conserved, with two exceptions.
12453First, environments which use only a single canonical representation of NaN
12454must perform said canonicalization. Second, SNaNs must be quieted per the
12455usual methods.
12456
12457The canonicalization operation may be optimized away if:
12458
12459- The input is known to be canonical. For example, it was produced by a
12460  floating-point operation that is required by the standard to be canonical.
12461- The result is consumed only by (or fused with) other floating-point
12462  operations. That is, the bits of the floating-point value are not examined.
12463
12464'``llvm.fmuladd.*``' Intrinsic
12465^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12466
12467Syntax:
12468"""""""
12469
12470::
12471
12472      declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
12473      declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
12474
12475Overview:
12476"""""""""
12477
12478The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
12479expressions that can be fused if the code generator determines that (a) the
12480target instruction set has support for a fused operation, and (b) that the
12481fused operation is more efficient than the equivalent, separate pair of mul
12482and add instructions.
12483
12484Arguments:
12485""""""""""
12486
12487The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
12488multiplicands, a and b, and an addend c.
12489
12490Semantics:
12491""""""""""
12492
12493The expression:
12494
12495::
12496
12497      %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
12498
12499is equivalent to the expression a \* b + c, except that rounding will
12500not be performed between the multiplication and addition steps if the
12501code generator fuses the operations. Fusion is not guaranteed, even if
12502the target platform supports it. If a fused multiply-add is required the
12503corresponding llvm.fma.\* intrinsic function should be used
12504instead. This never sets errno, just as '``llvm.fma.*``'.
12505
12506Examples:
12507"""""""""
12508
12509.. code-block:: llvm
12510
12511      %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
12512
12513
12514Experimental Vector Reduction Intrinsics
12515----------------------------------------
12516
12517Horizontal reductions of vectors can be expressed using the following
12518intrinsics. Each one takes a vector operand as an input and applies its
12519respective operation across all elements of the vector, returning a single
12520scalar result of the same element type.
12521
12522
12523'``llvm.experimental.vector.reduce.add.*``' Intrinsic
12524^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12525
12526Syntax:
12527"""""""
12528
12529::
12530
12531      declare i32 @llvm.experimental.vector.reduce.add.i32.v4i32(<4 x i32> %a)
12532      declare i64 @llvm.experimental.vector.reduce.add.i64.v2i64(<2 x i64> %a)
12533
12534Overview:
12535"""""""""
12536
12537The '``llvm.experimental.vector.reduce.add.*``' intrinsics do an integer ``ADD``
12538reduction of a vector, returning the result as a scalar. The return type matches
12539the element-type of the vector input.
12540
12541Arguments:
12542""""""""""
12543The argument to this intrinsic must be a vector of integer values.
12544
12545'``llvm.experimental.vector.reduce.fadd.*``' Intrinsic
12546^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12547
12548Syntax:
12549"""""""
12550
12551::
12552
12553      declare float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %a)
12554      declare double @llvm.experimental.vector.reduce.fadd.f64.v2f64(double %acc, <2 x double> %a)
12555
12556Overview:
12557"""""""""
12558
12559The '``llvm.experimental.vector.reduce.fadd.*``' intrinsics do a floating-point
12560``ADD`` reduction of a vector, returning the result as a scalar. The return type
12561matches the element-type of the vector input.
12562
12563If the intrinsic call has fast-math flags, then the reduction will not preserve
12564the associativity of an equivalent scalarized counterpart. If it does not have
12565fast-math flags, then the reduction will be *ordered*, implying that the
12566operation respects the associativity of a scalarized reduction.
12567
12568
12569Arguments:
12570""""""""""
12571The first argument to this intrinsic is a scalar accumulator value, which is
12572only used when there are no fast-math flags attached. This argument may be undef
12573when fast-math flags are used.
12574
12575The second argument must be a vector of floating-point values.
12576
12577Examples:
12578"""""""""
12579
12580.. code-block:: llvm
12581
12582      %fast = call fast float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float undef, <4 x float> %input) ; fast reduction
12583      %ord = call float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction
12584
12585
12586'``llvm.experimental.vector.reduce.mul.*``' Intrinsic
12587^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12588
12589Syntax:
12590"""""""
12591
12592::
12593
12594      declare i32 @llvm.experimental.vector.reduce.mul.i32.v4i32(<4 x i32> %a)
12595      declare i64 @llvm.experimental.vector.reduce.mul.i64.v2i64(<2 x i64> %a)
12596
12597Overview:
12598"""""""""
12599
12600The '``llvm.experimental.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
12601reduction of a vector, returning the result as a scalar. The return type matches
12602the element-type of the vector input.
12603
12604Arguments:
12605""""""""""
12606The argument to this intrinsic must be a vector of integer values.
12607
12608'``llvm.experimental.vector.reduce.fmul.*``' Intrinsic
12609^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12610
12611Syntax:
12612"""""""
12613
12614::
12615
12616      declare float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %a)
12617      declare double @llvm.experimental.vector.reduce.fmul.f64.v2f64(double %acc, <2 x double> %a)
12618
12619Overview:
12620"""""""""
12621
12622The '``llvm.experimental.vector.reduce.fmul.*``' intrinsics do a floating-point
12623``MUL`` reduction of a vector, returning the result as a scalar. The return type
12624matches the element-type of the vector input.
12625
12626If the intrinsic call has fast-math flags, then the reduction will not preserve
12627the associativity of an equivalent scalarized counterpart. If it does not have
12628fast-math flags, then the reduction will be *ordered*, implying that the
12629operation respects the associativity of a scalarized reduction.
12630
12631
12632Arguments:
12633""""""""""
12634The first argument to this intrinsic is a scalar accumulator value, which is
12635only used when there are no fast-math flags attached. This argument may be undef
12636when fast-math flags are used.
12637
12638The second argument must be a vector of floating-point values.
12639
12640Examples:
12641"""""""""
12642
12643.. code-block:: llvm
12644
12645      %fast = call fast float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float undef, <4 x float> %input) ; fast reduction
12646      %ord = call float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction
12647
12648'``llvm.experimental.vector.reduce.and.*``' Intrinsic
12649^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12650
12651Syntax:
12652"""""""
12653
12654::
12655
12656      declare i32 @llvm.experimental.vector.reduce.and.i32.v4i32(<4 x i32> %a)
12657
12658Overview:
12659"""""""""
12660
12661The '``llvm.experimental.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
12662reduction of a vector, returning the result as a scalar. The return type matches
12663the element-type of the vector input.
12664
12665Arguments:
12666""""""""""
12667The argument to this intrinsic must be a vector of integer values.
12668
12669'``llvm.experimental.vector.reduce.or.*``' Intrinsic
12670^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12671
12672Syntax:
12673"""""""
12674
12675::
12676
12677      declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)
12678
12679Overview:
12680"""""""""
12681
12682The '``llvm.experimental.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
12683of a vector, returning the result as a scalar. The return type matches the
12684element-type of the vector input.
12685
12686Arguments:
12687""""""""""
12688The argument to this intrinsic must be a vector of integer values.
12689
12690'``llvm.experimental.vector.reduce.xor.*``' Intrinsic
12691^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12692
12693Syntax:
12694"""""""
12695
12696::
12697
12698      declare i32 @llvm.experimental.vector.reduce.xor.i32.v4i32(<4 x i32> %a)
12699
12700Overview:
12701"""""""""
12702
12703The '``llvm.experimental.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
12704reduction of a vector, returning the result as a scalar. The return type matches
12705the element-type of the vector input.
12706
12707Arguments:
12708""""""""""
12709The argument to this intrinsic must be a vector of integer values.
12710
12711'``llvm.experimental.vector.reduce.smax.*``' Intrinsic
12712^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12713
12714Syntax:
12715"""""""
12716
12717::
12718
12719      declare i32 @llvm.experimental.vector.reduce.smax.i32.v4i32(<4 x i32> %a)
12720
12721Overview:
12722"""""""""
12723
12724The '``llvm.experimental.vector.reduce.smax.*``' intrinsics do a signed integer
12725``MAX`` reduction of a vector, returning the result as a scalar. The return type
12726matches the element-type of the vector input.
12727
12728Arguments:
12729""""""""""
12730The argument to this intrinsic must be a vector of integer values.
12731
12732'``llvm.experimental.vector.reduce.smin.*``' Intrinsic
12733^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12734
12735Syntax:
12736"""""""
12737
12738::
12739
12740      declare i32 @llvm.experimental.vector.reduce.smin.i32.v4i32(<4 x i32> %a)
12741
12742Overview:
12743"""""""""
12744
12745The '``llvm.experimental.vector.reduce.smin.*``' intrinsics do a signed integer
12746``MIN`` reduction of a vector, returning the result as a scalar. The return type
12747matches the element-type of the vector input.
12748
12749Arguments:
12750""""""""""
12751The argument to this intrinsic must be a vector of integer values.
12752
12753'``llvm.experimental.vector.reduce.umax.*``' Intrinsic
12754^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12755
12756Syntax:
12757"""""""
12758
12759::
12760
12761      declare i32 @llvm.experimental.vector.reduce.umax.i32.v4i32(<4 x i32> %a)
12762
12763Overview:
12764"""""""""
12765
12766The '``llvm.experimental.vector.reduce.umax.*``' intrinsics do an unsigned
12767integer ``MAX`` reduction of a vector, returning the result as a scalar. The
12768return type matches the element-type of the vector input.
12769
12770Arguments:
12771""""""""""
12772The argument to this intrinsic must be a vector of integer values.
12773
12774'``llvm.experimental.vector.reduce.umin.*``' Intrinsic
12775^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12776
12777Syntax:
12778"""""""
12779
12780::
12781
12782      declare i32 @llvm.experimental.vector.reduce.umin.i32.v4i32(<4 x i32> %a)
12783
12784Overview:
12785"""""""""
12786
12787The '``llvm.experimental.vector.reduce.umin.*``' intrinsics do an unsigned
12788integer ``MIN`` reduction of a vector, returning the result as a scalar. The
12789return type matches the element-type of the vector input.
12790
12791Arguments:
12792""""""""""
12793The argument to this intrinsic must be a vector of integer values.
12794
12795'``llvm.experimental.vector.reduce.fmax.*``' Intrinsic
12796^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12797
12798Syntax:
12799"""""""
12800
12801::
12802
12803      declare float @llvm.experimental.vector.reduce.fmax.f32.v4f32(<4 x float> %a)
12804      declare double @llvm.experimental.vector.reduce.fmax.f64.v2f64(<2 x double> %a)
12805
12806Overview:
12807"""""""""
12808
12809The '``llvm.experimental.vector.reduce.fmax.*``' intrinsics do a floating-point
12810``MAX`` reduction of a vector, returning the result as a scalar. The return type
12811matches the element-type of the vector input.
12812
12813If the intrinsic call has the ``nnan`` fast-math flag then the operation can
12814assume that NaNs are not present in the input vector.
12815
12816Arguments:
12817""""""""""
12818The argument to this intrinsic must be a vector of floating-point values.
12819
12820'``llvm.experimental.vector.reduce.fmin.*``' Intrinsic
12821^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12822
12823Syntax:
12824"""""""
12825
12826::
12827
12828      declare float @llvm.experimental.vector.reduce.fmin.f32.v4f32(<4 x float> %a)
12829      declare double @llvm.experimental.vector.reduce.fmin.f64.v2f64(<2 x double> %a)
12830
12831Overview:
12832"""""""""
12833
12834The '``llvm.experimental.vector.reduce.fmin.*``' intrinsics do a floating-point
12835``MIN`` reduction of a vector, returning the result as a scalar. The return type
12836matches the element-type of the vector input.
12837
12838If the intrinsic call has the ``nnan`` fast-math flag then the operation can
12839assume that NaNs are not present in the input vector.
12840
12841Arguments:
12842""""""""""
12843The argument to this intrinsic must be a vector of floating-point values.
12844
12845Half Precision Floating-Point Intrinsics
12846----------------------------------------
12847
12848For most target platforms, half precision floating-point is a
12849storage-only format. This means that it is a dense encoding (in memory)
12850but does not support computation in the format.
12851
12852This means that code must first load the half-precision floating-point
12853value as an i16, then convert it to float with
12854:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
12855then be performed on the float value (including extending to double
12856etc). To store the value back to memory, it is first converted to float
12857if needed, then converted to i16 with
12858:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
12859i16 value.
12860
12861.. _int_convert_to_fp16:
12862
12863'``llvm.convert.to.fp16``' Intrinsic
12864^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12865
12866Syntax:
12867"""""""
12868
12869::
12870
12871      declare i16 @llvm.convert.to.fp16.f32(float %a)
12872      declare i16 @llvm.convert.to.fp16.f64(double %a)
12873
12874Overview:
12875"""""""""
12876
12877The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
12878conventional floating-point type to half precision floating-point format.
12879
12880Arguments:
12881""""""""""
12882
12883The intrinsic function contains single argument - the value to be
12884converted.
12885
12886Semantics:
12887""""""""""
12888
12889The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
12890conventional floating-point format to half precision floating-point format. The
12891return value is an ``i16`` which contains the converted number.
12892
12893Examples:
12894"""""""""
12895
12896.. code-block:: llvm
12897
12898      %res = call i16 @llvm.convert.to.fp16.f32(float %a)
12899      store i16 %res, i16* @x, align 2
12900
12901.. _int_convert_from_fp16:
12902
12903'``llvm.convert.from.fp16``' Intrinsic
12904^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12905
12906Syntax:
12907"""""""
12908
12909::
12910
12911      declare float @llvm.convert.from.fp16.f32(i16 %a)
12912      declare double @llvm.convert.from.fp16.f64(i16 %a)
12913
12914Overview:
12915"""""""""
12916
12917The '``llvm.convert.from.fp16``' intrinsic function performs a
12918conversion from half precision floating-point format to single precision
12919floating-point format.
12920
12921Arguments:
12922""""""""""
12923
12924The intrinsic function contains single argument - the value to be
12925converted.
12926
12927Semantics:
12928""""""""""
12929
12930The '``llvm.convert.from.fp16``' intrinsic function performs a
12931conversion from half single precision floating-point format to single
12932precision floating-point format. The input half-float value is
12933represented by an ``i16`` value.
12934
12935Examples:
12936"""""""""
12937
12938.. code-block:: llvm
12939
12940      %a = load i16, i16* @x, align 2
12941      %res = call float @llvm.convert.from.fp16(i16 %a)
12942
12943.. _dbg_intrinsics:
12944
12945Debugger Intrinsics
12946-------------------
12947
12948The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
12949prefix), are described in the `LLVM Source Level
12950Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
12951document.
12952
12953Exception Handling Intrinsics
12954-----------------------------
12955
12956The LLVM exception handling intrinsics (which all start with
12957``llvm.eh.`` prefix), are described in the `LLVM Exception
12958Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
12959
12960.. _int_trampoline:
12961
12962Trampoline Intrinsics
12963---------------------
12964
12965These intrinsics make it possible to excise one parameter, marked with
12966the :ref:`nest <nest>` attribute, from a function. The result is a
12967callable function pointer lacking the nest parameter - the caller does
12968not need to provide a value for it. Instead, the value to use is stored
12969in advance in a "trampoline", a block of memory usually allocated on the
12970stack, which also contains code to splice the nest value into the
12971argument list. This is used to implement the GCC nested function address
12972extension.
12973
12974For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)``
12975then the resulting function pointer has signature ``i32 (i32, i32)*``.
12976It can be created as follows:
12977
12978.. code-block:: llvm
12979
12980      %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
12981      %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0
12982      call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval)
12983      %p = call i8* @llvm.adjust.trampoline(i8* %tramp1)
12984      %fp = bitcast i8* %p to i32 (i32, i32)*
12985
12986The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
12987``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``.
12988
12989.. _int_it:
12990
12991'``llvm.init.trampoline``' Intrinsic
12992^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12993
12994Syntax:
12995"""""""
12996
12997::
12998
12999      declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
13000
13001Overview:
13002"""""""""
13003
13004This fills the memory pointed to by ``tramp`` with executable code,
13005turning it into a trampoline.
13006
13007Arguments:
13008""""""""""
13009
13010The ``llvm.init.trampoline`` intrinsic takes three arguments, all
13011pointers. The ``tramp`` argument must point to a sufficiently large and
13012sufficiently aligned block of memory; this memory is written to by the
13013intrinsic. Note that the size and the alignment are target-specific -
13014LLVM currently provides no portable way of determining them, so a
13015front-end that generates this intrinsic needs to have some
13016target-specific knowledge. The ``func`` argument must hold a function
13017bitcast to an ``i8*``.
13018
13019Semantics:
13020""""""""""
13021
13022The block of memory pointed to by ``tramp`` is filled with target
13023dependent code, turning it into a function. Then ``tramp`` needs to be
13024passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
13025be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
13026function's signature is the same as that of ``func`` with any arguments
13027marked with the ``nest`` attribute removed. At most one such ``nest``
13028argument is allowed, and it must be of pointer type. Calling the new
13029function is equivalent to calling ``func`` with the same argument list,
13030but with ``nval`` used for the missing ``nest`` argument. If, after
13031calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
13032modified, then the effect of any later call to the returned function
13033pointer is undefined.
13034
13035.. _int_at:
13036
13037'``llvm.adjust.trampoline``' Intrinsic
13038^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13039
13040Syntax:
13041"""""""
13042
13043::
13044
13045      declare i8* @llvm.adjust.trampoline(i8* <tramp>)
13046
13047Overview:
13048"""""""""
13049
13050This performs any required machine-specific adjustment to the address of
13051a trampoline (passed as ``tramp``).
13052
13053Arguments:
13054""""""""""
13055
13056``tramp`` must point to a block of memory which already has trampoline
13057code filled in by a previous call to
13058:ref:`llvm.init.trampoline <int_it>`.
13059
13060Semantics:
13061""""""""""
13062
13063On some architectures the address of the code to be executed needs to be
13064different than the address where the trampoline is actually stored. This
13065intrinsic returns the executable address corresponding to ``tramp``
13066after performing the required machine specific adjustments. The pointer
13067returned can then be :ref:`bitcast and executed <int_trampoline>`.
13068
13069.. _int_mload_mstore:
13070
13071Masked Vector Load and Store Intrinsics
13072---------------------------------------
13073
13074LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
13075
13076.. _int_mload:
13077
13078'``llvm.masked.load.*``' Intrinsics
13079^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13080
13081Syntax:
13082"""""""
13083This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
13084
13085::
13086
13087      declare <16 x float>  @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
13088      declare <2 x double>  @llvm.masked.load.v2f64.p0v2f64  (<2 x double>* <ptr>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
13089      ;; The data is a vector of pointers to double
13090      declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64    (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>)
13091      ;; The data is a vector of function pointers
13092      declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>)
13093
13094Overview:
13095"""""""""
13096
13097Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
13098
13099
13100Arguments:
13101""""""""""
13102
13103The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
13104
13105
13106Semantics:
13107""""""""""
13108
13109The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
13110The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
13111
13112
13113::
13114
13115       %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
13116
13117       ;; The result of the two following instructions is identical aside from potential memory access exception
13118       %loadlal = load <16 x float>, <16 x float>* %ptr, align 4
13119       %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
13120
13121.. _int_mstore:
13122
13123'``llvm.masked.store.*``' Intrinsics
13124^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13125
13126Syntax:
13127"""""""
13128This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
13129
13130::
13131
13132       declare void @llvm.masked.store.v8i32.p0v8i32  (<8  x i32>   <value>, <8  x i32>*   <ptr>, i32 <alignment>,  <8  x i1> <mask>)
13133       declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>,  <16 x i1> <mask>)
13134       ;; The data is a vector of pointers to double
13135       declare void @llvm.masked.store.v8p0f64.p0v8p0f64    (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
13136       ;; The data is a vector of function pointers
13137       declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>)
13138
13139Overview:
13140"""""""""
13141
13142Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
13143
13144Arguments:
13145""""""""""
13146
13147The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
13148
13149
13150Semantics:
13151""""""""""
13152
13153The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
13154The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
13155
13156::
13157
13158       call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4,  <16 x i1> %mask)
13159
13160       ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
13161       %oldval = load <16 x float>, <16 x float>* %ptr, align 4
13162       %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
13163       store <16 x float> %res, <16 x float>* %ptr, align 4
13164
13165
13166Masked Vector Gather and Scatter Intrinsics
13167-------------------------------------------
13168
13169LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
13170
13171.. _int_mgather:
13172
13173'``llvm.masked.gather.*``' Intrinsics
13174^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13175
13176Syntax:
13177"""""""
13178This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
13179
13180::
13181
13182      declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32   (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
13183      declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64     (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
13184      declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1>  <mask>, <8 x float*> <passthru>)
13185
13186Overview:
13187"""""""""
13188
13189Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
13190
13191
13192Arguments:
13193""""""""""
13194
13195The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be a constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
13196
13197
13198Semantics:
13199""""""""""
13200
13201The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
13202The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
13203
13204
13205::
13206
13207       %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef)
13208
13209       ;; The gather with all-true mask is equivalent to the following instruction sequence
13210       %ptr0 = extractelement <4 x double*> %ptrs, i32 0
13211       %ptr1 = extractelement <4 x double*> %ptrs, i32 1
13212       %ptr2 = extractelement <4 x double*> %ptrs, i32 2
13213       %ptr3 = extractelement <4 x double*> %ptrs, i32 3
13214
13215       %val0 = load double, double* %ptr0, align 8
13216       %val1 = load double, double* %ptr1, align 8
13217       %val2 = load double, double* %ptr2, align 8
13218       %val3 = load double, double* %ptr3, align 8
13219
13220       %vec0    = insertelement <4 x double>undef, %val0, 0
13221       %vec01   = insertelement <4 x double>%vec0, %val1, 1
13222       %vec012  = insertelement <4 x double>%vec01, %val2, 2
13223       %vec0123 = insertelement <4 x double>%vec012, %val3, 3
13224
13225.. _int_mscatter:
13226
13227'``llvm.masked.scatter.*``' Intrinsics
13228^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13229
13230Syntax:
13231"""""""
13232This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
13233
13234::
13235
13236       declare void @llvm.masked.scatter.v8i32.v8p0i32     (<8 x i32>     <value>, <8 x i32*>     <ptrs>, i32 <alignment>, <8 x i1>  <mask>)
13237       declare void @llvm.masked.scatter.v16f32.v16p1f32   (<16 x float>  <value>, <16 x float addrspace(1)*>  <ptrs>, i32 <alignment>, <16 x i1> <mask>)
13238       declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1>  <mask>)
13239
13240Overview:
13241"""""""""
13242
13243Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
13244
13245Arguments:
13246""""""""""
13247
13248The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
13249
13250
13251Semantics:
13252""""""""""
13253
13254The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
13255
13256::
13257
13258       ;; This instruction unconditionally stores data vector in multiple addresses
13259       call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4,  <8 x i1>  <true, true, .. true>)
13260
13261       ;; It is equivalent to a list of scalar stores
13262       %val0 = extractelement <8 x i32> %value, i32 0
13263       %val1 = extractelement <8 x i32> %value, i32 1
13264       ..
13265       %val7 = extractelement <8 x i32> %value, i32 7
13266       %ptr0 = extractelement <8 x i32*> %ptrs, i32 0
13267       %ptr1 = extractelement <8 x i32*> %ptrs, i32 1
13268       ..
13269       %ptr7 = extractelement <8 x i32*> %ptrs, i32 7
13270       ;; Note: the order of the following stores is important when they overlap:
13271       store i32 %val0, i32* %ptr0, align 4
13272       store i32 %val1, i32* %ptr1, align 4
13273       ..
13274       store i32 %val7, i32* %ptr7, align 4
13275
13276
13277Masked Vector Expanding Load and Compressing Store Intrinsics
13278-------------------------------------------------------------
13279
13280LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
13281
13282.. _int_expandload:
13283
13284'``llvm.masked.expandload.*``' Intrinsics
13285^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13286
13287Syntax:
13288"""""""
13289This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
13290
13291::
13292
13293      declare <16 x float>  @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
13294      declare <2 x i64>     @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1>  <mask>, <2 x i64> <passthru>)
13295
13296Overview:
13297"""""""""
13298
13299Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "explandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand.
13300
13301
13302Arguments:
13303""""""""""
13304
13305The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type.
13306
13307Semantics:
13308""""""""""
13309
13310The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
13311
13312.. code-block:: c
13313
13314    // In this loop we load from B and spread the elements into array A.
13315    double *A, B; int *C;
13316    for (int i = 0; i < size; ++i) {
13317      if (C[i] != 0)
13318        A[i] = B[j++];
13319    }
13320
13321
13322.. code-block:: llvm
13323
13324    ; Load several elements from array B and expand them in a vector.
13325    ; The number of loaded elements is equal to the number of '1' elements in the Mask.
13326    %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef)
13327    ; Store the result in A
13328    call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask)
13329
13330    ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
13331    %MaskI = bitcast <8 x i1> %Mask to i8
13332    %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
13333    %MaskI64 = zext i8 %MaskIPopcnt to i64
13334    %BNextInd = add i64 %BInd, %MaskI64
13335
13336
13337Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
13338If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
13339
13340.. _int_compressstore:
13341
13342'``llvm.masked.compressstore.*``' Intrinsics
13343^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13344
13345Syntax:
13346"""""""
13347This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
13348
13349::
13350
13351      declare void @llvm.masked.compressstore.v8i32  (<8  x i32>   <value>, i32*   <ptr>, <8  x i1> <mask>)
13352      declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>)
13353
13354Overview:
13355"""""""""
13356
13357Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
13358
13359Arguments:
13360""""""""""
13361
13362The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
13363
13364
13365Semantics:
13366""""""""""
13367
13368The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example:
13369
13370.. code-block:: c
13371
13372    // In this loop we load elements from A and store them consecutively in B
13373    double *A, B; int *C;
13374    for (int i = 0; i < size; ++i) {
13375      if (C[i] != 0)
13376        B[j++] = A[i]
13377    }
13378
13379
13380.. code-block:: llvm
13381
13382    ; Load elements from A.
13383    %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef)
13384    ; Store all selected elements consecutively in array B
13385    call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask)
13386
13387    ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
13388    %MaskI = bitcast <8 x i1> %Mask to i8
13389    %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
13390    %MaskI64 = zext i8 %MaskIPopcnt to i64
13391    %BNextInd = add i64 %BInd, %MaskI64
13392
13393
13394Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
13395
13396
13397Memory Use Markers
13398------------------
13399
13400This class of intrinsics provides information about the lifetime of
13401memory objects and ranges where variables are immutable.
13402
13403.. _int_lifestart:
13404
13405'``llvm.lifetime.start``' Intrinsic
13406^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13407
13408Syntax:
13409"""""""
13410
13411::
13412
13413      declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
13414
13415Overview:
13416"""""""""
13417
13418The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
13419object's lifetime.
13420
13421Arguments:
13422""""""""""
13423
13424The first argument is a constant integer representing the size of the
13425object, or -1 if it is variable sized. The second argument is a pointer
13426to the object.
13427
13428Semantics:
13429""""""""""
13430
13431This intrinsic indicates that before this point in the code, the value
13432of the memory pointed to by ``ptr`` is dead. This means that it is known
13433to never be used and has an undefined value. A load from the pointer
13434that precedes this intrinsic can be replaced with ``'undef'``.
13435
13436.. _int_lifeend:
13437
13438'``llvm.lifetime.end``' Intrinsic
13439^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13440
13441Syntax:
13442"""""""
13443
13444::
13445
13446      declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
13447
13448Overview:
13449"""""""""
13450
13451The '``llvm.lifetime.end``' intrinsic specifies the end of a memory
13452object's lifetime.
13453
13454Arguments:
13455""""""""""
13456
13457The first argument is a constant integer representing the size of the
13458object, or -1 if it is variable sized. The second argument is a pointer
13459to the object.
13460
13461Semantics:
13462""""""""""
13463
13464This intrinsic indicates that after this point in the code, the value of
13465the memory pointed to by ``ptr`` is dead. This means that it is known to
13466never be used and has an undefined value. Any stores into the memory
13467object following this intrinsic may be removed as dead.
13468
13469'``llvm.invariant.start``' Intrinsic
13470^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13471
13472Syntax:
13473"""""""
13474This is an overloaded intrinsic. The memory object can belong to any address space.
13475
13476::
13477
13478      declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>)
13479
13480Overview:
13481"""""""""
13482
13483The '``llvm.invariant.start``' intrinsic specifies that the contents of
13484a memory object will not change.
13485
13486Arguments:
13487""""""""""
13488
13489The first argument is a constant integer representing the size of the
13490object, or -1 if it is variable sized. The second argument is a pointer
13491to the object.
13492
13493Semantics:
13494""""""""""
13495
13496This intrinsic indicates that until an ``llvm.invariant.end`` that uses
13497the return value, the referenced memory location is constant and
13498unchanging.
13499
13500'``llvm.invariant.end``' Intrinsic
13501^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13502
13503Syntax:
13504"""""""
13505This is an overloaded intrinsic. The memory object can belong to any address space.
13506
13507::
13508
13509      declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>)
13510
13511Overview:
13512"""""""""
13513
13514The '``llvm.invariant.end``' intrinsic specifies that the contents of a
13515memory object are mutable.
13516
13517Arguments:
13518""""""""""
13519
13520The first argument is the matching ``llvm.invariant.start`` intrinsic.
13521The second argument is a constant integer representing the size of the
13522object, or -1 if it is variable sized and the third argument is a
13523pointer to the object.
13524
13525Semantics:
13526""""""""""
13527
13528This intrinsic indicates that the memory is mutable again.
13529
13530'``llvm.launder.invariant.group``' Intrinsic
13531^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13532
13533Syntax:
13534"""""""
13535This is an overloaded intrinsic. The memory object can belong to any address
13536space. The returned pointer must belong to the same address space as the
13537argument.
13538
13539::
13540
13541      declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>)
13542
13543Overview:
13544"""""""""
13545
13546The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
13547established by ``invariant.group`` metadata no longer holds, to obtain a new
13548pointer value that carries fresh invariant group information. It is an
13549experimental intrinsic, which means that its semantics might change in the
13550future.
13551
13552
13553Arguments:
13554""""""""""
13555
13556The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
13557to the memory.
13558
13559Semantics:
13560""""""""""
13561
13562Returns another pointer that aliases its argument but which is considered different
13563for the purposes of ``load``/``store`` ``invariant.group`` metadata.
13564It does not read any accessible memory and the execution can be speculated.
13565
13566'``llvm.strip.invariant.group``' Intrinsic
13567^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13568
13569Syntax:
13570"""""""
13571This is an overloaded intrinsic. The memory object can belong to any address
13572space. The returned pointer must belong to the same address space as the
13573argument.
13574
13575::
13576
13577      declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>)
13578
13579Overview:
13580"""""""""
13581
13582The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
13583established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
13584value that does not carry the invariant information. It is an experimental
13585intrinsic, which means that its semantics might change in the future.
13586
13587
13588Arguments:
13589""""""""""
13590
13591The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
13592to the memory.
13593
13594Semantics:
13595""""""""""
13596
13597Returns another pointer that aliases its argument but which has no associated
13598``invariant.group`` metadata.
13599It does not read any memory and can be speculated.
13600
13601
13602
13603.. _constrainedfp:
13604
13605Constrained Floating-Point Intrinsics
13606-------------------------------------
13607
13608These intrinsics are used to provide special handling of floating-point
13609operations when specific rounding mode or floating-point exception behavior is
13610required.  By default, LLVM optimization passes assume that the rounding mode is
13611round-to-nearest and that floating-point exceptions will not be monitored.
13612Constrained FP intrinsics are used to support non-default rounding modes and
13613accurately preserve exception behavior without compromising LLVM's ability to
13614optimize FP code when the default behavior is used.
13615
13616Each of these intrinsics corresponds to a normal floating-point operation.  The
13617first two arguments and the return value are the same as the corresponding FP
13618operation.
13619
13620The third argument is a metadata argument specifying the rounding mode to be
13621assumed. This argument must be one of the following strings:
13622
13623::
13624
13625      "round.dynamic"
13626      "round.tonearest"
13627      "round.downward"
13628      "round.upward"
13629      "round.towardzero"
13630
13631If this argument is "round.dynamic" optimization passes must assume that the
13632rounding mode is unknown and may change at runtime.  No transformations that
13633depend on rounding mode may be performed in this case.
13634
13635The other possible values for the rounding mode argument correspond to the
13636similarly named IEEE rounding modes.  If the argument is any of these values
13637optimization passes may perform transformations as long as they are consistent
13638with the specified rounding mode.
13639
13640For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
13641"round.downward" or "round.dynamic" because if the value of 'x' is +0 then
13642'x-0' should evaluate to '-0' when rounding downward.  However, this
13643transformation is legal for all other rounding modes.
13644
13645For values other than "round.dynamic" optimization passes may assume that the
13646actual runtime rounding mode (as defined in a target-specific manner) matches
13647the specified rounding mode, but this is not guaranteed.  Using a specific
13648non-dynamic rounding mode which does not match the actual rounding mode at
13649runtime results in undefined behavior.
13650
13651The fourth argument to the constrained floating-point intrinsics specifies the
13652required exception behavior.  This argument must be one of the following
13653strings:
13654
13655::
13656
13657      "fpexcept.ignore"
13658      "fpexcept.maytrap"
13659      "fpexcept.strict"
13660
13661If this argument is "fpexcept.ignore" optimization passes may assume that the
13662exception status flags will not be read and that floating-point exceptions will
13663be masked.  This allows transformations to be performed that may change the
13664exception semantics of the original code.  For example, FP operations may be
13665speculatively executed in this case whereas they must not be for either of the
13666other possible values of this argument.
13667
13668If the exception behavior argument is "fpexcept.maytrap" optimization passes
13669must avoid transformations that may raise exceptions that would not have been
13670raised by the original code (such as speculatively executing FP operations), but
13671passes are not required to preserve all exceptions that are implied by the
13672original code.  For example, exceptions may be potentially hidden by constant
13673folding.
13674
13675If the exception behavior argument is "fpexcept.strict" all transformations must
13676strictly preserve the floating-point exception semantics of the original code.
13677Any FP exception that would have been raised by the original code must be raised
13678by the transformed code, and the transformed code must not raise any FP
13679exceptions that would not have been raised by the original code.  This is the
13680exception behavior argument that will be used if the code being compiled reads
13681the FP exception status flags, but this mode can also be used with code that
13682unmasks FP exceptions.
13683
13684The number and order of floating-point exceptions is NOT guaranteed.  For
13685example, a series of FP operations that each may raise exceptions may be
13686vectorized into a single instruction that raises each unique exception a single
13687time.
13688
13689
13690'``llvm.experimental.constrained.fadd``' Intrinsic
13691^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13692
13693Syntax:
13694"""""""
13695
13696::
13697
13698      declare <type>
13699      @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
13700                                          metadata <rounding mode>,
13701                                          metadata <exception behavior>)
13702
13703Overview:
13704"""""""""
13705
13706The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
13707two operands.
13708
13709
13710Arguments:
13711""""""""""
13712
13713The first two arguments to the '``llvm.experimental.constrained.fadd``'
13714intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
13715of floating-point values. Both arguments must have identical types.
13716
13717The third and fourth arguments specify the rounding mode and exception
13718behavior as described above.
13719
13720Semantics:
13721""""""""""
13722
13723The value produced is the floating-point sum of the two value operands and has
13724the same type as the operands.
13725
13726
13727'``llvm.experimental.constrained.fsub``' Intrinsic
13728^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13729
13730Syntax:
13731"""""""
13732
13733::
13734
13735      declare <type>
13736      @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
13737                                          metadata <rounding mode>,
13738                                          metadata <exception behavior>)
13739
13740Overview:
13741"""""""""
13742
13743The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
13744of its two operands.
13745
13746
13747Arguments:
13748""""""""""
13749
13750The first two arguments to the '``llvm.experimental.constrained.fsub``'
13751intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
13752of floating-point values. Both arguments must have identical types.
13753
13754The third and fourth arguments specify the rounding mode and exception
13755behavior as described above.
13756
13757Semantics:
13758""""""""""
13759
13760The value produced is the floating-point difference of the two value operands
13761and has the same type as the operands.
13762
13763
13764'``llvm.experimental.constrained.fmul``' Intrinsic
13765^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13766
13767Syntax:
13768"""""""
13769
13770::
13771
13772      declare <type>
13773      @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
13774                                          metadata <rounding mode>,
13775                                          metadata <exception behavior>)
13776
13777Overview:
13778"""""""""
13779
13780The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
13781its two operands.
13782
13783
13784Arguments:
13785""""""""""
13786
13787The first two arguments to the '``llvm.experimental.constrained.fmul``'
13788intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
13789of floating-point values. Both arguments must have identical types.
13790
13791The third and fourth arguments specify the rounding mode and exception
13792behavior as described above.
13793
13794Semantics:
13795""""""""""
13796
13797The value produced is the floating-point product of the two value operands and
13798has the same type as the operands.
13799
13800
13801'``llvm.experimental.constrained.fdiv``' Intrinsic
13802^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13803
13804Syntax:
13805"""""""
13806
13807::
13808
13809      declare <type>
13810      @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
13811                                          metadata <rounding mode>,
13812                                          metadata <exception behavior>)
13813
13814Overview:
13815"""""""""
13816
13817The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
13818its two operands.
13819
13820
13821Arguments:
13822""""""""""
13823
13824The first two arguments to the '``llvm.experimental.constrained.fdiv``'
13825intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
13826of floating-point values. Both arguments must have identical types.
13827
13828The third and fourth arguments specify the rounding mode and exception
13829behavior as described above.
13830
13831Semantics:
13832""""""""""
13833
13834The value produced is the floating-point quotient of the two value operands and
13835has the same type as the operands.
13836
13837
13838'``llvm.experimental.constrained.frem``' Intrinsic
13839^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13840
13841Syntax:
13842"""""""
13843
13844::
13845
13846      declare <type>
13847      @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
13848                                          metadata <rounding mode>,
13849                                          metadata <exception behavior>)
13850
13851Overview:
13852"""""""""
13853
13854The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
13855from the division of its two operands.
13856
13857
13858Arguments:
13859""""""""""
13860
13861The first two arguments to the '``llvm.experimental.constrained.frem``'
13862intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
13863of floating-point values. Both arguments must have identical types.
13864
13865The third and fourth arguments specify the rounding mode and exception
13866behavior as described above.  The rounding mode argument has no effect, since
13867the result of frem is never rounded, but the argument is included for
13868consistency with the other constrained floating-point intrinsics.
13869
13870Semantics:
13871""""""""""
13872
13873The value produced is the floating-point remainder from the division of the two
13874value operands and has the same type as the operands.  The remainder has the
13875same sign as the dividend.
13876
13877'``llvm.experimental.constrained.fma``' Intrinsic
13878^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13879
13880Syntax:
13881"""""""
13882
13883::
13884
13885      declare <type>
13886      @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
13887                                          metadata <rounding mode>,
13888                                          metadata <exception behavior>)
13889
13890Overview:
13891"""""""""
13892
13893The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
13894fused-multiply-add operation on its operands.
13895
13896Arguments:
13897""""""""""
13898
13899The first three arguments to the '``llvm.experimental.constrained.fma``'
13900intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
13901<t_vector>` of floating-point values. All arguments must have identical types.
13902
13903The fourth and fifth arguments specify the rounding mode and exception behavior
13904as described above.
13905
13906Semantics:
13907""""""""""
13908
13909The result produced is the product of the first two operands added to the third
13910operand computed with infinite precision, and then rounded to the target
13911precision.
13912
13913Constrained libm-equivalent Intrinsics
13914--------------------------------------
13915
13916In addition to the basic floating-point operations for which constrained
13917intrinsics are described above, there are constrained versions of various
13918operations which provide equivalent behavior to a corresponding libm function.
13919These intrinsics allow the precise behavior of these operations with respect to
13920rounding mode and exception behavior to be controlled.
13921
13922As with the basic constrained floating-point intrinsics, the rounding mode
13923and exception behavior arguments only control the behavior of the optimizer.
13924They do not change the runtime floating-point environment.
13925
13926
13927'``llvm.experimental.constrained.sqrt``' Intrinsic
13928^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13929
13930Syntax:
13931"""""""
13932
13933::
13934
13935      declare <type>
13936      @llvm.experimental.constrained.sqrt(<type> <op1>,
13937                                          metadata <rounding mode>,
13938                                          metadata <exception behavior>)
13939
13940Overview:
13941"""""""""
13942
13943The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
13944of the specified value, returning the same value as the libm '``sqrt``'
13945functions would, but without setting ``errno``.
13946
13947Arguments:
13948""""""""""
13949
13950The first argument and the return type are floating-point numbers of the same
13951type.
13952
13953The second and third arguments specify the rounding mode and exception
13954behavior as described above.
13955
13956Semantics:
13957""""""""""
13958
13959This function returns the nonnegative square root of the specified value.
13960If the value is less than negative zero, a floating-point exception occurs
13961and the return value is architecture specific.
13962
13963
13964'``llvm.experimental.constrained.pow``' Intrinsic
13965^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13966
13967Syntax:
13968"""""""
13969
13970::
13971
13972      declare <type>
13973      @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
13974                                         metadata <rounding mode>,
13975                                         metadata <exception behavior>)
13976
13977Overview:
13978"""""""""
13979
13980The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand
13981raised to the (positive or negative) power specified by the second operand.
13982
13983Arguments:
13984""""""""""
13985
13986The first two arguments and the return value are floating-point numbers of the
13987same type.  The second argument specifies the power to which the first argument
13988should be raised.
13989
13990The third and fourth arguments specify the rounding mode and exception
13991behavior as described above.
13992
13993Semantics:
13994""""""""""
13995
13996This function returns the first value raised to the second power,
13997returning the same values as the libm ``pow`` functions would, and
13998handles error conditions in the same way.
13999
14000
14001'``llvm.experimental.constrained.powi``' Intrinsic
14002^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14003
14004Syntax:
14005"""""""
14006
14007::
14008
14009      declare <type>
14010      @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
14011                                          metadata <rounding mode>,
14012                                          metadata <exception behavior>)
14013
14014Overview:
14015"""""""""
14016
14017The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand
14018raised to the (positive or negative) power specified by the second operand. The
14019order of evaluation of multiplications is not defined. When a vector of
14020floating-point type is used, the second argument remains a scalar integer value.
14021
14022
14023Arguments:
14024""""""""""
14025
14026The first argument and the return value are floating-point numbers of the same
14027type.  The second argument is a 32-bit signed integer specifying the power to
14028which the first argument should be raised.
14029
14030The third and fourth arguments specify the rounding mode and exception
14031behavior as described above.
14032
14033Semantics:
14034""""""""""
14035
14036This function returns the first value raised to the second power with an
14037unspecified sequence of rounding operations.
14038
14039
14040'``llvm.experimental.constrained.sin``' Intrinsic
14041^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14042
14043Syntax:
14044"""""""
14045
14046::
14047
14048      declare <type>
14049      @llvm.experimental.constrained.sin(<type> <op1>,
14050                                         metadata <rounding mode>,
14051                                         metadata <exception behavior>)
14052
14053Overview:
14054"""""""""
14055
14056The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
14057first operand.
14058
14059Arguments:
14060""""""""""
14061
14062The first argument and the return type are floating-point numbers of the same
14063type.
14064
14065The second and third arguments specify the rounding mode and exception
14066behavior as described above.
14067
14068Semantics:
14069""""""""""
14070
14071This function returns the sine of the specified operand, returning the
14072same values as the libm ``sin`` functions would, and handles error
14073conditions in the same way.
14074
14075
14076'``llvm.experimental.constrained.cos``' Intrinsic
14077^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14078
14079Syntax:
14080"""""""
14081
14082::
14083
14084      declare <type>
14085      @llvm.experimental.constrained.cos(<type> <op1>,
14086                                         metadata <rounding mode>,
14087                                         metadata <exception behavior>)
14088
14089Overview:
14090"""""""""
14091
14092The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
14093first operand.
14094
14095Arguments:
14096""""""""""
14097
14098The first argument and the return type are floating-point numbers of the same
14099type.
14100
14101The second and third arguments specify the rounding mode and exception
14102behavior as described above.
14103
14104Semantics:
14105""""""""""
14106
14107This function returns the cosine of the specified operand, returning the
14108same values as the libm ``cos`` functions would, and handles error
14109conditions in the same way.
14110
14111
14112'``llvm.experimental.constrained.exp``' Intrinsic
14113^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14114
14115Syntax:
14116"""""""
14117
14118::
14119
14120      declare <type>
14121      @llvm.experimental.constrained.exp(<type> <op1>,
14122                                         metadata <rounding mode>,
14123                                         metadata <exception behavior>)
14124
14125Overview:
14126"""""""""
14127
14128The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
14129exponential of the specified value.
14130
14131Arguments:
14132""""""""""
14133
14134The first argument and the return value are floating-point numbers of the same
14135type.
14136
14137The second and third arguments specify the rounding mode and exception
14138behavior as described above.
14139
14140Semantics:
14141""""""""""
14142
14143This function returns the same values as the libm ``exp`` functions
14144would, and handles error conditions in the same way.
14145
14146
14147'``llvm.experimental.constrained.exp2``' Intrinsic
14148^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14149
14150Syntax:
14151"""""""
14152
14153::
14154
14155      declare <type>
14156      @llvm.experimental.constrained.exp2(<type> <op1>,
14157                                          metadata <rounding mode>,
14158                                          metadata <exception behavior>)
14159
14160Overview:
14161"""""""""
14162
14163The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
14164exponential of the specified value.
14165
14166
14167Arguments:
14168""""""""""
14169
14170The first argument and the return value are floating-point numbers of the same
14171type.
14172
14173The second and third arguments specify the rounding mode and exception
14174behavior as described above.
14175
14176Semantics:
14177""""""""""
14178
14179This function returns the same values as the libm ``exp2`` functions
14180would, and handles error conditions in the same way.
14181
14182
14183'``llvm.experimental.constrained.log``' Intrinsic
14184^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14185
14186Syntax:
14187"""""""
14188
14189::
14190
14191      declare <type>
14192      @llvm.experimental.constrained.log(<type> <op1>,
14193                                         metadata <rounding mode>,
14194                                         metadata <exception behavior>)
14195
14196Overview:
14197"""""""""
14198
14199The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
14200logarithm of the specified value.
14201
14202Arguments:
14203""""""""""
14204
14205The first argument and the return value are floating-point numbers of the same
14206type.
14207
14208The second and third arguments specify the rounding mode and exception
14209behavior as described above.
14210
14211
14212Semantics:
14213""""""""""
14214
14215This function returns the same values as the libm ``log`` functions
14216would, and handles error conditions in the same way.
14217
14218
14219'``llvm.experimental.constrained.log10``' Intrinsic
14220^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14221
14222Syntax:
14223"""""""
14224
14225::
14226
14227      declare <type>
14228      @llvm.experimental.constrained.log10(<type> <op1>,
14229                                           metadata <rounding mode>,
14230                                           metadata <exception behavior>)
14231
14232Overview:
14233"""""""""
14234
14235The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
14236logarithm of the specified value.
14237
14238Arguments:
14239""""""""""
14240
14241The first argument and the return value are floating-point numbers of the same
14242type.
14243
14244The second and third arguments specify the rounding mode and exception
14245behavior as described above.
14246
14247Semantics:
14248""""""""""
14249
14250This function returns the same values as the libm ``log10`` functions
14251would, and handles error conditions in the same way.
14252
14253
14254'``llvm.experimental.constrained.log2``' Intrinsic
14255^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14256
14257Syntax:
14258"""""""
14259
14260::
14261
14262      declare <type>
14263      @llvm.experimental.constrained.log2(<type> <op1>,
14264                                          metadata <rounding mode>,
14265                                          metadata <exception behavior>)
14266
14267Overview:
14268"""""""""
14269
14270The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
14271logarithm of the specified value.
14272
14273Arguments:
14274""""""""""
14275
14276The first argument and the return value are floating-point numbers of the same
14277type.
14278
14279The second and third arguments specify the rounding mode and exception
14280behavior as described above.
14281
14282Semantics:
14283""""""""""
14284
14285This function returns the same values as the libm ``log2`` functions
14286would, and handles error conditions in the same way.
14287
14288
14289'``llvm.experimental.constrained.rint``' Intrinsic
14290^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14291
14292Syntax:
14293"""""""
14294
14295::
14296
14297      declare <type>
14298      @llvm.experimental.constrained.rint(<type> <op1>,
14299                                          metadata <rounding mode>,
14300                                          metadata <exception behavior>)
14301
14302Overview:
14303"""""""""
14304
14305The '``llvm.experimental.constrained.rint``' intrinsic returns the first
14306operand rounded to the nearest integer. It may raise an inexact floating-point
14307exception if the operand is not an integer.
14308
14309Arguments:
14310""""""""""
14311
14312The first argument and the return value are floating-point numbers of the same
14313type.
14314
14315The second and third arguments specify the rounding mode and exception
14316behavior as described above.
14317
14318Semantics:
14319""""""""""
14320
14321This function returns the same values as the libm ``rint`` functions
14322would, and handles error conditions in the same way.  The rounding mode is
14323described, not determined, by the rounding mode argument.  The actual rounding
14324mode is determined by the runtime floating-point environment.  The rounding
14325mode argument is only intended as information to the compiler.
14326
14327
14328'``llvm.experimental.constrained.nearbyint``' Intrinsic
14329^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14330
14331Syntax:
14332"""""""
14333
14334::
14335
14336      declare <type>
14337      @llvm.experimental.constrained.nearbyint(<type> <op1>,
14338                                               metadata <rounding mode>,
14339                                               metadata <exception behavior>)
14340
14341Overview:
14342"""""""""
14343
14344The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
14345operand rounded to the nearest integer. It will not raise an inexact
14346floating-point exception if the operand is not an integer.
14347
14348
14349Arguments:
14350""""""""""
14351
14352The first argument and the return value are floating-point numbers of the same
14353type.
14354
14355The second and third arguments specify the rounding mode and exception
14356behavior as described above.
14357
14358Semantics:
14359""""""""""
14360
14361This function returns the same values as the libm ``nearbyint`` functions
14362would, and handles error conditions in the same way.  The rounding mode is
14363described, not determined, by the rounding mode argument.  The actual rounding
14364mode is determined by the runtime floating-point environment.  The rounding
14365mode argument is only intended as information to the compiler.
14366
14367
14368General Intrinsics
14369------------------
14370
14371This class of intrinsics is designed to be generic and has no specific
14372purpose.
14373
14374'``llvm.var.annotation``' Intrinsic
14375^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14376
14377Syntax:
14378"""""""
14379
14380::
14381
14382      declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
14383
14384Overview:
14385"""""""""
14386
14387The '``llvm.var.annotation``' intrinsic.
14388
14389Arguments:
14390""""""""""
14391
14392The first argument is a pointer to a value, the second is a pointer to a
14393global string, the third is a pointer to a global string which is the
14394source file name, and the last argument is the line number.
14395
14396Semantics:
14397""""""""""
14398
14399This intrinsic allows annotation of local variables with arbitrary
14400strings. This can be useful for special purpose optimizations that want
14401to look for these annotations. These have no other defined use; they are
14402ignored by code generation and optimization.
14403
14404'``llvm.ptr.annotation.*``' Intrinsic
14405^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14406
14407Syntax:
14408"""""""
14409
14410This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
14411pointer to an integer of any width. *NOTE* you must specify an address space for
14412the pointer. The identifier for the default address space is the integer
14413'``0``'.
14414
14415::
14416
14417      declare i8*   @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
14418      declare i16*  @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32  <int>)
14419      declare i32*  @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32  <int>)
14420      declare i64*  @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32  <int>)
14421      declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32  <int>)
14422
14423Overview:
14424"""""""""
14425
14426The '``llvm.ptr.annotation``' intrinsic.
14427
14428Arguments:
14429""""""""""
14430
14431The first argument is a pointer to an integer value of arbitrary bitwidth
14432(result of some expression), the second is a pointer to a global string, the
14433third is a pointer to a global string which is the source file name, and the
14434last argument is the line number. It returns the value of the first argument.
14435
14436Semantics:
14437""""""""""
14438
14439This intrinsic allows annotation of a pointer to an integer with arbitrary
14440strings. This can be useful for special purpose optimizations that want to look
14441for these annotations. These have no other defined use; they are ignored by code
14442generation and optimization.
14443
14444'``llvm.annotation.*``' Intrinsic
14445^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14446
14447Syntax:
14448"""""""
14449
14450This is an overloaded intrinsic. You can use '``llvm.annotation``' on
14451any integer bit width.
14452
14453::
14454
14455      declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32  <int>)
14456      declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32  <int>)
14457      declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32  <int>)
14458      declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32  <int>)
14459      declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32  <int>)
14460
14461Overview:
14462"""""""""
14463
14464The '``llvm.annotation``' intrinsic.
14465
14466Arguments:
14467""""""""""
14468
14469The first argument is an integer value (result of some expression), the
14470second is a pointer to a global string, the third is a pointer to a
14471global string which is the source file name, and the last argument is
14472the line number. It returns the value of the first argument.
14473
14474Semantics:
14475""""""""""
14476
14477This intrinsic allows annotations to be put on arbitrary expressions
14478with arbitrary strings. This can be useful for special purpose
14479optimizations that want to look for these annotations. These have no
14480other defined use; they are ignored by code generation and optimization.
14481
14482'``llvm.codeview.annotation``' Intrinsic
14483^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14484
14485Syntax:
14486"""""""
14487
14488This annotation emits a label at its program point and an associated
14489``S_ANNOTATION`` codeview record with some additional string metadata. This is
14490used to implement MSVC's ``__annotation`` intrinsic. It is marked
14491``noduplicate``, so calls to this intrinsic prevent inlining and should be
14492considered expensive.
14493
14494::
14495
14496      declare void @llvm.codeview.annotation(metadata)
14497
14498Arguments:
14499""""""""""
14500
14501The argument should be an MDTuple containing any number of MDStrings.
14502
14503'``llvm.trap``' Intrinsic
14504^^^^^^^^^^^^^^^^^^^^^^^^^
14505
14506Syntax:
14507"""""""
14508
14509::
14510
14511      declare void @llvm.trap() noreturn nounwind
14512
14513Overview:
14514"""""""""
14515
14516The '``llvm.trap``' intrinsic.
14517
14518Arguments:
14519""""""""""
14520
14521None.
14522
14523Semantics:
14524""""""""""
14525
14526This intrinsic is lowered to the target dependent trap instruction. If
14527the target does not have a trap instruction, this intrinsic will be
14528lowered to a call of the ``abort()`` function.
14529
14530'``llvm.debugtrap``' Intrinsic
14531^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14532
14533Syntax:
14534"""""""
14535
14536::
14537
14538      declare void @llvm.debugtrap() nounwind
14539
14540Overview:
14541"""""""""
14542
14543The '``llvm.debugtrap``' intrinsic.
14544
14545Arguments:
14546""""""""""
14547
14548None.
14549
14550Semantics:
14551""""""""""
14552
14553This intrinsic is lowered to code which is intended to cause an
14554execution trap with the intention of requesting the attention of a
14555debugger.
14556
14557'``llvm.stackprotector``' Intrinsic
14558^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14559
14560Syntax:
14561"""""""
14562
14563::
14564
14565      declare void @llvm.stackprotector(i8* <guard>, i8** <slot>)
14566
14567Overview:
14568"""""""""
14569
14570The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
14571onto the stack at ``slot``. The stack slot is adjusted to ensure that it
14572is placed on the stack before local variables.
14573
14574Arguments:
14575""""""""""
14576
14577The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
14578The first argument is the value loaded from the stack guard
14579``@__stack_chk_guard``. The second variable is an ``alloca`` that has
14580enough space to hold the value of the guard.
14581
14582Semantics:
14583""""""""""
14584
14585This intrinsic causes the prologue/epilogue inserter to force the position of
14586the ``AllocaInst`` stack slot to be before local variables on the stack. This is
14587to ensure that if a local variable on the stack is overwritten, it will destroy
14588the value of the guard. When the function exits, the guard on the stack is
14589checked against the original guard by ``llvm.stackprotectorcheck``. If they are
14590different, then ``llvm.stackprotectorcheck`` causes the program to abort by
14591calling the ``__stack_chk_fail()`` function.
14592
14593'``llvm.stackguard``' Intrinsic
14594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14595
14596Syntax:
14597"""""""
14598
14599::
14600
14601      declare i8* @llvm.stackguard()
14602
14603Overview:
14604"""""""""
14605
14606The ``llvm.stackguard`` intrinsic returns the system stack guard value.
14607
14608It should not be generated by frontends, since it is only for internal usage.
14609The reason why we create this intrinsic is that we still support IR form Stack
14610Protector in FastISel.
14611
14612Arguments:
14613""""""""""
14614
14615None.
14616
14617Semantics:
14618""""""""""
14619
14620On some platforms, the value returned by this intrinsic remains unchanged
14621between loads in the same thread. On other platforms, it returns the same
14622global variable value, if any, e.g. ``@__stack_chk_guard``.
14623
14624Currently some platforms have IR-level customized stack guard loading (e.g.
14625X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
14626in the future.
14627
14628'``llvm.objectsize``' Intrinsic
14629^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14630
14631Syntax:
14632"""""""
14633
14634::
14635
14636      declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>)
14637      declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>)
14638
14639Overview:
14640"""""""""
14641
14642The ``llvm.objectsize`` intrinsic is designed to provide information to
14643the optimizers to determine at compile time whether a) an operation
14644(like memcpy) will overflow a buffer that corresponds to an object, or
14645b) that a runtime check for overflow isn't necessary. An object in this
14646context means an allocation of a specific class, structure, array, or
14647other object.
14648
14649Arguments:
14650""""""""""
14651
14652The ``llvm.objectsize`` intrinsic takes three arguments. The first argument is
14653a pointer to or into the ``object``. The second argument determines whether
14654``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size
14655is unknown. The third argument controls how ``llvm.objectsize`` acts when
14656``null`` in address space 0 is used as its pointer argument. If it's ``false``,
14657``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
14658the ``null`` is in a non-zero address space or if ``true`` is given for the
14659third argument of ``llvm.objectsize``, we assume its size is unknown.
14660
14661The second and third arguments only accept constants.
14662
14663Semantics:
14664""""""""""
14665
14666The ``llvm.objectsize`` intrinsic is lowered to a constant representing
14667the size of the object concerned. If the size cannot be determined at
14668compile time, ``llvm.objectsize`` returns ``i32/i64 -1 or 0`` (depending
14669on the ``min`` argument).
14670
14671'``llvm.expect``' Intrinsic
14672^^^^^^^^^^^^^^^^^^^^^^^^^^^
14673
14674Syntax:
14675"""""""
14676
14677This is an overloaded intrinsic. You can use ``llvm.expect`` on any
14678integer bit width.
14679
14680::
14681
14682      declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
14683      declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
14684      declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
14685
14686Overview:
14687"""""""""
14688
14689The ``llvm.expect`` intrinsic provides information about expected (the
14690most probable) value of ``val``, which can be used by optimizers.
14691
14692Arguments:
14693""""""""""
14694
14695The ``llvm.expect`` intrinsic takes two arguments. The first argument is
14696a value. The second argument is an expected value, this needs to be a
14697constant value, variables are not allowed.
14698
14699Semantics:
14700""""""""""
14701
14702This intrinsic is lowered to the ``val``.
14703
14704.. _int_assume:
14705
14706'``llvm.assume``' Intrinsic
14707^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14708
14709Syntax:
14710"""""""
14711
14712::
14713
14714      declare void @llvm.assume(i1 %cond)
14715
14716Overview:
14717"""""""""
14718
14719The ``llvm.assume`` allows the optimizer to assume that the provided
14720condition is true. This information can then be used in simplifying other parts
14721of the code.
14722
14723Arguments:
14724""""""""""
14725
14726The condition which the optimizer may assume is always true.
14727
14728Semantics:
14729""""""""""
14730
14731The intrinsic allows the optimizer to assume that the provided condition is
14732always true whenever the control flow reaches the intrinsic call. No code is
14733generated for this intrinsic, and instructions that contribute only to the
14734provided condition are not used for code generation. If the condition is
14735violated during execution, the behavior is undefined.
14736
14737Note that the optimizer might limit the transformations performed on values
14738used by the ``llvm.assume`` intrinsic in order to preserve the instructions
14739only used to form the intrinsic's input argument. This might prove undesirable
14740if the extra information provided by the ``llvm.assume`` intrinsic does not cause
14741sufficient overall improvement in code quality. For this reason,
14742``llvm.assume`` should not be used to document basic mathematical invariants
14743that the optimizer can otherwise deduce or facts that are of little use to the
14744optimizer.
14745
14746.. _int_ssa_copy:
14747
14748'``llvm.ssa_copy``' Intrinsic
14749^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14750
14751Syntax:
14752"""""""
14753
14754::
14755
14756      declare type @llvm.ssa_copy(type %operand) returned(1) readnone
14757
14758Arguments:
14759""""""""""
14760
14761The first argument is an operand which is used as the returned value.
14762
14763Overview:
14764""""""""""
14765
14766The ``llvm.ssa_copy`` intrinsic can be used to attach information to
14767operations by copying them and giving them new names.  For example,
14768the PredicateInfo utility uses it to build Extended SSA form, and
14769attach various forms of information to operands that dominate specific
14770uses.  It is not meant for general use, only for building temporary
14771renaming forms that require value splits at certain points.
14772
14773.. _type.test:
14774
14775'``llvm.type.test``' Intrinsic
14776^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14777
14778Syntax:
14779"""""""
14780
14781::
14782
14783      declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone
14784
14785
14786Arguments:
14787""""""""""
14788
14789The first argument is a pointer to be tested. The second argument is a
14790metadata object representing a :doc:`type identifier <TypeMetadata>`.
14791
14792Overview:
14793"""""""""
14794
14795The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
14796with the given type identifier.
14797
14798'``llvm.type.checked.load``' Intrinsic
14799^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14800
14801Syntax:
14802"""""""
14803
14804::
14805
14806      declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly
14807
14808
14809Arguments:
14810""""""""""
14811
14812The first argument is a pointer from which to load a function pointer. The
14813second argument is the byte offset from which to load the function pointer. The
14814third argument is a metadata object representing a :doc:`type identifier
14815<TypeMetadata>`.
14816
14817Overview:
14818"""""""""
14819
14820The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
14821virtual table pointer using type metadata. This intrinsic is used to implement
14822control flow integrity in conjunction with virtual call optimization. The
14823virtual call optimization pass will optimize away ``llvm.type.checked.load``
14824intrinsics associated with devirtualized calls, thereby removing the type
14825check in cases where it is not needed to enforce the control flow integrity
14826constraint.
14827
14828If the given pointer is associated with a type metadata identifier, this
14829function returns true as the second element of its return value. (Note that
14830the function may also return true if the given pointer is not associated
14831with a type metadata identifier.) If the function's return value's second
14832element is true, the following rules apply to the first element:
14833
14834- If the given pointer is associated with the given type metadata identifier,
14835  it is the function pointer loaded from the given byte offset from the given
14836  pointer.
14837
14838- If the given pointer is not associated with the given type metadata
14839  identifier, it is one of the following (the choice of which is unspecified):
14840
14841  1. The function pointer that would have been loaded from an arbitrarily chosen
14842     (through an unspecified mechanism) pointer associated with the type
14843     metadata.
14844
14845  2. If the function has a non-void return type, a pointer to a function that
14846     returns an unspecified value without causing side effects.
14847
14848If the function's return value's second element is false, the value of the
14849first element is undefined.
14850
14851
14852'``llvm.donothing``' Intrinsic
14853^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14854
14855Syntax:
14856"""""""
14857
14858::
14859
14860      declare void @llvm.donothing() nounwind readnone
14861
14862Overview:
14863"""""""""
14864
14865The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
14866three intrinsics (besides ``llvm.experimental.patchpoint`` and
14867``llvm.experimental.gc.statepoint``) that can be called with an invoke
14868instruction.
14869
14870Arguments:
14871""""""""""
14872
14873None.
14874
14875Semantics:
14876""""""""""
14877
14878This intrinsic does nothing, and it's removed by optimizers and ignored
14879by codegen.
14880
14881'``llvm.experimental.deoptimize``' Intrinsic
14882^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14883
14884Syntax:
14885"""""""
14886
14887::
14888
14889      declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
14890
14891Overview:
14892"""""""""
14893
14894This intrinsic, together with :ref:`deoptimization operand bundles
14895<deopt_opbundles>`, allow frontends to express transfer of control and
14896frame-local state from the currently executing (typically more specialized,
14897hence faster) version of a function into another (typically more generic, hence
14898slower) version.
14899
14900In languages with a fully integrated managed runtime like Java and JavaScript
14901this intrinsic can be used to implement "uncommon trap" or "side exit" like
14902functionality.  In unmanaged languages like C and C++, this intrinsic can be
14903used to represent the slow paths of specialized functions.
14904
14905
14906Arguments:
14907""""""""""
14908
14909The intrinsic takes an arbitrary number of arguments, whose meaning is
14910decided by the :ref:`lowering strategy<deoptimize_lowering>`.
14911
14912Semantics:
14913""""""""""
14914
14915The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
14916deoptimization continuation (denoted using a :ref:`deoptimization
14917operand bundle <deopt_opbundles>`) and returns the value returned by
14918the deoptimization continuation.  Defining the semantic properties of
14919the continuation itself is out of scope of the language reference --
14920as far as LLVM is concerned, the deoptimization continuation can
14921invoke arbitrary side effects, including reading from and writing to
14922the entire heap.
14923
14924Deoptimization continuations expressed using ``"deopt"`` operand bundles always
14925continue execution to the end of the physical frame containing them, so all
14926calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
14927
14928   - ``@llvm.experimental.deoptimize`` cannot be invoked.
14929   - The call must immediately precede a :ref:`ret <i_ret>` instruction.
14930   - The ``ret`` instruction must return the value produced by the
14931     ``@llvm.experimental.deoptimize`` call if there is one, or void.
14932
14933Note that the above restrictions imply that the return type for a call to
14934``@llvm.experimental.deoptimize`` will match the return type of its immediate
14935caller.
14936
14937The inliner composes the ``"deopt"`` continuations of the caller into the
14938``"deopt"`` continuations present in the inlinee, and also updates calls to this
14939intrinsic to return directly from the frame of the function it inlined into.
14940
14941All declarations of ``@llvm.experimental.deoptimize`` must share the
14942same calling convention.
14943
14944.. _deoptimize_lowering:
14945
14946Lowering:
14947"""""""""
14948
14949Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
14950symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
14951ensure that this symbol is defined).  The call arguments to
14952``@llvm.experimental.deoptimize`` are lowered as if they were formal
14953arguments of the specified types, and not as varargs.
14954
14955
14956'``llvm.experimental.guard``' Intrinsic
14957^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14958
14959Syntax:
14960"""""""
14961
14962::
14963
14964      declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
14965
14966Overview:
14967"""""""""
14968
14969This intrinsic, together with :ref:`deoptimization operand bundles
14970<deopt_opbundles>`, allows frontends to express guards or checks on
14971optimistic assumptions made during compilation.  The semantics of
14972``@llvm.experimental.guard`` is defined in terms of
14973``@llvm.experimental.deoptimize`` -- its body is defined to be
14974equivalent to:
14975
14976.. code-block:: text
14977
14978  define void @llvm.experimental.guard(i1 %pred, <args...>) {
14979    %realPred = and i1 %pred, undef
14980    br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
14981
14982  leave:
14983    call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
14984    ret void
14985
14986  continue:
14987    ret void
14988  }
14989
14990
14991with the optional ``[, !make.implicit !{}]`` present if and only if it
14992is present on the call site.  For more details on ``!make.implicit``,
14993see :doc:`FaultMaps`.
14994
14995In words, ``@llvm.experimental.guard`` executes the attached
14996``"deopt"`` continuation if (but **not** only if) its first argument
14997is ``false``.  Since the optimizer is allowed to replace the ``undef``
14998with an arbitrary value, it can optimize guard to fail "spuriously",
14999i.e. without the original condition being false (hence the "not only
15000if"); and this allows for "check widening" type optimizations.
15001
15002``@llvm.experimental.guard`` cannot be invoked.
15003
15004
15005'``llvm.load.relative``' Intrinsic
15006^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15007
15008Syntax:
15009"""""""
15010
15011::
15012
15013      declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly
15014
15015Overview:
15016"""""""""
15017
15018This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
15019adds ``%ptr`` to that value and returns it. The constant folder specifically
15020recognizes the form of this intrinsic and the constant initializers it may
15021load from; if a loaded constant initializer is known to have the form
15022``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
15023
15024LLVM provides that the calculation of such a constant initializer will
15025not overflow at link time under the medium code model if ``x`` is an
15026``unnamed_addr`` function. However, it does not provide this guarantee for
15027a constant initializer folded into a function body. This intrinsic can be
15028used to avoid the possibility of overflows when loading from such a constant.
15029
15030'``llvm.sideeffect``' Intrinsic
15031^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15032
15033Syntax:
15034"""""""
15035
15036::
15037
15038      declare void @llvm.sideeffect() inaccessiblememonly nounwind
15039
15040Overview:
15041"""""""""
15042
15043The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
15044treat it as having side effects, so it can be inserted into a loop to
15045indicate that the loop shouldn't be assumed to terminate (which could
15046potentially lead to the loop being optimized away entirely), even if it's
15047an infinite loop with no other side effects.
15048
15049Arguments:
15050""""""""""
15051
15052None.
15053
15054Semantics:
15055""""""""""
15056
15057This intrinsic actually does nothing, but optimizers must assume that it
15058has externally observable side effects.
15059
15060Stack Map Intrinsics
15061--------------------
15062
15063LLVM provides experimental intrinsics to support runtime patching
15064mechanisms commonly desired in dynamic language JITs. These intrinsics
15065are described in :doc:`StackMaps`.
15066
15067Element Wise Atomic Memory Intrinsics
15068-------------------------------------
15069
15070These intrinsics are similar to the standard library memory intrinsics except
15071that they perform memory transfer as a sequence of atomic memory accesses.
15072
15073.. _int_memcpy_element_unordered_atomic:
15074
15075'``llvm.memcpy.element.unordered.atomic``' Intrinsic
15076^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15077
15078Syntax:
15079"""""""
15080
15081This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
15082any integer bit width and for different address spaces. Not all targets
15083support all bit widths however.
15084
15085::
15086
15087      declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
15088                                                                       i8* <src>,
15089                                                                       i32 <len>,
15090                                                                       i32 <element_size>)
15091      declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
15092                                                                       i8* <src>,
15093                                                                       i64 <len>,
15094                                                                       i32 <element_size>)
15095
15096Overview:
15097"""""""""
15098
15099The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
15100'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
15101as arrays with elements that are exactly ``element_size`` bytes, and the copy between
15102buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
15103that are a positive integer multiple of the ``element_size`` in size.
15104
15105Arguments:
15106""""""""""
15107
15108The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
15109intrinsic, with the added constraint that ``len`` is required to be a positive integer
15110multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
15111``element_size``, then the behaviour of the intrinsic is undefined.
15112
15113``element_size`` must be a compile-time constant positive power of two no greater than
15114target-specific atomic access size limit.
15115
15116For each of the input pointers ``align`` parameter attribute must be specified. It
15117must be a power of two no less than the ``element_size``. Caller guarantees that
15118both the source and destination pointers are aligned to that boundary.
15119
15120Semantics:
15121""""""""""
15122
15123The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
15124memory from the source location to the destination location. These locations are not
15125allowed to overlap. The memory copy is performed as a sequence of load/store operations
15126where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
15127aligned at an ``element_size`` boundary.
15128
15129The order of the copy is unspecified. The same value may be read from the source
15130buffer many times, but only one write is issued to the destination buffer per
15131element. It is well defined to have concurrent reads and writes to both source and
15132destination provided those reads and writes are unordered atomic when specified.
15133
15134This intrinsic does not provide any additional ordering guarantees over those
15135provided by a set of unordered loads from the source location and stores to the
15136destination.
15137
15138Lowering:
15139"""""""""
15140
15141In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
15142lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
15143is replaced with an actual element size.
15144
15145Optimizer is allowed to inline memory copy when it's profitable to do so.
15146
15147'``llvm.memmove.element.unordered.atomic``' Intrinsic
15148^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15149
15150Syntax:
15151"""""""
15152
15153This is an overloaded intrinsic. You can use
15154``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
15155different address spaces. Not all targets support all bit widths however.
15156
15157::
15158
15159      declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
15160                                                                        i8* <src>,
15161                                                                        i32 <len>,
15162                                                                        i32 <element_size>)
15163      declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
15164                                                                        i8* <src>,
15165                                                                        i64 <len>,
15166                                                                        i32 <element_size>)
15167
15168Overview:
15169"""""""""
15170
15171The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
15172of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
15173``src`` are treated as arrays with elements that are exactly ``element_size``
15174bytes, and the copy between buffers uses a sequence of
15175:ref:`unordered atomic <ordering>` load/store operations that are a positive
15176integer multiple of the ``element_size`` in size.
15177
15178Arguments:
15179""""""""""
15180
15181The first three arguments are the same as they are in the
15182:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
15183``len`` is required to be a positive integer multiple of the ``element_size``.
15184If ``len`` is not a positive integer multiple of ``element_size``, then the
15185behaviour of the intrinsic is undefined.
15186
15187``element_size`` must be a compile-time constant positive power of two no
15188greater than a target-specific atomic access size limit.
15189
15190For each of the input pointers the ``align`` parameter attribute must be
15191specified. It must be a power of two no less than the ``element_size``. Caller
15192guarantees that both the source and destination pointers are aligned to that
15193boundary.
15194
15195Semantics:
15196""""""""""
15197
15198The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
15199of memory from the source location to the destination location. These locations
15200are allowed to overlap. The memory copy is performed as a sequence of load/store
15201operations where each access is guaranteed to be a multiple of ``element_size``
15202bytes wide and aligned at an ``element_size`` boundary.
15203
15204The order of the copy is unspecified. The same value may be read from the source
15205buffer many times, but only one write is issued to the destination buffer per
15206element. It is well defined to have concurrent reads and writes to both source
15207and destination provided those reads and writes are unordered atomic when
15208specified.
15209
15210This intrinsic does not provide any additional ordering guarantees over those
15211provided by a set of unordered loads from the source location and stores to the
15212destination.
15213
15214Lowering:
15215"""""""""
15216
15217In the most general case call to the
15218'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
15219``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
15220actual element size.
15221
15222The optimizer is allowed to inline the memory copy when it's profitable to do so.
15223
15224.. _int_memset_element_unordered_atomic:
15225
15226'``llvm.memset.element.unordered.atomic``' Intrinsic
15227^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15228
15229Syntax:
15230"""""""
15231
15232This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
15233any integer bit width and for different address spaces. Not all targets
15234support all bit widths however.
15235
15236::
15237
15238      declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>,
15239                                                                  i8 <value>,
15240                                                                  i32 <len>,
15241                                                                  i32 <element_size>)
15242      declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>,
15243                                                                  i8 <value>,
15244                                                                  i64 <len>,
15245                                                                  i32 <element_size>)
15246
15247Overview:
15248"""""""""
15249
15250The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
15251'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
15252with elements that are exactly ``element_size`` bytes, and the assignment to that array
15253uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
15254that are a positive integer multiple of the ``element_size`` in size.
15255
15256Arguments:
15257""""""""""
15258
15259The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
15260intrinsic, with the added constraint that ``len`` is required to be a positive integer
15261multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
15262``element_size``, then the behaviour of the intrinsic is undefined.
15263
15264``element_size`` must be a compile-time constant positive power of two no greater than
15265target-specific atomic access size limit.
15266
15267The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
15268must be a power of two no less than the ``element_size``. Caller guarantees that
15269the destination pointer is aligned to that boundary.
15270
15271Semantics:
15272""""""""""
15273
15274The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
15275memory starting at the destination location to the given ``value``. The memory is
15276set with a sequence of store operations where each access is guaranteed to be a
15277multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
15278
15279The order of the assignment is unspecified. Only one write is issued to the
15280destination buffer per element. It is well defined to have concurrent reads and
15281writes to the destination provided those reads and writes are unordered atomic
15282when specified.
15283
15284This intrinsic does not provide any additional ordering guarantees over those
15285provided by a set of unordered stores to the destination.
15286
15287Lowering:
15288"""""""""
15289
15290In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
15291lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
15292is replaced with an actual element size.
15293
15294The optimizer is allowed to inline the memory assignment when it's profitable to do so.
15295