1============================== 2PNaCl Bitcode Reference Manual 3============================== 4 5.. contents:: 6 :local: 7 :backlinks: none 8 :depth: 3 9 10Introduction 11============ 12 13This document is a reference manual for the PNaCl bitcode format. It describes 14the bitcode on a *semantic* level; the physical encoding level will be described 15elsewhere. For the purpose of this document, the textual form of LLVM IR is 16used to describe instructions and other bitcode constructs. 17 18Since the PNaCl bitcode is based to a large extent on LLVM IR as of 19version 3.3, many sections in this document point to a relevant section 20of the LLVM language reference manual. Only the changes, restrictions 21and variations specific to PNaCl are described---full semantic 22descriptions are not duplicated from the LLVM reference manual. 23 24High Level Structure 25==================== 26 27A PNaCl portable executable (**pexe** in short) is a single LLVM IR module. 28 29Data Model 30---------- 31 32The data model for PNaCl bitcode is fixed at little-endian ILP32: pointers are 3332 bits in size. 64-bit integer types are also supported natively via the i64 34type (for example, a front-end can generate these from the C/C++ type 35``long long``). 36 37Floating point support is fixed at IEEE 754 32-bit and 64-bit values (f32 and 38f64, respectively). 39 40.. _bitcode_linkagetypes: 41 42Linkage Types 43------------- 44 45`LLVM LangRef: Linkage Types 46<http://llvm.org/releases/3.3/docs/LangRef.html#linkage>`_ 47 48The linkage types supported by PNaCl bitcode are ``internal`` and ``external``. 49A single function in the pexe, named ``_start``, has the linkage type 50``external``. All the other functions and globals have the linkage type 51``internal``. 52 53Calling Conventions 54------------------- 55 56`LLVM LangRef: Calling Conventions 57<http://llvm.org/releases/3.3/docs/LangRef.html#callingconv>`_ 58 59The only calling convention supported by PNaCl bitcode is ``ccc`` - the C 60calling convention. 61 62Visibility Styles 63----------------- 64 65`LLVM LangRef: Visibility Styles 66<http://llvm.org/releases/3.3/docs/LangRef.html#visibility-styles>`_ 67 68PNaCl bitcode does not support visibility styles. 69 70.. _bitcode_globalvariables: 71 72Global Variables 73---------------- 74 75`LLVM LangRef: Global Variables 76<http://llvm.org/releases/3.3/docs/LangRef.html#globalvars>`_ 77 78Restrictions on global variables: 79 80* PNaCl bitcode does not support LLVM IR TLS models. See 81 :ref:`language_support_threading` for more details. 82* Restrictions on :ref:`linkage types <bitcode_linkagetypes>`. 83* The ``addrspace``, ``section``, ``unnamed_addr`` and 84 ``externally_initialized`` attributes are not supported. 85 86Every global variable must have an initializer. Each initializer must be 87either a *SimpleElement* or a *CompoundElement*, defined as follows. 88 89A *SimpleElement* is one of the following: 90 911) An i8 array literal or ``zeroinitializer``: 92 93.. naclcode:: 94 :prettyprint: 0 95 96 [SIZE x i8] c"DATA" 97 [SIZE x i8] zeroinitializer 98 992) A reference to a *GlobalValue* (a function or global variable) with an 100 optional 32-bit byte offset added to it (the addend, which may be 101 negative): 102 103.. naclcode:: 104 :prettyprint: 0 105 106 ptrtoint (TYPE* @GLOBAL to i32) 107 add (i32 ptrtoint (TYPE* @GLOBAL to i32), i32 ADDEND) 108 109A *CompoundElement* is a unnamed, packed struct containing more than one 110*SimpleElement*. 111 112Functions 113--------- 114 115`LLVM LangRef: Functions 116<http://llvm.org/releases/3.3/docs/LangRef.html#functionstructure>`_ 117 118The restrictions on :ref:`linkage types <bitcode_linkagetypes>`, calling 119conventions and visibility styles apply to functions. In addition, the following 120are not supported for functions: 121 122* Function attributes (either for the the function itself, its parameters or its 123 return type). 124* Garbage collector name (``gc``). 125* Functions with a variable number of arguments (*vararg*). 126* Alignment (``align``). 127 128Aliases 129------- 130 131`LLVM LangRef: Aliases 132<http://llvm.org/releases/3.3/docs/LangRef.html#aliases>`_ 133 134PNaCl bitcode does not support aliases. 135 136Named Metadata 137-------------- 138 139`LLVM LangRef: Named Metadata 140<http://llvm.org/releases/3.3/docs/LangRef.html#namedmetadatastructure>`_ 141 142While PNaCl bitcode has provisions for debugging metadata, it is not considered 143part of the stable ABI. It exists for tool support and should not appear in 144distributed pexes. 145 146Other kinds of LLVM metadata are not supported. 147 148Module-Level Inline Assembly 149---------------------------- 150 151`LLVM LangRef: Module-Level Inline Assembly 152<http://llvm.org/releases/3.3/docs/LangRef.html#moduleasm>`_ 153 154PNaCl bitcode does not support inline assembly. 155 156Volatile Memory Accesses 157------------------------ 158 159`LLVM LangRef: Volatile Memory Accesses 160<http://llvm.org/releases/3.3/docs/LangRef.html#volatile>`_ 161 162PNaCl bitcode does not support volatile memory accesses. The 163``volatile`` attribute on loads and stores is not supported. See the 164:doc:`pnacl-c-cpp-language-support` for more details. 165 166Memory Model for Concurrent Operations 167-------------------------------------- 168 169`LLVM LangRef: Memory Model for Concurrent Operations 170<http://llvm.org/releases/3.3/docs/LangRef.html#memmodel>`_ 171 172See the `PNaCl Developer's Guide <PNaClDeveloperGuide.html>`_ for more 173details. 174 175Fast-Math Flags 176--------------- 177 178`LLVM LangRef: Fast-Math Flags 179<http://llvm.org/releases/3.3/docs/LangRef.html#fastmath>`_ 180 181Fast-math mode is not currently supported by the PNaCl bitcode. 182 183Type System 184=========== 185 186`LLVM LangRef: Type System 187<http://llvm.org/releases/3.3/docs/LangRef.html#typesystem>`_ 188 189The LLVM types allowed in PNaCl bitcode are restricted, as follows: 190 191Scalar types 192------------ 193 194* The only scalar types allowed are integer, float (32-bit floating point), 195 double (64-bit floating point) and void. 196 197 * The only integer sizes allowed are i1, i8, i16, i32 and i64. 198 * The only integer sizes allowed for function arguments and function return 199 values are i32 and i64. 200 201Vector types 202------------ 203 204The only vector types allowed are: 205 206* 128-bit vectors integers of elements size i8, i16, i32. 207* 128-bit vectors of float elements. 208* Vectors of i1 type with element counts corresponding to the allowed 209 element counts listed previously (their width is therefore not 210 128-bits). 211 212Array and struct types 213---------------------- 214 215Array and struct types are only allowed in 216:ref:`global variable initializers <bitcode_globalvariables>`. 217 218.. _bitcode_pointertypes: 219 220Pointer types 221------------- 222 223Only the following pointer types are allowed: 224 225* Pointers to valid PNaCl bitcode scalar types, as specified above. 226* Pointers to functions. 227 228In addition, the address space for all pointers must be 0. 229 230A pointer is *inherent* when it represents the return value of an ``alloca`` 231instruction, or is an address of a global value. 232 233A pointer is *normalized* if it's either: 234 235* *inherent* 236* Is the return value of a ``bitcast`` instruction. 237* Is the return value of a ``inttoptr`` instruction. 238 239Undefined Values 240---------------- 241 242`LLVM LangRef: Undefined Values 243<http://llvm.org/releases/3.3/docs/LangRef.html#undefvalues>`_ 244 245``undef`` is only allowed within functions, not in global variable initializers. 246 247Constant Expressions 248-------------------- 249 250`LLVM LangRef: Constant Expressions 251<http://llvm.org/releases/3.3/docs/LangRef.html#constant-expressions>`_ 252 253Constant expressions are only allowed in 254:ref:`global variable initializers <bitcode_globalvariables>`. 255 256Other Values 257============ 258 259Metadata Nodes and Metadata Strings 260----------------------------------- 261 262`LLVM LangRef: Metadata Nodes and Metadata Strings 263<http://llvm.org/releases/3.3/docs/LangRef.html#metadata>`_ 264 265While PNaCl bitcode has provisions for debugging metadata, it is not considered 266part of the stable ABI. It exists for tool support and should not appear in 267distributed pexes. 268 269Other kinds of LLVM metadata are not supported. 270 271Intrinsic Global Variables 272========================== 273 274`LLVM LangRef: Intrinsic Global Variables 275<http://llvm.org/releases/3.3/docs/LangRef.html#intrinsic-global-variables>`_ 276 277PNaCl bitcode does not support intrinsic global variables. 278 279.. _ir_and_errno: 280 281Errno and errors in arithmetic instructions 282=========================================== 283 284Some arithmetic instructions and intrinsics have the similar semantics to 285libc math functions, but differ in the treatment of ``errno``. While the 286libc functions may set ``errno`` for domain errors, the instructions and 287intrinsics do not. This is because the variable ``errno`` is not special 288and is not required to be part of the program. 289 290Instruction Reference 291===================== 292 293List of allowed instructions 294---------------------------- 295 296This is a list of LLVM instructions supported by PNaCl bitcode. Where 297applicable, PNaCl-specific restrictions are provided. 298 299.. TODO: explain instructions or link in the future 300 301The following attributes are disallowed for all instructions: 302 303* ``nsw`` and ``nuw`` 304* ``exact`` 305 306Only the LLVM instructions listed here are supported by PNaCl bitcode. 307 308* ``ret`` 309* ``br`` 310* ``switch`` 311 312 i1 values are disallowed for ``switch``. 313 314* ``add``, ``sub``, ``mul``, ``shl``, ``udiv``, ``sdiv``, ``urem``, ``srem``, 315 ``lshr``, ``ashr`` 316 317 These arithmetic operations are disallowed on values of type ``i1``. 318 319 Integer division (``udiv``, ``sdiv``, ``urem``, ``srem``) by zero is 320 guaranteed to trap in PNaCl bitcode. 321 322* ``and`` 323* ``or`` 324* ``xor`` 325* ``fadd`` 326* ``fsub`` 327* ``fmul`` 328* ``fdiv`` 329* ``frem`` 330 331 The frem instruction has the semantics of the libc fmod function for 332 computing the floating point remainder. If the numerator is infinity, or 333 denominator is zero, or either are NaN, then the result is NaN. 334 Unlike the libc fmod function, this does not set ``errno`` when the 335 result is NaN (see the :ref:`instructions and errno <ir_and_errno>` 336 section). 337 338* ``alloca`` 339 340 See :ref:`alloca instructions <bitcode_allocainst>`. 341 342* ``load``, ``store`` 343 344 The pointer argument of these instructions must be a *normalized* pointer (see 345 :ref:`pointer types <bitcode_pointertypes>`). The ``volatile`` and ``atomic`` 346 attributes are not supported. Loads and stores of the type ``i1`` are not 347 supported. 348 349 These instructions must use ``align 1`` on integer memory accesses, ``align 4`` 350 for ``float`` accesses and ``align 8`` for ``double`` accesses. 351 352* ``trunc`` 353* ``zext`` 354* ``sext`` 355* ``fptrunc`` 356* ``fpext`` 357* ``fptoui`` 358* ``fptosi`` 359* ``uitofp`` 360* ``sitofp`` 361 362* ``ptrtoint`` 363 364 The pointer argument of a ``ptrtoint`` instruction must be a *normalized* 365 pointer (see :ref:`pointer types <bitcode_pointertypes>`) and the integer 366 argument must be an i32. 367 368* ``inttoptr`` 369 370 The integer argument of a ``inttoptr`` instruction must be an i32. 371 372* ``bitcast`` 373 374 The pointer argument of a ``bitcast`` instruction must be a *inherent* pointer 375 (see :ref:`pointer types <bitcode_pointertypes>`). 376 377* ``icmp`` 378* ``fcmp`` 379* ``phi`` 380* ``select`` 381* ``call`` 382* ``unreachable`` 383* ``insertelement`` 384* ``extractelement`` 385 386.. _bitcode_allocainst: 387 388``alloca`` 389---------- 390 391The only allowed type for ``alloca`` instructions in PNaCl bitcode is i8. The 392size argument must be an i32. For example: 393 394.. naclcode:: 395 :prettyprint: 0 396 397 %buf = alloca i8, i32 8, align 4 398 399Intrinsic Functions 400=================== 401 402`LLVM LangRef: Intrinsic Functions 403<http://llvm.org/releases/3.3/docs/LangRef.html#intrinsics>`_ 404 405List of allowed intrinsics 406-------------------------- 407 408The only intrinsics supported by PNaCl bitcode are the following. 409 410* ``llvm.memcpy`` 411* ``llvm.memmove`` 412* ``llvm.memset`` 413 414 These intrinsics are only supported with an i32 ``len`` argument. 415 416* ``llvm.bswap`` 417 418 The overloaded ``llvm.bswap`` intrinsic is only supported with the following 419 argument types: i16, i32, i64 (the types supported by C-style GCC builtins). 420 421* ``llvm.ctlz`` 422* ``llvm.cttz`` 423* ``llvm.ctpop`` 424 425 The overloaded llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics are only 426 supported with the i32 and i64 argument types (the types supported by 427 C-style GCC builtins). 428 429* ``llvm.sqrt`` 430 431 The overloaded ``llvm.sqrt`` intrinsic is only supported for float 432 and double arguments types. This has the same semantics as the libc 433 sqrt function, returning NaN for values less than -0.0. However, this 434 does not set ``errno`` when the result is NaN (see the 435 :ref:`instructions and errno <ir_and_errno>` section). 436 437* ``llvm.stacksave`` 438* ``llvm.stackrestore`` 439 440 These intrinsics are used to implement language features like scoped automatic 441 variable sized arrays in C99. ``llvm.stacksave`` returns a value that 442 represents the current state of the stack. This value may only be used as the 443 argument to ``llvm.stackrestore``, which restores the stack to the given 444 state. 445 446* ``llvm.trap`` 447 448 This intrinsic is lowered to a target dependent trap instruction, which aborts 449 execution. 450 451* ``llvm.nacl.read.tp`` 452 453 See :ref:`thread pointer related intrinsics 454 <bitcode_threadpointerintrinsics>`. 455 456* ``llvm.nacl.longjmp`` 457* ``llvm.nacl.setjmp`` 458 459 See :ref:`Setjmp and Longjmp <bitcode_setjmplongjmp>`. 460 461* ``llvm.nacl.atomic.store`` 462* ``llvm.nacl.atomic.load`` 463* ``llvm.nacl.atomic.rmw`` 464* ``llvm.nacl.atomic.cmpxchg`` 465* ``llvm.nacl.atomic.fence`` 466* ``llvm.nacl.atomic.fence.all`` 467* ``llvm.nacl.atomic.is.lock.free`` 468 469 See :ref:`atomic intrinsics <bitcode_atomicintrinsics>`. 470 471.. _bitcode_threadpointerintrinsics: 472 473Thread pointer related intrinsics 474--------------------------------- 475 476.. naclcode:: 477 :prettyprint: 0 478 479 declare i8* @llvm.nacl.read.tp() 480 481Returns a read-only thread pointer. The value is controlled by the embedding 482sandbox's runtime. 483 484.. _bitcode_setjmplongjmp: 485 486Setjmp and Longjmp 487------------------ 488 489.. naclcode:: 490 :prettyprint: 0 491 492 declare void @llvm.nacl.longjmp(i8* %jmpbuf, i32) 493 declare i32 @llvm.nacl.setjmp(i8* %jmpbuf) 494 495These intrinsics implement the semantics of C11 ``setjmp`` and ``longjmp``. The 496``jmpbuf`` pointer must be 64-bit aligned and point to at least 1024 bytes of 497allocated memory. 498 499.. _bitcode_atomicintrinsics: 500 501Atomic intrinsics 502----------------- 503 504.. naclcode:: 505 :prettyprint: 0 506 507 declare iN @llvm.nacl.atomic.load.<size>( 508 iN* <source>, i32 <memory_order>) 509 declare void @llvm.nacl.atomic.store.<size>( 510 iN <operand>, iN* <destination>, i32 <memory_order>) 511 declare iN @llvm.nacl.atomic.rmw.<size>( 512 i32 <computation>, iN* <object>, iN <operand>, i32 <memory_order>) 513 declare iN @llvm.nacl.atomic.cmpxchg.<size>( 514 iN* <object>, iN <expected>, iN <desired>, 515 i32 <memory_order_success>, i32 <memory_order_failure>) 516 declare void @llvm.nacl.atomic.fence(i32 <memory_order>) 517 declare void @llvm.nacl.atomic.fence.all() 518 519Each of these intrinsics is overloaded on the ``iN`` argument, which is 520reflected through ``<size>`` in the overload's name. Integral types of 5218, 16, 32 and 64-bit width are supported for these arguments. 522 523The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following 524read-modify-write operations, from the general and arithmetic sections 525of the C11/C++11 standards: 526 527 - ``add`` 528 - ``sub`` 529 - ``or`` 530 - ``and`` 531 - ``xor`` 532 - ``exchange`` 533 534For all of these read-modify-write operations, the returned value is 535that at ``object`` before the computation. The ``computation`` argument 536must be a compile-time constant. 537 538All atomic intrinsics also support C11/C++11 memory orderings, which 539must be compile-time constants. 540 541Integer values for these computations and memory orderings are defined 542in ``"llvm/IR/NaClAtomicIntrinsics.h"``. 543 544The ``@llvm.nacl.atomic.fence.all`` intrinsic is equivalent to the 545``@llvm.nacl.atomic.fence`` intrinsic with sequentially consistent 546ordering and compiler barriers preventing most non-atomic memory 547accesses from reordering around it. 548 549.. Note:: 550 :class: note 551 552 These intrinsics allow PNaCl to support C11/C++11 style atomic 553 operations as well as some legacy GCC-style ``__sync_*`` builtins 554 while remaining stable as the LLVM codebase changes. The user isn't 555 expected to use these intrinsics directly. 556 557.. naclcode:: 558 :prettyprint: 0 559 560 declare i1 @llvm.nacl.atomic.is.lock.free(i32 <byte_size>, i8* <address>) 561 562The ``llvm.nacl.atomic.is.lock.free`` intrinsic is designed to 563determine at translation time whether atomic operations of a certain 564``byte_size`` (a compile-time constant), at a particular ``address``, 565are lock-free or not. This reflects the C11 ``atomic_is_lock_free`` 566function from header ``<stdatomic.h>`` and the C++11 ``is_lock_free`` 567member function in header ``<atomic>``. It can be used through the 568``__nacl_atomic_is_lock_free`` builtin. 569