• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1==============================
2PNaCl Bitcode Reference Manual
3==============================
4
5.. contents::
6   :local:
7   :backlinks: none
8   :depth: 3
9
10Introduction
11============
12
13This document is a reference manual for the PNaCl bitcode format. It describes
14the bitcode on a *semantic* level; the physical encoding level will be described
15elsewhere. For the purpose of this document, the textual form of LLVM IR is
16used to describe instructions and other bitcode constructs.
17
18Since the PNaCl bitcode is based to a large extent on LLVM IR as of
19version 3.3, many sections in this document point to a relevant section
20of the LLVM language reference manual. Only the changes, restrictions
21and variations specific to PNaCl are described---full semantic
22descriptions are not duplicated from the LLVM reference manual.
23
24High Level Structure
25====================
26
27A PNaCl portable executable (**pexe** in short) is a single LLVM IR module.
28
29Data Model
30----------
31
32The data model for PNaCl bitcode is fixed at little-endian ILP32: pointers are
3332 bits in size. 64-bit integer types are also supported natively via the i64
34type (for example, a front-end can generate these from the C/C++ type
35``long long``).
36
37Floating point support is fixed at IEEE 754 32-bit and 64-bit values (f32 and
38f64, respectively).
39
40.. _bitcode_linkagetypes:
41
42Linkage Types
43-------------
44
45`LLVM LangRef: Linkage Types
46<http://llvm.org/releases/3.3/docs/LangRef.html#linkage>`_
47
48The linkage types supported by PNaCl bitcode are ``internal`` and ``external``.
49A single function in the pexe, named ``_start``, has the linkage type
50``external``. All the other functions and globals have the linkage type
51``internal``.
52
53Calling Conventions
54-------------------
55
56`LLVM LangRef: Calling Conventions
57<http://llvm.org/releases/3.3/docs/LangRef.html#callingconv>`_
58
59The only calling convention supported by PNaCl bitcode is ``ccc`` - the C
60calling convention.
61
62Visibility Styles
63-----------------
64
65`LLVM LangRef: Visibility Styles
66<http://llvm.org/releases/3.3/docs/LangRef.html#visibility-styles>`_
67
68PNaCl bitcode does not support visibility styles.
69
70.. _bitcode_globalvariables:
71
72Global Variables
73----------------
74
75`LLVM LangRef: Global Variables
76<http://llvm.org/releases/3.3/docs/LangRef.html#globalvars>`_
77
78Restrictions on global variables:
79
80* PNaCl bitcode does not support LLVM IR TLS models. See
81  :ref:`language_support_threading` for more details.
82* Restrictions on :ref:`linkage types <bitcode_linkagetypes>`.
83* The ``addrspace``, ``section``, ``unnamed_addr`` and
84  ``externally_initialized`` attributes are not supported.
85
86Every global variable must have an initializer. Each initializer must be
87either a *SimpleElement* or a *CompoundElement*, defined as follows.
88
89A *SimpleElement* is one of the following:
90
911) An i8 array literal or ``zeroinitializer``:
92
93.. naclcode::
94  :prettyprint: 0
95
96     [SIZE x i8] c"DATA"
97     [SIZE x i8] zeroinitializer
98
992) A reference to a *GlobalValue* (a function or global variable) with an
100   optional 32-bit byte offset added to it (the addend, which may be
101   negative):
102
103.. naclcode::
104  :prettyprint: 0
105
106     ptrtoint (TYPE* @GLOBAL to i32)
107     add (i32 ptrtoint (TYPE* @GLOBAL to i32), i32 ADDEND)
108
109A *CompoundElement* is a unnamed, packed struct containing more than one
110*SimpleElement*.
111
112Functions
113---------
114
115`LLVM LangRef: Functions
116<http://llvm.org/releases/3.3/docs/LangRef.html#functionstructure>`_
117
118The restrictions on :ref:`linkage types <bitcode_linkagetypes>`, calling
119conventions and visibility styles apply to functions. In addition, the following
120are not supported for functions:
121
122* Function attributes (either for the the function itself, its parameters or its
123  return type).
124* Garbage collector name (``gc``).
125* Functions with a variable number of arguments (*vararg*).
126* Alignment (``align``).
127
128Aliases
129-------
130
131`LLVM LangRef: Aliases
132<http://llvm.org/releases/3.3/docs/LangRef.html#aliases>`_
133
134PNaCl bitcode does not support aliases.
135
136Named Metadata
137--------------
138
139`LLVM LangRef: Named Metadata
140<http://llvm.org/releases/3.3/docs/LangRef.html#namedmetadatastructure>`_
141
142While PNaCl bitcode has provisions for debugging metadata, it is not considered
143part of the stable ABI. It exists for tool support and should not appear in
144distributed pexes.
145
146Other kinds of LLVM metadata are not supported.
147
148Module-Level Inline Assembly
149----------------------------
150
151`LLVM LangRef: Module-Level Inline Assembly
152<http://llvm.org/releases/3.3/docs/LangRef.html#moduleasm>`_
153
154PNaCl bitcode does not support inline assembly.
155
156Volatile Memory Accesses
157------------------------
158
159`LLVM LangRef: Volatile Memory Accesses
160<http://llvm.org/releases/3.3/docs/LangRef.html#volatile>`_
161
162PNaCl bitcode does not support volatile memory accesses. The
163``volatile`` attribute on loads and stores is not supported. See the
164:doc:`pnacl-c-cpp-language-support` for more details.
165
166Memory Model for Concurrent Operations
167--------------------------------------
168
169`LLVM LangRef: Memory Model for Concurrent Operations
170<http://llvm.org/releases/3.3/docs/LangRef.html#memmodel>`_
171
172See the `PNaCl Developer's Guide <PNaClDeveloperGuide.html>`_ for more
173details.
174
175Fast-Math Flags
176---------------
177
178`LLVM LangRef: Fast-Math Flags
179<http://llvm.org/releases/3.3/docs/LangRef.html#fastmath>`_
180
181Fast-math mode is not currently supported by the PNaCl bitcode.
182
183Type System
184===========
185
186`LLVM LangRef: Type System
187<http://llvm.org/releases/3.3/docs/LangRef.html#typesystem>`_
188
189The LLVM types allowed in PNaCl bitcode are restricted, as follows:
190
191Scalar types
192------------
193
194* The only scalar types allowed are integer, float (32-bit floating point),
195  double (64-bit floating point) and void.
196
197  * The only integer sizes allowed are i1, i8, i16, i32 and i64.
198  * The only integer sizes allowed for function arguments and function return
199    values are i32 and i64.
200
201Vector types
202------------
203
204The only vector types allowed are:
205
206* 128-bit vectors integers of elements size i8, i16, i32.
207* 128-bit vectors of float elements.
208* Vectors of i1 type with element counts corresponding to the allowed
209  element counts listed previously (their width is therefore not
210  128-bits).
211
212Array and struct types
213----------------------
214
215Array and struct types are only allowed in
216:ref:`global variable initializers <bitcode_globalvariables>`.
217
218.. _bitcode_pointertypes:
219
220Pointer types
221-------------
222
223Only the following pointer types are allowed:
224
225* Pointers to valid PNaCl bitcode scalar types, as specified above.
226* Pointers to functions.
227
228In addition, the address space for all pointers must be 0.
229
230A pointer is *inherent* when it represents the return value of an ``alloca``
231instruction, or is an address of a global value.
232
233A pointer is *normalized* if it's either:
234
235* *inherent*
236* Is the return value of a ``bitcast`` instruction.
237* Is the return value of a ``inttoptr`` instruction.
238
239Undefined Values
240----------------
241
242`LLVM LangRef: Undefined Values
243<http://llvm.org/releases/3.3/docs/LangRef.html#undefvalues>`_
244
245``undef`` is only allowed within functions, not in global variable initializers.
246
247Constant Expressions
248--------------------
249
250`LLVM LangRef: Constant Expressions
251<http://llvm.org/releases/3.3/docs/LangRef.html#constant-expressions>`_
252
253Constant expressions are only allowed in
254:ref:`global variable initializers <bitcode_globalvariables>`.
255
256Other Values
257============
258
259Metadata Nodes and Metadata Strings
260-----------------------------------
261
262`LLVM LangRef: Metadata Nodes and Metadata Strings
263<http://llvm.org/releases/3.3/docs/LangRef.html#metadata>`_
264
265While PNaCl bitcode has provisions for debugging metadata, it is not considered
266part of the stable ABI. It exists for tool support and should not appear in
267distributed pexes.
268
269Other kinds of LLVM metadata are not supported.
270
271Intrinsic Global Variables
272==========================
273
274`LLVM LangRef: Intrinsic Global Variables
275<http://llvm.org/releases/3.3/docs/LangRef.html#intrinsic-global-variables>`_
276
277PNaCl bitcode does not support intrinsic global variables.
278
279.. _ir_and_errno:
280
281Errno and errors in arithmetic instructions
282===========================================
283
284Some arithmetic instructions and intrinsics have the similar semantics to
285libc math functions, but differ in the treatment of ``errno``. While the
286libc functions may set ``errno`` for domain errors, the instructions and
287intrinsics do not. This is because the variable ``errno`` is not special
288and is not required to be part of the program.
289
290Instruction Reference
291=====================
292
293List of allowed instructions
294----------------------------
295
296This is a list of LLVM instructions supported by PNaCl bitcode. Where
297applicable, PNaCl-specific restrictions are provided.
298
299.. TODO: explain instructions or link in the future
300
301The following attributes are disallowed for all instructions:
302
303* ``nsw`` and ``nuw``
304* ``exact``
305
306Only the LLVM instructions listed here are supported by PNaCl bitcode.
307
308* ``ret``
309* ``br``
310* ``switch``
311
312  i1 values are disallowed for ``switch``.
313
314* ``add``, ``sub``, ``mul``, ``shl``,  ``udiv``, ``sdiv``, ``urem``, ``srem``,
315  ``lshr``, ``ashr``
316
317  These arithmetic operations are disallowed on values of type ``i1``.
318
319  Integer division (``udiv``, ``sdiv``, ``urem``, ``srem``) by zero is
320  guaranteed to trap in PNaCl bitcode.
321
322* ``and``
323* ``or``
324* ``xor``
325* ``fadd``
326* ``fsub``
327* ``fmul``
328* ``fdiv``
329* ``frem``
330
331  The frem instruction has the semantics of the libc fmod function for
332  computing the floating point remainder. If the numerator is infinity, or
333  denominator is zero, or either are NaN, then the result is NaN.
334  Unlike the libc fmod function, this does not set ``errno`` when the
335  result is NaN (see the :ref:`instructions and errno <ir_and_errno>`
336  section).
337
338* ``alloca``
339
340  See :ref:`alloca instructions <bitcode_allocainst>`.
341
342* ``load``, ``store``
343
344  The pointer argument of these instructions must be a *normalized* pointer (see
345  :ref:`pointer types <bitcode_pointertypes>`). The ``volatile`` and ``atomic``
346  attributes are not supported. Loads and stores of the type ``i1`` are not
347  supported.
348
349  These instructions must use ``align 1`` on integer memory accesses, ``align 4``
350  for ``float`` accesses and ``align 8`` for ``double`` accesses.
351
352* ``trunc``
353* ``zext``
354* ``sext``
355* ``fptrunc``
356* ``fpext``
357* ``fptoui``
358* ``fptosi``
359* ``uitofp``
360* ``sitofp``
361
362* ``ptrtoint``
363
364  The pointer argument of a ``ptrtoint`` instruction must be a *normalized*
365  pointer (see :ref:`pointer types <bitcode_pointertypes>`) and the integer
366  argument must be an i32.
367
368* ``inttoptr``
369
370  The integer argument of a ``inttoptr`` instruction must be an i32.
371
372* ``bitcast``
373
374  The pointer argument of a ``bitcast`` instruction must be a *inherent* pointer
375  (see :ref:`pointer types <bitcode_pointertypes>`).
376
377* ``icmp``
378* ``fcmp``
379* ``phi``
380* ``select``
381* ``call``
382* ``unreachable``
383* ``insertelement``
384* ``extractelement``
385
386.. _bitcode_allocainst:
387
388``alloca``
389----------
390
391The only allowed type for ``alloca`` instructions in PNaCl bitcode is i8. The
392size argument must be an i32. For example:
393
394.. naclcode::
395  :prettyprint: 0
396
397    %buf = alloca i8, i32 8, align 4
398
399Intrinsic Functions
400===================
401
402`LLVM LangRef: Intrinsic Functions
403<http://llvm.org/releases/3.3/docs/LangRef.html#intrinsics>`_
404
405List of allowed intrinsics
406--------------------------
407
408The only intrinsics supported by PNaCl bitcode are the following.
409
410* ``llvm.memcpy``
411* ``llvm.memmove``
412* ``llvm.memset``
413
414  These intrinsics are only supported with an i32 ``len`` argument.
415
416* ``llvm.bswap``
417
418  The overloaded ``llvm.bswap`` intrinsic is only supported with the following
419  argument types: i16, i32, i64 (the types supported by C-style GCC builtins).
420
421* ``llvm.ctlz``
422* ``llvm.cttz``
423* ``llvm.ctpop``
424
425  The overloaded llvm.ctlz, llvm.cttz, and llvm.ctpop intrinsics are only
426  supported with the i32 and i64 argument types (the types supported by
427  C-style GCC builtins).
428
429* ``llvm.sqrt``
430
431  The overloaded ``llvm.sqrt`` intrinsic is only supported for float
432  and double arguments types. This has the same semantics as the libc
433  sqrt function, returning NaN for values less than -0.0. However, this
434  does not set ``errno`` when the result is NaN (see the
435  :ref:`instructions and errno <ir_and_errno>` section).
436
437* ``llvm.stacksave``
438* ``llvm.stackrestore``
439
440  These intrinsics are used to implement language features like scoped automatic
441  variable sized arrays in C99. ``llvm.stacksave`` returns a value that
442  represents the current state of the stack. This value may only be used as the
443  argument to ``llvm.stackrestore``, which restores the stack to the given
444  state.
445
446* ``llvm.trap``
447
448  This intrinsic is lowered to a target dependent trap instruction, which aborts
449  execution.
450
451* ``llvm.nacl.read.tp``
452
453  See :ref:`thread pointer related intrinsics
454  <bitcode_threadpointerintrinsics>`.
455
456* ``llvm.nacl.longjmp``
457* ``llvm.nacl.setjmp``
458
459  See :ref:`Setjmp and Longjmp <bitcode_setjmplongjmp>`.
460
461* ``llvm.nacl.atomic.store``
462* ``llvm.nacl.atomic.load``
463* ``llvm.nacl.atomic.rmw``
464* ``llvm.nacl.atomic.cmpxchg``
465* ``llvm.nacl.atomic.fence``
466* ``llvm.nacl.atomic.fence.all``
467* ``llvm.nacl.atomic.is.lock.free``
468
469  See :ref:`atomic intrinsics <bitcode_atomicintrinsics>`.
470
471.. _bitcode_threadpointerintrinsics:
472
473Thread pointer related intrinsics
474---------------------------------
475
476.. naclcode::
477  :prettyprint: 0
478
479    declare i8* @llvm.nacl.read.tp()
480
481Returns a read-only thread pointer. The value is controlled by the embedding
482sandbox's runtime.
483
484.. _bitcode_setjmplongjmp:
485
486Setjmp and Longjmp
487------------------
488
489.. naclcode::
490  :prettyprint: 0
491
492    declare void @llvm.nacl.longjmp(i8* %jmpbuf, i32)
493    declare i32 @llvm.nacl.setjmp(i8* %jmpbuf)
494
495These intrinsics implement the semantics of C11 ``setjmp`` and ``longjmp``. The
496``jmpbuf`` pointer must be 64-bit aligned and point to at least 1024 bytes of
497allocated memory.
498
499.. _bitcode_atomicintrinsics:
500
501Atomic intrinsics
502-----------------
503
504.. naclcode::
505  :prettyprint: 0
506
507    declare iN @llvm.nacl.atomic.load.<size>(
508            iN* <source>, i32 <memory_order>)
509    declare void @llvm.nacl.atomic.store.<size>(
510            iN <operand>, iN* <destination>, i32 <memory_order>)
511    declare iN @llvm.nacl.atomic.rmw.<size>(
512            i32 <computation>, iN* <object>, iN <operand>, i32 <memory_order>)
513    declare iN @llvm.nacl.atomic.cmpxchg.<size>(
514            iN* <object>, iN <expected>, iN <desired>,
515            i32 <memory_order_success>, i32 <memory_order_failure>)
516    declare void @llvm.nacl.atomic.fence(i32 <memory_order>)
517    declare void @llvm.nacl.atomic.fence.all()
518
519Each of these intrinsics is overloaded on the ``iN`` argument, which is
520reflected through ``<size>`` in the overload's name. Integral types of
5218, 16, 32 and 64-bit width are supported for these arguments.
522
523The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following
524read-modify-write operations, from the general and arithmetic sections
525of the C11/C++11 standards:
526
527 - ``add``
528 - ``sub``
529 - ``or``
530 - ``and``
531 - ``xor``
532 - ``exchange``
533
534For all of these read-modify-write operations, the returned value is
535that at ``object`` before the computation. The ``computation`` argument
536must be a compile-time constant.
537
538All atomic intrinsics also support C11/C++11 memory orderings, which
539must be compile-time constants.
540
541Integer values for these computations and memory orderings are defined
542in ``"llvm/IR/NaClAtomicIntrinsics.h"``.
543
544The ``@llvm.nacl.atomic.fence.all`` intrinsic is equivalent to the
545``@llvm.nacl.atomic.fence`` intrinsic with sequentially consistent
546ordering and compiler barriers preventing most non-atomic memory
547accesses from reordering around it.
548
549.. Note::
550  :class: note
551
552    These intrinsics allow PNaCl to support C11/C++11 style atomic
553    operations as well as some legacy GCC-style ``__sync_*`` builtins
554    while remaining stable as the LLVM codebase changes. The user isn't
555    expected to use these intrinsics directly.
556
557.. naclcode::
558  :prettyprint: 0
559
560    declare i1 @llvm.nacl.atomic.is.lock.free(i32 <byte_size>, i8* <address>)
561
562The ``llvm.nacl.atomic.is.lock.free`` intrinsic is designed to
563determine at translation time whether atomic operations of a certain
564``byte_size`` (a compile-time constant), at a particular ``address``,
565are lock-free or not. This reflects the C11 ``atomic_is_lock_free``
566function from header ``<stdatomic.h>`` and the C++11 ``is_lock_free``
567member function in header ``<atomic>``. It can be used through the
568``__nacl_atomic_is_lock_free`` builtin.
569