• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2
3<html>
4
5<head>
6<title>Dalvik VM Instruction Formats</title>
7<link rel=stylesheet href="instruction-formats.css">
8</head>
9
10<body>
11
12<h1>Dalvik VM Instruction Formats</h1>
13<p>Copyright &copy; 2007 The Android Open Source Project
14
15<h2>Introduction and Overview</h2>
16
17<p>This document lists the instruction formats used by Dalvik bytecode
18and is meant to be used in conjunction with the
19<a href="dalvik-bytecode.html">bytecode reference document</a>.</p>
20
21<h3>Bitwise descriptions</h3>
22
23<p>The first column in the format table lists the bitwise layout of
24the format. It consists of one or more space-separated "words" each of
25which describes a 16-bit code unit. Each character in a word
26represents four bits, read from high bits to low, with vertical bars
27("<code>|</code>") interspersed to aid in reading. Uppercase letters
28in sequence from "<code>A</code>" are used to indicate fields within
29the format (which then get defined further by the syntax column). The term
30"<code>op</code>" is used to indicate the position of an eight-bit
31opcode within the format. A slashed zero
32("<code>&Oslash;</code>") is used to indicate that all bits must be
33zero in the indicated position.</p>
34
35<p>For the most part, lettering proceeds from earlier code units to
36later code units, and low-order to high-order within a code unit.
37However, there are a few exceptions to this general rule, which are
38done in order to make the naming of similar-meaning parts be the same
39across different instruction formats. These cases are noted explicitly
40in the format descriptions.</p>
41
42<p>For example, the format "<code>B|A|<i>op</i> CCCC</code>" indicates
43that the format consists of two 16-bit code units. The first word
44consists of the opcode in the low eight bits and a pair of four-bit
45values in the high eight bits; and the second word consists of a single
4616-bit value.</p>
47
48<h3>Format IDs</h3>
49
50<p>The second column in the format table indicates the short identifier
51for the format, which is used in other documents and in code to identify
52the format.</p>
53
54<p>Most format IDs consist of three characters, two digits followed by a
55letter. The first digit indicates the number of 16-bit code units in the
56format. The second digit indicates the maximum number of registers that the
57format contains (maximum, since some formats can accomodate a variable
58number of registers), with the special designation "<code>r</code>" indicating
59that a range of registers is encoded. The final letter semi-mnemonically
60indicates the type of any extra data encoded by the format. For example,
61format "<code>21t</code>" is of length two, contains one register reference,
62and additionally contains a branch target.</p>
63
64<p>Suggested static linking formats have an additional
65"<code>s</code>" suffix, making them four characters total. Similarly,
66suggested "inline" linking formats have an additional "<code>i</code>"
67suffix. (In this context, inline linking is like static linking,
68except with more direct ties into a virtual machine's implementation.)
69Finally, a couple oddball suggested formats (e.g.,
70"<code>20bc</code>") include two pieces of data which are both
71represented in its format ID.</p>
72
73<p>The full list of typecode letters are as follows. Note that some
74forms have different sizes, depending on the format:</p>
75
76<table class="letters">
77<thead>
78<tr>
79  <th>Mnemonic</th>
80  <th>Bit Sizes</th>
81  <th>Meaning</th>
82</tr>
83</thead>
84<tbody>
85<tr>
86  <td>b</td>
87  <td>8</td>
88  <td>immediate signed <b>b</b>yte</td>
89</tr>
90<tr>
91  <td>c</td>
92  <td>16, 32</td>
93  <td><b>c</b>onstant pool index</td>
94</tr>
95<tr>
96  <td>f</td>
97  <td>16</td>
98  <td>inter<b>f</b>ace constants (only used in statically linked formats)
99  </td>
100</tr>
101<tr>
102  <td>h</td>
103  <td>16</td>
104  <td>immediate signed <b>h</b>at (high-order bits of a 32- or 64-bit
105    value; low-order bits are all <code>0</code>)
106  </td>
107</tr>
108<tr>
109  <td>i</td>
110  <td>32</td>
111  <td>immediate signed <b>i</b>nt, or 32-bit float</td>
112</tr>
113<tr>
114  <td>l</td>
115  <td>64</td>
116  <td>immediate signed <b>l</b>ong, or 64-bit double</td>
117</tr>
118<tr>
119  <td>m</td>
120  <td>16</td>
121  <td><b>m</b>ethod constants (only used in statically linked formats)</td>
122</tr>
123<tr>
124  <td>n</td>
125  <td>4</td>
126  <td>immediate signed <b>n</b>ibble</td>
127</tr>
128<tr>
129  <td>s</td>
130  <td>16</td>
131  <td>immediate signed <b>s</b>hort</td>
132</tr>
133<tr>
134  <td>t</td>
135  <td>8, 16, 32</td>
136  <td>branch <b>t</b>arget</td>
137</tr>
138<tr>
139  <td>x</td>
140  <td>0</td>
141  <td>no additional data</td>
142</tr>
143</tbody>
144</table>
145
146<h3>Syntax</h3>
147
148<p>The third column of the format table indicates the human-oriented
149syntax for instructions which use the indicated format. Each instruction
150starts with the named opcode and is optionally followed by one or
151more arguments, themselves separated with commas.</p>
152
153<p>Wherever an argument refers to a field from the first column, the
154letter for that field is indicated in the syntax, repeated once for
155each four bits of the field. For example, an eight-bit field labeled
156"<code>BB</code>" in the first column would also be labeled
157"<code>BB</code>" in the syntax column.</p>
158
159<p>Arguments which name a register have the form "<code>v<i>X</i></code>".
160The prefix "<code>v</code>" was chosen instead of the more common
161"<code>r</code>" exactly to avoid conflicting with (non-virtual) architectures
162on which a Dalvik virtual machine might be implemented which themselves
163use the prefix "<code>r</code>" for their registers. (That is, this
164decision makes it possible to talk about both virtual and real registers
165together without the need for circumlocution.)</p>
166
167<p>Arguments which indicate a literal value have the form
168"<code>#+<i>X</i></code>". Some formats indicate literals that only
169have non-zero bits in their high-order bits; for these, the zeroes
170are represented explicitly in the syntax, even though they do not
171appear in the bitwise representation.</p>
172
173<p>Arguments which indicate a relative instruction address offset have the
174form "<code>+<i>X</i></code>".</p>
175
176<p>Arguments which indicate a literal constant pool index have the form
177"<code><i>kind</i>@<i>X</i></code>", where "<code><i>kind</i></code>"
178indicates which constant pool is being referred to. Each opcode that
179uses such a format explicitly allows only one kind of constant; see
180the opcode reference to figure out the correspondence. The four
181kinds of constant pool are "<code>string</code>" (string pool index),
182"<code>type</code>" (type pool index), "<code>field</code>" (field
183pool index), and "<code>meth</code>" (method pool index).</p>
184
185<p>Similar to the representation of constant pool indices, there are
186also suggested (optional) forms that indicate prelinked offsets or
187indices. There are two types of suggested prelinked value: vtable offsets
188(indicated as "<code>vtaboff</code>") and field offsets (indicated as
189"<code>fieldoff</code>").</p>
190
191<p>In the cases where a format value isn't explictly part of the syntax
192but instead picks a variant, each variant is listed with the prefix
193"<code>[<i>X</i>=<i>N</i>]</code>" (e.g., "<code>[A=2]</code>") to indicate
194the correspondence.</p>
195
196<h2>The Formats</h2>
197
198<table class="format">
199<thead>
200<tr>
201  <th>Format</th>
202  <th>ID</th>
203  <th>Syntax</th>
204  <th>Notable Opcodes Covered</th>
205</tr>
206</thead>
207<tbody>
208<tr>
209  <td><i>N/A</i></td>
210  <td>00x</td>
211  <td><i><code>N/A</code></i></td>
212  <td><i>pseudo-format used for unused opcodes; suggested for use as the
213    nominal format for a breakpoint opcode</i></td>
214</tr>
215<tr>
216  <td>&Oslash;&Oslash;|<i>op</i></td>
217  <td>10x</td>
218  <td><i><code>op</code></i></td>
219  <td>&nbsp;</td>
220</tr>
221<tr>
222  <td rowspan="2">B|A|<i>op</i></td>
223  <td>12x</td>
224  <td><i><code>op</code></i> vA, vB</td>
225  <td>&nbsp;</td>
226</tr>
227<tr>
228  <td>11n</td>
229  <td><i><code>op</code></i> vA, #+B</td>
230  <td>&nbsp;</td>
231</tr>
232<tr>
233  <td rowspan="2">AA|<i>op</i></td>
234  <td>11x</td>
235  <td><i><code>op</code></i> vAA</td>
236  <td>&nbsp;</td>
237</tr>
238<tr>
239  <td>10t</td>
240  <td><i><code>op</code></i> +AA</td>
241  <td>goto</td>
242</tr>
243<tr>
244  <td>&Oslash;&Oslash;|<i>op</i> AAAA</td></td>
245  <td>20t</td>
246  <td><i><code>op</code></i> +AAAA</td>
247  <td>goto/16</td>
248</tr>
249<tr>
250  <td>AA|<i>op</i> BBBB</td></td>
251  <td>20bc</td>
252  <td><i><code>op</code></i> AA, kind@BBBB</td>
253  <td><i>suggested format for statically determined verification errors;
254    A is the type of error and B is an index into a type-appropriate
255    table (e.g. method references for a no-such-method error)</i></td>
256</tr>
257<tr>
258  <td rowspan="5">AA|<i>op</i> BBBB</td>
259  <td>22x</td>
260  <td><i><code>op</code></i> vAA, vBBBB</td>
261  <td>&nbsp;</td>
262</tr>
263<tr>
264  <td>21t</td>
265  <td><i><code>op</code></i> vAA, +BBBB</td>
266  <td>&nbsp;</td>
267</tr>
268<tr>
269  <td>21s</td>
270  <td><i><code>op</code></i> vAA, #+BBBB</td>
271  <td>&nbsp;</td>
272</tr>
273<tr>
274  <td>21h</td>
275  <td><i><code>op</code></i> vAA, #+BBBB0000<br/>
276    <i><code>op</code></i> vAA, #+BBBB000000000000
277  </td>
278  <td>&nbsp;</td>
279</tr>
280<tr>
281  <td>21c</td>
282  <td><i><code>op</code></i> vAA, type@BBBB<br/>
283    <i><code>op</code></i> vAA, field@BBBB<br/>
284    <i><code>op</code></i> vAA, string@BBBB
285  </td>
286  <td>check-cast<br/>
287    const-class<br/>
288    const-string
289  </td>
290</tr>
291<tr>
292  <td rowspan="2">AA|<i>op</i> CC|BB</td>
293  <td>23x</td>
294  <td><i><code>op</code></i> vAA, vBB, vCC</td>
295  <td>&nbsp;</td>
296</tr>
297<tr>
298  <td>22b</td>
299  <td><i><code>op</code></i> vAA, vBB, #+CC</td>
300  <td>&nbsp;</td>
301</tr>
302<tr>
303  <td rowspan="4">B|A|<i>op</i> CCCC</td>
304  <td>22t</td>
305  <td><i><code>op</code></i> vA, vB, +CCCC</td>
306  <td>&nbsp;</td>
307</tr>
308<tr>
309  <td>22s</td>
310  <td><i><code>op</code></i> vA, vB, #+CCCC</td>
311  <td>&nbsp;</td>
312</tr>
313<tr>
314  <td>22c</td>
315  <td><i><code>op</code></i> vA, vB, type@CCCC<br/>
316    <i><code>op</code></i> vA, vB, field@CCCC
317  </td>
318  <td>instance-of</td>
319</tr>
320<tr>
321  <td>22cs</td>
322  <td><i><code>op</code></i> vA, vB, fieldoff@CCCC</td>
323  <td><i>suggested format for statically linked field access instructions of
324    format 22c</i>
325  </td>
326</tr>
327<tr>
328  <td>&Oslash;&Oslash;|<i>op</i> AAAA<sub>lo</sub> AAAA<sub>hi</sub></td></td>
329  <td>30t</td>
330  <td><i><code>op</code></i> +AAAAAAAA</td>
331  <td>goto/32</td>
332</tr>
333<tr>
334  <td>&Oslash;&Oslash;|<i>op</i> AAAA BBBB</td>
335  <td>32x</td>
336  <td><i><code>op</code></i> vAAAA, vBBBB</td>
337  <td>&nbsp;</td>
338</tr>
339<tr>
340  <td rowspan="3">AA|<i>op</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub></td>
341  <td>31i</td>
342  <td><i><code>op</code></i> vAA, #+BBBBBBBB</td>
343  <td>&nbsp;</td>
344</tr>
345<tr>
346  <td>31t</td>
347  <td><i><code>op</code></i> vAA, +BBBBBBBB</td>
348  <td>&nbsp;</td>
349</tr>
350<tr>
351  <td>31c</td>
352  <td><i><code>op</code></i> vAA, string@BBBBBBBB</td>
353  <td>const-string/jumbo</td>
354</tr>
355<tr>
356  <td rowspan="3">A|G|<i>op</i> BBBB F|E|D|C</td>
357  <td>35c</td>
358  <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
359    meth@BBBB<br/>
360    <i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
361    type@BBBB<br/>
362    <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF},
363    <i><code>kind</code></i>@BBBB<br/>
364    <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE},
365    <i><code>kind</code></i>@BBBB<br/>
366    <i>[<code>A=2</code>] <code>op</code></i> {vC, vD},
367    <i><code>kind</code></i>@BBBB<br/>
368    <i>[<code>A=1</code>] <code>op</code></i> {vC},
369    <i><code>kind</code></i>@BBBB<br/>
370    <i>[<code>A=0</code>] <code>op</code></i> {},
371    <i><code>kind</code></i>@BBBB<br/>
372    <p><i>The unusual choice in lettering here reflects a desire to make
373    the count and the reference index have the same label as in format
374    3rc.</i></p>
375  </td>
376  <td>&nbsp;</td>
377</tr>
378<tr>
379  <td>35ms</td>
380  <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
381    vtaboff@BBBB<br/>
382    <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF},
383    vtaboff@BBBB<br/>
384    <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE},
385    vtaboff@BBBB<br/>
386    <i>[<code>A=2</code>] <code>op</code></i> {vC, vD},
387    vtaboff@BBBB<br/>
388    <i>[<code>A=1</code>] <code>op</code></i> {vC},
389    vtaboff@BBBB<br/>
390    <p><i>The unusual choice in lettering here reflects a desire to make
391    the count and the reference index have the same label as in format
392    3rms.</i></p>
393  </td>
394  <td><i>suggested format for statically linked <code>invoke-virtual</code>
395    and <code>invoke-super</code> instructions of format 35c</i>
396  </td>
397</tr>
398<tr>
399  <td>35mi</td>
400  <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
401    inline@BBBB<br/>
402    <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF},
403    inline@BBBB<br/>
404    <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE},
405    inline@BBBB<br/>
406    <i>[<code>A=2</code>] <code>op</code></i> {vC, vD},
407    inline@BBBB<br/>
408    <i>[<code>A=1</code>] <code>op</code></i> {vC},
409    inline@BBBB<br/>
410    <p><i>The unusual choice in lettering here reflects a desire to make
411    the count and the reference index have the same label as in format
412    3rmi.</i></p>
413  </td>
414  <td><i>suggested format for inline linked <code>invoke-static</code>
415    and <code>invoke-virtual</code> instructions of format 35c</i>
416  </td>
417</tr>
418<tr>
419  <td rowspan="3">AA|<i>op</i> BBBB CCCC</td>
420  <td>3rc</td>
421  <td><i><code>op</code></i> {vCCCC .. vNNNN}, meth@BBBB<br/>
422    <i><code>op</code></i> {vCCCC .. vNNNN}, type@BBBB<br/>
423    <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
424    determines the count <code>0..255</code>, and <code>C</code>
425    determines the first register</i></p>
426  </td>
427  <td>&nbsp;</td>
428</tr>
429<tr>
430  <td>3rms</td>
431  <td><i><code>op</code></i> {vCCCC .. vNNNN}, vtaboff@BBBB<br/>
432    <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
433    determines the count <code>0..255</code>, and <code>C</code>
434    determines the first register</i></p>
435  </td>
436  <td><i>suggested format for statically linked <code>invoke-virtual</code>
437    and <code>invoke-super</code> instructions of format <code>3rc</code></i>
438  </td>
439</tr>
440<tr>
441  <td>3rmi</td>
442  <td><i><code>op</code></i> {vCCCC .. vNNNN}, inline@BBBB<br/>
443    <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
444    determines the count <code>0..255</code>, and <code>C</code>
445    determines the first register</i></p>
446  </td>
447  <td><i>suggested format for inline linked <code>invoke-static</code>
448    and <code>invoke-virtual</code> instructions of format 3rc</i>
449  </td>
450</tr>
451<tr>
452  <td>AA|<i>op</i> BBBB<sub>lo</sub> BBBB BBBB BBBB<sub>hi</sub></td>
453  <td>51l</td>
454  <td><i><code>op</code></i> vAA, #+BBBBBBBBBBBBBBBB</td>
455  <td>const-wide</td>
456</tr>
457</tbody>
458</table>
459
460</body>
461</html>
462