1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 2 3<html> 4 5<head> 6<title>Dalvik VM Instruction Formats</title> 7<link rel=stylesheet href="instruction-formats.css"> 8</head> 9 10<body> 11 12<h1>Dalvik VM Instruction Formats</h1> 13<p>Copyright © 2007 The Android Open Source Project 14 15<h2>Introduction and Overview</h2> 16 17<p>This document lists the instruction formats used by Dalvik bytecode 18and is meant to be used in conjunction with the 19<a href="dalvik-bytecode.html">bytecode reference document</a>.</p> 20 21<h3>Bitwise descriptions</h3> 22 23<p>The first column in the format table lists the bitwise layout of 24the format. It consists of one or more space-separated "words" each of 25which describes a 16-bit code unit. Each character in a word 26represents four bits, read from high bits to low, with vertical bars 27("<code>|</code>") interspersed to aid in reading. Uppercase letters 28in sequence from "<code>A</code>" are used to indicate fields within 29the format (which then get defined further by the syntax column). The term 30"<code>op</code>" is used to indicate the position of the eight-bit 31opcode within the format. A slashed zero ("<code>Ø</code>") is 32used to indicate that all bits should be zero in the indicated 33position.</p> 34 35<p>For example, the format "<code>B|A|<i>op</i> CCCC</code>" indicates 36that the format consists of two 16-bit code units. The first word 37consists of the opcode in the low eight bits and a pair of four-bit 38values in the high eight bits; and the second word consists of a single 3916-bit value.</p> 40 41<h3>Format IDs</h3> 42 43<p>The second column in the format table indicates the short identifier 44for the format, which is used in other documents and in code to identify 45the format.</p> 46 47<p>Format IDs consist of three characters, two digits followed by a 48letter. The first digit indicates the number of 16-bit code units in the 49format. The second digit indicates the maximum number of registers that the 50format contains (maximum, since some formats can accomodate a variable 51number of registers), with the special designation "<code>r</code>" indicating 52that a range of registers is encoded. The final letter semi-mnemonically 53indicates the type of any extra data encoded by the format. For example, 54format "<code>21t</code>" is of length two, contains one register reference, 55and additionally contains a branch target.</p> 56 57<p>Suggested static linking formats have an additional "<code>s</code>" suffix, 58making them four characters total.</p> 59 60<p>The full list of typecode letters are as follows. Note that some 61forms have different sizes, depending on the format:</p> 62 63<table class="letters"> 64<thead> 65<tr> 66 <th>Mnemonic</th> 67 <th>Bit Sizes</th> 68 <th>Meaning</th> 69</tr> 70</thead> 71<tbody> 72<tr> 73 <td>b</td> 74 <td>8</td> 75 <td>immediate signed <b>b</b>yte</td> 76</tr> 77<tr> 78 <td>c</td> 79 <td>16, 32</td> 80 <td><b>c</b>onstant pool index</td> 81</tr> 82<tr> 83 <td>f</td> 84 <td>16</td> 85 <td>inter<b>f</b>ace constants (only used in statically linked formats) 86 </td> 87</tr> 88<tr> 89 <td>h</td> 90 <td>16</td> 91 <td>immediate signed <b>h</b>at (high-order bits of a 32- or 64-bit 92 value; low-order bits are all <code>0</code>) 93 </td> 94</tr> 95<tr> 96 <td>i</td> 97 <td>32</td> 98 <td>immediate signed <b>i</b>nt, or 32-bit float</td> 99</tr> 100<tr> 101 <td>l</td> 102 <td>64</td> 103 <td>immediate signed <b>l</b>ong, or 64-bit double</td> 104</tr> 105<tr> 106 <td>m</td> 107 <td>16</td> 108 <td><b>m</b>ethod constants (only used in statically linked formats)</td> 109</tr> 110<tr> 111 <td>n</td> 112 <td>4</td> 113 <td>immediate signed <b>n</b>ibble</td> 114</tr> 115<tr> 116 <td>s</td> 117 <td>16</td> 118 <td>immediate signed <b>s</b>hort</td> 119</tr> 120<tr> 121 <td>t</td> 122 <td>8, 16, 32</td> 123 <td>branch <b>t</b>arget</td> 124</tr> 125<tr> 126 <td>x</td> 127 <td>0</td> 128 <td>no additional data</td> 129</tr> 130</tbody> 131</table> 132 133<h3>Syntax</h3> 134 135<p>The third column of the format table indicates the human-oriented 136syntax for instructions which use the indicated format. Each instruction 137starts with the named opcode and is optionally followed by one or 138more arguments, themselves separated with commas.</p> 139 140<p>Wherever an argument refers to a field from the first column, the 141letter for that field is indicated in the syntax, repeated once for 142each four bits of the field. For example, an eight-bit field labeled 143"<code>BB</code>" in the first column would also be labeled 144"<code>BB</code>" in the syntax column.</p> 145 146<p>Arguments which name a register have the form "<code>v<i>X</i></code>". 147The prefix "<code>v</code>" was chosen instead of the more common 148"<code>r</code>" exactly to avoid conflicting with (non-virtual) architectures 149on which a Dalvik virtual machine might be implemented which themselves 150use the prefix "<code>r</code>" for their registers. (That is, this 151decision makes it possible to talk about both virtual and real registers 152together without the need for circumlocution.)</p> 153 154<p>Arguments which indicate a literal value have the form 155"<code>#+<i>X</i></code>". Some formats indicate literals that only 156have non-zero bits in their high-order bits; for these, the zeroes 157are represented explicitly in the syntax, even though they do not 158appear in the bitwise representation.</p> 159 160<p>Arguments which indicate a relative instruction address offset have the 161form "<code>+<i>X</i></code>".</p> 162 163<p>Arguments which indicate a literal constant pool index have the form 164"<code><i>kind</i>@<i>X</i></code>", where "<code><i>kind</i></code>" 165indicates which constant pool is being referred to. Each opcode that 166uses such a format explicitly allows only one kind of constant; see 167the opcode reference to figure out the correspondence. The four 168kinds of constant pool are "<code>string</code>" (string pool index), 169"<code>type</code>" (type pool index), "<code>field</code>" (field 170pool index), and "<code>meth</code>" (method pool index).</p> 171 172<p>Similar to the representation of constant pool indices, there are 173also suggested (optional) forms that indicate prelinked offsets or 174indices. These prelinked values include "<code>vtaboff</code>" 175(vtable offset), "<code>fieldoff</code>" (field offset), and 176"<code>iface</code>" (interface pool index).</p> 177 178<p>In the cases where a format value isn't explictly part of the syntax 179but instead picks a variant, each variant is listed with the prefix 180"<code>[<i>X</i>=<i>N</i>]</code>" (e.g., "<code>[B=2]</code>") to indicate 181the correspondence.</p> 182 183<h2>The Formats</h2> 184 185<table class="format"> 186<thead> 187<tr> 188 <th>Format</th> 189 <th>ID</th> 190 <th>Syntax</th> 191 <th>Notable Opcodes Covered</th> 192</tr> 193</thead> 194<tbody> 195<tr> 196 <td>ØØ|<i>op</i></td> 197 <td>10x</td> 198 <td><i><code>op</code></i></td> 199 <td> </td> 200</tr> 201<tr> 202 <td rowspan="2">B|A|<i>op</i></td> 203 <td>12x</td> 204 <td><i><code>op</code></i> vA, vB</td> 205 <td> </td> 206</tr> 207<tr> 208 <td>11n</td> 209 <td><i><code>op</code></i> vA, #+B</td> 210 <td> </td> 211</tr> 212<tr> 213 <td rowspan="2">AA|<i>op</i></td> 214 <td>11x</td> 215 <td><i><code>op</code></i> vAA</td> 216 <td> </td> 217</tr> 218<tr> 219 <td>10t</td> 220 <td><i><code>op</code></i> +AA</td> 221 <td>goto</td> 222</tr> 223<tr> 224 <td>ØØ|<i>op</i> AAAA</td></td> 225 <td>20t</td> 226 <td><i><code>op</code></i> +AAAA</td> 227 <td>goto/16</td> 228</tr> 229<tr> 230 <td rowspan="5">AA|<i>op</i> BBBB</td> 231 <td>22x</td> 232 <td><i><code>op</code></i> vAA, vBBBB</td> 233 <td> </td> 234</tr> 235<tr> 236 <td>21t</td> 237 <td><i><code>op</code></i> vAA, +BBBB</td> 238 <td> </td> 239</tr> 240<tr> 241 <td>21s</td> 242 <td><i><code>op</code></i> vAA, #+BBBB</td> 243 <td> </td> 244</tr> 245<tr> 246 <td>21h</td> 247 <td><i><code>op</code></i> vAA, #+BBBB0000<br/> 248 <i><code>op</code></i> vAA, #+BBBB000000000000 249 </td> 250 <td> </td> 251</tr> 252<tr> 253 <td>21c</td> 254 <td><i><code>op</code></i> vAA, type@BBBB<br/> 255 <i><code>op</code></i> vAA, field@BBBB<br/> 256 <i><code>op</code></i> vAA, string@BBBB 257 </td> 258 <td>check-cast<br/> 259 const-class<br/> 260 const-string 261 </td> 262</tr> 263<tr> 264 <td rowspan="2">AA|<i>op</i> CC|BB</td> 265 <td>23x</td> 266 <td><i><code>op</code></i> vAA, vBB, vCC</td> 267 <td> </td> 268</tr> 269<tr> 270 <td>22b</td> 271 <td><i><code>op</code></i> vAA, vBB, #+CC</td> 272 <td> </td> 273</tr> 274<tr> 275 <td rowspan="4">B|A|<i>op</i> CCCC</td> 276 <td>22t</td> 277 <td><i><code>op</code></i> vA, vB, +CCCC</td> 278 <td> </td> 279</tr> 280<tr> 281 <td>22s</td> 282 <td><i><code>op</code></i> vA, vB, #+CCCC</td> 283 <td> </td> 284</tr> 285<tr> 286 <td>22c</td> 287 <td><i><code>op</code></i> vA, vB, type@CCCC<br/> 288 <i><code>op</code></i> vA, vB, field@CCCC 289 </td> 290 <td>instance-of</td> 291</tr> 292<tr> 293 <td>22cs</td> 294 <td><i><code>op</code></i> vA, vB, fieldoff@CCCC</td> 295 <td><i>(suggested format for statically linked field access instructions of 296 format 22c)</i> 297 </td> 298</tr> 299<tr> 300 <td>ØØ|<i>op</i> AAAA<sub>lo</sub> AAAA<sub>hi</sub></td></td> 301 <td>30t</td> 302 <td><i><code>op</code></i> +AAAAAAAA</td> 303 <td>goto/32</td> 304</tr> 305<tr> 306 <td>ØØ|<i>op</i> AAAA BBBB</td> 307 <td>32x</td> 308 <td><i><code>op</code></i> vAAAA, vBBBB</td> 309 <td> </td> 310</tr> 311<tr> 312 <td rowspan="3">AA|<i>op</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub></td> 313 <td>31i</td> 314 <td><i><code>op</code></i> vAA, #+BBBBBBBB</td> 315 <td> </td> 316</tr> 317<tr> 318 <td>31t</td> 319 <td><i><code>op</code></i> vAA, +BBBBBBBB</td> 320 <td> </td> 321</tr> 322<tr> 323 <td>31c</td> 324 <td><i><code>op</code></i> vAA, string@BBBBBBBB</td> 325 <td>const-string/jumbo</td> 326</tr> 327<tr> 328 <td>B|A|<i>op</i> CCCC G|F|E|D</td> 329 <td>35c</td> 330 <td><i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA}, 331 meth@CCCC<br/> 332 <i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA}, 333 type@CCCC<br/> 334 <i>[<code>B=4</code>] <code>op</code></i> {vD, vE, vF, vG}, 335 <i><code>kind</code></i>@CCCC<br/> 336 <i>[<code>B=3</code>] <code>op</code></i> {vD, vE, vF}, 337 <i><code>kind</code></i>@CCCC<br/> 338 <i>[<code>B=2</code>] <code>op</code></i> {vD, vE}, 339 <i><code>kind</code></i>@CCCC<br/> 340 <i>[<code>B=1</code>] <code>op</code></i> {vD}, 341 <i><code>kind</code></i>@CCCC<br/> 342 <i>[<code>B=0</code>] <code>op</code></i> {}, 343 <i><code>kind</code></i>@CCCC 344 </td> 345 <td> </td> 346</tr> 347<tr> 348 <td>B|A|<i>op</i> CCCC G|F|E|D</td> 349 <td>35ms</td> 350 351 <td><i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA}, 352 vtaboff@CCCC<br/> 353 <i>[<code>B=4</code>] <code>op</code></i> {vD, vE, vF, vG}, 354 vtaboff@CCCC<br/> 355 <i>[<code>B=3</code>] <code>op</code></i> {vD, vE, vF}, 356 vtaboff@CCCC<br/> 357 <i>[<code>B=2</code>] <code>op</code></i> {vD, vE}, 358 vtaboff@CCCC<br/> 359 <i>[<code>B=1</code>] <code>op</code></i> {vD}, 360 vtaboff@CCCC<br/> 361 </td> 362 <td><i>(suggested format for statically linked <code>invoke-virtual</code> 363 and <code>invoke-super</code> instructions of format 35c)</i> 364 </td> 365</tr> 366<tr> 367 <td>B|A|<i>op</i> DDCC H|G|F|E</td> 368 <td>35fs</td> 369 <td><i>[<code>B=5</code>] <code>op</code></i> {vE, vF, vG, vH, vA}, 370 vtaboff@CC, iface@DD<br/> 371 <i>[<code>B=4</code>] <code>op</code></i> {vE, vF, vG, vH}, 372 vtaboff@CC, iface@DD<br/> 373 <i>[<code>B=3</code>] <code>op</code></i> {vE, vF, vG}, 374 vtaboff@CC, iface@DD<br/> 375 <i>[<code>B=2</code>] <code>op</code></i> {vE, vF}, 376 vtaboff@CC, iface@DD<br/> 377 <i>[<code>B=1</code>] <code>op</code></i> {vE}, 378 vtaboff@CC, iface@DD<br/> 379 </td> 380 <td><i>(suggested format for statically linked <code>invoke-interface</code> 381 instructions of format 35c)</i> 382 </td> 383</tr> 384<tr> 385 <td>AA|<i>op</i> BBBB CCCC</td> 386 <td>3rc</td> 387 <td><i><code>op</code></i> {vCCCC .. vNNNN}, meth@BBBB<br/> 388 <i><code>op</code></i> {vCCCC .. vNNNN}, type@BBBB<br/> 389 <p><i>(where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code> 390 determines the count <code>0..255</code>, and <code>C</code> 391 determines the first register)</i></p> 392 </td> 393 <td> </td> 394</tr> 395<tr> 396 <td>AA|<i>op</i> BBBB CCCC</td> 397 <td>3rms</td> 398 <td><i><code>op</code></i> {vCCCC .. vNNNN}, vtaboff@BBBB<br/> 399 <p><i>(where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code> 400 determines the count <code>0..255</code>, and <code>C</code> 401 determines the first register)</i></p> 402 </td> 403 <td><i>(suggested format for statically linked <code>invoke-virtual</code> 404 and <code>invoke-super</code> instructions of format <code>3rc</code>)</i> 405 </td> 406</tr> 407<tr> 408 <td>AA|<i>op</i> CCBB DDDD</td> 409 <td>3rfs</td> 410 <td><i><code>op</code></i> {vDDDD .. vNNNN}, vtaboff@BB, 411 iface@CC<br/> 412 <p><i>(where <code>NNNN = DDDD+AA-1</code>, that is <code>A</code> 413 determines the count <code>0..255</code>, and <code>D</code> 414 determines the first register)</i></p> 415 </td> 416 <td><i>(suggested format for statically linked <code>invoke-interface</code> 417 instructions of format <code>3rc</code>)</i> 418 </td> 419</tr> 420<tr> 421 <td>AA|<i>op</i> BBBB<sub>lo</sub> BBBB BBBB BBBB<sub>hi</sub></td> 422 <td>51l</td> 423 <td><i><code>op</code></i> vAA, #+BBBBBBBBBBBBBBBB</td> 424 <td>const-wide</td> 425</tr> 426</tbody> 427</table> 428 429</body> 430</html> 431