1<html> 2<head> 3<title>Dalvik Bytecode Verifier Notes</title> 4</head> 5 6<body> 7<h1>Dalvik Bytecode Verifier Notes</h1> 8 9<p> 10The bytecode verifier in the Dalvik VM attempts to provide the same sorts 11of checks and guarantees that other popular virtual machines do. We 12perform generally the same set of checks as are described in _The Java 13Virtual Machine Specification, Second Edition_, including the updates 14planned for the Third Edition. 15 16<p> 17Verification can be enabled for all classes, disabled for all, or enabled 18only for "remote" (non-bootstrap) classes. It should be performed for any 19class that will be processed with the DEX optimizer, and in fact the 20default VM behavior is to only optimize verified classes. 21 22 23<h2>Why Verify?</h2> 24 25<p> 26The verification process adds additional time to the build and to 27the installation of new applications. It's fairly quick for app-sized 28DEX files, but rather slow for the big "core" and "framework" files. 29Why do it all, when our system relies on UNIX processes for security? 30<p> 31<ol> 32 <li>Optimizations. The interpreter can ignore a lot of potential 33 error cases because the verifier guarantees that they are impossible. 34 Also, we can optimize the DEX file more aggressively if we start 35 with a stronger set of assumptions about the bytecode. 36 <li>"Exact" GC. The work peformed during verification has significant 37 overlap with the work required to compute register use maps for exact 38 GC. Improper register use, caught by the verifier, could lead to 39 subtle problems with an "exact" GC. 40 <li>Intra-application security. If an app wants to download bits 41 of interpreted code over the network and execute them, it can safely 42 do so using well-established security mechanisms. 43 <li>3rd party app failure analysis. We have no way to control the 44 tools and post-processing utilities that external developers employ, 45 so when we get bug reports with a weird exception or native crash 46 it's very helpful to start with the assumption that the bytecode 47 is valid. 48</ol> 49 50 51<h2>Verifier Differences</h2> 52 53<p> 54There are a few checks that the Dalvik bytecode verifier does not perform, 55because they're not relevant. For example: 56<ul> 57 <li>Type restrictions on constant pool references are not enforced, 58 because Dalvik does not have a pool of typed constants. (Dalvik 59 uses a simple index into type-specific pools.) 60 <li>Verification of the operand stack size is not performed, because 61 Dalvik does not have an operand stack. 62 <li>Limitations on <code>jsr</code> and <code>ret</code> do not apply, 63 because Dalvik doesn't support subroutines. 64</ul> 65 66In some cases they are implemented differently, e.g.: 67<ul> 68 <li>In a conventional VM, backward branches and exceptions are 69 forbidden when a local variable holds an uninitialized reference. The 70 restriction was changed to mark registers as invalid when they hold 71 references to the uninitialized result of a previous invocation of the 72 same <code>new-instance</code> instruction. 73 This solves the same problem -- trickery potentially allowing 74 uninitialized objects to slip past the verifier -- without unduly 75 limiting branches. 76</ul> 77 78There are also some new ones, such as: 79<ul> 80 <li>The <code>move-exception</code> instruction can only appear as 81 the first instruction in an exception handler. 82 <li>The <code>move-result*</code> instructions can only appear 83 immediately after an appropriate <code>invoke-*</code> 84 or <code>filled-new-array</code> instruction. 85</ul> 86 87<p> 88The Dalvik verifier is more restrictive than other VMs in one area: 89type safety on sub-32-bit integer widths. These additional restrictions 90should make it impossible to, say, pass a value outside the range 91[-128, 127] to a function that takes a <code>byte</code> as an argument. 92 93 94<h2>Verification Failures</h2> 95 96<p> 97When the verifier rejects a class, it always throws a VerifyError. 98This is different in some cases from other implementations. For example, 99if a class attempts to perform an illegal access on a field, the expected 100behavior is to receive an IllegalAccessError at runtime the first time 101the field is actually accessed. The Dalvik verifier will reject the 102entire class immediately. 103 104<p> 105It's difficult to throw the error on first use in Dalvik. Possible ways 106to implement this behavior include: 107 108<ol> 109<li>We could replace the invalid field access instruction with a special 110instruction that generates an illegal access error, and allow class 111verification to complete successfully. This type of verification must 112often be deferred to first class load, rather than be performed ahead of time 113during DEX optimization, which means the bytecode instructions will be 114mapped read-only during verification. So this won't work. 115</li> 116 117<li>We can perform the access checks when the field/method/class is 118resolved. In a typical VM implementation we would do the check when the 119entry is resolved in the context of the current classfile, but our DEX 120files combine multiple classfiles together, merging the field/method/class 121resolution results into a single large table. Once one class successfully 122resolves the field, every other class in the same DEX file would be able 123to access the field. This is bad. 124</li> 125 126<li>Perform the access checks on every field/method/class access. 127This adds significant overhead. This is mitigated somewhat by the DEX 128optimizer, which will convert many field/method/class accesses into a 129simpler form after performing the access check. However, not all accesses 130can be optimized (e.g. accesses to classes unknown at dexopt time), 131and we don't currently have an optimized form of certain instructions 132(notably static field operations). 133</li> 134</ol> 135 136<p> 137Other implementations are possible, but they all involve allocating 138some amount of additional memory or spending additional cycles 139on non-DEX-optimized instructions. We don't want to throw an 140IllegalAccessError at verification time, since that would indicate that 141access to the class being verified was illegal. 142<p> 143One approach that might be worth pursuing: for situations like illegal 144accesses, the verifier makes an in-RAM private copy of the method, and 145alters the instructions there. The class object is altered to point at 146the new copy of the instructions. This requires minimal memory overhead 147and provides a better experience for developers. 148 149<p> 150The VerifyError is accompanied by detailed, if somewhat cryptic, 151information in the log file. From this it's possible to determine the 152exact instruction that failed, and the reason for the failure. We can 153also constructor the VerifyError with an IllegalAccessError passed in as 154the cause. 155 156<address>Copyright © 2008 The Android Open Source Project</address> 157 158</body> 159</html> 160