The bytecode verifier in the Dalvik VM attempts to provide the same sorts of checks and guarantees that other popular virtual machines do. We perform generally the same set of checks as are described in _The Java Virtual Machine Specification, Second Edition_, including the updates planned for the Third Edition.
Verification can be enabled for all classes, disabled for all, or enabled only for "remote" (non-bootstrap) classes. It should be performed for any class that will be processed with the DEX optimizer, and in fact the default VM behavior is to only optimize verified classes.
The verification process adds additional time to the build and to the installation of new applications. It's fairly quick for app-sized DEX files, but rather slow for the big "core" and "framework" files. Why do it all, when our system relies on UNIX processes for security?
It's also a convenient framework to deal with certain situations, notably replacement of instructions that access volatile 64-bit fields with more rigorous versions that guarantee atomicity.
There are a few checks that the Dalvik bytecode verifier does not perform, because they're not relevant. For example:
jsr
and ret
do not apply,
because Dalvik doesn't support subroutines.
new-instance
instruction.
This solves the same problem -- trickery potentially allowing
uninitialized objects to slip past the verifier -- without unduly
limiting branches.
move-exception
instruction can only appear as
the first instruction in an exception handler.
move-result*
instructions can only appear
immediately after an appropriate invoke-*
or filled-new-array
instruction.
The VM is permitted but not required to enforce "structured locking" constraints, which are designed to ensure that, when a method returns, all monitors locked by the method have been unlocked an equal number of times. This is not currently implemented.
The Dalvik verifier is more restrictive than other VMs in one area:
type safety on sub-32-bit integer widths. These additional restrictions
should make it impossible to, say, pass a value outside the range
[-128, 127] to a function that takes a byte
as an argument.
If a method locks an object with a synchronized
statement, the
object must be unlocked before the method returns. At the bytecode level,
this means the method must execute a matching monitor-exit
for every monitor-enter
instruction, whether the function
completes normally or abnormally. The bytecode verifier optionally
enforces this.
The verifier uses a fairly simple-minded model. If you enter a monitor
held in register N, you can exit the monitor using register N or any
subsequently-made copies of register N. The verifier does not attempt
to identify previously-made copies, track loads and stores through
fields, or recognize identical constant values (for example, the result
values from two const-class
instructions on the same class
will be the same reference, but the verifier doesn't recognize this).
Further, you may only exit the monitor most recently entered. "Hand over hand" locking techniques, e.g. "lock A; lock B; unlock A; unlock B", are not allowed.
This means that there are a number of situations in which the verifier will throw an exception on code that would execute correctly at run time. This is not expected to be an issue for compiler-generated bytecode.
For implementation convenience, the maximum nesting depth of
synchronized
statements has been set to 32. This is not
a limitation on the recursion count. The only way to trip this would be
to have a single method with more than 32 nested synchronized
statements, something that is unlikely to occur.
The verifier may reject a class immediately, or it may defer throwing an exception until the code is actually used. For example, if a class attempts to perform an illegal access on a field, the VM should throw an IllegalAccessError the first time the instruction is encountered. On the other hand, if a class contains an invalid bytecode, it should be rejected immediately with a VerifyError.
Immediate VerifyErrors are accompanied by detailed, if somewhat cryptic, information in the log file. From this it's possible to determine the exact instruction that failed, and the reason for the failure.
It's a bit tricky to implement deferred verification errors in Dalvik. A few approaches were considered:
In early versions of Dalvik (as found in Android 1.6 and earlier), the verifier simply regarded all problems as immediately fatal. This generally worked, but in some cases the VM was rejecting classes because of bits of code that were never used. The VerifyError itself was sometimes difficult to decipher, because it was thrown during verification rather than at the point where the problem was first noticed during execution.
The current version uses a variation of approach #1. The dexopt command works the way it did before, leaving the code untouched and flagging fully-correct classes as "pre-verified". When the VM loads a class that didn't pass pre-verification, the verifier is invoked. If a "deferrable" problem is detected, a modifiable copy of the instructions in the problematic method is made. In that copy, the troubled instruction is replaced with an "always throw" opcode, and verification continues.
In the example used earlier, an attempt to read from an inaccessible field would result in the "field get" instruction being replaced by "always throw IllegalAccessError on field X". Creating copies of method bodies requires additional heap space, but since this affects very few methods overall the memory impact should be minor.
Copyright © 2008 The Android Open Source Project