• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1page.title=Dalvik
2pdk.version=1.0
3doc.type=porting
4@jd:body
5
6<div id="qv-wrapper">
7<div id="qv">
8<h2>In this document</h2>
9<a name="toc"/>
10<ul>
11<li><a href="#dalvikCoreLibraries">Core Libraries</a></li>
12<li><a href="#dalvikJNICallBridge">JNI Call Bridge</a></li>
13<li><a href="#dalvikInterpreter">Interpreter</a></li>
14</ul>
15</div>
16</div>
17
18<p>
19The Dalvik virtual machine is intended to run on a variety of platforms.
20The baseline system is expected to be a variant of UNIX (Linux, BSD, Mac
21OS X) running the GNU C compiler.  Little-endian CPUs have been exercised
22the most heavily, but big-endian systems are explicitly supported.
23</p><p>
24There are two general categories of work: porting to a Linux system
25with a previously unseen CPU architecture, and porting to a different
26operating system.  This document covers the former.
27</p>
28
29
30<a name="dalvikCoreLibraries"></a><h3>Core Libraries</h3>
31
32<p>
33The native code in the core libraries (chiefly <code>dalvik/libcore</code>,
34but also <code>dalvik/vm/native</code>) is written in C/C++ and is expected
35to work without modification in a Linux environment.  Much of the code
36comes directly from the Apache Harmony project.
37</p><p>
38The core libraries pull in code from many other projects, including
39OpenSSL, zlib, and ICU.  These will also need to be ported before the VM
40can be used.
41</p>
42
43
44<a name="dalvikJNICallBridge"></a><h3>JNI Call Bridge</h3>
45
46<p>
47Most of the Dalvik VM runtime is written in portable C.  The one
48non-portable component of the runtime is the JNI call bridge.  Simply put,
49this converts an array of integers into function arguments of various
50types, and calls a function.  This must be done according to the C calling
51conventions for the platform.  The task could be as simple as pushing all
52of the arguments onto the stack, or involve complex rules for register
53assignment and stack alignment.
54</p><p>
55To ease porting to new platforms, the <a href="http://sourceware.org/libffi/">
56open-source FFI library</a> (Foreign Function Interface) is used when a
57custom bridge is unavailable.  FFI is not as fast as a native implementation,
58and the optional performance improvements it does offer are not used, so
59writing a replacement is a good first step.
60</p><p>
61The code lives in <code>dalvik/vm/arch/*</code>, with the FFI-based version
62in the "generic" directory.  There are two source files for each architecture.
63One defines the call bridge itself:
64</p><p><blockquote>
65<code>void dvmPlatformInvoke(void* pEnv, ClassObject* clazz, int argInfo,
66int argc, const u4* argv, const char* signature, void* func,
67JValue* pReturn)</code>
68</blockquote></p><p>
69This will invoke a C/C++ function declared:
70</p><p><blockquote>
71    <code>return_type func(JNIEnv* pEnv, Object* this [, <i>args</i>])<br></code>
72</blockquote>or (for a "static" method):<blockquote>
73    <code>return_type func(JNIEnv* pEnv, ClassObject* clazz [, <i>args</i>])</code>
74</blockquote></p><p>
75The role of <code>dvmPlatformInvoke</code> is to convert the values in
76<code>argv</code> into C-style calling conventions, call the method, and
77then place the return type into <code>pReturn</code> (a union that holds
78all of the basic JNI types).  The code may use the method signature
79(a DEX "shorty" signature, with one character for the return type and one
80per argument) to determine how to handle the values.
81</p><p>
82The other source file involved here defines a 32-bit "hint".  The hint
83is computed when the method's class is loaded, and passed in as the
84"argInfo" argument.  The hint can be used to avoid scanning the ASCII
85method signature for things like the return value, total argument size,
86or inter-argument 64-bit alignment restrictions.
87</p>
88
89<a name="dalvikInterpreter"></a><h3>Interpreter</h3>
90
91<p>
92The Dalvik runtime includes two interpreters, labeled "portable" and "fast".
93The portable interpreter is largely contained within a single C function,
94and should compile on any system that supports gcc.  (If you don't have gcc,
95you may need to disable the "threaded" execution model, which relies on
96gcc's "goto table" implementation; look for the THREADED_INTERP define.)
97</p><p>
98The fast interpreter uses hand-coded assembly fragments.  If none are
99available for the current architecture, the build system will create an
100interpreter out of C "stubs".  The resulting "all stubs" interpreter is
101quite a bit slower than the portable interpreter, making "fast" something
102of a misnomer.
103</p><p>
104The fast interpreter is enabled by default.  On platforms without native
105support, you may want to switch to the portable interpreter.  This can
106be controlled with the <code>dalvik.vm.execution-mode</code> system
107property.  For example, if you:
108</p><p><blockquote>
109<code>adb shell "echo dalvik.vm.execution-mode = int:portable >> /data/local.prop"</code>
110</blockquote></p><p>
111and reboot, the Android app framework will start the VM with the portable
112interpreter enabled.
113</p>
114
115
116<h3>Mterp Interpreter Structure</h3>
117
118<p>
119There may be significant performance advantages to rewriting the
120interpreter core in assembly language, using architecture-specific
121optimizations.  In Dalvik this can be done one instruction at a time.
122</p><p>
123The simplest way to implement an interpreter is to have a large "switch"
124statement.  After each instruction is handled, the interpreter returns to
125the top of the loop, fetches the next instruction, and jumps to the
126appropriate label.
127</p><p>
128An improvement on this is called "threaded" execution.  The instruction
129fetch and dispatch are included at the end of every instruction handler.
130This makes the interpreter a little larger overall, but you get to avoid
131the (potentially expensive) branch back to the top of the switch statement.
132</p><p>
133Dalvik mterp goes one step further, using a computed goto instead of a goto
134table.  Instead of looking up the address in a table, which requires an
135extra memory fetch on every instruction, mterp multiplies the opcode number
136by a fixed value.  By default, each handler is allowed 64 bytes of space.
137</p><p>
138Not all handlers fit in 64 bytes.  Those that don't can have subroutines
139or simply continue on to additional code outside the basic space.  Some of
140this is handled automatically by Dalvik, but there's no portable way to detect
141overflow of a 64-byte handler until the VM starts executing.
142</p><p>
143The choice of 64 bytes is somewhat arbitrary, but has worked out well for
144ARM and x86.
145</p><p>
146In the course of development it's useful to have C and assembly
147implementations of each handler, and be able to flip back and forth
148between them when hunting problems down.  In mterp this is relatively
149straightforward.  You can always see the files being fed to the compiler
150and assembler for your platform by looking in the
151<code>dalvik/vm/mterp/out</code> directory.
152</p><p>
153The interpreter sources live in <code>dalvik/vm/mterp</code>.  If you
154haven't yet, you should read <code>dalvik/vm/mterp/README.txt</code> now.
155</p>
156
157
158<h3>Getting Started With Mterp</h3>
159
160</p><p>
161Getting started:
162<ol>
163<li>Decide on the name of your architecture.  For the sake of discussion,
164let's call it <code>myarch</code>.
165<li>Make a copy of <code>dalvik/vm/mterp/config-allstubs</code> to
166<code>dalvik/vm/mterp/config-myarch</code>.
167<li>Create a <code>dalvik/vm/mterp/myarch</code> directory to hold your
168source files.
169<li>Add <code>myarch</code> to the list in
170<code>dalvik/vm/mterp/rebuild.sh</code>.
171<li>Make sure <code>dalvik/vm/Android.mk</code> will find the files for
172your architecture.  If <code>$(TARGET_ARCH)</code> is configured this
173will happen automatically.
174</ol>
175</p><p>
176You now have the basic framework in place.  Whenever you make a change, you
177need to perform two steps: regenerate the mterp output, and build the
178core VM library.  (It's two steps because we didn't want the build system
179to require Python 2.5.  Which, incidentally, you need to have.)
180<ol>
181<li>In the <code>dalvik/vm/mterp</code> directory, regenerate the contents
182of the files in <code>dalvik/vm/mterp/out</code> by executing
183<code>./rebuild.sh</code>.  Note there are two files, one in C and one
184in assembly.
185<li>In the <code>dalvik</code> directory, regenerate the
186<code>libdvm.so</code> library with <code>mm</code>.  You can also use
187<code>make libdvm</code> from the top of the tree.
188</ol>
189</p><p>
190This will leave you with an updated libdvm.so, which can be pushed out to
191a device with <code>adb sync</code> or <code>adb push</code>.  If you're
192using the emulator, you need to add <code>make snod</code> (System image,
193NO Dependency check) to rebuild the system image file.  You should not
194need to do a top-level "make" and rebuild the dependent binaries.
195</p><p>
196At this point you have an "all stubs" interpreter.  You can see how it
197works by examining <code>dalvik/vm/mterp/cstubs/entry.c</code>.  The
198code runs in a loop, pulling out the next opcode, and invoking the
199handler through a function pointer.  Each handler takes a "glue" argument
200that contains all of the useful state.
201</p><p>
202Your goal is to replace the entry method, exit method, and each individual
203instruction with custom implementations.  The first thing you need to do
204is create an entry function that calls the handler for the first instruction.
205After that, the instructions chain together, so you don't need a loop.
206(Look at the ARM or x86 implementation to see how they work.)
207</p><p>
208Once you have that, you need something to jump to.  You can't branch
209directly to the C stub because it's expecting to be called with a "glue"
210argument and then return.  We need a C stub "wrapper" that does the
211setup and jumps directly to the next handler.  We write this in assembly
212and then add it to the config file definition.
213</p><p>
214To see how this works, create a file called
215<code>dalvik/vm/mterp/myarch/stub.S</code> that contains one line:
216<pre>
217/* stub for ${opcode} */
218</pre>
219Then, in <code>dalvik/vm/mterp/config-myarch</code>, add this below the
220<code>handler-size</code> directive:
221<pre>
222# source for the instruction table stub
223asm-stub myarch/stub.S
224</pre>
225</p><p>
226Regenerate the sources with <code>./rebuild.sh</code>, and take a look
227inside <code>dalvik/vm/mterp/out/InterpAsm-myarch.S</code>.  You should
228see 256 copies of the stub function in a single large block after the
229<code>dvmAsmInstructionStart</code> label.  The <code>stub.S</code>
230code will be used anywhere you don't provide an assembly implementation.
231</p><p>
232Note that each block begins with a <code>.balign 64</code> directive.
233This is what pads each handler out to 64 bytes.  Note also that the
234<code>${opcode}</code> text changed into an opcode name, which should
235be used to call the C implementation (<code>dvmMterp_${opcode}</code>).
236</p><p>
237The actual contents of <code>stub.S</code> are up to you to define.
238See <code>entry.S</code> and <code>stub.S</code> in the <code>armv5te</code>
239or <code>x86</code> directories for working examples.
240</p><p>
241If you're working on a variation of an existing architecture, you may be
242able to use most of the existing code and just provide replacements for
243a few instructions.  Look at the <code>armv4t</code> implementation as
244an example.
245</p>
246
247
248<h3>Replacing Stubs</h3>
249
250<p>
251There are roughly 230 Dalvik opcodes, including some that are inserted by
252<a href="dexopt.html">dexopt</a> and aren't described in the
253<a href="dalvik-bytecode.html">Dalvik bytecode</a> documentation.  Each
254one must perform the appropriate actions, fetch the next opcode, and
255branch to the next handler.  The actions performed by the assembly version
256must exactly match those performed by the C version (in
257<code>dalvik/vm/mterp/c/OP_*</code>).
258</p><p>
259It is possible to customize the set of "optimized" instructions for your
260platform.  This is possible because optimized DEX files are not expected
261to work on multiple devices.  Adding, removing, or redefining instructions
262is beyond the scope of this document, and for simplicity it's best to stick
263with the basic set defined by the portable interpreter.
264</p><p>
265Once you have written a handler that looks like it should work, add
266it to the config file.  For example, suppose we have a working version
267of <code>OP_NOP</code>.  For demonstration purposes, fake it for now by
268putting this into <code>dalvik/vm/mterp/myarch/OP_NOP.S</code>:
269<pre>
270/* This is my NOP handler */
271</pre>
272</p><p>
273Then, in the <code>op-start</code> section of <code>config-myarch</code>, add:
274<pre>
275    op OP_NOP myarch
276</pre>
277</p><p>
278This tells the generation script to use the assembly version from the
279<code>myarch</code> directory instead of the C version from the <code>c</code>
280directory.
281</p><p>
282Execute <code>./rebuild.sh</code>.  Look at <code>InterpAsm-myarch.S</code>
283and <code>InterpC-myarch.c</code> in the <code>out</code> directory.  You
284will see that the <code>OP_NOP</code> stub wrapper has been replaced with our
285new code in the assembly file, and the C stub implementation is no longer
286included.
287</p><p>
288As you implement instructions, the C version and corresponding stub wrapper
289will disappear from the output files.  Eventually you will have a 100%
290assembly interpreter.
291</p>
292
293
294<h3>Interpreter Switching</h3>
295
296<p>
297The Dalvik VM actually includes a third interpreter implementation: the debug
298interpreter.  This is a variation of the portable interpreter that includes
299support for debugging and profiling.
300</p><p>
301When a debugger attaches, or a profiling feature is enabled, the VM
302will switch interpreters at a convenient point.  This is done at the
303same time as the GC safe point check: on a backward branch, a method
304return, or an exception throw.  Similarly, when the debugger detaches
305or profiling is discontinued, execution transfers back to the "fast" or
306"portable" interpreter.
307</p><p>
308Your entry function needs to test the "entryPoint" value in the "glue"
309pointer to determine where execution should begin.  Your exit function
310will need to return a boolean that indicates whether the interpreter is
311exiting (because we reached the "bottom" of a thread stack) or wants to
312switch to the other implementation.
313</p><p>
314See the <code>entry.S</code> file in <code>x86</code> or <code>armv5te</code>
315for examples.
316</p>
317
318
319<h3>Testing</h3>
320
321<p>
322A number of VM tests can be found in <code>dalvik/tests</code>.  The most
323useful during interpreter development is <code>003-omnibus-opcodes</code>,
324which tests many different instructions.
325</p><p>
326The basic invocation is:
327<pre>
328$ cd dalvik/tests
329$ ./run-test 003
330</pre>
331</p><p>
332This will run test 003 on an attached device or emulator.  You can run
333the test against your desktop VM by specifying <code>--reference</code>
334if you suspect the test may be faulty.  You can also use
335<code>--portable</code> and <code>--fast</code> to explictly specify
336one Dalvik interpreter or the other.
337</p><p>
338Some instructions are replaced by <code>dexopt</code>, notably when
339"quickening" field accesses and method invocations.  To ensure
340that you are testing the basic form of the instruction, add the
341<code>--no-optimize</code> option.
342</p><p>
343There is no in-built instruction tracing mechanism.  If you want
344to know for sure that your implementation of an opcode handler
345is being used, the easiest approach is to insert a "printf"
346call.  For an example, look at <code>common_squeak</code> in
347<code>dalvik/vm/mterp/armv5te/footer.S</code>.
348</p><p>
349At some point you need to ensure that debuggers and profiling work with
350your interpreter.  The easiest way to do this is to simply connect a
351debugger or toggle profiling.  (A future test suite may include some
352tests for this.)
353</p>
354
355
356