1rt "mterp" README
2
3NOTE: Find rebuilding instructions at the bottom of this file.
4
5
6==== Overview ====
7
8Every configuration has a "config-*" file that controls how the sources
9are generated. The sources are written into the "out" directory, where
10they are picked up by the Android build system.
11
12The best way to become familiar with the interpreter is to look at the
13generated files in the "out" directory.
14
15
16==== Config file format ====
17
18The config files are parsed from top to bottom. Each line in the file
19may be blank, hold a comment (line starts with '#'), or be a command.
20
21The commands are:
22
23 handler-style <computed-goto|jump-table>
24
25 Specify which style of interpreter to generate. In computed-goto,
26 each handler is allocated a fixed region, allowing transitions to
27 be done via table-start-address + (opcode * handler-size). With
28 jump-table style, handlers may be of any length, and the generated
29 table is an array of pointers to the handlers. This command is required,
30 and must be the first command in the config file.
31
32 handler-size <bytes>
33
34 Specify the size of the fixed region, in bytes. On most platforms
35 this will need to be a power of 2. For jump-table implementations,
36 this command is ignored.
37
38 import <filename>
39
40 The specified file is included immediately, in its entirety. No
41 substitutions are performed. ".cpp" and ".h" files are copied to the
42 C output, ".S" files are copied to the asm output.
43
44 asm-alt-stub <filename>
45
46 When present, this command will cause the generation of an alternate
47 set of entry points (for computed-goto interpreters) or an alternate
48 jump table (for jump-table interpreters).
49
50 fallback-stub <filename>
51
52 Specifies a file to be used for the special FALLBACK tag on the "op"
53 command below. Intended to be used to transfer control to an alternate
54 interpreter to single-step a not-yet-implemented opcode. Note: should
55 note be used on RETURN-class instructions.
56
57 op-start <directory>
58
59 Indicates the start of the opcode list. Must precede any "op"
60 commands. The specified directory is the default location to pull
61 instruction files from.
62
63 op <opcode> <directory>|FALLBACK
64
65 Can only appear after "op-start" and before "op-end". Overrides the
66 default source file location of the specified opcode. The opcode
67 definition will come from the specified file, e.g. "op OP_NOP arm"
68 will load from "arm/OP_NOP.S". A substitution dictionary will be
69 applied (see below). If the special "FALLBACK" token is used instead of
70 a directory name, the source file specified in fallback-stub will instead
71 be used for this opcode.
72
73 alt <opcode> <directory>
74
75 Can only appear after "op-start" and before "op-end". Similar to the
76 "op" command above, but denotes a source file to override the entry
77 in the alternate handler table. The opcode definition will come from
78 the specified file, e.g. "alt OP_NOP arm" will load from
79 "arm/ALT_OP_NOP.S". A substitution dictionary will be applied
80 (see below).
81
82 op-end
83
84 Indicates the end of the opcode list. All kNumPackedOpcodes
85 opcodes are emitted when this is seen, followed by any code that
86 didn't fit inside the fixed-size instruction handler space.
87
88The order of "op" and "alt" directives are not significant; the generation
89tool will extract ordering info from the VM sources.
90
91Typically the form in which most opcodes currently exist is used in
92the "op-start" directive.
93
94==== Instruction file format ====
95
96The assembly instruction files are simply fragments of assembly sources.
97The starting label will be provided by the generation tool, as will
98declarations for the segment type and alignment. The expected target
99assembler is GNU "as", but others will work (may require fiddling with
100some of the pseudo-ops emitted by the generation tool).
101
102A substitution dictionary is applied to all opcode fragments as they are
103appended to the output. Substitutions can look like "$value" or "${value}".
104
105The dictionary always includes:
106
107 $opcode - opcode name, e.g. "OP_NOP"
108 $opnum - opcode number, e.g. 0 for OP_NOP
109 $handler_size_bytes - max size of an instruction handler, in bytes
110 $handler_size_bits - max size of an instruction handler, log 2
111
112Both C and assembly sources will be passed through the C pre-processor,
113so you can take advantage of C-style comments and preprocessor directives
114like "#define".
115
116Some generator operations are available.
117
118 %include "filename" [subst-dict]
119
120 Includes the file, which should look like "arm/OP_NOP.S". You can
121 specify values for the substitution dictionary, using standard Python
122 syntax. For example, this:
123 %include "arm/unop.S" {"result":"r1"}
124 would insert "arm/unop.S" at the current file position, replacing
125 occurrences of "$result" with "r1".
126
127 %default <subst-dict>
128
129 Specify default substitution dictionary values, using standard Python
130 syntax. Useful if you want to have a "base" version and variants.
131
132 %break
133
134 Identifies the split between the main portion of the instruction
135 handler (which must fit in "handler-size" bytes) and the "sister"
136 code, which is appended to the end of the instruction handler block.
137 In jump table implementations, %break is ignored.
138
139The generation tool does *not* print a warning if your instructions
140exceed "handler-size", but the VM will abort on startup if it detects an
141oversized handler. On architectures with fixed-width instructions this
142is easy to work with, on others this you will need to count bytes.
143
144
145==== Using C constants from assembly sources ====
146
147The file "art/runtime/asm_support.h" has some definitions for constant
148values, structure sizes, and struct member offsets. The format is fairly
149restricted, as simple macros are used to massage it for use with both C
150(where it is verified) and assembly (where the definitions are used).
151
152If a constant in the file becomes out of sync, the VM will log an error
153message and abort during startup.
154
155
156==== Development tips ====
157
158If you need to debug the initial piece of an opcode handler, and your
159debug code expands it beyond the handler size limit, you can insert a
160generic header at the top:
161
162 b ${opcode}_start
163%break
164${opcode}_start:
165
166If you already have a %break, it's okay to leave it in place -- the second
167%break is ignored.
168
169
170==== Rebuilding ====
171
172If you change any of the source file fragments, you need to rebuild the
173combined source files in the "out" directory. Make sure the files in
174"out" are editable, then:
175
176 $ cd mterp
177 $ ./rebuild.sh
178
179The ultimate goal is to have the build system generate the necessary
180output files without requiring this separate step, but we're not yet
181ready to require Python in the build.
182
183==== Interpreter Control ====
184
185The mterp fast interpreter achieves much of its performance advantage
186over the C++ interpreter through its efficient mechanism of
187transitioning from one Dalvik bytecode to the next. Mterp for ARM targets
188uses a computed-goto mechanism, in which the handler entrypoints are
189located at the base of the handler table + (opcode * 128).
190
191In normal operation, the dedicated register rIBASE
192(r8 for ARM, edx for x86) holds a mainHandlerTable. If we need to switch
193to a mode that requires inter-instruction checking, rIBASE is changed
194to altHandlerTable. Note that this change is not immediate. What is actually
195changed is the value of curHandlerTable - which is part of the interpBreak
196structure. Rather than explicitly check for changes, each thread will
197blindly refresh rIBASE at backward branches, exception throws and returns.
198