• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1 /*
2  * Copyright (C) 2008 The Android Open Source Project
3  *
4  * Licensed under the Apache License, Version 2.0 (the "License");
5  * you may not use this file except in compliance with the License.
6  * You may obtain a copy of the License at
7  *
8  *      http://www.apache.org/licenses/LICENSE-2.0
9  *
10  * Unless required by applicable law or agreed to in writing, software
11  * distributed under the License is distributed on an "AS IS" BASIS,
12  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13  * See the License for the specific language governing permissions and
14  * limitations under the License.
15  */
16 
17 /*
18  * Compress the range of "constant pool" indexes in instructions and
19  * annotations to lower runtime RAM footprint.
20  *
21  * NOTE: this is an incomplete experimental feature.  Do not try to use it.
22  */
23 #include "Dalvik.h"
24 #include "libdex/InstrUtils.h"
25 #include "libdex/OptInvocation.h"
26 #include "libdex/DexClass.h"
27 
28 /*
29 Overview
30 
31 When a class, method, field, or string constant is referred to from
32 Dalvik bytecode, the reference takes the form of an integer index value.
33 This value indexes into an array of type_id_item, method_id_item,
34 field_id_item, or string_id_item in the DEX file.  The first three
35 themselves contain (directly or indirectly) indexes to strings that the
36 resolver uses to convert the instruction stream index into a pointer to
37 the appropriate object or struct.
38 
39 For example, an invoke-virtual instruction needs to specify which method
40 is to be invoked.  The method constant indexes into the method_id_item
41 array, each entry of which has indexes that specify the defining class
42 (type_id_item), method name (string_id_item), and method prototype
43 (proto_id_item).  The type_id_item just holds an index to a string_id_item,
44 which holds the file offset to the string with the class name.  The VM
45 finds the class by name, then searches through the class' table of virtual
46 methods to find one with a matching name and prototype.
47 
48 This process is fairly expensive, so after the first time it completes
49 successfully, the VM records that the method index resolved to a specific
50 Method struct.  On subsequent execution, the VM just pulls the Method ptr
51 out of the resolved-methods array.  A similar approach is used with
52 the indexes for classes, fields, and string constants.
53 
54 The problem with this approach is that we need to have a "resolved" entry
55 for every possible class, method, field, and string constant in every
56 DEX file, even if some of those aren't used from code.  The DEX string
57 constant table has entries for method prototypes and class names that are
58 never used by the code, and "public static final" fields often turn into
59 immediate constants.  The resolution table entries are only 4 bytes each,
60 but there are roughly 200,000 of them in the bootstrap classes alone.
61 
62 DEX optimization removes many index references by replacing virtual method
63 indexes with vtable offsets and instance field indexes with byte offsets.
64 In the earlier example, the method would be resolved at "dexopt" time, and
65 the instruction rewritten as invoke-virtual-quick with the vtable offset.
66 
67 (There are comparatively few classes compared to other constant pool
68 entries, and a much higher percentage (typically 60-70%) are used.  The
69 biggest gains come from the string pool.)
70 
71 Using the resolved-entity tables provides a substantial performance
72 improvement, but results in applications allocating 1MB+ of tables that
73 are 70% unused.  The used and unused entries are freely intermixed,
74 preventing effective sharing with the zygote process, and resulting in
75 large numbers of private/dirty pages on the native heap as the tables
76 populate on first use.
77 
78 The trick is to reduce the memory usage without decreasing performance.
79 Using smaller resolved-entity tables can actually give us a speed boost,
80 because we'll have a smaller "live" set of pages and make more effective
81 use of the data cache.
82 
83 
84 The approach we're going to use is to determine the set of indexes that
85 could potentially be resolved, generate a mapping from the minimal set to
86 the full set, and append the mapping to the DEX file.  This is done at
87 "dexopt" time, because we need to keep the changes in shared/read-only
88 pages or we'll lose the benefits of doing the work.
89 
90 There are two ways to create and use the new mapping:
91 
92  (1) Write the entire full->minimal mapping to the ".odex" file.  On every
93  instruction that uses an index, use the mapping to determine the
94  "compressed" constant value, and then use that to index into the
95  resolved-entity tables on the heap.  The instruction stream is unchanged,
96  and the resolver can easily tell if a given index is cacheable.
97 
98  (2) Write the inverse miminal->full mapping to the ".odex" file, and
99  rewrite the constants in the instruction stream.  The interpreter is
100  unchanged, and the resolver code uses the mapping to find the original
101  data in the DEX.
102 
103 Approach #1 is easier and safer to implement, but it requires a table
104 lookup every time we execute an instruction that includes a constant
105 pool reference.  This causes an unacceptable performance hit, chiefly
106 because we're hitting semi-random memory pages and hosing the data cache.
107 This is mitigated somewhat by DEX optimizations that replace the constant
108 with a vtable index or field byte offset.  Approach #1 also requires
109 a larger map table, increasing the size of the DEX on disk.  One nice
110 property of approach #1 is that most of the DEX file is unmodified,
111 so use of the mapping is a runtime decision.
112 
113 Approach #2 is preferred for performance reasons.
114 
115 
116 The class/method/field/string resolver code has to handle indices from
117 three sources: interpreted instructions, annotations, and exception
118 "catch" lists.  Sometimes these occur indirectly, e.g. we need to resolve
119 the declaring class associated with fields and methods when the latter
120 two are themselves resolved.  Parsing and rewriting instructions is fairly
121 straightforward, but annotations use a complex format with variable-width
122 index values.
123 
124 We can safely rewrite index values in annotations if we guarantee that the
125 new value is smaller than the original.  This implies a two-pass approach:
126 the first determines the set of indexes actually used, the second does the
127 rewrite.  Doing the rewrite in a single pass would be much harder.
128 
129 Instances of the "original" indices will still be found in the file; if
130 we try to be all-inclusive we will include some stuff that doesn't need
131 to be there (e.g. we don't generally need to cache the class name string
132 index result, since once we have the class resolved we don't need to look
133 it up by name through the resolver again).  There is some potential for
134 performance improvement by caching more than we strictly need, but we can
135 afford to give up a little performance during class loading if it allows
136 us to regain some memory.
137 
138 For safety and debugging, it's useful to distinguish the "compressed"
139 constants in some way, e.g. setting the high bit when we rewrite them.
140 In practice we don't have any free bits: indexes are usually 16-bit
141 values, and we have more than 32,767 string constants in at least one of
142 our core DEX files.  Also, this does not work with constants embedded in
143 annotations, because of the variable-width encoding.
144 
145 We should be safe if we can establish a clear distinction between sources
146 of "original" and "compressed" indices.  If the values get crossed up we
147 can end up with elusive bugs.  The easiest approach is to declare that
148 only indices pulled from certain locations (the instruction stream and/or
149 annotations) are compressed.  This prevents us from adding indices in
150 arbitrary locations to the compressed set, but should allow a reasonably
151 robust implementation.
152 
153 
154 Further implementation thoughts:
155 
156  - We don't have to do annotations in the first pass.  At heart the
157    resolved entity cache is a performance optimization, not necessary for
158    correctness, and we're not making annotation performance a priority
159    at this stage.
160  - The most important "fast path" is instruction processing.  Everything
161    else can do additional work without having a measurable impact.
162    However...
163  - We need to keep an eye on uncached resolves to ensure that we haven't
164    introduced noticeable performance losses.  In particular, the use of
165    runtime annotations with string constants may suffer if we don't include
166    annotation rewriting in the solution.
167  - We can have separate resolver functions for "original" and "compressed"
168    indices.  This way we don't have to add a flag argument to the resolver
169    functions (which would require passing an additional parameter in from
170    the interpreter).
171  - The VM spec has some specific things to say about string constant
172    equality and interning.  Index compression should have no effect on
173    that; we just change how long it takes to find the interned string in
174    certain circumstances.  The impact can be mitigated somewhat by
175    improving the performance of the interned string table code.
176  - This can make e.g. method resolution slower.  The method_id_item has
177    an index to a method name string, and we will no longer cache the
178    result of resolving that string.  This impacts resolution of any method
179    with the same name as a previously-resolved method.
180  - We may need to tweak the tools, particularly "dexdump", to show the
181    translated values.
182  - We can use 16-bit values in the mapping table, since we should have
183    fewer than 2^16 remapped entries.  If we overflow we can skip the remap
184    for that table or for the entire DEX file.  The resolver will need to
185    check for the existence of the table to determine whether or not entries
186    must be remapped.  The cost of the extra check is acceptable for
187    approach #2, since it's only at resolve time, but may be undesirable
188    for approach #1.
189 */
190 /*
191 Output Formats
192 
193 There are two possible output formats, from which we choose based on how
194 we plan to take advantage of the remapped constants.  At most one of these
195 will appear in the DEX.
196 
197 NOTE: if EIXM appears in the DEX, the VM *must* be configured with
198 DVM_RESOLVER_CACHE=DVM_RC_EXPANDING (2).  Otherwise the constants we
199 pull from the instruction stream will be wrong and we will fail quickly.
200 
201 For approach #1: map from original indices to the reduced set.
202 
203   This includes the four "mapToNew" tables.
204 
205   Format (RIXM):
206    u4 classCount            // #of entries in classMap[]; == typeIdsSize
207    u4 reducedClassCount     // #of entries in remapped table (for alloc)
208    u2 classMap[]
209    u4 methodCount
210    u4 reducedMethodCount
211    u2 methodMap[]
212    u4 fieldCount
213    u4 reducedFieldCount
214    u2 fieldMap[]
215    u4 stringCount
216    u4 reducedStringCount
217    u2 stringMap[]
218 
219 For approach #2: map from the reduced set back to the originals.
220 
221   This includes the four "mapToOld" tables.
222 
223   Format (EIXM):
224    u4 classCount            // #of entries in classMap[]; post-reduction
225    u2 classMap[]
226    u4 methodCount
227    u2 methodMap[]
228    u4 fieldCount
229    u2 fieldMap[]
230    u4 stringCount
231    u2 stringMap[]
232 
233 The arrays are padded so that the "count" values are always aligned on
234 32-bit boundaries.  All multi-byte values are in native host order.
235 */
236 
237 
238 /*
239  * Gather results from the post-optimization instruction scan.
240  */
241 typedef struct ScanResults {
242     /* output */
243     BitVector*  usedClasses;
244     BitVector*  usedMethods;
245     BitVector*  usedFields;
246     BitVector*  usedStrings;
247 } ScanResults;
248 
249 /* prototype for the for-all-methods function */
250 typedef void (AllMethodsFunc)(DexFile* pDexFile, const char* classDescriptor,
251     DexMethod* pDexMethod, void* arg);
252 
253 
254 /*
255  * Free scan results.
256  */
freeScanResults(ScanResults * pResults)257 static void freeScanResults(ScanResults* pResults)
258 {
259     if (pResults == NULL)
260         return;
261 
262     dvmFreeBitVector(pResults->usedClasses);
263     dvmFreeBitVector(pResults->usedMethods);
264     dvmFreeBitVector(pResults->usedFields);
265     dvmFreeBitVector(pResults->usedStrings);
266     free(pResults);
267 }
268 
269 /*
270  * Allocate storage for the results of the instruction scan.
271  */
allocScanResults(const DexFile * pDexFile)272 static ScanResults* allocScanResults(const DexFile* pDexFile)
273 {
274     ScanResults* pResults;
275     const DexHeader* pHeader = pDexFile->pHeader;
276 
277     pResults = (ScanResults*) calloc(1, sizeof(ScanResults));
278     if (pResults == NULL)
279         return NULL;
280 
281     pResults->usedClasses = dvmAllocBitVector(pHeader->typeIdsSize, false);
282     pResults->usedMethods = dvmAllocBitVector(pHeader->methodIdsSize, false);
283     pResults->usedFields = dvmAllocBitVector(pHeader->fieldIdsSize, false);
284     pResults->usedStrings = dvmAllocBitVector(pHeader->stringIdsSize, false);
285 
286     if (pResults->usedClasses == NULL ||
287         pResults->usedMethods == NULL ||
288         pResults->usedFields == NULL ||
289         pResults->usedStrings == NULL)
290     {
291         freeScanResults(pResults);
292         return NULL;
293     }
294 
295     return pResults;
296 }
297 
298 /*
299  * Call "func(method, arg)" on all methods in the specified class.
300  *
301  * Pass in a pointer to the class_data_item, positioned at the start of
302  * the field data (i.e. just past the class data header).
303  *
304  * "classDescriptor" is for debug messages.
305  */
forAllMethodsInClass(DexFile * pDexFile,const u1 ** ppEncodedData,const DexClassDataHeader * pHeader,const char * classDescriptor,AllMethodsFunc func,void * arg)306 static void forAllMethodsInClass(DexFile* pDexFile, const u1** ppEncodedData,
307     const DexClassDataHeader* pHeader, const char* classDescriptor,
308     AllMethodsFunc func, void* arg)
309 {
310     int i;
311 
312     /*
313      * Consume field data.
314      */
315     if (pHeader->staticFieldsSize != 0) {
316         int count = (int) pHeader->staticFieldsSize;
317         u4 lastIndex = 0;
318         DexField field;
319         for (i = 0; i < count; i++) {
320             dexReadClassDataField(ppEncodedData, &field, &lastIndex);
321         }
322     }
323     if (pHeader->instanceFieldsSize != 0) {
324         int count = (int) pHeader->instanceFieldsSize;
325         u4 lastIndex = 0;
326         DexField field;
327         for (i = 0; i < count; i++) {
328             dexReadClassDataField(ppEncodedData, &field, &lastIndex);
329         }
330     }
331 
332     /*
333      * Run through all methods.
334      */
335     if (pHeader->directMethodsSize != 0) {
336         int count = (int) pHeader->directMethodsSize;
337         u4 lastIndex = 0;
338         DexMethod method;
339 
340         for (i = 0; i < count; i++) {
341             dexReadClassDataMethod(ppEncodedData, &method, &lastIndex);
342             (func)(pDexFile, classDescriptor, &method, arg);
343         }
344     }
345     if (pHeader->virtualMethodsSize != 0) {
346         int count = (int) pHeader->virtualMethodsSize;
347         u4 lastIndex = 0;
348         DexMethod method;
349 
350         for (i = 0; i < count; i++) {
351             dexReadClassDataMethod(ppEncodedData, &method, &lastIndex);
352             (func)(pDexFile, classDescriptor, &method, arg);
353         }
354     }
355 }
356 
357 /*
358  * Call "func(method, arg)" on all methods in all classes in DexFile.
359  */
forAllMethods(DexFile * pDexFile,AllMethodsFunc func,void * arg)360 static void forAllMethods(DexFile* pDexFile, AllMethodsFunc func, void* arg)
361 {
362     u4 count = pDexFile->pHeader->classDefsSize;
363     u4 idx;
364 
365     for (idx = 0; idx < count; idx++) {
366         const DexClassDef* pClassDef;
367         DexClassDataHeader header;
368         const u1* pEncodedData;
369 
370         pClassDef = dexGetClassDef(pDexFile, idx);
371         pEncodedData = dexGetClassData(pDexFile, pClassDef);
372 
373         const char* classDescriptor;
374         classDescriptor = dexStringByTypeIdx(pDexFile, pClassDef->classIdx);
375 
376         if (pEncodedData != NULL) {
377             dexReadClassDataHeader(&pEncodedData, &header);
378 
379             forAllMethodsInClass(pDexFile, &pEncodedData, &header,
380                 classDescriptor, func, arg);
381         } else {
382             //printf("%s: no class data\n", classDescriptor);
383             /* no class data, e.g. "marker interface" */
384         }
385     }
386 }
387 
388 /*
389  * Mark a class ID as referenced.
390  */
markClass(const u2 * ptr,ScanResults * pResults)391 static void markClass(const u2* ptr, ScanResults* pResults)
392 {
393     u2 classIdx = *ptr;
394     if (!dvmSetBit(pResults->usedClasses, classIdx)) {
395         LOGE("Unable to mark class %d as in-use\n", classIdx);
396     }
397 }
398 
399 /*
400  * Mark a method ID as referenced.
401  */
markMethod(const u2 * ptr,ScanResults * pResults)402 static void markMethod(const u2* ptr, ScanResults* pResults)
403 {
404     u2 methodIdx = *ptr;
405     if (!dvmSetBit(pResults->usedMethods, methodIdx)) {
406         LOGE("Unable to mark method %d as in-use\n", methodIdx);
407     }
408 }
409 
410 /*
411  * Mark a field ID as referenced.
412  */
markField(const u2 * ptr,ScanResults * pResults)413 static void markField(const u2* ptr, ScanResults* pResults)
414 {
415     u2 fieldIdx = *ptr;
416     if (!dvmSetBit(pResults->usedFields, fieldIdx)) {
417         LOGE("Unable to mark field %d as in-use\n", fieldIdx);
418     }
419 }
420 
421 /*
422  * Mark a string constant as referenced.
423  */
markString(const u2 * ptr,ScanResults * pResults)424 static void markString(const u2* ptr, ScanResults* pResults)
425 {
426     u2 stringIdx = *ptr;
427     if (!dvmSetBit(pResults->usedStrings, stringIdx)) {
428         LOGE("Unable to mark string %d as in-use\n", stringIdx);
429     }
430 }
431 
432 /*
433  * Mark a "jumbo" string constant as referenced.
434  */
markJumboString(u2 * ptr,ScanResults * pResults)435 static void markJumboString(u2* ptr, ScanResults* pResults)
436 {
437     u4 stringIdx;
438 
439     /* it's in native byte order, but might not be 32-bit aligned */
440     memcpy(&stringIdx, ptr, sizeof(u4));
441     if (!dvmSetBit(pResults->usedStrings, stringIdx)) {
442         LOGE("Unable to mark string %d as in-use\n", stringIdx);
443     }
444 }
445 
446 /*
447  * Remap a value in the instruction stream.
448  */
updateValue(u2 * ptr,const IndexMapSet * pIndexMapSet,int whichMap)449 static inline void updateValue(u2* ptr, const IndexMapSet* pIndexMapSet,
450     int whichMap)
451 {
452     const IndexMap* pMap = &pIndexMapSet->map[whichMap];
453     if (pMap != NULL) {
454         u2 newIdx = pMap->mapToNew[*ptr];
455         assert(newIdx != kNoIndexMapping);
456         *ptr = newIdx;
457     }
458 }
updateClass(u2 * ptr,const IndexMapSet * pIndexMapSet)459 static void updateClass(u2* ptr, const IndexMapSet* pIndexMapSet)
460 {
461     updateValue(ptr, pIndexMapSet, kMapClasses);
462 }
updateMethod(u2 * ptr,const IndexMapSet * pIndexMapSet)463 static void updateMethod(u2* ptr, const IndexMapSet* pIndexMapSet)
464 {
465     updateValue(ptr, pIndexMapSet, kMapMethods);
466 }
updateField(u2 * ptr,const IndexMapSet * pIndexMapSet)467 static void updateField(u2* ptr, const IndexMapSet* pIndexMapSet)
468 {
469     updateValue(ptr, pIndexMapSet, kMapFields);
470 }
updateString(u2 * ptr,const IndexMapSet * pIndexMapSet)471 static void updateString(u2* ptr, const IndexMapSet* pIndexMapSet)
472 {
473     updateValue(ptr, pIndexMapSet, kMapStrings);
474 }
updateJumboString(u2 * ptr,const IndexMapSet * pIndexMapSet)475 static void updateJumboString(u2* ptr, const IndexMapSet* pIndexMapSet)
476 {
477     u4 stringIdx;
478     u4 newIdx;
479 
480     /* it's in native byte order, but might not be 32-bit aligned */
481     memcpy(&stringIdx, ptr, sizeof(stringIdx));
482 
483     /* get new value */
484     newIdx = pIndexMapSet->map[kMapStrings].mapToNew[*ptr];
485     assert(newIdx != kNoIndexMapping);
486 
487     /* copy it out */
488     memcpy(ptr, &newIdx, sizeof(newIdx));
489 }
490 
491 /*
492  * Run through an instructions stream, marking constants as we see them.
493  *
494  * If "pResults" is non-NULL, we populate "pResults" with what we find,
495  * making no changes to the instruction stream.
496  *
497  * If "pIndexMapSet" is non-NULL, we rewrite the constants in the
498  * instruction stream.
499  */
markUsedConstantsFromInsns(u2 * insns,u4 insnsSize,ScanResults * pResults,const IndexMapSet * pIndexMapSet)500 static void markUsedConstantsFromInsns(u2* insns, u4 insnsSize,
501     ScanResults* pResults, const IndexMapSet* pIndexMapSet)
502 {
503     //printf(" %p %u units\n", insns, insnsSize);
504 
505     while (insnsSize > 0) {
506         int width;
507         u2* pConst = insns + 1;
508 
509         switch (*insns & 0xff) {
510         case OP_IGET:
511         case OP_IGET_WIDE:
512         case OP_IGET_OBJECT:
513         case OP_IGET_BOOLEAN:
514         case OP_IGET_BYTE:
515         case OP_IGET_CHAR:
516         case OP_IGET_SHORT:
517         case OP_IPUT:
518         case OP_IPUT_WIDE:
519         case OP_IPUT_OBJECT:
520         case OP_IPUT_BOOLEAN:
521         case OP_IPUT_BYTE:
522         case OP_IPUT_CHAR:
523         case OP_IPUT_SHORT:
524         case OP_SGET:
525         case OP_SGET_WIDE:
526         case OP_SGET_OBJECT:
527         case OP_SGET_BOOLEAN:
528         case OP_SGET_BYTE:
529         case OP_SGET_CHAR:
530         case OP_SGET_SHORT:
531         case OP_SPUT:
532         case OP_SPUT_WIDE:
533         case OP_SPUT_OBJECT:
534         case OP_SPUT_BOOLEAN:
535         case OP_SPUT_BYTE:
536         case OP_SPUT_CHAR:
537         case OP_SPUT_SHORT:
538             /* instanceop vA, vB, field@CCCC */
539             /* staticop vAA, field@BBBB */
540             if (pResults != NULL)
541                 markField(pConst, pResults);
542             else
543                 updateField(pConst, pIndexMapSet);
544             break;
545 
546         case OP_CONST_STRING:
547             /* const-string vAA, string@BBBB */
548             if (pResults != NULL)
549                 markString(pConst, pResults);
550             else
551                 updateString(pConst, pIndexMapSet);
552             break;
553 
554         case OP_CONST_STRING_JUMBO:
555             /* const-string/jumbo vAA, string@BBBBBBBB */
556             if (pResults != NULL)
557                 markJumboString(pConst, pResults);
558             else
559                 updateJumboString(pConst, pIndexMapSet);
560             break;
561 
562         case OP_CONST_CLASS:
563         case OP_CHECK_CAST:
564         case OP_NEW_INSTANCE:
565         case OP_FILLED_NEW_ARRAY_RANGE:
566         case OP_INSTANCE_OF:
567         case OP_NEW_ARRAY:
568         case OP_FILLED_NEW_ARRAY:
569             /* const-class vAA, type@BBBB */
570             /* check-cast vAA, type@BBBB */
571             /* new-instance vAA, type@BBBB */
572             /* filled-new-array/range {vCCCC .. vNNNN}, type@BBBB */
573             /* instance-of vA, vB, type@CCCC */
574             /* new-array vA, vB, type@CCCC */
575             /* filled-new-array {vD, vE, vF, vG, vA}, type@CCCC */
576             if (pResults != NULL)
577                 markClass(pConst, pResults);
578             else
579                 updateClass(pConst, pIndexMapSet);
580             break;
581 
582         case OP_INVOKE_VIRTUAL:
583         case OP_INVOKE_SUPER:
584         case OP_INVOKE_DIRECT:
585         case OP_INVOKE_STATIC:
586         case OP_INVOKE_INTERFACE:
587         case OP_INVOKE_VIRTUAL_RANGE:
588         case OP_INVOKE_SUPER_RANGE:
589         case OP_INVOKE_DIRECT_RANGE:
590         case OP_INVOKE_STATIC_RANGE:
591         case OP_INVOKE_INTERFACE_RANGE:
592             /* invoke-kind {vD, vE, vF, vG, vA}, meth@CCCC */
593             /* invoke-kind/range {vCCCC .. vNNNN}, meth@BBBB */
594             if (pResults != NULL)
595                 markMethod(pConst, pResults);
596             else
597                 updateMethod(pConst, pIndexMapSet);
598             break;
599 
600         default:
601             // ignore this instruction
602             ;
603         }
604 
605         width = dexGetInstrOrTableWidthAbs(gDvm.instrWidth, insns);
606         assert(width > 0 && width <= (int)insnsSize);
607 
608         insns += width;
609         insnsSize -= width;
610     }
611 }
612 
613 /*
614  * This is an AllMethodsFunc implementation.
615  *
616  * Run through the instructions in this method, setting bits in the "pResults"
617  * struct as we locate constants.
618  */
markUsedConstants(DexFile * pDexFile,const char * classDescriptor,DexMethod * pDexMethod,void * arg)619 static void markUsedConstants(DexFile* pDexFile, const char* classDescriptor,
620     DexMethod* pDexMethod, void* arg)
621 {
622     ScanResults* pResults = (ScanResults*) arg;
623     const DexCode* pDexCode = dexGetCode(pDexFile, pDexMethod);
624 
625     if (false) {
626         const DexMethodId* pMethodId;
627         const char* methodName;
628         pMethodId = dexGetMethodId(pDexFile, pDexMethod->methodIdx);
629         methodName = dexStringById(pDexFile, pMethodId->nameIdx);
630         printf(" %s.%s\n", classDescriptor, methodName);
631     }
632 
633     if (pDexCode != NULL) {
634         u2* insns = (u2*) pDexCode->insns;
635         u4 insnsSize = pDexCode->insnsSize;
636         markUsedConstantsFromInsns(insns, insnsSize, pResults, NULL);
637     } else {
638         //printf(" (no code)\n");
639     }
640 }
641 
642 /*
643  * This is an AllMethodsFunc implementation.
644  *
645  * Run through the instructions in this method, altering the constants used.
646  */
updateUsedConstants(DexFile * pDexFile,const char * classDescriptor,DexMethod * pDexMethod,void * arg)647 static void updateUsedConstants(DexFile* pDexFile, const char* classDescriptor,
648     DexMethod* pDexMethod, void* arg)
649 {
650     const IndexMapSet* pIndexMapSet = (const IndexMapSet*) arg;
651     const DexCode* pDexCode = dexGetCode(pDexFile, pDexMethod);
652 
653     if (false) {
654         const DexMethodId* pMethodId;
655         const char* methodName;
656         pMethodId = dexGetMethodId(pDexFile, pDexMethod->methodIdx);
657         methodName = dexStringById(pDexFile, pMethodId->nameIdx);
658         printf(" %s.%s\n", classDescriptor, methodName);
659     }
660 
661     if (pDexCode != NULL) {
662         u2* insns = (u2*) pDexCode->insns;
663         u4 insnsSize = pDexCode->insnsSize;
664         markUsedConstantsFromInsns(insns, insnsSize, NULL, pIndexMapSet);
665     } else {
666         //printf(" (no code)\n");
667     }
668 }
669 
670 /*
671  * Count up the bits and show a count.
672  */
showBitCount(const char * label,int setCount,int maxCount)673 static void showBitCount(const char* label, int setCount, int maxCount)
674 {
675     printf("%s: %d of %d (%.1f%% unused)\n", label, setCount, maxCount,
676         ((maxCount - setCount) * 100.0f) / maxCount);
677 }
678 
679 /*
680  * Print some debug info.
681  */
summarizeResults(DvmDex * pDvmDex,ScanResults * pResults)682 static void summarizeResults(DvmDex* pDvmDex, ScanResults* pResults)
683 {
684     DexFile* pDexFile = pDvmDex->pDexFile;
685     int i;
686 
687 #if 0
688     for (i = 0; i < (int) pDvmDex->pDexFile->pHeader->typeIdsSize; i++) {
689         const DexTypeId* pDexTypeId;
690         const char* classDescr;
691 
692         pDexTypeId = dexGetTypeId(pDexFile, i);
693         classDescr = dexStringById(pDexFile, pDexTypeId->descriptorIdx);
694 
695         if (dvmIsBitSet(pResults->usedStrings, i))
696             printf("used  : %04x '%s'\n", i, classDescr);
697         else
698             printf("unused: %04x '%s'\n", i, classDescr);
699     }
700 #endif
701 #if 0
702     for (i = 0; i < (int) pDvmDex->pDexFile->pHeader->methodIdsSize; i++) {
703         const DexMethodId* pDexMethodId;
704         const DexTypeId* pDexTypeId;
705         const char* classDescr;
706         const char* methodName;
707 
708         pDexMethodId = dexGetMethodId(pDexFile, i);
709         methodName = dexStringById(pDexFile, pDexMethodId->nameIdx);
710 
711         pDexTypeId = dexGetTypeId(pDexFile, pDexMethodId->classIdx);
712         classDescr = dexStringById(pDexFile, pDexTypeId->descriptorIdx);
713         if (dvmIsBitSet(pResults->usedMethods, i))
714             printf("used  : %s.%s\n", classDescr, methodName);
715         else
716             printf("unused: %s.%s\n", classDescr, methodName);
717     }
718 #endif
719 #if 0
720     for (i = 0; i < (int) pDvmDex->pDexFile->pHeader->fieldIdsSize; i++) {
721         const DexFieldId* pDexFieldId;
722         const DexTypeId* pDexTypeId;
723         const char* classDescr;
724         const char* fieldName;
725 
726         pDexFieldId = dexGetFieldId(pDexFile, i);
727         fieldName = dexStringById(pDexFile, pDexFieldId->nameIdx);
728 
729         pDexTypeId = dexGetTypeId(pDexFile, pDexFieldId->classIdx);
730         classDescr = dexStringById(pDexFile, pDexTypeId->descriptorIdx);
731         if (dvmIsBitSet(pResults->usedFields, i))
732             printf("used  : %s.%s\n", classDescr, fieldName);
733         else
734             printf("unused: %s.%s\n", classDescr, fieldName);
735     }
736 #endif
737 #if 0
738     for (i = 0; i < (int) pDvmDex->pDexFile->pHeader->stringIdsSize; i++) {
739         const char* str;
740 
741         str = dexStringById(pDexFile, i);
742 
743         if (dvmIsBitSet(pResults->usedStrings, i))
744             printf("used  : %04x '%s'\n", i, str);
745         else
746             printf("unused: %04x '%s'\n", i, str);
747     }
748 #endif
749 
750     int totalMax, totalSet;
751     int setCount;
752 
753     totalMax = totalSet = 0;
754 
755     setCount = dvmCountSetBits(pResults->usedClasses);
756     showBitCount("classes", setCount, pDexFile->pHeader->typeIdsSize);
757     totalSet += setCount;
758     totalMax += pDexFile->pHeader->typeIdsSize;
759 
760     setCount = dvmCountSetBits(pResults->usedMethods);
761     showBitCount("methods", setCount, pDexFile->pHeader->methodIdsSize);
762     totalSet += setCount;
763     totalMax += pDexFile->pHeader->methodIdsSize;
764 
765     setCount = dvmCountSetBits(pResults->usedFields);
766     showBitCount("fields",  setCount, pDexFile->pHeader->fieldIdsSize);
767     totalSet += setCount;
768     totalMax += pDexFile->pHeader->fieldIdsSize;
769 
770     setCount = dvmCountSetBits(pResults->usedStrings);
771     showBitCount("strings", setCount, pDexFile->pHeader->stringIdsSize);
772     totalSet += setCount;
773     totalMax += pDexFile->pHeader->stringIdsSize;
774 
775     printf("TOTAL %d of %d (%.1f%% unused -- %.1fK)\n", totalSet, totalMax,
776         ((totalMax - totalSet) * 100.0f) / totalMax,
777         (totalMax - totalSet) / 256.0f);
778 }
779 
780 /*
781  * Fill out an index map set entry.
782  *
783  * If we can't fit the map into our base type, we don't create the map.
784  *
785  * Returns "false" if allocation fails.
786  */
constructIndexMap(int totalCount,const BitVector * pBits,IndexMap * pMap)787 static bool constructIndexMap(int totalCount, const BitVector* pBits,
788     IndexMap* pMap)
789 {
790     const int kMaxIndex = 65534;        // 65535, a/k/a -1, is special
791     int setCount;
792 
793     setCount = dvmCountSetBits(pBits);
794     if (setCount < 0 || setCount > kMaxIndex)
795         return true;
796 
797     u2* mapToOld = (u2*) malloc(setCount * sizeof(u2));
798     u2* mapToNew = (u2*) malloc(totalCount * sizeof(u2));
799     if (mapToOld == NULL || mapToNew == NULL) {
800         free(mapToOld);
801         free(mapToNew);
802         return false;
803     }
804 
805     /* fill in both arrays */
806     int entry, idx = 0;
807     for (entry = 0; entry < totalCount; entry++) {
808         if (dvmIsBitSet(pBits, entry)) {
809             mapToNew[entry] = idx;
810             mapToOld[idx] = entry;
811             idx++;
812         } else {
813             mapToNew[entry] = kNoIndexMapping;
814         }
815     }
816 
817     if (idx != setCount) {
818         LOGE("GLITCH: idx=%d setCount=%d\n", idx, setCount);
819         dvmAbort();
820     }
821 
822     /* success */
823     pMap->mapToOld = mapToOld;
824     pMap->mapToNew = mapToNew;
825     pMap->origCount = totalCount;
826     pMap->newCount = setCount;
827 
828     return true;
829 }
830 
831 /*
832  * Construct a "reducing" chunk, with maps that convert the constants in
833  * instructions to their reduced value for the cache lookup.
834  */
constructReducingDataChunk(IndexMapSet * pIndexMapSet)835 static bool constructReducingDataChunk(IndexMapSet* pIndexMapSet)
836 {
837     int chunkLen = 0;
838     int i;
839 
840     pIndexMapSet->chunkType = kDexChunkReducingIndexMap;
841 
842     /*
843      * Compute space requirements and allocate storage.
844      */
845     for (i = 0; i < kNumIndexMaps; i++) {
846         /* space for the "original" count */
847         chunkLen += sizeof(u4);
848 
849         /* space for the "reduced" count */
850         chunkLen += sizeof(u4);
851 
852         /* add data length, round up to 32-bit boundary */
853         chunkLen += pIndexMapSet->map[i].origCount * sizeof(u2);
854         chunkLen = (chunkLen + 3) & ~3;
855     }
856 
857     pIndexMapSet->chunkDataLen = chunkLen;
858     pIndexMapSet->chunkData = (u1*) calloc(1, chunkLen);
859     if (pIndexMapSet->chunkData == NULL)
860         return false;
861 
862     /*
863      * Copy the data in.
864      */
865     u1* ptr = pIndexMapSet->chunkData;
866     for (i = 0; i < kNumIndexMaps; i++) {
867         u4* wordPtr = (u4*) ptr;
868         int dataLen = pIndexMapSet->map[i].origCount * sizeof(u2);
869 
870         *wordPtr++ = pIndexMapSet->map[i].origCount;
871         *wordPtr++ = pIndexMapSet->map[i].newCount;
872         if (dataLen != 0)
873             memcpy(wordPtr, pIndexMapSet->map[i].mapToNew, dataLen);
874 
875         /* advance pointer, maintaining 32-bit alignment */
876         ptr = ((u1*) wordPtr) + dataLen;
877         ptr = (u1*) (((int) ptr + 3) & ~3);
878     }
879 
880     if (ptr - (u1*) pIndexMapSet->chunkData != chunkLen) {
881         LOGE("GLITCH: expected len=%d, actual=%d\n",
882             chunkLen, ptr - (u1*) pIndexMapSet->chunkData);
883         dvmAbort();
884     }
885 
886     return true;
887 }
888 
889 /*
890  * Construct an "expanding" chunk, with maps that convert instructions
891  * with reduced constants back to their full original values.
892  */
constructExpandingDataChunk(IndexMapSet * pIndexMapSet)893 static bool constructExpandingDataChunk(IndexMapSet* pIndexMapSet)
894 {
895     int chunkLen = 0;
896     int i;
897 
898     pIndexMapSet->chunkType = kDexChunkExpandingIndexMap;
899 
900     /*
901      * Compute space requirements and allocate storage.
902      */
903     for (i = 0; i < kNumIndexMaps; i++) {
904         /* space for the length word */
905         chunkLen += sizeof(u4);
906 
907         /* add data length, round up to 32-bit boundary */
908         chunkLen += pIndexMapSet->map[i].newCount * sizeof(u2);
909         chunkLen = (chunkLen + 3) & ~3;
910     }
911 
912     pIndexMapSet->chunkDataLen = chunkLen;
913     pIndexMapSet->chunkData = (u1*) calloc(1, chunkLen);
914     if (pIndexMapSet->chunkData == NULL)
915         return false;
916 
917     /*
918      * Copy the data in.
919      */
920     u1* ptr = pIndexMapSet->chunkData;
921     for (i = 0; i < kNumIndexMaps; i++) {
922         u4* wordPtr = (u4*) ptr;
923         int dataLen = pIndexMapSet->map[i].newCount * sizeof(u2);
924 
925         *wordPtr++ = pIndexMapSet->map[i].newCount;
926         if (dataLen != 0)
927             memcpy(wordPtr, pIndexMapSet->map[i].mapToOld, dataLen);
928 
929         /* advance pointer, maintaining 32-bit alignment */
930         ptr = ((u1*) wordPtr) + dataLen;
931         ptr = (u1*) (((int) ptr + 3) & ~3);
932     }
933 
934     if (ptr - (u1*) pIndexMapSet->chunkData != chunkLen) {
935         LOGE("GLITCH: expected len=%d, actual=%d\n",
936             chunkLen, ptr - (u1*) pIndexMapSet->chunkData);
937         dvmAbort();
938     }
939 
940     return true;
941 }
942 
943 /*
944  * Construct the "chunk" of data that will be appended to the optimized DEX
945  * file.
946  */
constructDataChunk(IndexMapSet * pIndexMapSet)947 static bool constructDataChunk(IndexMapSet* pIndexMapSet)
948 {
949     assert(sizeof(pIndexMapSet->map[0].mapToOld[0]) == sizeof(u2));
950     assert(sizeof(pIndexMapSet->map[0].mapToNew[0]) == sizeof(u2));
951 
952 #if DVM_RESOLVER_CACHE == DVM_RC_EXPANDING
953     return constructExpandingDataChunk(pIndexMapSet);
954 #else
955     return constructReducingDataChunk(pIndexMapSet);
956 #endif
957 }
958 
959 /*
960  * Allocate storage to hold the maps.
961  */
createIndexMapSet(const DexFile * pDexFile,ScanResults * pResults)962 static IndexMapSet* createIndexMapSet(const DexFile* pDexFile,
963     ScanResults* pResults)
964 {
965     IndexMapSet* pIndexMapSet;
966     int setCount;
967     bool okay = true;
968 
969     pIndexMapSet = calloc(1, sizeof(*pIndexMapSet));
970     if (pIndexMapSet == NULL)
971         return NULL;
972 
973     okay = okay && constructIndexMap(pDexFile->pHeader->typeIdsSize,
974             pResults->usedClasses, &pIndexMapSet->map[kMapClasses]);
975     okay = okay && constructIndexMap(pDexFile->pHeader->methodIdsSize,
976             pResults->usedMethods, &pIndexMapSet->map[kMapMethods]);
977     okay = okay && constructIndexMap(pDexFile->pHeader->fieldIdsSize,
978             pResults->usedFields, &pIndexMapSet->map[kMapFields]);
979     okay = okay && constructIndexMap(pDexFile->pHeader->stringIdsSize,
980             pResults->usedStrings, &pIndexMapSet->map[kMapStrings]);
981 
982     LOGVV("Constr: %d %d %d %d\n",
983         pIndexMapSet->map[kMapClasses].mapToOld[0],
984         pIndexMapSet->map[kMapMethods].mapToOld[0],
985         pIndexMapSet->map[kMapFields].mapToOld[0],
986         pIndexMapSet->map[kMapStrings].mapToOld[0]);
987 
988     okay = okay && constructDataChunk(pIndexMapSet);
989 
990     if (!okay) {
991         dvmFreeIndexMapSet(pIndexMapSet);
992         return NULL;
993     }
994 
995     return pIndexMapSet;
996 }
997 
998 /*
999  * Free map storage.
1000  *
1001  * "pIndexMapSet" may be incomplete.
1002  */
dvmFreeIndexMapSet(IndexMapSet * pIndexMapSet)1003 void dvmFreeIndexMapSet(IndexMapSet* pIndexMapSet)
1004 {
1005     int i;
1006 
1007     if (pIndexMapSet == NULL)
1008         return;
1009 
1010     for (i = 0; i < kNumIndexMaps; i++) {
1011         free(pIndexMapSet->map[i].mapToOld);
1012         free(pIndexMapSet->map[i].mapToNew);
1013     }
1014     free(pIndexMapSet->chunkData);
1015     free(pIndexMapSet);
1016 }
1017 
1018 /*
1019  * Rewrite constant indexes to reduce heap requirements.
1020  */
dvmRewriteConstants(DvmDex * pDvmDex)1021 IndexMapSet* dvmRewriteConstants(DvmDex* pDvmDex)
1022 {
1023 #if (DVM_RESOLVER_CACHE != DVM_RC_REDUCING) && \
1024     (DVM_RESOLVER_CACHE != DVM_RC_EXPANDING)
1025     /* nothing to do */
1026     return NULL;
1027 #endif
1028 
1029     /*
1030      * We're looking for instructions that use "constant pool" entries for
1031      * classes, methods, fields, and strings.  Many field and method entries
1032      * are optimized away, and many string constants are never accessed from
1033      * code or annotations.
1034      */
1035     ScanResults* pResults = allocScanResults(pDvmDex->pDexFile);
1036     forAllMethods(pDvmDex->pDexFile, markUsedConstants, pResults);
1037 
1038     summarizeResults(pDvmDex, pResults);
1039 
1040     /*
1041      * Allocate and populate the index maps.
1042      */
1043     IndexMapSet* pIndexMapSet = createIndexMapSet(pDvmDex->pDexFile, pResults);
1044 #if DVM_RESOLVER_CACHE == DVM_RC_EXPANDING
1045     if (pIndexMapSet != NULL) {
1046         /*
1047          * Rewrite the constants to use the reduced set.
1048          */
1049         forAllMethods(pDvmDex->pDexFile, updateUsedConstants, pIndexMapSet);
1050     }
1051 #endif
1052 
1053     freeScanResults(pResults);
1054 
1055     return pIndexMapSet;
1056 }
1057 
1058