• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1page.title=Performance Tips
2page.article=true
3@jd:body
4
5<div id="tb-wrapper">
6<div id="tb">
7
8<h2>In this document</h2>
9<ol>
10  <li><a href="#ObjectCreation">Avoid Creating Unnecessary Objects</a></li>
11  <li><a href="#PreferStatic">Prefer Static Over Virtual</a></li>
12  <li><a href="#UseFinal">Use Static Final For Constants</a></li>
13  <li><a href="#GettersSetters">Avoid Internal Getters/Setters</a></li>
14  <li><a href="#Loops">Use Enhanced For Loop Syntax</a></li>
15  <li><a href="#PackageInner">Consider Package Instead of Private Access with Private Inner Classes</a></li>
16  <li><a href="#AvoidFloat">Avoid Using Floating-Point</a></li>
17  <li><a href="#UseLibraries">Know and Use the Libraries</a></li>
18  <li><a href="#NativeMethods">Use Native Methods Carefully</a></li>
19  <li><a href="#library">Know And Use The Libraries</a></li>
20  <li><a href="#native_methods">Use Native Methods Judiciously</a></li>
21  <li><a href="#closing_notes">Closing Notes</a></li>
22</ol>
23
24</div>
25</div>
26
27<p>This document primarily covers micro-optimizations that can improve overall app performance
28when combined, but it's unlikely that these changes will result in dramatic
29performance effects. Choosing the right algorithms and data structures should always be your
30priority, but is outside the scope of this document. You should use the tips in this document
31as general coding practices that you can incorporate into your habits for general code
32efficiency.</p>
33
34<p>There are two basic rules for writing efficient code:</p>
35<ul>
36    <li>Don't do work that you don't need to do.</li>
37    <li>Don't allocate memory if you can avoid it.</li>
38</ul>
39
40<p>One of the trickiest problems you'll face when micro-optimizing an Android
41app is that your app is certain to be running on multiple types of
42hardware. Different versions of the VM running on different
43processors running at different speeds. It's not even generally the case
44that you can simply say "device X is a factor F faster/slower than device Y",
45and scale your results from one device to others. In particular, measurement
46on the emulator tells you very little about performance on any device. There
47are also huge differences between devices with and without a
48<acronym title="Just In Time compiler">JIT</acronym>: the best
49code for a device with a JIT is not always the best code for a device
50without.</p>
51
52<p>To ensure your app performs well across a wide variety of devices, ensure
53your code is efficient at all levels and agressively optimize your performance.</p>
54
55
56<h2 id="ObjectCreation">Avoid Creating Unnecessary Objects</h2>
57
58<p>Object creation is never free. A generational garbage collector with per-thread allocation
59pools for temporary objects can make allocation cheaper, but allocating memory
60is always more expensive than not allocating memory.</p>
61
62<p>As you allocate more objects in your app, you will force a periodic
63garbage collection, creating little "hiccups" in the user experience. The
64concurrent garbage collector introduced in Android 2.3 helps, but unnecessary work
65should always be avoided.</p>
66
67<p>Thus, you should avoid creating object instances you don't need to.  Some
68examples of things that can help:</p>
69
70<ul>
71    <li>If you have a method returning a string, and you know that its result
72    will always be appended to a {@link java.lang.StringBuffer} anyway, change your signature
73    and implementation so that the function does the append directly,
74    instead of creating a short-lived temporary object.</li>
75    <li>When extracting strings from a set of input data, try
76    to return a substring of the original data, instead of creating a copy.
77    You will create a new {@link java.lang.String} object, but it will share the {@code char[]}
78    with the data. (The trade-off being that if you're only using a small
79    part of the original input, you'll be keeping it all around in memory
80    anyway if you go this route.)</li>
81</ul>
82
83<p>A somewhat more radical idea is to slice up multidimensional arrays into
84parallel single one-dimension arrays:</p>
85
86<ul>
87    <li>An array of {@code int}s is a much better than an array of {@link java.lang.Integer}
88    objects,
89    but this also generalizes to the fact that two parallel arrays of ints
90    are also a <strong>lot</strong> more efficient than an array of {@code (int,int)}
91    objects.  The same goes for any combination of primitive types.</li>
92
93    <li>If you need to implement a container that stores tuples of {@code (Foo,Bar)}
94    objects, try to remember that two parallel {@code Foo[]} and {@code Bar[]} arrays are
95    generally much better than a single array of custom {@code (Foo,Bar)} objects.
96    (The exception to this, of course, is when you're designing an API for
97    other code to access. In those cases, it's usually better to make a small
98    compromise to the speed in order to achieve a good API design. But in your own internal
99    code, you should try and be as efficient as possible.)</li>
100</ul>
101
102<p>Generally speaking, avoid creating short-term temporary objects if you
103can.  Fewer objects created mean less-frequent garbage collection, which has
104a direct impact on user experience.</p>
105
106
107
108
109<h2 id="PreferStatic">Prefer Static Over Virtual</h2>
110
111<p>If you don't need to access an object's fields, make your method static.
112Invocations will be about 15%-20% faster.
113It's also good practice, because you can tell from the method
114signature that calling the method can't alter the object's state.</p>
115
116
117
118
119
120<h2 id="UseFinal">Use Static Final For Constants</h2>
121
122<p>Consider the following declaration at the top of a class:</p>
123
124<pre>
125static int intVal = 42;
126static String strVal = "Hello, world!";
127</pre>
128
129<p>The compiler generates a class initializer method, called
130<code>&lt;clinit&gt;</code>, that is executed when the class is first used.
131The method stores the value 42 into <code>intVal</code>, and extracts a
132reference from the classfile string constant table for <code>strVal</code>.
133When these values are referenced later on, they are accessed with field
134lookups.</p>
135
136<p>We can improve matters with the "final" keyword:</p>
137
138<pre>
139static final int intVal = 42;
140static final String strVal = "Hello, world!";
141</pre>
142
143<p>The class no longer requires a <code>&lt;clinit&gt;</code> method,
144because the constants go into static field initializers in the dex file.
145Code that refers to <code>intVal</code> will use
146the integer value 42 directly, and accesses to <code>strVal</code> will
147use a relatively inexpensive "string constant" instruction instead of a
148field lookup.</p>
149
150<p class="note"><strong>Note:</strong> This optimization applies only to primitive types and
151{@link java.lang.String} constants, not arbitrary reference types. Still, it's good
152practice to declare constants <code>static final</code> whenever possible.</p>
153
154
155
156
157
158<h2 id="GettersSetters">Avoid Internal Getters/Setters</h2>
159
160<p>In native languages like C++ it's common practice to use getters
161(<code>i = getCount()</code>) instead of accessing the field directly (<code>i
162= mCount</code>). This is an excellent habit for C++ and is often practiced in other
163object oriented languages like C# and Java, because the compiler can
164usually inline the access, and if you need to restrict or debug field access
165you can add the code at any time.</p>
166
167<p>However, this is a bad idea on Android.  Virtual method calls are expensive,
168much more so than instance field lookups.  It's reasonable to follow
169common object-oriented programming practices and have getters and setters
170in the public interface, but within a class you should always access
171fields directly.</p>
172
173<p>Without a <acronym title="Just In Time compiler">JIT</acronym>,
174direct field access is about 3x faster than invoking a
175trivial getter. With the JIT (where direct field access is as cheap as
176accessing a local), direct field access is about 7x faster than invoking a
177trivial getter.</p>
178
179<p>Note that if you're using <a href="{@docRoot}tools/help/proguard.html">ProGuard</a>,
180you can have the best of both worlds because ProGuard can inline accessors for you.</p>
181
182
183
184
185
186<h2 id="Loops">Use Enhanced For Loop Syntax</h2>
187
188<p>The enhanced <code>for</code> loop (also sometimes known as "for-each" loop) can be used
189for collections that implement the {@link java.lang.Iterable} interface and for arrays.
190With collections, an iterator is allocated to make interface calls
191to {@code hasNext()} and {@code next()}. With an {@link java.util.ArrayList},
192a hand-written counted loop is
193about 3x faster (with or without JIT), but for other collections the enhanced
194for loop syntax will be exactly equivalent to explicit iterator usage.</p>
195
196<p>There are several alternatives for iterating through an array:</p>
197
198<pre>
199static class Foo {
200    int mSplat;
201}
202
203Foo[] mArray = ...
204
205public void zero() {
206    int sum = 0;
207    for (int i = 0; i &lt; mArray.length; ++i) {
208        sum += mArray[i].mSplat;
209    }
210}
211
212public void one() {
213    int sum = 0;
214    Foo[] localArray = mArray;
215    int len = localArray.length;
216
217    for (int i = 0; i &lt; len; ++i) {
218        sum += localArray[i].mSplat;
219    }
220}
221
222public void two() {
223    int sum = 0;
224    for (Foo a : mArray) {
225        sum += a.mSplat;
226    }
227}
228</pre>
229
230<p><code>zero()</code> is slowest, because the JIT can't yet optimize away
231the cost of getting the array length once for every iteration through the
232loop.</p>
233
234<p><code>one()</code> is faster. It pulls everything out into local
235variables, avoiding the lookups. Only the array length offers a performance
236benefit.</p>
237
238<p><code>two()</code> is fastest for devices without a JIT, and
239indistinguishable from <strong>one()</strong> for devices with a JIT.
240It uses the enhanced for loop syntax introduced in version 1.5 of the Java
241programming language.</p>
242
243<p>So, you should use the enhanced <code>for</code> loop by default, but consider a
244hand-written counted loop for performance-critical {@link java.util.ArrayList} iteration.</p>
245
246<p class="note"><strong>Tip:</strong>
247Also see Josh Bloch's <em>Effective Java</em>, item 46.</p>
248
249
250
251<h2 id="PackageInner">Consider Package Instead of Private Access with Private Inner Classes</h2>
252
253<p>Consider the following class definition:</p>
254
255<pre>
256public class Foo {
257    private class Inner {
258        void stuff() {
259            Foo.this.doStuff(Foo.this.mValue);
260        }
261    }
262
263    private int mValue;
264
265    public void run() {
266        Inner in = new Inner();
267        mValue = 27;
268        in.stuff();
269    }
270
271    private void doStuff(int value) {
272        System.out.println("Value is " + value);
273    }
274}</pre>
275
276<p>What's important here is that we define a private inner class
277(<code>Foo$Inner</code>) that directly accesses a private method and a private
278instance field in the outer class. This is legal, and the code prints "Value is
27927" as expected.</p>
280
281<p>The problem is that the VM considers direct access to <code>Foo</code>'s
282private members from <code>Foo$Inner</code> to be illegal because
283<code>Foo</code> and <code>Foo$Inner</code> are different classes, even though
284the Java language allows an inner class to access an outer class' private
285members. To bridge the gap, the compiler generates a couple of synthetic
286methods:</p>
287
288<pre>
289/*package*/ static int Foo.access$100(Foo foo) {
290    return foo.mValue;
291}
292/*package*/ static void Foo.access$200(Foo foo, int value) {
293    foo.doStuff(value);
294}</pre>
295
296<p>The inner class code calls these static methods whenever it needs to
297access the <code>mValue</code> field or invoke the <code>doStuff()</code> method
298in the outer class. What this means is that the code above really boils down to
299a case where you're accessing member fields through accessor methods.
300Earlier we talked about how accessors are slower than direct field
301accesses, so this is an example of a certain language idiom resulting in an
302"invisible" performance hit.</p>
303
304<p>If you're using code like this in a performance hotspot, you can avoid the
305overhead by declaring fields and methods accessed by inner classes to have
306package access, rather than private access. Unfortunately this means the fields
307can be accessed directly by other classes in the same package, so you shouldn't
308use this in public API.</p>
309
310
311
312
313<h2 id="AvoidFloat">Avoid Using Floating-Point</h2>
314
315<p>As a rule of thumb, floating-point is about 2x slower than integer on
316Android-powered devices.</p>
317
318<p>In speed terms, there's no difference between <code>float</code> and
319<code>double</code> on the more modern hardware. Space-wise, <code>double</code>
320is 2x larger. As with desktop machines, assuming space isn't an issue, you
321should prefer <code>double</code> to <code>float</code>.</p>
322
323<p>Also, even for integers, some processors have hardware multiply but lack
324hardware divide. In such cases, integer division and modulus operations are
325performed in software&mdash;something to think about if you're designing a
326hash table or doing lots of math.</p>
327
328
329
330
331<h2 id="UseLibraries">Know and Use the Libraries</h2>
332
333<p>In addition to all the usual reasons to prefer library code over rolling
334your own, bear in mind that the system is at liberty to replace calls
335to library methods with hand-coded assembler, which may be better than the
336best code the JIT can produce for the equivalent Java. The typical example
337here is {@link java.lang.String#indexOf String.indexOf()} and
338related APIs, which Dalvik replaces with
339an inlined intrinsic. Similarly, the {@link java.lang.System#arraycopy
340System.arraycopy()} method
341is about 9x faster than a hand-coded loop on a Nexus One with the JIT.</p>
342
343
344<p class="note"><strong>Tip:</strong>
345Also see Josh Bloch's <em>Effective Java</em>, item 47.</p>
346
347
348
349
350<h2 id="NativeMethods">Use Native Methods Carefully</h2>
351
352<p>Developing your app with native code using the
353<a href="{@docRoot}tools/sdk/ndk/index.html">Android NDK</a>
354isn't necessarily more efficient than programming with the
355Java language. For one thing,
356there's a cost associated with the Java-native transition, and the JIT can't
357optimize across these boundaries. If you're allocating native resources (memory
358on the native heap, file descriptors, or whatever), it can be significantly
359more difficult to arrange timely collection of these resources. You also
360need to compile your code for each architecture you wish to run on (rather
361than rely on it having a JIT). You may even have to compile multiple versions
362for what you consider the same architecture: native code compiled for the ARM
363processor in the G1 can't take full advantage of the ARM in the Nexus One, and
364code compiled for the ARM in the Nexus One won't run on the ARM in the G1.</p>
365
366<p>Native code is primarily useful when you have an existing native codebase
367that you want to port to Android, not for "speeding up" parts of your Android app
368written with the Java language.</p>
369
370<p>If you do need to use native code, you should read our
371<a href="{@docRoot}guide/practices/jni.html">JNI Tips</a>.</p>
372
373<p class="note"><strong>Tip:</strong>
374Also see Josh Bloch's <em>Effective Java</em>, item 54.</p>
375
376
377
378
379
380<h2 id="Myths">Performance Myths</h2>
381
382
383<p>On devices without a JIT, it is true that invoking methods via a
384variable with an exact type rather than an interface is slightly more
385efficient. (So, for example, it was cheaper to invoke methods on a
386<code>HashMap map</code> than a <code>Map map</code>, even though in both
387cases the map was a <code>HashMap</code>.) It was not the case that this
388was 2x slower; the actual difference was more like 6% slower. Furthermore,
389the JIT makes the two effectively indistinguishable.</p>
390
391<p>On devices without a JIT, caching field accesses is about 20% faster than
392repeatedly accesssing the field. With a JIT, field access costs about the same
393as local access, so this isn't a worthwhile optimization unless you feel it
394makes your code easier to read. (This is true of final, static, and static
395final fields too.)
396
397
398
399<h2 id="Measure">Always Measure</h2>
400
401<p>Before you start optimizing, make sure you have a problem that you
402need to solve. Make sure you can accurately measure your existing performance,
403or you won't be able to measure the benefit of the alternatives you try.</p>
404
405<p>Every claim made in this document is backed up by a benchmark. The source
406to these benchmarks can be found in the <a
407href="http://code.google.com/p/dalvik/source/browse/#svn/trunk/benchmarks">code.google.com
408"dalvik" project</a>.</p>
409
410<p>The benchmarks are built with the
411<a href="http://code.google.com/p/caliper/">Caliper</a> microbenchmarking
412framework for Java. Microbenchmarks are hard to get right, so Caliper goes out
413of its way to do the hard work for you, and even detect some cases where you're
414not measuring what you think you're measuring (because, say, the VM has
415managed to optimize all your code away). We highly recommend you use Caliper
416to run your own microbenchmarks.</p>
417
418<p>You may also find
419<a href="{@docRoot}tools/debugging/debugging-tracing.html">Traceview</a> useful
420for profiling, but it's important to realize that it currently disables the JIT,
421which may cause it to misattribute time to code that the JIT may be able to win
422back. It's especially important after making changes suggested by Traceview
423data to ensure that the resulting code actually runs faster when run without
424Traceview.</p>
425
426<p>For more help profiling and debugging your apps, see the following documents:</p>
427
428<ul>
429  <li><a href="{@docRoot}tools/debugging/debugging-tracing.html">Profiling with
430    Traceview and dmtracedump</a></li>
431  <li><a href="{@docRoot}tools/debugging/systrace.html">Analysing Display and Performance
432    with Systrace</a></li>
433</ul>
434
435