README.md
1# Metalava
2
3(Also known as "doclava2", but deliberately not named doclava2 since crucially
4it does not generate docs; it's intended only for **meta**data extraction and
5generation.)
6
7Metalava is a metadata generator intended for the Android source tree, used for
8a number of purposes:
9
10* Allow extracting the API (into signature text files, into stub API files
11 (which in turn get compiled into android.jar, the Android SDK library) and
12 more importantly to hide code intended to be implementation only, driven by
13 javadoc comments like @hide, @$doconly, @removed, etc, as well as various
14 annotations.
15
16* Extracting source level annotations into external annotations file (such as
17 the typedef annotations, which cannot be stored in the SDK as .class level
18 annotations).
19
20* Diffing versions of the API and determining whether a newer version is
21 compatible with the older version.
22
23## Building and running
24
25To build:
26
27 $ ./gradlew
28
29This builds a binary distribution in `../../out/host/common/install/metalava/bin/metalava`.
30
31To run metalava:
32
33 $ ../../out/host/common/install/metalava/bin/metalava
34 _ _
35 _ __ ___ ___| |_ __ _| | __ ___ ____ _
36 | '_ ` _ \ / _ \ __/ _` | |/ _` \ \ / / _` |
37 | | | | | | __/ || (_| | | (_| |\ V / (_| |
38 |_| |_| |_|\___|\__\__,_|_|\__,_| \_/ \__,_|
39
40 metalava extracts metadata from source code to generate artifacts such as the
41 signature files, the SDK stub files, external annotations etc.
42
43 Usage: metalava <flags>
44
45 Flags:
46
47 --help This message.
48 --quiet Only include vital output
49 --verbose Include extra diagnostic output
50
51 ...
52
53(*output truncated*)
54
55Metalava has a new command line syntax, but it also understands the doclava1
56flags and translates them on the fly. Flags that are ignored are listed on the
57command line. If metalava is dropped into an Android framework build for
58example, you'll see something like this (unless running with --quiet) :
59
60 metalava: Ignoring javadoc-related doclava1 flag -J-Xmx1600m
61 metalava: Ignoring javadoc-related doclava1 flag -J-XX:-OmitStackTraceInFastThrow
62 metalava: Ignoring javadoc-related doclava1 flag -XDignore.symbol.file
63 metalava: Ignoring javadoc-related doclava1 flag -doclet
64 metalava: Ignoring javadoc-related doclava1 flag -docletpath
65 metalava: Ignoring javadoc-related doclava1 flag -templatedir
66 metalava: Ignoring javadoc-related doclava1 flag -htmldir
67 ...
68
69## Features
70
71* Compatibility with doclava1: in compat mode, metalava spits out the same
72 signature files for the framework as doclava1.
73
74* Ability to read in an existing android.jar file instead of from source, which
75 means we can regenerate signature files etc for older versions according to
76 new formats (e.g. to fix past errors in doclava, such as annotation instance
77 methods which were accidentally not included.)
78
79* Ability to merge in data (annotations etc) from external sources, such as
80 IntelliJ external annotations data as well as signature files containing
81 annotations. This isn't just merged at export time, it's merged at codebase
82 load time such that it can be part of the API analysis.
83
84* Support for an updated signature file format (which is described in FORMAT.md)
85
86 * Address errors in the doclava1 format which for example was missing
87 annotation class instance methods
88
89 * Improve the signature format such that it for example labels enums "enum"
90 instead of "abstract class extends java.lang.Enum", annotations as
91 "@interface" instead of "abstract class extends java.lang.Annotation", sorts
92 modifiers in the canonical modifier order, using "extends" instead of
93 "implements" for the superclass of an interface, and many other similar
94 tweaks outlined in the `Compatibility` class. (Metalava also allows (and
95 ignores) block comments in the signature files.)
96
97 * Add support for writing (and reading) annotations into the signature
98 files. This is vital now that some of these annotations become part of the
99 API contract (in particular nullness contracts, as well as parameter names
100 and default values.)
101
102 * Support for a "compact" nullness format -- one based on Kotlin's
103 syntax. Since the goal is to have **all** API elements explicitly state
104 their nullness contract, the signature files would very quickly become
105 bloated with @NonNull and @Nullable annotations everywhere. So instead, the
106 signature format now uses a suffix of `?` for nullable, `!` for not yet
107 annotated, and nothing for non-null.
108
109 Instead of
110
111 method public java.lang.Double convert0(java.lang.Float);
112 method @Nullable public java.lang.Double convert1(@NonNull java.lang.Float);
113
114 we have
115
116 method public java.lang.Double! convert0(java.lang.Float!);
117 method public java.lang.Double? convert1(java.lang.Float);
118
119 * Other compactness improvements: Skip packages in some cases both for export
120 and reinsert during import. Specifically, drop "java.lang." from package
121 names such that you have
122
123 method public void onUpdate(int, String);
124
125 instead of
126
127 method public void onUpdate(int, java.lang.String);
128
129 Similarly, annotations (the ones considered part of the API; unknown
130 annotations are not included in signature files) use just the simple name
131 instead of the full package name, e.g. `@UiThread` instead of
132 `@android.annotation.UiThread`.
133
134 * Misc documentation handling; for example, it attempts to fix sentences that
135 javadoc will mistreat, such as sentences that "end" with "e.g. ". It also
136 looks for various common typos and fixes those; here's a sample error
137 message running metalava on master: Enhancing docs:
138
139 frameworks/base/core/java/android/content/res/AssetManager.java:166: error: Replaced Kitkat with KitKat in documentation for Method android.content.res.AssetManager.getLocales() [Typo]
140 frameworks/base/core/java/android/print/PrinterCapabilitiesInfo.java:122: error: Replaced Kitkat with KitKat in documentation for Method android.print.PrinterCapabilitiesInfo.Builder.setColorModes(int, int) [Typo]
141
142* Built-in support for injecting new annotations for use by the Kotlin compiler,
143 not just nullness annotations found in the source code and annotations merged
144 in from external sources, but also inferring whether nullness annotations have
145 recently changed and if so marking them as @Migrate (which lets the Kotlin
146 compiler treat errors in the user code as warnings instead of errors.)
147
148* Support for generating documentation into the stubs files (so we can run
149 javadoc or [Dokka](https://github.com/Kotlin/dokka) on the stubs files instead
150 of the source code). This means that the documentation tool itself does not
151 need to be able to figure out which parts of the source code is included in
152 the API and which one is implementation; it is simply handed the filtered API
153 stub sources that include documentation.
154
155* Support for parsing Kotlin files. API files can now be implemented in Kotlin
156 as well and metalava will parse and extract API information from them just as
157 is done for Java files.
158
159* Like doclava1, metalava can diff two APIs and warn about API compatibility
160 problems such as removing API elements. Metalava adds new warnings around
161 nullness, such as attempting to change a nullness contract incompatibly
162 (e.g. you can change a parameter from non null to nullable for final classes,
163 but not versa). It also lets you diff directly on a source tree; it does not
164 require you to create two signature files to diff.
165
166* Consistent stubs: In doclava1, the code which iterated over the API and
167 generated the signature files and generated the stubs had diverged, so there
168 was some inconsistency. In metalava the stub files contain **exactly** the
169 same signatures as in the signature files.
170
171 (This turned out to be incredibly important; this revealed for example that
172 StringBuilder.setLength(int) was missing from the API signatures since it is a
173 public method inherited from a package protected super class, which the API
174 extraction code in doclava1 missed, but accidentally included in the SDK
175 anyway since it packages package private classes. Metalava strictly applies
176 the exact same API as is listed in the signature files, and once this was
177 hooked up to the build it immediately became apparent that it was missing
178 important methods that should really be part of the API.)
179
180* API Lint: Metalava can optionally (with --api-lint) run a series of additional
181 checks on the public API in the codebase and flag issues that are discouraged
182 or forbidden by the Android API Council; there are currently around 80 checks.
183 Some of these take advantage of looking at the source code which wasn't
184 possible with the signature-file based Python version; for example, it looks
185 inside method bodies to see if you're synchronizing on this or the current
186 class, which is forbidden.
187
188* Baselines: Metalava can report all of its issues into a "baseline" file, which
189 records the current set of issues. From that point forward, when metalava
190 finds a problem, it will only be reported if it is not already in the
191 baseline. This lets you enforce new issues going forward without having to
192 fix all existing violations. Periodically, as older issues are fixed, you can
193 regenerate the baseline. For issues with some false positives, such as API
194 Lint, being able to check in the set of accepted or verified false positives
195 is quite important.
196
197* Metalava can generate reports about nullness annotation coverage (which helps
198 target efforts since we plan to annotate the entire API). First, it can
199 generate a raw count:
200
201 Nullness Annotation Coverage Statistics:
202 1279 out of 46900 methods were annotated (2%)
203 2 out of 21683 fields were annotated (0%)
204 2770 out of 47492 parameters were annotated (5%)
205
206 More importantly, you can also point it to some existing compiled applications
207 (.class or .jar files) and it will then measure the annotation coverage of the
208 APIs used by those applications. This lets us target the most important APIs
209 that are currently used by a corpus of apps and target our annotation efforts
210 in a targeted way. For example, running the analysis on the current version of
211 framework, and pointing it to the
212 [Plaid](https://github.com/nickbutcher/plaid) app's compiled output with
213
214 ... --annotation-coverage-of ~/plaid/app/build/intermediates/classes/debug
215
216 This produces the following output:
217
218 324 methods and fields were missing nullness annotations out of 650 total
219 API references. API nullness coverage is 50%
220
221 ```
222 | Qualified Class Name | Usage Count |
223 |--------------------------------------------------------------|-----------------:|
224 | android.os.Parcel | 146 |
225 | android.view.View | 119 |
226 | android.view.ViewPropertyAnimator | 114 |
227 | android.content.Intent | 104 |
228 | android.graphics.Rect | 79 |
229 | android.content.Context | 61 |
230 | android.widget.TextView | 53 |
231 | android.transition.TransitionValues | 49 |
232 | android.animation.Animator | 34 |
233 | android.app.ActivityOptions | 34 |
234 | android.view.LayoutInflater | 31 |
235 | android.app.Activity | 28 |
236 | android.content.SharedPreferences | 26 |
237 | android.content.SharedPreferences.Editor | 26 |
238 | android.text.SpannableStringBuilder | 23 |
239 | android.view.ViewGroup.MarginLayoutParams | 21 |
240 | ... (99 more items | |
241 ```
242
243Top referenced un-annotated members:
244
245 ```
246 | Member | Usage Count |
247 |--------------------------------------------------------------|-----------------:|
248 | Parcel.readString() | 62 |
249 | Parcel.writeString(String) | 62 |
250 | TextView.setText(CharSequence) | 34 |
251 | TransitionValues.values | 28 |
252 | View.getContext() | 28 |
253 | ViewPropertyAnimator.setDuration(long) | 26 |
254 | ViewPropertyAnimator.setInterpolator(android.animation.Ti... | 26 |
255 | LayoutInflater.inflate(int, android.view.ViewGroup, boole... | 23 |
256 | Rect.left | 22 |
257 | Rect.top | 22 |
258 | Intent.Intent(android.content.Context, Class<?>) | 21 |
259 | Rect.bottom | 21 |
260 | TransitionValues.view | 21 |
261 | VERSION.SDK_INT | 18 |
262 | Context.getResources() | 18 |
263 | EditText.getText() | 18 |
264 | ... (309 more items | |
265 ```
266
267 From this it's clear that it would be useful to start annotating
268 android.os.Parcel and android.view.View for example where there are
269 unannotated APIs that are frequently used, at least by this app.
270
271* Built on top of a full, type-resolved AST. Doclava1 was integrated with
272 javadoc, which meant that most of the source tree was opaque. Therefore, as
273 just one example, the code which generated documentation for typedef constants
274 had to require the constants to all share a single prefix it could look
275 for. However, in metalava, annotation references are available at the AST
276 level, so it can resolve references and map them back to the original field
277 references and include those directly.
278
279* Support for extracting annotations. Metalava can also generate the external
280 annotation files needed by Studio and lint in Gradle, which captures the
281 typedefs (@IntDef and @StringDef classes) in the source code. Prior to this
282 this was generated manually via the development/tools/extract code. This also
283 merges in manually curated data; some of this is in the manual/ folder in this
284 project.
285
286* Support for extracting API levels (api-versions.xml). This was generated by
287 separate code (tools/base/misc/api-generator), invoked during the build. This
288 functionality is now rolled into metalava, which has one very important
289 attribute: metalava will use this information when recording API levels for
290 API usage. (Prior to this, this was based on signature file parsing in
291 doclava, which sometimes generated incorrect results. Metalava uses the
292 android.jar files themselves to ensure that it computes the exact available
293 SDK data for each API level.)
294
295* Misc other features. For example, if you use the @VisibleForTesting annotation
296 from the support library, where you can express the intended visibility if the
297 method had not required visibility for testing, then metalava will treat that
298 method using the intended visibility instead when generating signature files
299 and stubs.
300
301## Architecture & Implementation
302
303Metalava is implemented on top of IntelliJ parsing APIs (PSI and UAST). However,
304these are hidden behind a "model": an abstraction layer which only exposes high
305level concepts like packages, classes and inner classes, methods, fields, and
306modifier lists (including annotations).
307
308This is done for multiple reasons:
309
310(1) It allows us to have multiple "back-ends": for example, metalava can read in
311 a model not just from parsing source code, but from reading older SDK
312 android.jar files (e.g. backed by bytecode) or reading previous signature
313 files. Reading in multiple versions of an API lets doclava perform
314 "diffing", such as warning if an API is changing in an incompatible way. It
315 can also generate signature files in the new format (including data that was
316 missing in older signature files, such as annotation methods) without having
317 to parse older source code which may no longer be easy to parse.
318
319(2) There's a lot of logic for deciding whether code found in the source tree
320 should be included in the API. With the model approach we can build up an
321 API and for example mark a subset of its methods as included. By having a
322 separate hierarchy we can easily perform this work once and pass around our
323 filtered model instead of passing around PsiClass and PsiMethod instances
324 and having to keep the filtered data separately and remembering to always
325 consult the filter, not the PSI elements directly.
326
327The basic API element class is "Item". (In doclava1 this was called a
328"DocInfo".) There are several sub interfaces of Item: PackageItem, ClassItem,
329MemberItem, MethodItem, FieldItem, ParameterItem, etc. And then there are
330several implementation hierarchies: One is PSI based, where you point metalava
331to a source tree or a .jar file, and it constructs Items built on top of PSI:
332PsiPackageItem, PsiClassItem, PsiMethodItem, etc. Another is textual, based on
333signature files: TextPackageItem, TextClassItem, and so on.
334
335The "Codebase" class captures a complete API snapshot (including classes that
336are hidden, which is why it's called a "Codebase" rather than an "API").
337
338There are methods to load codebases - from source folders, from a .jar file,
339from a signature file. That's how API diffing is performed: you load two
340codebases (from whatever source you want, typically a previous API signature
341file and the current set of source folders), and then you "diff" the two.
342
343There are several key helpers that help with the implementation, detailed next.
344
345### Visiting Items
346
347First, metalava provides an ItemVisitor. This lets you visit the API easily.
348For example, here's how you can visit every class:
349
350 coebase.accept(object : ItemVisitor() {
351 override fun visitClass(cls: ClassItem) {
352 // code operating on the class here
353 }
354 })
355
356Similarly you can visit all items (regardless of type) by overriding
357`visitItem`, or to specifically visit methods, fields and so on overriding
358`visitPackage`, `visitClass`, `visitMethod`, etc.
359
360There is also an `ApiVisitor`. This is a subclass of the `ItemVisitor`, but
361which limits itself to visiting code elements that are part of the API.
362
363This is how for example the SignatureWriter and the StubWriter are both
364implemented: they simply extend `ApiVisitor`, which means they'll only export
365the API items in the codebase, and then in each relevant method they emit the
366signature or stub data:
367
368 class SignatureWriter(
369 private val writer: PrintWriter,
370 private val generateDefaultConstructors: Boolean,
371 private val filter: (Item) -> Boolean) : ApiVisitor(
372 visitConstructorsAsMethods = false) {
373
374 ....
375
376 override fun visitConstructor(constructor: ConstructorItem) {
377 writer.print(" ctor ")
378 writeModifiers(constructor)
379 writer.print(constructor.containingClass().fullName())
380 writeParameterList(constructor)
381 writeThrowsList(constructor)
382 writer.print(";\n")
383 }
384
385 ....
386
387### Visiting Types
388
389There is a `TypeVisitor` similar to `ItemVisitor` which you can use to visit all
390types in the codebase.
391
392When computing the API, all types that are included in the API should be
393included (e.g. if `List<Foo>` is part of the API then `Foo` must be too). This
394is easy to do with the `TypeVisitor`.
395
396### Diffing Codebases
397
398Another visitor which helps with implementation is the ComparisonVisitor:
399
400 open class ComparisonVisitor {
401 open fun compare(old: Item, new: Item) {}
402 open fun added(item: Item) {}
403 open fun removed(item: Item) {}
404
405 open fun compare(old: PackageItem, new: PackageItem) { }
406 open fun compare(old: ClassItem, new: ClassItem) { }
407 open fun compare(old: MethodItem, new: MethodItem) { }
408 open fun compare(old: FieldItem, new: FieldItem) { }
409 open fun compare(old: ParameterItem, new: ParameterItem) { }
410
411 open fun added(item: PackageItem) { }
412 open fun added(item: ClassItem) { }
413 open fun added(item: MethodItem) { }
414 open fun added(item: FieldItem) { }
415 open fun added(item: ParameterItem) { }
416
417 open fun removed(item: PackageItem) { }
418 open fun removed(item: ClassItem) { }
419 open fun removed(item: MethodItem) { }
420 open fun removed(item: FieldItem) { }
421 open fun removed(item: ParameterItem) { }
422 }
423
424This makes it easy to perform API comparison operations.
425
426For example, metalava has a feature to mark "newly annotated" nullness
427annotations as migrated. To do this, it just extends `ComparisonVisitor`,
428overrides the `compare(old: Item, new: Item)` method, and checks whether the old
429item has no nullness annotations and the new one does, and if so, also marks the
430new annotations as @Migrate.
431
432Similarly, the API Check can simply override
433
434 open fun removed(item: Item) {
435 reporter.report(error, item, "Removing ${Item.describe(item)} is not allowed")
436 }
437
438to flag all API elements that have been removed as invalid (since you cannot
439remove API. (The real check is slightly more complicated; it looks into the
440hierarchy to see if there still is an inherited method with the same signature,
441in which case the deletion is allowed.))
442
443### Documentation Generation
444
445As mentioned above, metalava generates documentation directly into the stubs
446files, which can then be processed by Dokka and Javadoc to generate the same
447docs as before.
448
449Doclava1 was integrated with javadoc directly, so the way it generated metadata
450docs (such as documenting permissions, ranges and typedefs from annotations) was
451to insert auxiliary tags (`@range`, `@permission`, etc) and then this would get
452converted into English docs later via `macros_override.cs`.
453
454This it not how metalava does it; it generates the English documentation
455directly. This was not just convenient for the implementation (since metalava
456does not use javadoc data structures to pass maps like the arguments for the
457typedef macro), but should also help Dokka -- and arguably the Kotlin code which
458generates the documentation is easier to reason about and to update when it's
459handling loop conditionals. (As a result I for example improved some of the
460grammar, e.g. when it's listing a number of possible constants the conjunction
461is usually "or", but if it's a flag, the sentence begins with "a combination of
462" and then the conjunction at the end should be "and").
463