README.md
1# Metalava
2
3Metalava is a metadata generator intended for JVM type projects. The main
4users of this tool are Android Platform and AndroidX libraries, however this
5tool also works on non-Android libraries.
6
7Metalava has many features related to API management. Some examples of the most
8commonly used ones are:
9
10* Allows extracting the API (into signature text files, into stub API files
11 which in turn get compiled into android.jar, the Android SDK library) and
12 more importantly to hide code intended to be implementation only, driven by
13 javadoc comments like @hide, @doconly, @removed, etc, as well as various
14 annotations.
15
16* Extracting source level annotations into external annotations file (such as
17 the typedef annotations, which cannot be stored in the SDK as .class level
18 annotations) to ship alongside the Android SDK and used by Android Lint.
19
20* Diffing versions of the API and determining whether a newer version is
21 compatible with the older version. (See [COMPATIBILITY.md](COMPATIBILITY.md))
22
23## Building and running
24
25To download the code and any dependencies required for building, see [DOWNLOADING.md](DOWNLOADING.md)
26
27To build:
28
29 $ cd tools/metalava
30 $ ./gradlew
31
32It puts build artifacts in `../../out/metalava/`.
33
34To run the metalava executable:
35
36### Through Gradle
37
38To list all the options:
39
40 $ ./gradlew run
41
42To run it with specific arguments:
43
44 $ ./gradlew run --args="--api path/to/api.txt"
45
46### Through distribution artifact
47
48First build it with:
49
50 $ ./gradlew installDist
51
52Then run it with:
53
54 $ ../../out/metalava/install/metalava/bin/metalava
55 _ _
56 _ __ ___ ___| |_ __ _| | __ ___ ____ _
57 | '_ ` _ \ / _ \ __/ _` | |/ _` \ \ / / _` |
58 | | | | | | __/ || (_| | | (_| |\ V / (_| |
59 |_| |_| |_|\___|\__\__,_|_|\__,_| \_/ \__,_|
60
61 metalava extracts metadata from source code to generate artifacts such as the
62 signature files, the SDK stub files, external annotations etc.
63
64 Usage: metalava <flags>
65
66 Flags:
67
68 --help This message.
69 --quiet Only include vital output
70 --verbose Include extra diagnostic output
71
72 ...
73
74(*output truncated*)
75
76## Features
77
78* Ability to read in an existing android.jar file instead of from source, which
79 means we can regenerate signature files etc for older versions according to
80 new formats (e.g. to fix past errors in doclava, such as annotation instance
81 methods which were accidentally not included.)
82
83* Ability to merge in data (annotations etc) from external sources, such as
84 IntelliJ external annotations data as well as signature files containing
85 annotations. This isn't just merged at export time, it's merged at codebase
86 load time such that it can be part of the API analysis.
87
88* Support for an updated signature file format (which is described in [FORMAT.md](FORMAT.md))
89
90 * Address errors in the doclava1 format which for example was missing
91 annotation class instance methods
92
93 * Improve the signature format such that it for example labels enums "enum"
94 instead of "abstract class extends java.lang.Enum", annotations as
95 "@interface" instead of "abstract class extends java.lang.Annotation", sorts
96 modifiers in the canonical modifier order, using "extends" instead of
97 "implements" for the superclass of an interface, and many other similar
98 tweaks outlined in the `Compatibility` class. (Metalava also allows (and
99 ignores) block comments in the signature files.)
100
101 * Add support for writing (and reading) annotations into the signature
102 files. This is vital now that some of these annotations become part of the
103 API contract (in particular nullness contracts, as well as parameter names
104 and default values.)
105
106 * Support for a "compact" nullness format -- one based on Kotlin's
107 syntax. Since the goal is to have **all** API elements explicitly state
108 their nullness contract, the signature files would very quickly become
109 bloated with @NonNull and @Nullable annotations everywhere. So instead, the
110 signature format now uses a suffix of `?` for nullable, `!` for not yet
111 annotated, and nothing for non-null.
112
113 Instead of
114
115 method public java.lang.Double convert0(java.lang.Float);
116 method @Nullable public java.lang.Double convert1(@NonNull java.lang.Float);
117
118 we have
119
120 method public java.lang.Double! convert0(java.lang.Float!);
121 method public java.lang.Double? convert1(java.lang.Float);
122
123 * Other compactness improvements: Skip packages in some cases both for export
124 and reinsert during import. Specifically, drop "java.lang." from package
125 names such that you have
126
127 method public void onUpdate(int, String);
128
129 instead of
130
131 method public void onUpdate(int, java.lang.String);
132
133 Similarly, annotations (the ones considered part of the API; unknown
134 annotations are not included in signature files) use just the simple name
135 instead of the full package name, e.g. `@UiThread` instead of
136 `@android.annotation.UiThread`.
137
138 * Misc documentation handling; for example, it attempts to fix sentences that
139 javadoc will mistreat, such as sentences that "end" with "e.g. ". It also
140 looks for various common typos and fixes those; here's a sample error
141 message running metalava on master: Enhancing docs:
142
143 frameworks/base/core/java/android/content/res/AssetManager.java:166: error: Replaced Kitkat with KitKat in documentation for Method android.content.res.AssetManager.getLocales() [Typo]
144 frameworks/base/core/java/android/print/PrinterCapabilitiesInfo.java:122: error: Replaced Kitkat with KitKat in documentation for Method android.print.PrinterCapabilitiesInfo.Builder.setColorModes(int, int) [Typo]
145
146* Built-in support for injecting new annotations for use by the Kotlin compiler,
147 not just nullness annotations found in the source code and annotations merged
148 in from external sources, but also inferring whether nullness annotations have
149 recently changed and if so marking them as @Migrate (which lets the Kotlin
150 compiler treat errors in the user code as warnings instead of errors.)
151
152* Support for generating documentation into the stubs files (so we can run
153 javadoc or [Dokka](https://github.com/Kotlin/dokka) on the stubs files instead
154 of the source code). This means that the documentation tool itself does not
155 need to be able to figure out which parts of the source code is included in
156 the API and which one is implementation; it is simply handed the filtered API
157 stub sources that include documentation.
158
159* Support for parsing Kotlin files. API files can now be implemented in Kotlin
160 as well and metalava will parse and extract API information from them just as
161 is done for Java files.
162
163* Like doclava1, metalava can diff two APIs and warn about API compatibility
164 problems such as removing API elements. Metalava adds new warnings around
165 nullness, such as attempting to change a nullness contract incompatibly
166 (e.g. you can change a parameter from non null to nullable for final classes,
167 but not versa). It also lets you diff directly on a source tree; it does not
168 require you to create two signature files to diff.
169
170* Consistent stubs: In doclava1, the code which iterated over the API and
171 generated the signature files and generated the stubs had diverged, so there
172 was some inconsistency. In metalava the stub files contain **exactly** the
173 same signatures as in the signature files.
174
175 (This turned out to be incredibly important; this revealed for example that
176 StringBuilder.setLength(int) was missing from the API signatures since it is a
177 public method inherited from a package protected super class, which the API
178 extraction code in doclava1 missed, but accidentally included in the SDK
179 anyway since it packages package private classes. Metalava strictly applies
180 the exact same API as is listed in the signature files, and once this was
181 hooked up to the build it immediately became apparent that it was missing
182 important methods that should really be part of the API.)
183
184* API Lint: Metalava can optionally (with --api-lint) run a series of additional
185 checks on the public API in the codebase and flag issues that are discouraged
186 or forbidden by the Android API Council; there are currently around 80 checks.
187 Some of these take advantage of looking at the source code which wasn't
188 possible with the signature-file based Python version; for example, it looks
189 inside method bodies to see if you're synchronizing on this or the current
190 class, which is forbidden.
191
192* Baselines: Metalava can report all of its issues into a "baseline" file, which
193 records the current set of issues. From that point forward, when metalava
194 finds a problem, it will only be reported if it is not already in the
195 baseline. This lets you enforce new issues going forward without having to
196 fix all existing violations. Periodically, as older issues are fixed, you can
197 regenerate the baseline. For issues with some false positives, such as API
198 Lint, being able to check in the set of accepted or verified false positives
199 is quite important.
200
201* Metalava can generate reports about nullness annotation coverage (which helps
202 target efforts since we plan to annotate the entire API). First, it can
203 generate a raw count:
204
205 Nullness Annotation Coverage Statistics:
206 1279 out of 46900 methods were annotated (2%)
207 2 out of 21683 fields were annotated (0%)
208 2770 out of 47492 parameters were annotated (5%)
209
210 More importantly, you can also point it to some existing compiled applications
211 (.class or .jar files) and it will then measure the annotation coverage of the
212 APIs used by those applications. This lets us target the most important APIs
213 that are currently used by a corpus of apps and target our annotation efforts
214 in a targeted way. For example, running the analysis on the current version of
215 framework, and pointing it to the
216 [Plaid](https://github.com/nickbutcher/plaid) app's compiled output with
217
218 ... --annotation-coverage-of ~/plaid/app/build/intermediates/classes/debug
219
220 This produces the following output:
221
222 324 methods and fields were missing nullness annotations out of 650 total
223 API references. API nullness coverage is 50%
224
225 ```
226 | Qualified Class Name | Usage Count |
227 |--------------------------------------------------------------|-----------------:|
228 | android.os.Parcel | 146 |
229 | android.view.View | 119 |
230 | android.view.ViewPropertyAnimator | 114 |
231 | android.content.Intent | 104 |
232 | android.graphics.Rect | 79 |
233 | android.content.Context | 61 |
234 | android.widget.TextView | 53 |
235 | android.transition.TransitionValues | 49 |
236 | android.animation.Animator | 34 |
237 | android.app.ActivityOptions | 34 |
238 | android.view.LayoutInflater | 31 |
239 | android.app.Activity | 28 |
240 | android.content.SharedPreferences | 26 |
241 | android.content.SharedPreferences.Editor | 26 |
242 | android.text.SpannableStringBuilder | 23 |
243 | android.view.ViewGroup.MarginLayoutParams | 21 |
244 | ... (99 more items | |
245 ```
246
247Top referenced un-annotated members:
248
249 ```
250 | Member | Usage Count |
251 |--------------------------------------------------------------|-----------------:|
252 | Parcel.readString() | 62 |
253 | Parcel.writeString(String) | 62 |
254 | TextView.setText(CharSequence) | 34 |
255 | TransitionValues.values | 28 |
256 | View.getContext() | 28 |
257 | ViewPropertyAnimator.setDuration(long) | 26 |
258 | ViewPropertyAnimator.setInterpolator(android.animation.Ti... | 26 |
259 | LayoutInflater.inflate(int, android.view.ViewGroup, boole... | 23 |
260 | Rect.left | 22 |
261 | Rect.top | 22 |
262 | Intent.Intent(android.content.Context, Class<?>) | 21 |
263 | Rect.bottom | 21 |
264 | TransitionValues.view | 21 |
265 | VERSION.SDK_INT | 18 |
266 | Context.getResources() | 18 |
267 | EditText.getText() | 18 |
268 | ... (309 more items | |
269 ```
270
271 From this it's clear that it would be useful to start annotating
272 android.os.Parcel and android.view.View for example where there are
273 unannotated APIs that are frequently used, at least by this app.
274
275* Built on top of a full, type-resolved AST. Doclava1 was integrated with
276 javadoc, which meant that most of the source tree was opaque. Therefore, as
277 just one example, the code which generated documentation for typedef constants
278 had to require the constants to all share a single prefix it could look
279 for. However, in metalava, annotation references are available at the AST
280 level, so it can resolve references and map them back to the original field
281 references and include those directly.
282
283* Support for extracting annotations. Metalava can also generate the external
284 annotation files needed by Studio and lint in Gradle, which captures the
285 typedefs (@IntDef and @StringDef classes) in the source code. Prior to this
286 this was generated manually via the development/tools/extract code. This also
287 merges in manually curated data; some of this is in the manual/ folder in this
288 project.
289
290* Support for extracting API levels (api-versions.xml). This was generated by
291 separate code (tools/base/misc/api-generator), invoked during the build. This
292 functionality is now rolled into metalava, which has one very important
293 attribute: metalava will use this information when recording API levels for
294 API usage. (Prior to this, this was based on signature file parsing in
295 doclava, which sometimes generated incorrect results. Metalava uses the
296 android.jar files themselves to ensure that it computes the exact available
297 SDK data for each API level.)
298
299* Misc other features. For example, if you use the @VisibleForTesting annotation
300 from the support library, where you can express the intended visibility if the
301 method had not required visibility for testing, then metalava will treat that
302 method using the intended visibility instead when generating signature files
303 and stubs.
304
305## Architecture & Implementation
306
307Metalava is implemented on top of IntelliJ parsing APIs (PSI and UAST). However,
308these are hidden behind a "model": an abstraction layer which only exposes high
309level concepts like packages, classes and inner classes, methods, fields, and
310modifier lists (including annotations).
311
312This is done for multiple reasons:
313
314(1) It allows us to have multiple "back-ends": for example, metalava can read in
315 a model not just from parsing source code, but from reading older SDK
316 android.jar files (e.g. backed by bytecode) or reading previous signature
317 files. Reading in multiple versions of an API lets doclava perform
318 "diffing", such as warning if an API is changing in an incompatible way. It
319 can also generate signature files in the new format (including data that was
320 missing in older signature files, such as annotation methods) without having
321 to parse older source code which may no longer be easy to parse.
322
323(2) There's a lot of logic for deciding whether code found in the source tree
324 should be included in the API. With the model approach we can build up an
325 API and for example mark a subset of its methods as included. By having a
326 separate hierarchy we can easily perform this work once and pass around our
327 filtered model instead of passing around PsiClass and PsiMethod instances
328 and having to keep the filtered data separately and remembering to always
329 consult the filter, not the PSI elements directly.
330
331The basic API element class is "Item". (In doclava1 this was called a
332"DocInfo".) There are several sub interfaces of Item: PackageItem, ClassItem,
333MemberItem, MethodItem, FieldItem, ParameterItem, etc. And then there are
334several implementation hierarchies: One is PSI based, where you point metalava
335to a source tree or a .jar file, and it constructs Items built on top of PSI:
336PsiPackageItem, PsiClassItem, PsiMethodItem, etc. Another is textual, based on
337signature files: TextPackageItem, TextClassItem, and so on.
338
339The "Codebase" class captures a complete API snapshot (including classes that
340are hidden, which is why it's called a "Codebase" rather than an "API").
341
342There are methods to load codebases - from source folders, from a .jar file,
343from a signature file. That's how API diffing is performed: you load two
344codebases (from whatever source you want, typically a previous API signature
345file and the current set of source folders), and then you "diff" the two.
346
347There are several key helpers that help with the implementation, detailed next.
348
349### Visiting Items
350
351First, metalava provides an ItemVisitor. This lets you visit the API easily.
352For example, here's how you can visit every class:
353
354 codebase.accept(object : ItemVisitor() {
355 override fun visitClass(cls: ClassItem) {
356 // code operating on the class here
357 }
358 })
359
360Similarly you can visit all items (regardless of type) by overriding
361`visitItem`, or to specifically visit methods, fields and so on overriding
362`visitPackage`, `visitClass`, `visitMethod`, etc.
363
364There is also an `ApiVisitor`. This is a subclass of the `ItemVisitor`, but
365which limits itself to visiting code elements that are part of the API.
366
367This is how for example the SignatureWriter and the StubWriter are both
368implemented: they simply extend `ApiVisitor`, which means they'll only export
369the API items in the codebase, and then in each relevant method they emit the
370signature or stub data:
371
372 class SignatureWriter(
373 private val writer: PrintWriter,
374 private val generateDefaultConstructors: Boolean,
375 private val filter: (Item) -> Boolean) : ApiVisitor(
376 visitConstructorsAsMethods = false) {
377
378 ....
379
380 override fun visitConstructor(constructor: ConstructorItem) {
381 writer.print(" ctor ")
382 writeModifiers(constructor)
383 writer.print(constructor.containingClass().fullName())
384 writeParameterList(constructor)
385 writeThrowsList(constructor)
386 writer.print(";\n")
387 }
388
389 ....
390
391### Visiting Types
392
393There is a `TypeVisitor` similar to `ItemVisitor` which you can use to visit all
394types in the codebase.
395
396When computing the API, all types that are included in the API should be
397included (e.g. if `List<Foo>` is part of the API then `Foo` must be too). This
398is easy to do with the `TypeVisitor`.
399
400### Diffing Codebases
401
402Another visitor which helps with implementation is the ComparisonVisitor:
403
404 open class ComparisonVisitor {
405 open fun compare(old: Item, new: Item) {}
406 open fun added(item: Item) {}
407 open fun removed(item: Item) {}
408
409 open fun compare(old: PackageItem, new: PackageItem) { }
410 open fun compare(old: ClassItem, new: ClassItem) { }
411 open fun compare(old: MethodItem, new: MethodItem) { }
412 open fun compare(old: FieldItem, new: FieldItem) { }
413 open fun compare(old: ParameterItem, new: ParameterItem) { }
414
415 open fun added(item: PackageItem) { }
416 open fun added(item: ClassItem) { }
417 open fun added(item: MethodItem) { }
418 open fun added(item: FieldItem) { }
419 open fun added(item: ParameterItem) { }
420
421 open fun removed(item: PackageItem) { }
422 open fun removed(item: ClassItem) { }
423 open fun removed(item: MethodItem) { }
424 open fun removed(item: FieldItem) { }
425 open fun removed(item: ParameterItem) { }
426 }
427
428This makes it easy to perform API comparison operations.
429
430For example, metalava has a feature to mark "newly annotated" nullness
431annotations as migrated. To do this, it just extends `ComparisonVisitor`,
432overrides the `compare(old: Item, new: Item)` method, and checks whether the old
433item has no nullness annotations and the new one does, and if so, also marks the
434new annotations as @Migrate.
435
436Similarly, the API Check can simply override
437
438 open fun removed(item: Item) {
439 reporter.report(error, item, "Removing ${Item.describe(item)} is not allowed")
440 }
441
442to flag all API elements that have been removed as invalid (since you cannot
443remove API. (The real check is slightly more complicated; it looks into the
444hierarchy to see if there still is an inherited method with the same signature,
445in which case the deletion is allowed.))
446
447### Documentation Generation
448
449As mentioned above, metalava generates documentation directly into the stubs
450files, which can then be processed by Dokka and Javadoc to generate the same
451docs as before.
452
453Doclava1 was integrated with javadoc directly, so the way it generated metadata
454docs (such as documenting permissions, ranges and typedefs from annotations) was
455to insert auxiliary tags (`@range`, `@permission`, etc) and then this would get
456converted into English docs later via `macros_override.cs`.
457
458This it not how metalava does it; it generates the English documentation
459directly. This was not just convenient for the implementation (since metalava
460does not use javadoc data structures to pass maps like the arguments for the
461typedef macro), but should also help Dokka -- and arguably the Kotlin code which
462generates the documentation is easier to reason about and to update when it's
463handling loop conditionals. (As a result I for example improved some of the
464grammar, e.g. when it's listing a number of possible constants the conjunction
465is usually "or", but if it's a flag, the sentence begins with "a combination of
466" and then the conjunction at the end should be "and").
467