1Kati internals 2============== 3 4This is an informal document about internals of kati. This document is not meant 5to be a comprehensive document of kati or GNU make. This explains some random 6topics which other programmers may be interested in. 7 8Motivation 9---------- 10 11The motivation of kati was to speed up Android platform build. Especially, its 12incremental build time was the main focus. Android platform's build system is a 13very unique system. It provides a DSL, (ab)using Turing-completeness of GNU 14make. The DSL allows developers to write build rules in a descriptive way, but 15the downside is it's complicated and slow. 16 17When we say a build system is slow, we consider "null build" and "full 18build". Null build is a build which does nothing, because all output files are 19already up-to-date. Full build is a build which builds everything, because there 20were nothing which have been already built. Actual builds in daily development 21are somewhere between null build and full build. Most benchmarks below were done 22for null build. 23 24For Android with my fairly beefy workstation, null build took ~100 secs with GNU 25make. This means you needed to wait ~100 secs to see if there's a compile error 26when you changed a single C file. To be fair, things were not that bad. There 27are tools called mm/mmm. They allow developers to build an individual module. As 28they ignore dependencies between modules, they are fast. However, you need to be 29somewhat experienced to use them properly. You should know which modules will be 30affected by your change. It would be nicer if you can just type "make" whenever 31you change something. 32 33This is why we started this project. We decided to create a GNU make clone from 34scratch, but there were some other options. One option was to replace all 35Android.mk by files with a better format. There is actually a longer-term 36project for this. Kati was planned to be a short-term project. Another option 37was to hack GNU make instead of developing a clone. We didn't take this option 38because we thought the source code of GNU make is somewhat complicated due to 39historical reason. It's written in old-style C, has a lot of ifdefs for some 40unknown architectures, etc. 41 42Currently, kati's main mode is --ninja mode. Instead of executing build commands 43by itself, kati generates build.ninja file and 44[ninja](https://github.com/martine/ninja) actually runs commands. There were 45some back-and-forths before kati became the current form. Some experiments 46succeeded and some others failed. We even changed the language for kati. At 47first, we wrote kati in Go. We naively expected we can get enough performance 48with Go. I guessed at least one of the following statements are true: 1. GNU 49make is not very optimized for computation heavy Makefiles, 2. Go is fast for 50our purpose, or 3. we can come up with some optimization tricks for Android's 51build system. As for 3, some of such optimization succeeded but it's performance 52gain didn't cancel the slowness of Go. 53 54Go's performance would be somewhat interesting topic. I didn't study the 55performance difference in detail, but it seemed both our use of Go and Go 56language itself were making the Go version of kati slower. As for our fault, I 57think Go version has more unnecessary string allocations than C++ version 58has. As for Go itself, it seemed GC was the main show-stopper. For example, 59Android's build system defines about one million make variables, and buffers for 60them will be never freed. IIRC, this kind of allocation pattern isn't good for 61non-generational GC. 62 63Go version and test cases were written by ukai and me, and C++ rewrite was done 64mostly by me. The rest of this document is mostly about the C++ version. 65 66Overall architecture 67-------------------- 68 69Kati consists of the following components: 70 71* Parser 72* Evaluator 73* Dependency builder 74* Executor 75* Ninja generator 76 77A Makefile has some statements which consist of zero or more expressions. There 78are two parsers and two evaluators - one for statements and the other for 79expressions. 80 81Most of users of GNU make may not care about the evaluator much. However, GNU 82make's evaluator is very powerful and is Turing-complete. For Android's null 83build, most time is spent in this phase. Other tasks, such as building 84dependency graphs and calling stat function for build targets, are not the 85bottleneck. This would be a very Android specific characteristics. Android's 86build system uses a lot of GNU make black magics. 87 88The evaluator outputs a list of build rules and a variable table. The dependency 89builder creates a dependency graph from the list of build rules. Note this step 90doesn't use the variable table. 91 92Then either executor or ninja generator will be used. Either way, kati runs its 93evaluator again for command lines. The variable table is used again for this 94step. 95 96We'll look at each components closely. GNU make is a somewhat different language 97from modern languages. Let's see. 98 99Parser for statements 100--------------------- 101 102I'm not 100% sure, but I think GNU make parses and evaluates Makefiles 103simultaneously, but kati has two phases for parsing and evaluation. The reason 104of this design is for performance. For Android build, kati (or GNU make) needs 105to read ~3k files ~50k times. The file which is read most often is read ~5k 106times. It's waste of time to parse such files again and again. Kati can re-use 107parsed results when it needs to evaluate a Makefile second time. If we stop 108caching the parsed results, kati will be two times slower for Android's 109build. Caching parsed statements is done in *file_cache.cc*. 110 111The statement parser is defined in *parser.cc*. In kati, there are four kinds of 112statements: 113 114* Rules 115* Assignments 116* Commands 117* Make directives 118 119Data structures for them are defined in *stmt.h*. Here are examples of these 120statements: 121 122 VAR := yay! # An assignment 123 all: # A rule 124 echo $(VAR) # A command 125 include xxx.mk # A make directive (include) 126 127In addition to include directive, there are ifeq/ifneq/ifdef/ifndef directives 128and export/unexport directives. Also, kati internally uses "parse error 129statement". As GNU make doesn't show parse errors in branches which are not 130taken, we need to delay parse errors to evaluation time. 131 132### Context dependent parser 133 134A tricky point of parsing make statements is that the parsing depends on the 135context of the evaluation. See the following Makefile chunk for example: 136 137 $(VAR) 138 X=hoge echo $${X} 139 140You cannot tell whether the second line is a command or an assignment until 141*$(VAR)* is evaluated. If *$(VAR)* is a rule statement, the second line is a 142command and otherwise it's an assignment. If the previous line is 143 144 VAR := target: 145 146the second line will turn out to be a command. 147 148For some reason, GNU make expands expressions before it decides the type of 149a statement only for rules. Storing assignments or directives in a variable 150won't work as assignments or directives. For example 151 152 ASSIGN := A=B 153 $(ASSIGN): 154 155doesn't assign "*B:*" to *A*, but defines a build rule whose target is *A=B*. 156 157Anyway, as a line starts with a tab character can be either a command statement 158or other statements depending on the evaluation result of the previous line, 159sometimes kati's parser cannot tell the statement type of a line. In this case, 160kati's parser speculatively creates a command statement object, keeping the 161original line. If it turns out the line is actually not a command statement, 162the evaluator re-runs the parser. 163 164### Line concatenations and comments 165 166In most programming languages, line concatenations by a backslash character and 167comments are handled at a very early stage of a language 168implementation. However, GNU make changes the behavior for them depending on 169parse/eval context. For example, the following Makefile outputs "has space" and 170"hasnospace": 171 172 VAR := has\ 173 space 174 all: 175 echo $(VAR) 176 echo has\ 177 nospace 178 179GNU make usually inserts a whitespace between lines, but for command lines it 180doesn't. As we've seen in the previous subsection, sometimes kati cannot tell 181a line is a command statement or not. This means we should handle them after 182evaluating statements. Similar discussion applies for comments. GNU make usually 183trims characters after '#', but it does nothing for '#' in command lines. 184 185We have a bunch of comment/backslash related testcases in the testcase directory 186of kati's repository. 187 188Parser for expressions 189---------------------- 190 191A statement may have one or more expressions. The number of expressions in a 192statement depends on the statement's type. For example, 193 194 A := $(X) 195 196This is an assignment statement, which has two expressions - *A* and 197*$(X)*. Types of expressions and their parser are defined in *expr.cc*. Like 198other programming languages, an expression is a tree of expressions. The type of 199a leaf expression is either literal, variable reference, 200[substitution references](http://www.gnu.org/software/make/manual/make.html#Substitution-Refs), 201or make functions. 202 203As written, backslashes and comments change their behavior depending on the 204context. Kati handles them in this phase. *ParseExprOpt* is the enum for the 205contexts. 206 207As a nature of old systems, GNU make is very permissive. For some reason, it 208allows some kind of unmatched pairs of parentheses. For example, GNU make 209doesn't think *$($(foo)* is an error - this is a reference to variable 210*$(foo*. If you have some experiences with parsers, you may wonder how one can 211implement a parser which allows such expressions. It seems GNU make 212intentionally allows this: 213 214http://git.savannah.gnu.org/cgit/make.git/tree/expand.c#n285 215 216No one won't use this feature intentionally. However, as GNU make allows this, 217some Makefiles have unmatched parentheses, so kati shouldn't raise an error for 218them, unfortunately. 219 220GNU make has a bunch of functions. Most users would use only simple ones such as 221*$(wildcard ...)* and *$(subst ...)*. There are also more complex functions such 222as *$(if ...)* and *$(call ...)*, which make GNU make Turing-complete. Make 223functions are defined in *func.cc*. Though *func.cc* is not short, the 224implementation is fairly simple. There is only one weirdness I remember around 225functions. GNU make slightly changes its parsing for *$(if ...)*, *$(and ...)*, 226and *$(or ...)*. See *trim_space* and *trim_right_space_1st* in *func.h* and how 227they are used in *expr.cc*. 228 229Evaluator for statements 230------------------------ 231 232Evaluator for statements are defined in *eval.cc*. As written, there are four 233kinds of statements: 234 235* Rules 236* Assignments 237* Commands 238* Make directives 239 240There is nothing tricky around commands and make directives. A rule statement 241have some forms and should be parsed after evaluating expression by the third 242parser. This will be discussed in the next section. 243 244Assignments in GNU make is tricky a bit. There are two kinds of variables in GNU 245make - simple variables and recursive variables. See the following code snippet: 246 247 A = $(info world!) # recursive 248 B := $(info Hello,) # simple 249 $(A) 250 $(B) 251 252This code outputs "Hello," and "world!", in this order. The evaluation of 253a recursive variable is delayed until the variable is referenced. So the first 254line, which is an assignment of a recursive variable, outputs nothing. The 255content of the variable *$(A)* will be *$(info world!)* after the first 256line. The assignment in the second line uses *:=* which means this is a simple 257variable assignment. For simple variables, the right hand side is evaluated 258immediately. So "Hello," will be output and the value of *$(B)* will be an empty 259string ($(info ...) returns an empty string). Then, "world!" will be shown when 260the third line is evaluated as *$(A)* is evaluated, and lastly the forth line 261does nothing, as *$(B)* is an empty string. 262 263There are two more kinds of assignments (i.e., *+=* and *?=*). These assignments 264keep the type of the original variable. Evaluation of them will be done 265immediately only when the left hand side of the assignment is already defined 266and is a simple variable. 267 268Parser for rules 269---------------- 270 271After evaluating a rule statement, kati needs to parse the evaluated result. A 272rule statement can actually be the following four things: 273 274* A rule 275* A [target specific variable](http://www.gnu.org/software/make/manual/make.html#Target_002dspecific) 276* An empty line 277* An error (there're non-whitespace characters without a colon) 278 279Parsing them is mostly done in *rule.cc*. 280 281### Rules 282 283A rule is something like *all: hello.exe*. You should be familiar with it. There 284are several kinds of rules such as pattern rules, double colon rules, and order 285only dependencies, but they don't complicate the rule parser. 286 287A feature which complicates the parser is semicolon. You can write the first 288build command on the same line as the rule. For example, 289 290 target: 291 echo hi! 292 293and 294 295 target: ; echo hi! 296 297have the same meaning. This is tricky because kati shouldn't evaluate expressions 298in a command until the command is actually invoked. As a semicolon can appear as 299the result of expression evaluation, there are some corner cases. A tricky 300example: 301 302 all: $(info foo) ; $(info bar) 303 $(info baz) 304 305should output *foo*, *baz*, and then *bar*, in this order, but 306 307 VAR := all: $(info foo) ; $(info bar) 308 $(VAR) 309 $(info baz) 310 311outputs *foo*, *bar*, and then *baz*. 312 313Again, for the command line after a semicolon, kati should also change how 314backslashes and comments are handled. 315 316 target: has\ 317 space ; echo no\ 318 space 319 320The above example says *target* depends on two targets, *has* and *space*, and 321to build *target*, *echo nospace* should be executed. 322 323### Target specific variables 324 325You may not familiar with target specific variables. This feature allows you to 326define variable which can be referenced only from commands in a specified 327target. See the following code: 328 329 VAR := X 330 target1: VAR := Y 331 target1: 332 echo $(VAR) 333 target2: 334 echo $(VAR) 335 336In this example, *target1* shows *Y* and *target2* shows *X*. I think this 337feature is somewhat similar to namespaces in other programming languages. If a 338target specific variable is specified for a non-leaf target, the variable will 339be used even in build commands of prerequisite targets. 340 341In general, I like GNU make, but this is the only GNU make's feature I don't 342like. See the following Makefile: 343 344 hello: CFLAGS := -g 345 hello: hello.o 346 gcc $(CFLAGS) $< -o $@ 347 hello.o: hello.c 348 gcc $(CFLAGS) -c $< -o $@ 349 350If you run make for the target *hello*, *CFLAGS* is applied for both commands: 351 352 $ make hello 353 gcc -g -c hello.c -o hello.o 354 gcc -g hello.o -o hello 355 356However, *CFLAGS* for *hello* won't be used when you build only *hello.o*: 357 358 $ make hello.o 359 gcc -c hello.c -o hello.o 360 361Things could be even worse when two targets with different target specific 362variables depend on a same target. The build result will be inconsistent. I 363think there is no valid usage of this feature for non-leaf targets. 364 365Let's go back to the parsing. Like for semicolons, we need to delay the 366evaluation of the right hand side of the assignment for recursive variables. Its 367implementation is very similar to the one for semicolons, but the combination of 368the assignment and the semicolon makes parsing a bit trickier. An example: 369 370 target1: ;X=Y echo $(X) # A rule with a command 371 target2: X=;Y echo $(X) # A target specific variable 372 373Evaluator for expressions 374------------------------- 375 376Evaluation of expressions is done in *expr.cc*, *func.cc*, and 377*command.cc*. The amount of code for this step is fairly large especially 378because of the number of GNU make functions. However, their implementations are 379fairly straightforward. 380 381One tricky function is $(wildcard ...). It seems GNU make is doing some kind of 382optimization only for this function and $(wildcard ...) in commands seem to be 383evaluated before the evaluation phase for commands. Both C++ kati and Go kati 384are different from GNU make's behavior in different ways, but it seems this 385incompatibility is OK for Android build. 386 387There is an important optimization done for Android. Android's build system has 388a lot of $(shell find ...) calls to create a list of all .java/.mk files under a 389directory, and they are slow. For this, kati has a builtin emulator of GNU 390find. The find emulator traverses the directory tree and creates an in-memory 391directory tree. Then the find emulator returns results of find commands using 392the cached tree. For my environment, the find command emulator makes kati ~1.6x 393faster for AOSP. 394 395The implementations of some IO-related functions in commands are tricky in the 396ninja generation mode. This will be described later. 397 398Dependency builder 399------------------ 400 401Now we get a list of rules and a variable table. *dep.cc* builds a dependency 402graph using the list of rules. I think this step is what GNU make is supposed to 403do for normal users. 404 405This step is fairly complex like other components but there's nothing 406strange. There are three types of rules in GNU make: 407 408* explicit rule 409* implicit rule 410* suffix rule 411 412The following code shows the three types: 413 414 all: foo.o 415 foo.o: 416 echo explicit 417 %.o: 418 echo implicit 419 .c.o: 420 echo suffix 421 422In the above example, all of these three rules match the target *foo.o*. GNU 423make prioritizes explicit rules first. When there's no explicit rule for a 424target, it uses an implicit rule with longer pattern string. Suffix rules are 425used only when there are no explicit/implicit rules. 426 427Android has more than one thousand implicit rules and there are ten thousands of 428targets. It's too slow to do matching for them with a naive O(NM) 429algorithm. Kati uses a trie to speed up this step. 430 431Multiple rules without commands should be merged into the rule with a 432command. For example: 433 434 foo.o: foo.h 435 %.o: %.c 436 $(CC) -c $< -o $@ 437 438*foo.o* depends not only on *foo.c*, but also on *foo.h*. 439 440Executor 441-------- 442 443C++ kati's executor is fairly simple. This is defined in *exec.cc*. This is 444useful only for testing because this lacks some important features for a build 445system (e.g., parallel build). 446 447Expressions in commands are evaluated at this stage. When they are evaluated, 448target specific variables and some special variables (e.g., $< and $@) should be 449considered. *command.cc* is handling them. This file is used by both the 450executor and the ninja generator. 451 452Evaluation at this stage is tricky when both *+=* and target specific variables 453are involved. Here is an example code: 454 455 all: test1 test2 test3 test4 456 457 A:=X 458 B=X 459 X:=foo 460 461 test1: A+=$(X) 462 test1: 463 @echo $(A) # X bar 464 465 test2: B+=$(X) 466 test2: 467 @echo $(B) # X bar 468 469 test3: A:= 470 test3: A+=$(X) 471 test3: 472 @echo $(A) # foo 473 474 test4: B= 475 test4: B+=$(X) 476 test4: 477 @echo $(B) # bar 478 479 X:=bar 480 481*$(A)* in *test3* is a simple variable. Though *$(A)* in the global scope is 482simple, *$(A)* in *test1* is a recursive variable. This means types of global 483variables don't affect types of target specific variables. However, The result 484of *test1* ("X bar") shows the value of a target specific variable is 485concatenated to the value of a global variable. 486 487Ninja generator 488--------------- 489 490*ninja.cc* generates a ninja file using the results of other components. This 491step is actually fairly complicated because kati needs to map GNU make's 492features to ninja's. 493 494A build rule in GNU make may have multiple commands, while ninja's has always a 495single command. To mitigate this, the ninja generator translates multiple 496commands into something like *(cmd1) && (cmd2) && ...*. Kati should also escape 497some special characters for ninja and shell. 498 499The tougher thing is $(shell ...) in commands. Current kati's implementation 500translates it into shell's $(...). This works for many cases. But this approach 501won't work when the result of $(shell ...) is passed to another make 502function. For example 503 504 all: 505 echo $(if $(shell echo),FAIL,PASS) 506 507should output PASS, because the result of $(shell echo) is an empty string. GNU 508make and kati's executor mode output PASS correctly. However, kati's ninja 509generator emits a ninja file which shows FAIL. 510 511I wrote a few experimental patches for this issue, but they didn't 512work well. The current kati's implementation has an Android specific workaround 513for this. See *HasNoIoInShellScript* in *func.cc* for detail. 514 515Ninja regeneration 516------------------ 517 518C++ kati has --regen flag. If this flag is specified, kati checks if anything 519in your environment was changed after the previous run. If kati thinks it doesn't 520need to regenerate the ninja file, it finishes quickly. For Android, running 521kati takes ~30 secs at the first run but the second run takes only ~1 sec. 522 523Kati thinks it needs to regenerate the ninja file when one of the followings is 524changed: 525 526* The command line flags passed to kati 527* A timestamp of a Makefile used to generate the previous ninja file 528* An environment variable used while evaluating Makefiles 529* A result of $(wildcard ...) 530* A result of $(shell ...) 531 532Quickly doing the last check is not trivial. It takes ~18 secs to run all 533$(shell ...) in Android's build system due to the slowness of $(shell find 534...). So, for find commands executed by kati's find emulator, kati stores the 535timestamps of traversed directories with the find command itself. For each find 536commands, kati checks the timestamps of them. If they are not changed, kati 537skips re-running the find command. 538 539Kati doesn't run $(shell date ...) and $(shell echo ...) during this check. The 540former always changes so there's no sense to re-run them. Android uses the 541latter to create a file and the result of them are empty strings. We don't want 542to update these files to get empty strings. 543 544TODO 545---- 546 547A big TODO is sub-makes invoked by $(MAKE). I wrote some experimental patches 548but nothing is ready to be used as of writing. 549