1[[bbv2.arch]] 2B2 v2 architecture 3--------------------------- 4 5This document is work-in progress. Do not expect much from it yet. 6 7[[bbv2.arch.overview]] 8Overview 9-------- 10 11B2 implementation is structured in four different components: 12"kernel", "util", "build" and "tools". The first two are relatively 13uninteresting, so we will focus on the remaining pair. The "build" 14component provides classes necessary to declare targets, determining 15which properties should be used for their building, and creating the 16dependency graph. The "tools" component provides user-visible 17functionality. It mostly allows declaring specific kinds of main 18targets, as well as registering available tools, which are then used 19when creating the dependency graph. 20 21[[bbv2.arch.build]] 22The build layer 23--------------- 24 25The build layer has just four main parts -- metatargets (abstract 26targets), virtual targets, generators and properties. 27 28* Metatargets (see the "targets.jam" module) represent all the 29user-defined entities that can be built. The "meta" prefix signifies 30that they do not need to correspond to exact files or even files at all 31-- they can produce a different set of files depending on the build 32request. Metatargets are created when Jamfiles are loaded. Each has a 33`generate` method which is given a property set and produces virtual 34targets for the passed properties. 35* Virtual targets (see the "virtual-targets.jam" module) correspond to 36actual atomic updatable entities -- most typically files. 37* Properties are just (name, value) pairs, specified by the user and 38describing how targets should be built. Properties are stored using the 39`property-set` class. 40* Generators are objects that encapsulate specific tools -- they can 41take a list of source virtual targets and produce new virtual targets 42from them. 43 44The build process includes the following steps: 45 461. Top-level code calls the `generate` method of a metatarget with some 47properties. 482. The metatarget combines the requested properties with its 49requirements and passes the result, together with the list of sources, 50to the `generators.construct` function. 513. A generator appropriate for the build properties is selected and its 52`run` method is called. The method returns a list of virtual targets. 534. The virtual targets are returned to the top level code, and for each 54instance, the `actualize` method is called to setup nodes and updating 55actions in the dependency graph kept inside B2 engine. This 56dependency graph is then updated, which runs necessary commands. 57 58[[bbv2.arch.build.metatargets]] 59Metatargets 60~~~~~~~~~~~ 61 62There are several classes derived from "abstract-target". The 63"main-target" class represents a top-level main target, the 64"project-target" class acts like a container holding multiple main 65targets, and "basic-target" class is a base class for all further target 66types. 67 68Since each main target can have several alternatives, all top-level 69target objects are actually containers, referring to "real" main target 70classes. The type of that container is "main-target". For example, 71given: 72 73.... 74alias a ; 75lib a : a.cpp : <toolset>gcc ; 76.... 77 78we would have one-top level "main-target" instance, containing one 79"alias-target" and one "lib-target" instance. "main-target"'s "generate" 80method decides which of the alternative should be used, and calls 81"generate" on the corresponding instance. 82 83Each alternative is an instance of a class derived from "basic-target". 84"basic-target.generate" does several things that should always be done: 85 86* Determines what properties should be used for building the target. 87This includes looking at requested properties, requirements, and usage 88requirements of all sources. 89* Builds all sources. 90* Computes usage requirements that should be passed back to targets 91depending on this one. 92 93For the real work of constructing a virtual target, a new method 94"construct" is called. 95 96The "construct" method can be implemented in any way by classes derived 97from "basic-target", but one specific derived class plays the central 98role -- "typed-target". That class holds the desired type of file to be 99produced, and its "construct" method uses the generators module to do 100the actual work. 101 102This means that a specific metatarget subclass may avoid using 103generators all together. However, this is deprecated and we are trying 104to eliminate all such subclasses at the moment. 105 106Note that the `build/targets.jam` file contains an UML diagram which 107might help. 108 109[[bbv2.arch.build.virtual]] 110Virtual targets 111~~~~~~~~~~~~~~~ 112 113Virtual targets are atomic updatable entities. Each virtual target can 114be assigned an updating action -- instance of the `action` class. The 115action class, in turn, contains a list of source targets, properties, 116and a name of an action which should be executed. 117 118We try hard to never create equal instances of the `virtual-target` 119class. Code creating virtual targets passes them though the 120`virtual-target.register` function, which detects if a target with the 121same name, sources, and properties has already been created. In that 122case, the preexisting target is returned. 123 124When all virtual targets are produced, they are "actualized". This means 125that the real file names are computed, and the commands that should be 126run are generated. This is done by the `virtual-target.actualize` and 127`action.actualize` methods. The first is conceptually simple, while the 128second needs additional explanation. Commands in B2 are 129generated in a two-stage process. First, a rule with an appropriate name 130(for example "gcc.compile") is called and is given a list of target 131names. The rule sets some variables, like "OPTIONS". After that, the 132command string is taken, and variable are substitutes, so use of OPTIONS 133inside the command string gets transformed into actual compile options. 134 135B2 added a third stage to simplify things. It is now possible 136to automatically convert properties to appropriate variable assignments. 137For example, <debug-symbols>on would add "-g" to the OPTIONS variable, 138without requiring to manually add this logic to gcc.compile. This 139functionality is part of the "toolset" module. 140 141Note that the `build/virtual-targets.jam` file contains an UML diagram 142which might help. 143 144[[bbv2.arch.build.properties]] 145Properties 146~~~~~~~~~~ 147 148Above, we noted that metatargets are built with a set of properties. 149That set is represented by the `property-set` class. An important point 150is that handling of property sets can get very expensive. For that 151reason, we make sure that for each set of (name, value) pairs only one 152`property-set` instance is created. The `property-set` uses extensive 153caching for all operations, so most work is avoided. The 154`property-set.create` is the factory function used to create instances 155of the `property-set` class. 156 157[[bbv2.arch.tools]] 158The tools layer 159--------------- 160 161Write me! 162 163[[bbv2.arch.targets]] 164Targets 165------- 166 167NOTE: THIS SECTION IS NOT EXPECTED TO BE READ! There are two 168user-visible kinds of targets in B2. First are "abstract" — 169they correspond to things declared by the user, e.g. projects and 170executable files. The primary thing about abstract targets is that it is 171possible to request them to be built with a particular set of 172properties. Each property combination may possibly yield different built 173files, so abstract target do not have a direct correspondence to built 174files. 175 176File targets, on the other hand, are associated with concrete files. 177Dependency graphs for abstract targets with specific properties are 178constructed from file targets. User has no way to create file targets 179but can specify rules for detecting source file types, as well as rules 180for transforming between file targets of different types. That 181information is used in constructing the final dependency graph, as 182described in the link:#bbv2.arch.depends[next section]. **Note:**File 183targets are not the same entities as Jam targets; the latter are created 184from file targets at the latest possible moment. *Note:*"File target" is 185an originally proposed name for what we now call virtual targets. It is 186more understandable by users, but has one problem: virtual targets can 187potentially be "phony", and not correspond to any file. 188 189[[bbv2.arch.depends]] 190Dependency scanning 191------------------- 192 193Dependency scanning is the process of finding implicit dependencies, 194like "#include" statements in {CPP}. The requirements for correct 195dependency scanning mechanism are: 196 197* link:#bbv2.arch.depends.different-scanning-algorithms[Support for 198different scanning algorithms]. {CPP} and XML have quite different syntax 199for includes and rules for looking up the included files. 200* link:#bbv2.arch.depends.same-file-different-scanners[Ability to scan 201the same file several times]. For example, a single {CPP} file may be 202compiled using different include paths. 203* link:#bbv2.arch.depends.dependencies-on-generated-files[Proper 204detection of dependencies on generated files.] 205* link:#bbv2.arch.depends.dependencies-from-generated-files[Proper 206detection of dependencies from a generated file.] 207 208[[bbv2.arch.depends.different-scanning-algorithms]] 209Support for different scanning algorithms 210~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 211 212Different scanning algorithm are encapsulated by objects called 213"scanners". Please see the "scanner" module documentation for more 214details. 215 216[[bbv2.arch.depends.same-file-different-scanners]] 217Ability to scan the same file several times 218~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 219 220As stated above, it is possible to compile a {CPP} file multiple times, 221using different include paths. Therefore, include dependencies for those 222compilations can be different. The problem is that B2 engine 223does not allow multiple scans of the same target. To solve that, we pass 224the scanner object when calling `virtual-target.actualize` and it 225creates different engine targets for different scanners. 226 227For each engine target created with a specified scanner, a corresponding 228one is created without it. The updating action is associated with the 229scanner-less target, and the target with the scanner is made to depend 230on it. That way if sources for that action are touched, all targets — 231with and without the scanner are considered outdated. 232 233Consider the following example: "a.cpp" prepared from "a.verbatim", 234compiled by two compilers using different include paths and copied into 235some install location. The dependency graph would look like: 236 237.... 238a.o (<toolset>gcc) <--(compile)-- a.cpp (scanner1) ----+ 239a.o (<toolset>msvc) <--(compile)-- a.cpp (scanner2) ----| 240a.cpp (installed copy) <--(copy) ----------------------- a.cpp (no scanner) 241 ^ 242 | 243 a.verbose --------------------------------+ 244.... 245 246[[bbv2.arch.depends.dependencies-on-generated-files]] 247Proper detection of dependencies on generated files. 248~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 249 250This requirement breaks down to the following ones. 251 2521. If when compiling "a.cpp" there is an include of "a.h", the "dir" 253directory is on the include path, and a target called "a.h" will be 254generated in "dir", then B2 should discover the include, and 255create "a.h" before compiling "a.cpp". 2562. Since B2 almost always generates targets under the "bin" 257directory, this should be supported as well. I.e. in the scenario above, 258Jamfile in "dir" might create a main target, which generates "a.h". The 259file will be generated to "dir/bin" directory, but we still have to 260recognize the dependency. 261 262The first requirement means that when determining what "a.h" means when 263found in "a.cpp", we have to iterate over all directories in include 264paths, checking for each one: 265 2661. If there is a file named "a.h" in that directory, or 2672. If there is a target called "a.h", which will be generated in that 268that directory. 269 270Classic Jam has built-in facilities for point (1) above, but that is not 271enough. It is hard to implement the right semantics without builtin 272support. For example, we could try to check if there exists a target 273called "a.h" somewhere in the dependency graph, and add a dependency to 274it. The problem is that without a file search in the include path, the 275semantics may be incorrect. For example, one can have an action that 276generated some "dummy" header, for systems which do not have a native 277one. Naturally, we do not want to depend on that generated header on 278platforms where a native one is included. 279 280There are two design choices for builtin support. Suppose we have files 281a.cpp and b.cpp, and each one includes header.h, generated by some 282action. Dependency graph created by classic Jam would look like: 283 284.... 285a.cpp -----> <scanner1>header.h [search path: d1, d2, d3] 286 287 <d2>header.h --------> header.y 288 [generated in d2] 289 290b.cpp -----> <scanner2>header.h [search path: d1, d2, d4] 291.... 292 293In this case, Jam thinks all header.h target are not related. The 294correct dependency graph might be: 295 296.... 297a.cpp ---- 298 \ 299 >----> <d2>header.h --------> header.y 300 / [generated in d2] 301b.cpp ---- 302.... 303 304or 305 306.... 307a.cpp -----> <scanner1>header.h [search path: d1, d2, d3] 308 | 309 (includes) 310 V 311 <d2>header.h --------> header.y 312 [generated in d2] 313 ^ 314 (includes) 315 | 316b.cpp -----> <scanner2>header.h [ search path: d1, d2, d4] 317.... 318 319The first alternative was used for some time. The problem however is: 320what include paths should be used when scanning header.h? The second 321alternative was suggested by Matt Armstrong. It has a similar effect: 322Any target depending on <scanner1>header.h will also depend on 323<d2>header.h. This way though we now have two different targets with two 324different scanners, so those targets can be scanned independently. The 325first alternative's problem is avoided, so the second alternative is 326implemented now. 327 328The second sub-requirements is that targets generated under the "bin" 329directory are handled as well. B2 implements a semi-automatic 330approach. When compiling {CPP} files the process is: 331 3321. The main target to which the compiled file belongs to is found. 3332. All other main targets that the found one depends on are found. 334These include: main targets used as sources as well as those specified 335as "dependency" properties. 3363. All directories where files belonging to those main targets will be 337generated are added to the include path. 338 339After this is done, dependencies are found by the approach explained 340previously. 341 342Note that if a target uses generated headers from another main target, 343that main target should be explicitly specified using the dependency 344property. It would be better to lift this requirement, but it does not 345seem to be causing any problems in practice. 346 347For target types other than {CPP}, adding of include paths must be 348implemented anew. 349 350[[bbv2.arch.depends.dependencies-from-generated-files]] 351Proper detection of dependencies from generated files 352~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 353 354Suppose file "a.cpp" includes "a.h" and both are generated by some 355action. Note that classic Jam has two stages. In the first stage the 356dependency graph is built and actions to be run are determined. In the 357second stage the actions are executed. Initially, neither file exists, 358so the include is not found. As the result, Jam might attempt to compile 359a.cpp before creating a.h, causing the compilation to fail. 360 361The solution in Boost.Jam is to perform additional dependency scans 362after targets are updated. This breaks separation between build stages 363in Jam — which some people consider a good thing — but I am not aware of 364any better solution. 365 366In order to understand the rest of this section, you better read some 367details about Jam's dependency scanning, available at 368http://public.perforce.com:8080/@md=d&cd=//public/jam/src/&ra=s&c=kVu@//2614?ac=10[this 369link]. 370 371Whenever a target is updated, Boost.Jam rescans it for includes. 372Consider this graph, created before any actions are run. 373 374.... 375A -------> C ----> C.pro 376 / 377B --/ C-includes ---> D 378.... 379 380Both A and B have dependency on C and C-includes (the latter dependency 381is not shown). Say during building we have tried to create A, then tried 382to create C and successfully created C. 383 384In that case, the set of includes in C might well have changed. We do 385not bother to detect precisely which includes were added or removed. 386Instead we create another internal node C-includes-2. Then we determine 387what actions should be run to update the target. In fact this means that 388we perform the first stage logic when already in the execution stage. 389 390After actions for C-includes-2 are determined, we add C-includes-2 to 391the list of A's dependents, and stage 2 proceeds as usual. 392Unfortunately, we can not do the same with target B, since when it is 393not visited, C target does not know B depends on it. So, we add a flag 394to C marking it as rescanned. When visiting the B target, the flag is 395noticed and C-includes-2 is added to the list of B's dependencies as 396well. 397 398Note also that internal nodes are sometimes updated too. Consider this 399dependency graph: 400 401.... 402a.o ---> a.cpp 403 a.cpp-includes --> a.h (scanned) 404 a.h-includes ------> a.h (generated) 405 | 406 | 407 a.pro <-------------------------------------------+ 408.... 409 410Here, our handling of generated headers come into play. Say that a.h 411exists but is out of date with respect to "a.pro", then "a.h 412(generated)" and "a.h-includes" will be marked for updating, but "a.h 413(scanned)" will not. We have to rescan "a.h" after it has been created, 414but since "a.h (generated)" has no associated scanner, it is only 415possible to rescan "a.h" after "a.h-includes" target has been updated. 416 417The above consideration lead to the decision to rescan a target whenever 418it is updated, no matter if it is internal or not. 419 420________________________________________________________________________________________________________ 421*Warning* 422 423The remainder of this document is not intended to be read at all. This 424will be rearranged in the future. 425________________________________________________________________________________________________________ 426 427File targets 428------------ 429 430As described above, file targets correspond to files that B2 431manages. Users may be concerned about file targets in three ways: when 432declaring file target types, when declaring transformations between 433types and when determining where a file target is to be placed. File 434targets can also be connected to actions that determine how the target 435is to be created. Both file targets and actions are implemented in the 436`virtual-target` module. 437 438Types 439~~~~~ 440 441A file target can be given a type, which determines what transformations 442can be applied to the file. The `type.register` rule declares new types. 443File type can also be assigned a scanner, which is then used to find 444implicit dependencies. See "link:#bbv2.arch.depends[dependency 445scanning]". 446 447Target paths 448~~~~~~~~~~~~ 449 450To distinguish targets build with different properties, they are put in 451different directories. Rules for determining target paths are given 452below: 453 4541. All targets are placed under a directory corresponding to the 455project where they are defined. 4562. Each non free, non incidental property causes an additional element 457to be added to the target path. That element has the the form 458`<feature-name>-<feature-value>` for ordinary features and 459`<feature-value>` for implicit ones. [TODO: Add note about composite 460features]. 4613. If the set of free, non incidental properties is different from the 462set of free, non incidental properties for the project in which the main 463target that uses the target is defined, a part of the form 464`main_target-<name>` is added to the target path. **Note:**It would be 465nice to completely track free features also, but this appears to be 466complex and not extremely needed. 467 468For example, we might have these paths: 469 470.... 471debug/optimization-off 472debug/main-target-a 473.... 474