• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[[bbv2.arch]]
2B2 v2 architecture
3---------------------------
4
5This document is work-in progress. Do not expect much from it yet.
6
7[[bbv2.arch.overview]]
8Overview
9--------
10
11B2 implementation is structured in four different components:
12"kernel", "util", "build" and "tools". The first two are relatively
13uninteresting, so we will focus on the remaining pair. The "build"
14component provides classes necessary to declare targets, determining
15which properties should be used for their building, and creating the
16dependency graph. The "tools" component provides user-visible
17functionality. It mostly allows declaring specific kinds of main
18targets, as well as registering available tools, which are then used
19when creating the dependency graph.
20
21[[bbv2.arch.build]]
22The build layer
23---------------
24
25The build layer has just four main parts -- metatargets (abstract
26targets), virtual targets, generators and properties.
27
28* Metatargets (see the "targets.jam" module) represent all the
29user-defined entities that can be built. The "meta" prefix signifies
30that they do not need to correspond to exact files or even files at all
31-- they can produce a different set of files depending on the build
32request. Metatargets are created when Jamfiles are loaded. Each has a
33`generate` method which is given a property set and produces virtual
34targets for the passed properties.
35* Virtual targets (see the "virtual-targets.jam" module) correspond to
36actual atomic updatable entities -- most typically files.
37* Properties are just (name, value) pairs, specified by the user and
38describing how targets should be built. Properties are stored using the
39`property-set` class.
40* Generators are objects that encapsulate specific tools -- they can
41take a list of source virtual targets and produce new virtual targets
42from them.
43
44The build process includes the following steps:
45
461.  Top-level code calls the `generate` method of a metatarget with some
47properties.
482.  The metatarget combines the requested properties with its
49requirements and passes the result, together with the list of sources,
50to the `generators.construct` function.
513.  A generator appropriate for the build properties is selected and its
52`run` method is called. The method returns a list of virtual targets.
534.  The virtual targets are returned to the top level code, and for each
54instance, the `actualize` method is called to setup nodes and updating
55actions in the dependency graph kept inside B2 engine. This
56dependency graph is then updated, which runs necessary commands.
57
58[[bbv2.arch.build.metatargets]]
59Metatargets
60~~~~~~~~~~~
61
62There are several classes derived from "abstract-target". The
63"main-target" class represents a top-level main target, the
64"project-target" class acts like a container holding multiple main
65targets, and "basic-target" class is a base class for all further target
66types.
67
68Since each main target can have several alternatives, all top-level
69target objects are actually containers, referring to "real" main target
70classes. The type of that container is "main-target". For example,
71given:
72
73....
74alias a ;
75lib a : a.cpp : <toolset>gcc ;
76....
77
78we would have one-top level "main-target" instance, containing one
79"alias-target" and one "lib-target" instance. "main-target"'s "generate"
80method decides which of the alternative should be used, and calls
81"generate" on the corresponding instance.
82
83Each alternative is an instance of a class derived from "basic-target".
84"basic-target.generate" does several things that should always be done:
85
86* Determines what properties should be used for building the target.
87This includes looking at requested properties, requirements, and usage
88requirements of all sources.
89* Builds all sources.
90* Computes usage requirements that should be passed back to targets
91depending on this one.
92
93For the real work of constructing a virtual target, a new method
94"construct" is called.
95
96The "construct" method can be implemented in any way by classes derived
97from "basic-target", but one specific derived class plays the central
98role -- "typed-target". That class holds the desired type of file to be
99produced, and its "construct" method uses the generators module to do
100the actual work.
101
102This means that a specific metatarget subclass may avoid using
103generators all together. However, this is deprecated and we are trying
104to eliminate all such subclasses at the moment.
105
106Note that the `build/targets.jam` file contains an UML diagram which
107might help.
108
109[[bbv2.arch.build.virtual]]
110Virtual targets
111~~~~~~~~~~~~~~~
112
113Virtual targets are atomic updatable entities. Each virtual target can
114be assigned an updating action -- instance of the `action` class. The
115action class, in turn, contains a list of source targets, properties,
116and a name of an action which should be executed.
117
118We try hard to never create equal instances of the `virtual-target`
119class. Code creating virtual targets passes them though the
120`virtual-target.register` function, which detects if a target with the
121same name, sources, and properties has already been created. In that
122case, the preexisting target is returned.
123
124When all virtual targets are produced, they are "actualized". This means
125that the real file names are computed, and the commands that should be
126run are generated. This is done by the `virtual-target.actualize` and
127`action.actualize` methods. The first is conceptually simple, while the
128second needs additional explanation. Commands in B2 are
129generated in a two-stage process. First, a rule with an appropriate name
130(for example "gcc.compile") is called and is given a list of target
131names. The rule sets some variables, like "OPTIONS". After that, the
132command string is taken, and variable are substitutes, so use of OPTIONS
133inside the command string gets transformed into actual compile options.
134
135B2 added a third stage to simplify things. It is now possible
136to automatically convert properties to appropriate variable assignments.
137For example, <debug-symbols>on would add "-g" to the OPTIONS variable,
138without requiring to manually add this logic to gcc.compile. This
139functionality is part of the "toolset" module.
140
141Note that the `build/virtual-targets.jam` file contains an UML diagram
142which might help.
143
144[[bbv2.arch.build.properties]]
145Properties
146~~~~~~~~~~
147
148Above, we noted that metatargets are built with a set of properties.
149That set is represented by the `property-set` class. An important point
150is that handling of property sets can get very expensive. For that
151reason, we make sure that for each set of (name, value) pairs only one
152`property-set` instance is created. The `property-set` uses extensive
153caching for all operations, so most work is avoided. The
154`property-set.create` is the factory function used to create instances
155of the `property-set` class.
156
157[[bbv2.arch.tools]]
158The tools layer
159---------------
160
161Write me!
162
163[[bbv2.arch.targets]]
164Targets
165-------
166
167NOTE: THIS SECTION IS NOT EXPECTED TO BE READ! There are two
168user-visible kinds of targets in B2. First are "abstract" —
169they correspond to things declared by the user, e.g. projects and
170executable files. The primary thing about abstract targets is that it is
171possible to request them to be built with a particular set of
172properties. Each property combination may possibly yield different built
173files, so abstract target do not have a direct correspondence to built
174files.
175
176File targets, on the other hand, are associated with concrete files.
177Dependency graphs for abstract targets with specific properties are
178constructed from file targets. User has no way to create file targets
179but can specify rules for detecting source file types, as well as rules
180for transforming between file targets of different types. That
181information is used in constructing the final dependency graph, as
182described in the link:#bbv2.arch.depends[next section]. **Note:**File
183targets are not the same entities as Jam targets; the latter are created
184from file targets at the latest possible moment. *Note:*"File target" is
185an originally proposed name for what we now call virtual targets. It is
186more understandable by users, but has one problem: virtual targets can
187potentially be "phony", and not correspond to any file.
188
189[[bbv2.arch.depends]]
190Dependency scanning
191-------------------
192
193Dependency scanning is the process of finding implicit dependencies,
194like "#include" statements in {CPP}. The requirements for correct
195dependency scanning mechanism are:
196
197* link:#bbv2.arch.depends.different-scanning-algorithms[Support for
198different scanning algorithms]. {CPP} and XML have quite different syntax
199for includes and rules for looking up the included files.
200* link:#bbv2.arch.depends.same-file-different-scanners[Ability to scan
201the same file several times]. For example, a single {CPP} file may be
202compiled using different include paths.
203* link:#bbv2.arch.depends.dependencies-on-generated-files[Proper
204detection of dependencies on generated files.]
205* link:#bbv2.arch.depends.dependencies-from-generated-files[Proper
206detection of dependencies from a generated file.]
207
208[[bbv2.arch.depends.different-scanning-algorithms]]
209Support for different scanning algorithms
210~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
211
212Different scanning algorithm are encapsulated by objects called
213"scanners". Please see the "scanner" module documentation for more
214details.
215
216[[bbv2.arch.depends.same-file-different-scanners]]
217Ability to scan the same file several times
218~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219
220As stated above, it is possible to compile a {CPP} file multiple times,
221using different include paths. Therefore, include dependencies for those
222compilations can be different. The problem is that B2 engine
223does not allow multiple scans of the same target. To solve that, we pass
224the scanner object when calling `virtual-target.actualize` and it
225creates different engine targets for different scanners.
226
227For each engine target created with a specified scanner, a corresponding
228one is created without it. The updating action is associated with the
229scanner-less target, and the target with the scanner is made to depend
230on it. That way if sources for that action are touched, all targets —
231with and without the scanner are considered outdated.
232
233Consider the following example: "a.cpp" prepared from "a.verbatim",
234compiled by two compilers using different include paths and copied into
235some install location. The dependency graph would look like:
236
237....
238a.o (<toolset>gcc)        <--(compile)-- a.cpp (scanner1) ----+
239a.o (<toolset>msvc)       <--(compile)-- a.cpp (scanner2) ----|
240a.cpp (installed copy)    <--(copy) ----------------------- a.cpp (no scanner)
241                                                                 ^
242                                                                 |
243                       a.verbose --------------------------------+
244....
245
246[[bbv2.arch.depends.dependencies-on-generated-files]]
247Proper detection of dependencies on generated files.
248~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
249
250This requirement breaks down to the following ones.
251
2521.  If when compiling "a.cpp" there is an include of "a.h", the "dir"
253directory is on the include path, and a target called "a.h" will be
254generated in "dir", then B2 should discover the include, and
255create "a.h" before compiling "a.cpp".
2562.  Since B2 almost always generates targets under the "bin"
257directory, this should be supported as well. I.e. in the scenario above,
258Jamfile in "dir" might create a main target, which generates "a.h". The
259file will be generated to "dir/bin" directory, but we still have to
260recognize the dependency.
261
262The first requirement means that when determining what "a.h" means when
263found in "a.cpp", we have to iterate over all directories in include
264paths, checking for each one:
265
2661.  If there is a file named "a.h" in that directory, or
2672.  If there is a target called "a.h", which will be generated in that
268that directory.
269
270Classic Jam has built-in facilities for point (1) above, but that is not
271enough. It is hard to implement the right semantics without builtin
272support. For example, we could try to check if there exists a target
273called "a.h" somewhere in the dependency graph, and add a dependency to
274it. The problem is that without a file search in the include path, the
275semantics may be incorrect. For example, one can have an action that
276generated some "dummy" header, for systems which do not have a native
277one. Naturally, we do not want to depend on that generated header on
278platforms where a native one is included.
279
280There are two design choices for builtin support. Suppose we have files
281a.cpp and b.cpp, and each one includes header.h, generated by some
282action. Dependency graph created by classic Jam would look like:
283
284....
285a.cpp -----> <scanner1>header.h  [search path: d1, d2, d3]
286
287                  <d2>header.h  --------> header.y
288                  [generated in d2]
289
290b.cpp -----> <scanner2>header.h  [search path: d1, d2, d4]
291....
292
293In this case, Jam thinks all header.h target are not related. The
294correct dependency graph might be:
295
296....
297a.cpp ----
298          \
299           >---->  <d2>header.h  --------> header.y
300          /       [generated in d2]
301b.cpp ----
302....
303
304or
305
306....
307a.cpp -----> <scanner1>header.h  [search path: d1, d2, d3]
308                          |
309                       (includes)
310                          V
311                  <d2>header.h  --------> header.y
312                  [generated in d2]
313                          ^
314                      (includes)
315                          |
316b.cpp -----> <scanner2>header.h [ search path: d1, d2, d4]
317....
318
319The first alternative was used for some time. The problem however is:
320what include paths should be used when scanning header.h? The second
321alternative was suggested by Matt Armstrong. It has a similar effect:
322Any target depending on <scanner1>header.h will also depend on
323<d2>header.h. This way though we now have two different targets with two
324different scanners, so those targets can be scanned independently. The
325first alternative's problem is avoided, so the second alternative is
326implemented now.
327
328The second sub-requirements is that targets generated under the "bin"
329directory are handled as well. B2 implements a semi-automatic
330approach. When compiling {CPP} files the process is:
331
3321.  The main target to which the compiled file belongs to is found.
3332.  All other main targets that the found one depends on are found.
334These include: main targets used as sources as well as those specified
335as "dependency" properties.
3363.  All directories where files belonging to those main targets will be
337generated are added to the include path.
338
339After this is done, dependencies are found by the approach explained
340previously.
341
342Note that if a target uses generated headers from another main target,
343that main target should be explicitly specified using the dependency
344property. It would be better to lift this requirement, but it does not
345seem to be causing any problems in practice.
346
347For target types other than {CPP}, adding of include paths must be
348implemented anew.
349
350[[bbv2.arch.depends.dependencies-from-generated-files]]
351Proper detection of dependencies from generated files
352~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
353
354Suppose file "a.cpp" includes "a.h" and both are generated by some
355action. Note that classic Jam has two stages. In the first stage the
356dependency graph is built and actions to be run are determined. In the
357second stage the actions are executed. Initially, neither file exists,
358so the include is not found. As the result, Jam might attempt to compile
359a.cpp before creating a.h, causing the compilation to fail.
360
361The solution in Boost.Jam is to perform additional dependency scans
362after targets are updated. This breaks separation between build stages
363in Jam — which some people consider a good thing — but I am not aware of
364any better solution.
365
366In order to understand the rest of this section, you better read some
367details about Jam's dependency scanning, available at
368http://public.perforce.com:8080/@md=d&cd=//public/jam/src/&ra=s&c=kVu@//2614?ac=10[this
369link].
370
371Whenever a target is updated, Boost.Jam rescans it for includes.
372Consider this graph, created before any actions are run.
373
374....
375A -------> C ----> C.pro
376     /
377B --/         C-includes   ---> D
378....
379
380Both A and B have dependency on C and C-includes (the latter dependency
381is not shown). Say during building we have tried to create A, then tried
382to create C and successfully created C.
383
384In that case, the set of includes in C might well have changed. We do
385not bother to detect precisely which includes were added or removed.
386Instead we create another internal node C-includes-2. Then we determine
387what actions should be run to update the target. In fact this means that
388we perform the first stage logic when already in the execution stage.
389
390After actions for C-includes-2 are determined, we add C-includes-2 to
391the list of A's dependents, and stage 2 proceeds as usual.
392Unfortunately, we can not do the same with target B, since when it is
393not visited, C target does not know B depends on it. So, we add a flag
394to C marking it as rescanned. When visiting the B target, the flag is
395noticed and C-includes-2 is added to the list of B's dependencies as
396well.
397
398Note also that internal nodes are sometimes updated too. Consider this
399dependency graph:
400
401....
402a.o ---> a.cpp
403            a.cpp-includes -->  a.h (scanned)
404                                   a.h-includes ------> a.h (generated)
405                                                                 |
406                                                                 |
407            a.pro <-------------------------------------------+
408....
409
410Here, our handling of generated headers come into play. Say that a.h
411exists but is out of date with respect to "a.pro", then "a.h
412(generated)" and "a.h-includes" will be marked for updating, but "a.h
413(scanned)" will not. We have to rescan "a.h" after it has been created,
414but since "a.h (generated)" has no associated scanner, it is only
415possible to rescan "a.h" after "a.h-includes" target has been updated.
416
417The above consideration lead to the decision to rescan a target whenever
418it is updated, no matter if it is internal or not.
419
420________________________________________________________________________________________________________
421*Warning*
422
423The remainder of this document is not intended to be read at all. This
424will be rearranged in the future.
425________________________________________________________________________________________________________
426
427File targets
428------------
429
430As described above, file targets correspond to files that B2
431manages. Users may be concerned about file targets in three ways: when
432declaring file target types, when declaring transformations between
433types and when determining where a file target is to be placed. File
434targets can also be connected to actions that determine how the target
435is to be created. Both file targets and actions are implemented in the
436`virtual-target` module.
437
438Types
439~~~~~
440
441A file target can be given a type, which determines what transformations
442can be applied to the file. The `type.register` rule declares new types.
443File type can also be assigned a scanner, which is then used to find
444implicit dependencies. See "link:#bbv2.arch.depends[dependency
445scanning]".
446
447Target paths
448~~~~~~~~~~~~
449
450To distinguish targets build with different properties, they are put in
451different directories. Rules for determining target paths are given
452below:
453
4541.  All targets are placed under a directory corresponding to the
455project where they are defined.
4562.  Each non free, non incidental property causes an additional element
457to be added to the target path. That element has the the form
458`<feature-name>-<feature-value>` for ordinary features and
459`<feature-value>` for implicit ones. [TODO: Add note about composite
460features].
4613.  If the set of free, non incidental properties is different from the
462set of free, non incidental properties for the project in which the main
463target that uses the target is defined, a part of the form
464`main_target-<name>` is added to the target path. **Note:**It would be
465nice to completely track free features also, but this appears to be
466complex and not extremely needed.
467
468For example, we might have these paths:
469
470....
471debug/optimization-off
472debug/main-target-a
473....
474