• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[article Boost.AutoIndex
2    [quickbook 1.5]
3    [copyright 2008, 2011 John Maddock]
4    [license
5        Distributed under the Boost Software License, Version 1.0.
6        (See accompanying file LICENSE_1_0.txt or copy at
7        [@http://www.boost.org/LICENSE_1_0.txt])
8    ]
9    [authors [Maddock, John]]
10    [/last-revision $Date: 2008-11-04 17:11:53 +0000 (Tue, 04 Nov 2008) $]
11]
12
13[def __quickbook  [@http://www.boost.org/doc/tools/quickbook/index.html Quickbook]]
14[def __boostbook [@http://www.boost.org/doc/html/boostbook.html BoostBook]]
15[def __boostbook_docs [@http://www.boost.org/doc/libs/1_41_0/doc/html/boostbook.html BoostBook documentation]]
16[def __quickbook_syntax [@http://www.boost.org/doc/libs/1_41_0/doc/html/quickbook/ref.html Quickbook Syntax Compendium]]
17[def __docbook [@http://www.docbook.org/ DocBook]]
18[def __docbook_params [@http://docbook.sourceforge.net/release/xsl/current/doc/ Docbook xsl:param format options]]
19[def __DocObjMod [@http://en.wikipedia.org/wiki/Document_Object_Model Document Object Model (DOM)]]
20
21[def __doxygen [@http://www.doxygen.org/ Doxygen]]
22[def __pdf [@http://www.adobe.com/products/acrobat/adobepdf.html PDF]]
23
24[template deg[]'''°'''] [/ degree sign ]
25
26
27[section:overview Overview]
28
29AutoIndex is a tool for taking the grunt work out of indexing a
30Boostbook\/Docbook document
31(perhaps generated by your Quickbook file mylibrary.qbk,
32and perhaps using also Doxygen autodoc)
33that describes C\/C++ code.
34
35Traditionally, in order to index a Docbook document you would
36have to manually add a large amount of `<indexterm>` markup:
37in fact one `<indexterm>` for each occurrence of each term to be
38indexed.
39
40Instead AutoIndex will automatically scan one or more C\/C++ header files
41and extract all the ['function], ['class], ['macro] and ['typedef]
42names that are defined by those headers, and then insert the
43`<indexterm>`s into the Docbook XML document for you.
44
45AutoIndex can also scan using a list of index terms
46specified in a script file, for example index.idx.
47These manually provided terms can optionally be regular expressions,
48and may allow the user to find references to terms
49that may not occur in the C++ header files.  Of course providing a manual
50list of search terms in to index is a tedious task
51(especially handling plurals and variants),
52and requires enough knowledge of the library
53 to guess what users may be seeking to know,
54but at least the real 'grunt work' of
55finding the term and listing the page number is automated.
56
57AutoIndex creates index entries as follows:
58
59for each occurrence of each search term, it creates two index entries:
60
61# The search term as the ['primary index key] and
62 the ['title of the section it appears in] as a subterm.
63
64# The section title as the main index entry and the search term as the subentry.
65
66Thus the user has two chances to find what they're
67looking for, based upon either the section name
68or the ['function], ['class], ['macro] or ['typedef] name.
69
70[note This behaviour can be changed so that only one index entry is created
71 (using the search term as the key and
72 not using the section name except as a sub-entry of the search term).]
73
74So for example in Boost.Math the class name `students_t_distribution` has a primary
75entry that lists all sections the class name appears in:
76
77[$../students_t_eg_1.png]
78
79Then those sections also have primary entries, which list all the search terms those
80sections contain:
81
82[$../students_t_eg_2.png]
83
84Of course these automated index entries may not be quite
85what you're looking for: often you'll get a few spurious entries, a few missing entries,
86and a few entries where the section name used as an index entry is less than ideal.
87So AutoIndex provides some powerful regular expression based rules that allow you
88to add, remove, constrain, or rewrite entries.  Normally just a few lines in
89AutoIndex's script file are enough to tailor the output to match the author's
90expectations (and thus hopefully the index user's expectations too!).
91
92AutoIndex also supports multiple indexes (as does Docbook), and since it knows
93which search terms are ['function], ['class], ['macro] or ['typedef] names, it
94can add the necessary attributes to the XML so that you can have separate
95indexes for each of these different types.  These specialised indexes only contain
96entries for the ['function], ['class], ['macro] or ['typedef] names, ['section
97names] are never used as primary index terms here, unlike the main "include everything"
98index.
99
100Finally, while the Docbook XSL stylesheets create nice indexes complete with page
101numbers for PDF output, the HTML indexes look poorer by comparison, as these use
102section titles in place of page numbers... but as AutoIndex uses section titles
103as index entries this leads to a lot of repetition, so as an alternative AutoIndex
104can be instructed to construct the index itself.  This is faster than using
105the XSL stylesheets, and now each index entry is a hyperlink to the
106appropriate section:
107
108[$../students_t_eg_3.png]
109
110With internal index generation there is also a helpful navigation bar
111at the start of each Index:
112
113[$../students_t_eg_4.png]
114
115Finally, you can choose what kind of XML container wraps an internally generated index -
116this defaults to `<section>...</section>` but you can use either command line options
117or Boost.Build Jamfile features, to select an alternative wrapper - for example ['appendix]
118or ['chapter] would be good choices, whatever fits best into the flow of the
119document.  You can even set the container wrapper to type ['index] provided you turn
120off index generation by the XSL stylesheets, for example by setting the following
121build requirements in the Jamfile:
122
123[pre
124<format>html:<auto-index-internal>on       # Use internally generated indexes.
125<auto-index-type>index                     # Use <index>...</index> as the XML wrapper.
126<format>html:<xsl:param>generate.index=0   # Don't let the XSL stylesheets generate indexes.
127]
128
129[endsect] [/section:overview Overview]
130
131[section:tut Getting Started and Tutorial]
132
133[section:build Step 1: Build the AutoIndex tool]
134
135[note This step is strictly optional, but very desirable to speed up build times.]
136
137cd into `tools/auto_index/build` and invoke bjam as:
138
139   bjam release
140
141Optionally pass the name of the compiler toolset you want to use to bjam as well:
142
143   bjam release gcc
144
145This will build the tool and place a copy in the current directory (which is to say `tools/auto_index/build`)
146
147Now open up your `user-config.jam` file and at the end of the file add the line:
148
149[pre
150using auto-index : ['full-path-to-boost-tree]/tools/auto_index/build/auto-index.exe ;
151]
152
153[note
154This declaration must go towards the end of `user-config.jam`, or in any case after the Boostbook initialisation.
155
156Also note that Windows users must use forward slashes in the paths in `user-config.jam`]
157
158[endsect] [/section:build Step 1: Build the AutoIndex tool]
159
160[section:configure Step 2: Configure Boost.Build jamfile to use AutoIndex]
161
162Assuming you have a Jamfile for building your documentation that looks
163something like:
164
165[pre
166boostbook standalone
167    :
168        mylibrary
169    :
170        # build requirements go here:
171    ;
172]
173
174Then add the line:
175
176[pre using auto-index ; ]
177
178to the start of the Jamfile, and then add whatever auto-index options
179you want to the ['build requirements section], for example:
180
181[pre
182   boostbook standalone
183    :
184        mylibrary
185    :
186        # Build requirements go here:
187
188        # <auto-index>on (or off) one turns on (or off) indexing:
189        <auto-index>on
190
191        # Turns on (or off) auto-index-verbose for diagnostic info.
192        # This is highly recommended until you have got all the many details correct!
193        <auto-index-verbose>on
194
195        # Choose the indexing method (separately for html and PDF) - see manual.
196        # Choose indexing method for PDFs:
197        <format>pdf:<auto-index-internal>off
198
199        # Choose indexing method for html:
200        <format>html:<auto-index-internal>on
201
202        # Set the name of the script file to use (index.idx is popular):
203        <auto-index-script>index.idx
204        # Commands in the script file should all use RELATIVE PATHS
205        # otherwise the script will not be portable to other machines.
206        # Relative paths are normally taken as relative to the location
207        # of the script file, but we can add a prefix to all
208        # those relative paths using the <auto-index-prefix> feature.
209        # The path specified by <auto-index-prefix> may be either relative or
210        # absolute, for example the following will get us up to the boost root
211        # directory for most Boost libraries:
212        <auto-index-prefix>..\/..\/..
213
214        # Tell Quickbook that it should enable indexing.
215        <quickbook-define>enable_index ;
216
217    ;
218] [/pre]
219
220[section:options Available Indexing Options]
221
222The available options are:
223
224[variablelist
225[[<auto-index>off/on][Turns indexing of the document on, defaults to
226"off", so be sure to set this if you want AutoIndex invoked!]]
227[[<auto-index-internal>off/on][Chooses whether AutoIndex creates the index
228itself (feature on), or whether it simply inserts the necessary DocBook
229markup so that the DocBook XSL stylesheets can create the index.  Defaults to "off".]]
230[[<auto-index-script>filename][Specifies the name of the script to load.]]
231[[<auto-index-no-duplicates>off/on][When ['on] AutoIndex will only index a term
232once in any given section, otherwise (the default) multiple index entries per
233term may be created if the term occurs more than once in the section.]]
234[[<auto-index-section-names>off/on][When ['on] AutoIndex will use create two
235index entries for each term found - one uses the term itself as the primary
236index key, the other uses the enclosing section name.  When off the index
237entry that uses the section title is not created.  Defaults to "on"]]
238[[<auto-index-verbose>off/on][Defaults to "off".  When turned on AutoIndex
239prints progress information - useful for debugging purposes during setup.]]
240[[<auto-index-prefix>filename][Optionally specifies a directory to apply
241as a prefix to all relative file paths in the script file.
242
243You may wish to do this to reduce typing of pathnames, and\/or where the
244paths can't be located relative to the script file location,
245typically if the headers are in the Boost trunk,
246but the script file is in Boost sandbox.
247
248For Boost standard library layout,
249[^<auto-index-prefix>..\/..\/..] will get you back up to the 'root' of the Boost tree,
250so [^!scan-path boost\/mylibrary\/] is where your headers will be, and [^libs\/mylibrary] for other files.
251Without a prefix all relative paths are relative to the location of the script file.
252]]
253
254[[<auto-index-type>element-name][Specifies the name of the XML element in which to enclose an internally generated indexes:
255  defaults to ['section], but could equally be ['appendix] or ['chapter] or some other block level element that has a formal title.
256   The actual list of available options depends upon the Quickbook document type, the following table gives the available options,
257   assuming that the index is placed at the top level, and not in some sub-section or other container:]]
258]
259
260[table
261[[Document Type][Permitted Index Types]]
262[[book][appendix index article chapter reference part]]
263[[article][section appendix index sect1]]
264[[chapter][section index sect1]]
265[[library][The same as Chapter (section index sect1)]]
266[[part][appendix index article chapter reference]]
267[[appendix][section index sect1]]
268[[preface][section index sect1]]
269[[qandadiv][N/A: an index would have to be placed within a subsection of the document.]]
270[[qandaset][N/A: an index would have to be placed within a subsection of the document.]]
271[[reference][N/A: an index would have to be placed within a subsection of the document.]]
272[[set][N/A: an index would have to be placed within a subsection of the document.]]
273]
274
275In large part then the choice of `<auto-index-type>element-name` depends on the
276formatting you want to be applied to the index:
277
278[table
279[[XML Container Used for the Index][Formatting Applied by the XSL Stylesheets]]
280[[appendix][Starts a new page.]]
281[[article][Starts a new page.]]
282[[chapter][Starts a new page.]]
283[[index][Starts a new page only if it's contained within an article or book.]]
284[[part][Starts a new page.]]
285[[reference][Starts a new page.]]
286[[sect1][Starts a new page as long as it's not the first section (but is controlled by the XSL parameters chunk.section.depth and/or chunk.first.sections).]]
287[[section][Starts a new page as long as it's not the first section or nested within another section (but is controlled by the XSL parameters chunk.section.depth and/or chunk.first.sections).]]
288]
289
290In almost all cases the default (section) is the correct choice - the exception is when the index is to be placed
291directly inside a /book/ or /part/, in which case you should probably use the same XML container for the index as
292you use for whatever subdivisions are in the /book/ or /part/.  In any event placing a /section/ within a /book/ or
293/part/ will result in invalid XML.
294
295Finally, if you are using Quickbook to generate the documentation, then you may wish to add:
296
297[pre <include>$boost-root/tools/auto_index/include]
298
299to your projects requirements (replacing $boost-root with the path to the root of the Boost tree), so that
300the file auto_index_helpers.qbk can be included in your quickbook source with simply a:
301
302[pre \[include auto_index_helpers.qbk\]]
303
304[endsect] [/section:options Available Indexing Options]
305
306[section:optional Making AutoIndex optional]
307
308It is considerate to make the [*use of auto-index optional] in Boost.Build,
309to allow users who do not have AutoIndex installed to still be able to build your documentation.
310
311This also very convenient while you are refining your documentation,
312to allow you to decide to build indexes, or not:
313building indexes can take long time, if you are just correcting typos,
314you won't want to wait while you keep rebuilding the index!
315
316One method of setting up optional AutoIndex support is to place all
317AutoIndex configuration in a the body of a bjam if statement:
318
319[pre
320  if --enable-index in  \[ modules.peek : ARGV \]
321  {
322     ECHO "Building the  docs with automatic index generation enabled." ;
323
324     using auto-index ;
325     project : requirements
326          <auto-index>on
327          <auto-index-script>index.idx
328
329           ... other AutoIndex options here...
330
331        # And tell Quickbook that it should enable indexing.
332        <quickbook-define>enable_index
333    ;
334  }
335  else
336  {
337     ECHO "Building the my_library docs with automatic index generation disabled. To get an Index, try building with --enable-index." ;
338  }
339] [/pre]
340
341You will also need to add a conditional statement at the end of your Quickbook file,
342so that the index(es) is/are only added after the last section if indexing is enabled.
343
344[pre
345\[\? '''enable_index'''
346\'\'\'
347  <index/>
348\'\'\'
349\]
350] [/pre]
351
352
353To use this jamfile, you need to cd to your docs folder, for example:
354
355 cd \boost-sandbox\guild\mylibrary\libs\mylibrary\doc
356
357and then run `bjam` to build the docs without index, for example:
358
359  bjam -a html > mylibrary_html.log
360
361or with index(es)
362
363  bjam -a html --enable-index > mylibrary_html_index.log
364
365[endsect] [/section:optional Making AutoIndex optional]
366
367[tip Always send the output to a log file.
368It will contain of lot of stuff, but is invaluable to check if all has gone right,
369or else diagnose what has gone wrong.
370]  [/tip]
371
372[tip A return code of 0 is not a reliable indication
373that you have got what you really want -
374inspecting the log file is the only certain way.
375] [/tip]
376
377[tip If you upgrade compiler version, for example MSVC from 9 to 10,
378then you may need to rebuild Autoindex
379to avoid what Microsoft call a 'side-by-side' error.
380And make sure that the autoindex.exe version you are using is the new one.
381] [/tip]
382
383[endsect] [/section:configure Step 2: Configure Boost.Build to use AutoIndex]
384
385[section:add_indexes Step 3: Add indexes to your documentation]
386
387To add a single "include everything"  index to a BoostBook\/Docbook document,
388(perhaps generated using Quickbook, and perhaps also using Doxygen reference section),
389add `<index/>` at the location where you want the index to appear.
390The index will be rendered as a separate section called "Index"
391when the documentation is built.
392
393To add multiple indexes, then give each one a title and set its
394`type` attribute to specify which terms will be included, for example
395to place the ['function], ['class], ['macro] or ['typedef] names
396indexed by ['AutoIndex] in separate indexes along with a main
397"include everything" index as well, one could add:
398
399[pre
400<index type\="class_name">
401<title>Class Index<\/title>
402<\/index>
403
404<index type\="typedef_name">
405<title>Typedef Index<\/title>
406<\/index>
407
408<index type\="function_name">
409<title>Function Index<\/title>
410<\/index>
411
412<index type\="macro_name">
413<title>Macro Index<\/title>
414<\/index>
415
416<index\/>
417]
418
419[note Multiple indexes like this only work correctly if you tell the XSL stylesheets
420to honor the "type" attribute on each index as by default [/[*they do not do this]].
421You can turn the feature on by adding `<xsl:param>index.on.type=1` to your projects
422requirements in the Jamfile.]
423
424In Quickbook, you add the same markup but enclose it between two triple-tick \'\'\' escapes,
425thus
426
427[pre   \'\'\'<index\/>\'\'\' ]
428
429Or more easily via the helper file auto_index_helpers.qbk, so that given:
430
431[pre \[include auto_index_helpers.qbk\]]
432
433one can simply write:
434
435[pre
436\[named_index class_name Class Index\]
437\[named_index function_name Function Index\]
438\[named_index typedef_name Typedef Index\]
439\[named_index macro_name Macro Index\]
440\[index\]
441]
442
443[note AutoIndex knows nothing of the XML `xinclude` element, so if
444you're writing raw Docbook XML then you may want to run this through an
445XSL processor to flatten everything to one XML file before passing to
446AutoIndex.  If you're using Boostbook or quickbook though, this all
447happens for you anyway, and AutoIndex will index the whole document
448including any sections included with `xinclude`.]
449
450If you are using AutoIndex's internal index generation on
451
452[pre
453<auto-index-internal>on
454]
455(usually recommended for HTML output, but ['not] the default)
456then you can also decide what kind of XML wrapper the generated index is placed in.
457By default this is a `<section>...</section>` XML block (this replaces the original
458`<index>...</index>` block).  However, depending upon the structure of the document
459and whether or not you want the index on a separate page - or else on the front page after
460the TOC - you may want to place the index inside a different type of XML block.  For example
461if your document uses `<chapter>` top level content rather than `<section>`s then
462it may be preferable to place the index in a `<chapter>` or `<appendix>` block.
463You can also place the index inside an `<index>` block if you prefer, in which case the index
464does not appear in on a page of its own, but after the TOC in the HTML output.
465
466You control the type of XML block used by setting the =<auto-index-type>element-name=
467attribute in the Jamfile, or via the `index-type=element-name` command line option to
468AutoIndex itself.  For example, to place the index in an appendix, your Jamfile might
469look like:
470
471[pre
472using quickbook ;
473using auto-index ;
474
475xml mylibrary : mylibary.qbk ;
476boostbook standalone
477    :
478        mylibrary
479    :
480        # auto-indexing is on:
481        <auto-index>on
482
483        # PDFs rely on the XSL stylesheets to generate the index:
484        <format>pdf:<auto-index-internal>off
485
486        # HTML output uses auto-index to generate the index:
487        <format>html:<auto-index-internal>on
488
489        # Name of script file to use:
490        <auto-index-script>index.idx
491
492        # Set the XML wrapper for HML Indexes to "appendix":
493        <format>html:<auto-index-type>appendix
494
495        # Turn on multiple index support:
496        <xsl:param>index.on.type=1
497]
498
499
500[endsect] [/section:add_indexes Step 3: Add indexes to your documentation]
501
502[section:script Step 4: Create the .idx script file - to control what to terms to index]
503
504AutoIndex works by reading a script file that tells it what terms to index.
505
506If your document contains largely text, and only a small amount of simple C++,
507and/or if you are using Doxygen to provide a C++ Reference section
508(that lists the C++ elements),
509and/or if you are relying on the indexing provided from a Standalone Doxygen Index,
510you may decide that a index is not needed
511and that you may only want the text part indexed.
512
513But if you want C++ classes functions, typedefs and/or macros AutoIndexed,
514optionally, the script file also tells which other C++ files to scan.
515
516At its simplest, it will scan one or more headers for terms that
517should be indexed in the documentation.  So for example to scan
518"myheader.hpp" the script file would just contain:
519
520   !scan myheader.hpp
521   !scan mydetailsheader.hpp
522
523Or, more likely in practice, so
524we can recursively scan through directories looking for all
525the files to scan whose [*name matches a particular regular expression]:
526
527[pre !scan-path "boost\/mylibrary" ".*\.hpp" true ]
528
529Each argument is whitespace separated and can be optionally
530enclosed in "double quotes" (recommended).
531
532The final ['true] argument indicates
533that subdirectories in `/boost/math/mylibrary` should be searched
534recursively in addition to that directory.
535
536[caution The second ['file-name-regex] argument is a regular expression and not a filename GLOB!]
537
538[caution The scan-path is modified by any setting of <auto-index-prefix>.
539The examples here assume that this is [^<auto-index-prefix>..\/..\/..]
540so that `boost/mylibrary` will be your header files,
541`libs/mylibrary/doc` will contain your documentation files and
542`libs/mylibrary/example` will contain your examples.
543]
544
545You could also scan any examples (.cpp) files,
546typically in folder `/mylibrary/lib/example`.
547
548[pre
549# All example source files, assuming no sub-folders.
550!scan-path "libs\/mylibrary\/example" ".*\.cpp"
551] [/pre]
552
553Often the ['scan] or ['scan-path] rules will bring in too many terms
554to search for, so we need to be able to exclude terms as well:
555
556   !exclude type
557
558Which excludes the term "type" from being indexed.
559
560We can also add terms manually:
561
562   foobar
563
564will index occurrences of "foobar" and:
565
566   foobar \<\w*(foo|bar)\w*\>
567
568will index any whole word containing either "foo" or "bar" within it,
569this is useful when you want to index a lot of similar or related
570words under one entry, for example:
571
572   reflex
573
574Will only index occurrences of "reflex" as a whole word, but:
575
576   reflex \<reflex\w*\>
577
578will index occurrences of "reflex", "reflexing" and
579"reflexed" all under the same entry ['reflex].
580You will very often need to use this to deal with plurals and other variants.
581
582This inclusion rule can also restrict the term to
583certain sections, and add an index category that
584the term should belong to (so it only appears in certain
585indexes).
586
587Finally the script can add rewrite rules, that rename section names
588that are automatically used as index entries.  For example we might
589want to remove leading "A" or "The" prefixes from section titles
590when AutoIndex uses them as an index entry:
591
592   !rewrite-name "(?i)(?:A|The)\s+(.*)" "\1"
593
594[endsect] [/section:script Step 4: Create the script file -  to control what to terms to index]
595
596[section:entries Step 5: Add Manual Index Entries to Docbook XML - Optional]
597
598If you add manual `<indexentry>` markup to your Docbook XML then these will be
599passed through unchanged.  Please note however, that if you are using
600AutoIndex's internal index generation then it only recognises
601`<primary>`, `<secondary>` and `<tertiary>` elements within the `<indexterm>`.
602`<see>` and `<seealso>` elements are not currently recognised
603and AutoIndex will emit a warning if these are used.
604
605Likewise none of the  attributes which can be applied to these elements are used when
606AutoIndex generates the index itself, with the exception of the `<type>` attribute.
607
608For Quickbook users, there are some templates in auto_index_helpers.qbk that assist
609in adding manual entries without having to escape to Docbook.
610
611[endsect]  [/section:entries Step 5: Add Manual Index Entries to Docbook XML - Optional]
612
613[section:pis Step 6: Using XML processing instructions to control what gets indexed.]
614
615Sometimes when you need to exclude certain sections of text from indexing,
616then you can achieve this with the following XML processing instructions:
617
618[table
619[[Instruction][Effect]]
620[[`<?BoostAutoIndex IgnoreSection?>`]
621   [Causes the whole of the current section to be excluded from indexing.
622    By "section" we mean either a true "section" or any sibling XML element:
623    "dedication", "toc", "lot", "glossary", "bibliography", "preface", "chapter",
624      "reference", "part", "article", "appendix", "index", "setindex", "colophon",
625      "sect1", "refentry", "simplesect", "section" or "partintro".]]
626[[`<?BoostAutoIndex IgnoreBlock?>`]
627   [Causes the whole of the current text block to be excluded from indexing.
628    A text block may be any of the section/chapter elements listed above, or a
629    paragraph, code listing, table etc.  The complete list is:
630    "calloutlist", "glosslist", "bibliolist", "itemizedlist", "orderedlist",
631      "segmentedlist", "simplelist", "variablelist", "caution", "important", "note",
632      "tip", "warning", "literallayout", "programlisting", "programlistingco",
633      "screen", "screenco", "screenshot", "synopsis", "cmdsynopsis", "funcsynopsis",
634      "classsynopsis", "fieldsynopsis", "constructorsynopsis",
635      "destructorsynopsis", "methodsynopsis", "formalpara", "para", "simpara",
636      "address", "blockquote", "graphic", "graphicco", "mediaobject",
637      "mediaobjectco", "informalequation", "informalexample", "informalfigure",
638      "informaltable", "equation", "example", "figure", "table", "msgset", "procedure",
639      "sidebar", "qandaset", "task", "productionset", "constraintdef", "anchor",
640      "bridgehead", "remark", "highlights", "abstract", "authorblurb" or "epigraph".]]
641]
642
643For Quickbook users the file auto_index_helpers.qbk contains a helper template
644that assists in inserting these processing instructions, for example:
645
646[pre \[AutoIndex IgnoreSection\]]
647
648Will cause that section to not be indexed.
649
650[endsect] [/section:pis Step 6: Using XML processing instructions to control what gets indexed.]
651
652[section:build_docs Step 7: Build the Docs]
653
654Using Boost.Build you build the docs with either:
655
656   bjam release > mylibrary_html.log
657
658To build the html docs or:
659
660   bjam pdf release > mylibrary_pdf.log
661
662To build the pdf.
663
664During the build process you should see AutoIndex emit a message in the log file
665such as:
666
667[pre Indexing 990 terms... ]
668
669If you don't see that, or if it's indexing 0 terms then something is wrong!
670
671Likewise when index generation is complete, AutoIndex will emit another message:
672
673[pre 38 Index entries were created.]
674
675Again, if you see that 0 entries were created then something is wrong!
676
677Examine the log file, and if the cause is not obvious,
678make sure that you have [^<auto-index-verbose>on] and that
679any needed
680[^!debug regular-expression] directives are in your script file.
681
682[endsect] [/section:build_docs Step 7: Build the Docs]
683
684[section:refine Step 8: Iterate - to refine your index]
685
686Creating a good index is an iterative process, often the first step is
687just to add a header scanning rule to the script file and then generate
688the documentation and see:
689
690* What's missing.
691* What's been included that shouldn't be.
692* What's been included under a poor name.
693
694Further rules can then be added to the script to handle these cases
695and the next iteration examined, and so on.
696
697[tip If you don't understand why a particular term is (or is not) present in the index,
698try adding a ['!debug regular-expression]
699directive to the [link boost_autoindex.script_ref script file].
700] [/tip]
701
702[heading Restricting which Sections are indexed for a particular term]
703
704You can restrict which sections are indexed for a particular term.
705So assuming that the docbook document has the usual hierarchical names for section ID's
706(as Quickbook generates, for example),
707you can easily place a constraint on which sections are examined for a particular term.
708
709For example, if you want to index occurrences of Lord Kelvin's name,
710but only in the introduction section, you might then add:
711
712  Kelvin "" ".*introduction.*"
713
714to the script file,
715assuming that the section ID of the intro is "some_library_or_chapter_name.introduction".
716
717This would avoid an index entry every time 'Kelvin' is found,
718something the user is unlikely to find helpful.
719
720[endsect] [/section:refine Step 8: Iterate - to refine your index]
721
722[endsect] [/section:tut Getting Started and Tutorial]
723
724
725[section:script_ref Script File (.idx) Reference]
726
727The following elements can occur in a script:
728
729[h4 Comments and blank lines]
730
731Blank lines consisting of only whitespace are ignored, so are lines that [*start with a #].
732
733[note You can't append \# comments onto the end of a line\!]
734
735[h4 Inclusion of Index terms]
736
737   term [regular-expression1 [regular-expression2 [category]]]
738
739[variablelist
740[[term][
741['Term to index.]
742
743The index term will form a primary entry in the Index
744with the section title(s) containing the term as secondary entries, and
745also will be used as a secondary entry beneath each of the section
746titles that the index term occurs in.]
747] [/term]
748
749[[regular-expression1][
750['Index term Searcher.]
751
752An optional regular expression: each occurrence
753of the regular expression in the text of the document will result
754in one index term being emitted.
755
756If the regular expression is omitted (default) or is "", then the ['index term] itself
757will be used as the search text - and only occurrence of whole words matching
758['index term] will be indexed.
759
760For example:
761
762``foobar``
763
764will index occurrences of "foobar" in any section, but
765
766``foobar \<\w*(foo|bar)\w*\>``
767
768will index any whole word containing either "foo" or "bar" within it.
769This is useful when you want to index a lot of similar or related words under one entry.
770
771``reflex``
772
773will only index occurrences of "reflex" as a whole word, but:
774
775``reflex \<reflex\w*\>``
776
777will index occurrences of "reflex", "reflexes", "reflexing" and "reflexed" ...
778all under the same entry reflex.
779
780You will very often need to use this to deal with plurals and other variants.]
781] [/regular-expression1]
782
783[[regular-expression2]
784[['Section(s) Selector.]
785
786A constraint that specifies which sections are
787indexed for ['term]: only if the ID of the section matches
788['regular-expression2] exactly will that section be indexed
789for occurrences of ['term].
790
791For example, to limit indexing to just [*one specific section] (but not sub-sections below):
792
793``myclass "" "mylib\.examples"``
794
795
796For example, to limit indexing to specific sections, [*and sub-sections below]:
797
798``myclass "" "mylib\.examples.*"``
799
800will index occurrences of "myclass" as a whole word,
801but only in sections whose section ID [*begins] "mylib.examples", while
802
803``myclass "\<myclass\w*\>" "mylib\.examples.*"``
804
805will also index plurals myclass, myclasses, myclasss ...
806
807and:
808
809``myclass "" "(?!mylib\.introduction).*"``
810
811will index occurrences of "myclass" in any section,
812except those whose section IDs begin "mylib.introduction".
813
814Finally, two (or more) sections can be excluded by OR'ing them together:
815
816``myclass "" "(?!mylib\.introduction|mylib\.reference).*"``
817
818which excludes searching for this term in sections whose ID's start with either "mylib.introduction" or "mylib.reference".
819
820If this third section selection field is omitted (the default)
821or is "", then [*all sections] are indexed for this term.
822]
823] [/regular-expression2]
824
825[[category][
826['Index Category Constraint.]
827
828Optionally a category to place occurrences of ['index term] in.
829If you have multiple indexes then this is the name
830assigned to the indexes "type" attribute.
831
832For example:
833
834  myclass "" "" class_name
835
836Will index occurances of ['myclass] and place them in the class-index if there is one.
837
838]] [/category]
839
840]  [/variablelist]
841
842You can have an index term appear more than once in the script file:
843
844* If they have different /category/ names then they are treated quite separately.
845* Otherwise they are combined, so that the logical or of the regular expressions provided are taken.
846
847Thus:
848
849   myterm search_expression1 constrait_expression2 foo
850   myterm search_expression1 constrait_expression2 bar
851
852Will be treated as different terms each with their own entries, while:
853
854   myterm search_expression1 constrait_expression2 mycategory
855   myterm search_expression1 constrait_expression2 mycategory
856
857Will be combined into a single term equivalent to:
858
859   myterm (?:search_expression1|search_expression1) (?:constrait_expression2|constrait_expression2) mycategory
860
861[h4 Source File Scanning]
862
863   !scan source-file-name
864
865Scans the C\/C++ source file ['source-file-name] for definitions of
866['function]s, ['class]s, ['macro]s or ['typedef]s and makes each of
867these a term to be indexed.  Terms found are assigned to the index category
868"function_name", "class_name", "macro_name" or "typedef_name" depending
869on how they were seen in the source file.  These may then be included
870in a specialised index whose "type" attribute has the same category name.
871
872[important
873When actually indexing a document, the scanner will not index just any old occurrence of the
874terms found in the source files.  Instead it searches for class definitions or function or
875typedef declarations.  This reduces the number of spurious matches placed in the index, but
876may also miss some legitimate terms:
877refer to the /define-scanner/ command for information on how to change this.
878]
879
880[h4 Directory and Source File Scanning]
881
882   !scan-path directory-name file-name-regex [recurse]
883
884[variablelist
885[[directory-name][The directory to scan: this should be a path relative
886to the script file (or to the path specified with the prefix=path option on the command line)
887and should use all forward slashes in its file name.]]
888
889[[file-name-regex][A regular expression: any file in the directory whose name
890matches the regular expression will be scanned for terms to index.]]
891
892[[recurse][An optional boolean value - either "true" or "false" - that
893indicates whether to recurse into subdirectories.  This defaults to "false".]]
894]
895
896[h4 Excluding Terms]
897
898   !exclude term-list
899
900Excludes all the terms in whitespace separated ['term-list] from being indexed.
901This should be placed /after/ any ['!scan] or ['!scan-path] rules which may
902result in the terms becoming included.  In other words this removes terms from
903the scanners internal list of things to index.
904
905[h4 Rewriting Section Names]
906
907[pre !rewrite-id regular-expression new-name]
908
909[variablelist
910[[regular-expression][A regular expression: all section ID's that match
911the expression exactly will have index entries ['new-name] instead of
912their title(s).]]
913
914[[new-name][The name that the section will appear under in the index.]]
915]
916
917   !rewrite-name regular-expression format-text
918
919[variablelist
920[[regular-expression][A regular expression: all sections whose titles
921match the regular expression exactly, will have index entries composed
922of the regular expression match combined with the regex format string
923['format-text].]]
924[[format-text][The Perl-style format string used to reformat the title.]]
925]
926
927For example:
928
929[pre
930!rewrite-name "(?:A|An|The)\s+(.*)" "\1"
931]
932
933Will remove any leading "A", "An" or "The" from all index entries - thus preventing lots of
934entries under "The" etc!
935
936[h4 Defining or Changing the File Scanners]
937
938   !define-scanner type file-search-expression xml-regex-formatter term-formatter id-filter filename-filter
939
940When a source file is scanned using the =!scan= or =!scan-path= rules, then the file is searched using
941a series of regular expressions to look for classes, functions, macros or typedefs that should be indexed.
942A set of default regular expressions are provided for this (see below), but sometimes you may want to replace
943the defaults, or add new scanners.  The arguments to this rule are:
944
945[variablelist
946[[type][The ['type] to which items found using this rule will assigned, index terms created from the
947source file and then found in the XML, will have the type attribute set to this value, and may then appear in a
948specialized index with the same type attribute]]
949[[file-search-expression][A regular expression that is used to scan the source file for index terms, the result of
950a match against this expression will be transformed by the next two arguments.]]
951[[xml-regex-formatter][A regular expression format string that extracts the salient information from whatever
952matched the ['file-search-expression] in the source file, and creates ['a new regular expression] that will
953be used to search the document being indexed for occurrences of this index term.]]
954[[term-formatter][A regular expression format string that extracts the salient information from whatever
955matched the ['file-search-expression] in the source file, and creates the index term that will appear in
956the index.]]
957[[id-filter][Optional.  A regular expression that restricts the section-id's that are searched in the document being indexed:
958only sections whose ID attribute matches this expression exactly will be considered for indexing terms found by this scanner.]]
959[[filename-filter][Optional.  A regular expression that restricts which files are scanned by this scanner: only files whose file name
960matches this expression exactly will be scanned for index terms to use.  Note that the filename matched against this may
961well be an absolute path, and contain either forward or backward slash path separators.]]
962]
963
964If, when the first file is scanned, there are no scanners whose ['type] is "class_name", "typedef_name", "macro_name" or
965"function_name", then the defaults are installed.  These are equivalent to:
966
967   !define-scanner class_name "^[[:space:]]*(template[[:space:]]*<[^;:{]+>[[:space:]]*)?(class|struct)[[:space:]]*(\<\w+\>([[:blank:]]*\([^)]*\))?[[:space:]]*)*(\<\w*\>)[[:space:]]*(<[^;:{]+>)?[[:space:]]*(\{|:[^;\{()]*\{)" "(?:class|struct)[^;{]+\<\5\>[^;{]+\{" \5
968   !define-scanner typedef_name "typedef[^;{}#]+?(\w+)\s*;"  "typedef[^;]+\<\1\>\s*;" "\1"
969   !define-scanner "macro_name" "^\s*#\s*define\s+(\w+)" "\<\1\>" "\1"
970   !define-scanner "function_name" "\w++(?:\s*+<[^>]++>)?[\s&*]+?(\w+)\s*(?:BOOST_[[:upper:]_]+\s*)?\([^;{}]*\)\s*[;{]" "\\<\\w+\\>(?:\\s+<[^>]*>)*[\\s&*]+\\<\1\\>\\s*\\([^;{]*\\)" "\1"
971
972Note that these defaults are not installed if you have provided your own versions with these ['type] names. In this case if
973you want the default scanners to be in effect as well as your own, you should include the above in your script file.
974It is also perfectly allowable to have multiple scanners with the same ['type], but with the other fields differing.
975
976Finally you should note that the default scanners are quite strict
977in what they will find, for example the class
978scanner will only create index entries for classes that have class definitions of the form:
979
980   class my_class : public base_classes
981   {
982      // etc
983
984In the documentation, so that simple mentions of the class name will ['not] get indexed,
985only the class synopsis if there is one.
986If this isn't how you want things, then include the ['class_name] scanner definition
987above in your script file, and change
988the ['xml-regex-formatter] field to something more permissive, for example:
989
990   !define-scanner class_name "^[[:space:]]*(template[[:space:]]*<[^;:{]+>[[:space:]]*)?(class|struct)[[:space:]]*(\<\w+\>([[:blank:]]*\([^)]*\))?[[:space:]]*)*(\<\w*\>)[[:space:]]*(<[^;:{]+>)?[[:space:]]*(\{|:[^;\{()]*\{)" "\<\5\>" \5
991
992Will look for ['any] occurrence of whatever class names the scanner may find in the documentation.
993
994[h4 Debugging scanning]
995
996If you see a term in the index, and you don't understand why it's there, add a ['debug] directive:
997
998[pre
999!debug regular-expression
1000]
1001
1002Now, whenever ['regular-expression] matches either the found index term,
1003or the section title it appears in, or the ['type] field of a scanner, then
1004some diagnostic information will be printed that will look something like:
1005
1006[pre
1007Debug term found, in block with ID: spirit.qi.reference.parser_concepts.parser
1008Current section title is: Notation
1009The main index entry will be : Notation
1010The indexed term is: parser
1011The search regex is: \[P\|p\]arser
1012The section constraint is: .*qi.reference.parser_concepts.*
1013The index type for this entry is: qi_index
1014]
1015
1016This can produce a lot of output in your log file,
1017but until you are satisfied with your file selection and scanning process,
1018it is worth switching it on.
1019
1020[endsect] [/section:script_ref Script File Reference]
1021
1022[section:workflow  Understanding The AutoIndex Workflow]
1023
1024# Load the script file (usually index.idx)
1025  and process it one line at a time,
1026  producing one or more index term per (non-comment) line.
1027
1028# Reading all lines builds a list of ['terms to index].
1029  Some of those may be terms defined (by you) directly in the script file,
1030  others may be terms found by scanning C++ header and source files
1031  that were specified by the ['!scan-path] directive.
1032
1033# Once the complete list of ['terms to index] is complete,
1034  it loads the Docbook XML file.
1035  (If this comes from Quickbook\/Doxygen\/Boostbook\/Docbook then this is
1036  the complete documentation after conversion to Docbook format).
1037
1038# AutoIndex builds an internal __DocObjMod of the Docbook XML.
1039  This internal representation then gets scanned for occurrences of the ['terms to index].
1040  This scanning works at the XML paragraph level
1041  (or equivalent sibling such as a table or code block)
1042  - so all the XML encoding within a paragraph gets flattened to plain text.[br]
1043  This flattening means the regular expressions used to search for ['terms to index]
1044  can find anything that is completely contained within a paragraph
1045  (or code block etc).
1046
1047# For each term found then an ['indexterm] Docbook element is inserted
1048  into the __DocObjMod (provided internal index generation is off),
1049
1050# Also the AutoIndex's internal index representation gets updated.
1051
1052# Once the whole XML document has been indexed,
1053  then, if AutoIndex has been instructed to generate the index itself,
1054  it creates the necessary XML and inserts this into the __DocObjMod.
1055
1056# Finally the whole __DocObjMod is written out as a new Docbook XML file,
1057  and normal processing of this continues via the XSL stylesheets (with xsltproc)
1058  to actually build the final human-readable docs.
1059
1060[endsect] [/section:workflow  AutoIndex Workflow]
1061
1062
1063[section:xml XML Handling]
1064
1065AutoIndex is rather simplistic in its handling of XML:
1066
1067* When indexing a document, all block content at the paragraph level gets collapsed into a single
1068string for matching against the regular expressions representing each index term.  In other words,
1069for the most part, you can assume that you're indexing plain text when writing regular expressions.
1070* Named XML entities for &, ", ', < or > are converted to their corresponding characters before indexing
1071a section of text.  However, decimal or hex escape sequences are not currently converted.
1072* Index terms are assumed to be plain text (whether they originate from the script file
1073or from scanning source files) and the characters &, ", < and > will be escaped to
1074&amp; &quot; &lt; and &gt; respectively.
1075
1076[endsect] [/section:xml XML Handling]
1077
1078[section:qbk Quickbook Support]
1079
1080The file auto_index_helpers.qbk in ['boost-path]/tools/auto_index/include contains various Quickbook
1081templates to assist with AutoIndex support.  One would normally add the above path to your include
1082search path via an `<include>path` statement in your Jamfile, and then make the templates available
1083to your Quickbook source via a:
1084
1085[pre \[include auto_index_helpers.qbk\]]
1086
1087statement at the start of your Quickbook file.
1088
1089The available templates are then:
1090
1091[table
1092[[Template][Description]]
1093[[`[index]`][Creates a main index, with no "type" category set, which will be titled simply "Index".]]
1094[[`[named_index type title]`][Creates an index with the type attribute set to "type" and the title will be "title".[br]
1095         For example to create an index containing only class names one would typically add `[named_index class_name Class Index]`
1096         to your Quickbook source.]]
1097[[`[AutoIndex Arg]`][Creates a Docbook processing instruction that will be handled by AutoIndex, valid values for "Arg"
1098                     are either "IgnoreSection" or "IgnoreBlock".]]
1099[[`[indexterm1 primary-key]`][Creates a manual index entry that will link to the current section, and have a single primary key "primary-key".
1100         Note that this index key will not have a "type" attribute set, and so will only appear in the main index.]]
1101[[`[indexterm2 primary-key secondary-key]`][Creates a manual index entry that will link to the current section, and has
1102         "primary-key" and "secondary key" as the primary and secondary keys respectively.
1103         Note that this index key will not have a "type" attribute set, and so will only appear in the main index.]]
1104[[`[indexterm3 primary-key secondary-key tertiary-key]`][Creates a manual index entry that will link to the current section,
1105         and have primary, secondary and tertiary keys: "primary-key", "secondary key" and "tertiary key".
1106         Note that this index key will not have a "type" attribute set, and so will only appear in the main index.]]
1107
1108[[`[typed_indexterm1 type primary-key]`][Creates a manual index entry that will link to the current section, and have a single primary key "primary-key".
1109         Note that this index key will have the  "type" attribute set to the "type" argument, and so may appear in named sub-indexes
1110         that also have their type attribute set.]]
1111[[`[typed_indexterm2 type primary-key secondary-key]`][Creates a manual index entry that will link to the current section, and has
1112         "primary-key" and "secondary key" as the primary and secondary keys respectively.
1113         Note that this index key will have the  "type" attribute set to the "type" argument, and so may appear in named sub-indexes
1114         that also have their type attribute set.]]
1115[[`[typed_indexterm3 type primary-key secondary-key tertiary-key]`][Creates a manual index entry that will link to the current section,
1116         and have primary, secondary and tertiary keys: "primary-key", "secondary key" and "tertiary key".
1117         Note that this index key will have the  "type" attribute set to the "type" argument, and so may appear in named sub-indexes
1118         that also have their type attribute set.]]
1119]
1120
1121[endsect]
1122
1123[section:comm_ref Command Line Reference]
1124
1125The following command line options are supported by AutoIndex:
1126
1127[variablelist
1128[[--in=infilename][Specifies the name of the XML input file to be indexed.]]
1129[[--out=outfilename][Specifies the name of the new XML file to create.]]
1130[[--scan=source-filename][Specifies that ['source-filename] should be scanned
1131for terms to index.]]
1132[[--script=script-filename][Specifies the name of the script file to process.]]
1133[[--no-duplicates][If a term occurs more than once in the same section, then
1134include only one index entry.]]
1135[[--internal-index][Specifies that AutoIndex should generate the actual
1136indexes rather than inserting `<indexterm>`s and leaving index generation
1137to the XSL stylesheets.]]
1138[[--no-section-names][Prevents AutoIndex from using section names as index entries.]]
1139[[--prefix=pathname][Specifies a directory to apply as a prefix to all relative file paths in the script file.]]
1140[[--index-type=element-name][Specifies the name of the XML element to enclose internally generated indexes in:
1141  defaults to ['section], but could equally be ['appendix] or ['chapter]
1142  or some other block level element that has a formal title.]]
1143]
1144
1145[endsect]  [/section:comm_ref Command Line Reference]
1146
1147[include ../include/auto_index_helpers.qbk]
1148
1149[index]
1150