1[article Boost.AutoIndex 2 [quickbook 1.5] 3 [copyright 2008, 2011 John Maddock] 4 [license 5 Distributed under the Boost Software License, Version 1.0. 6 (See accompanying file LICENSE_1_0.txt or copy at 7 [@http://www.boost.org/LICENSE_1_0.txt]) 8 ] 9 [authors [Maddock, John]] 10 [/last-revision $Date: 2008-11-04 17:11:53 +0000 (Tue, 04 Nov 2008) $] 11] 12 13[def __quickbook [@http://www.boost.org/doc/tools/quickbook/index.html Quickbook]] 14[def __boostbook [@http://www.boost.org/doc/html/boostbook.html BoostBook]] 15[def __boostbook_docs [@http://www.boost.org/doc/libs/1_41_0/doc/html/boostbook.html BoostBook documentation]] 16[def __quickbook_syntax [@http://www.boost.org/doc/libs/1_41_0/doc/html/quickbook/ref.html Quickbook Syntax Compendium]] 17[def __docbook [@http://www.docbook.org/ DocBook]] 18[def __docbook_params [@http://docbook.sourceforge.net/release/xsl/current/doc/ Docbook xsl:param format options]] 19[def __DocObjMod [@http://en.wikipedia.org/wiki/Document_Object_Model Document Object Model (DOM)]] 20 21[def __doxygen [@http://www.doxygen.org/ Doxygen]] 22[def __pdf [@http://www.adobe.com/products/acrobat/adobepdf.html PDF]] 23 24[template deg[]'''°'''] [/ degree sign ] 25 26 27[section:overview Overview] 28 29AutoIndex is a tool for taking the grunt work out of indexing a 30Boostbook\/Docbook document 31(perhaps generated by your Quickbook file mylibrary.qbk, 32and perhaps using also Doxygen autodoc) 33that describes C\/C++ code. 34 35Traditionally, in order to index a Docbook document you would 36have to manually add a large amount of `<indexterm>` markup: 37in fact one `<indexterm>` for each occurrence of each term to be 38indexed. 39 40Instead AutoIndex will automatically scan one or more C\/C++ header files 41and extract all the ['function], ['class], ['macro] and ['typedef] 42names that are defined by those headers, and then insert the 43`<indexterm>`s into the Docbook XML document for you. 44 45AutoIndex can also scan using a list of index terms 46specified in a script file, for example index.idx. 47These manually provided terms can optionally be regular expressions, 48and may allow the user to find references to terms 49that may not occur in the C++ header files. Of course providing a manual 50list of search terms in to index is a tedious task 51(especially handling plurals and variants), 52and requires enough knowledge of the library 53 to guess what users may be seeking to know, 54but at least the real 'grunt work' of 55finding the term and listing the page number is automated. 56 57AutoIndex creates index entries as follows: 58 59for each occurrence of each search term, it creates two index entries: 60 61# The search term as the ['primary index key] and 62 the ['title of the section it appears in] as a subterm. 63 64# The section title as the main index entry and the search term as the subentry. 65 66Thus the user has two chances to find what they're 67looking for, based upon either the section name 68or the ['function], ['class], ['macro] or ['typedef] name. 69 70[note This behaviour can be changed so that only one index entry is created 71 (using the search term as the key and 72 not using the section name except as a sub-entry of the search term).] 73 74So for example in Boost.Math the class name `students_t_distribution` has a primary 75entry that lists all sections the class name appears in: 76 77[$../students_t_eg_1.png] 78 79Then those sections also have primary entries, which list all the search terms those 80sections contain: 81 82[$../students_t_eg_2.png] 83 84Of course these automated index entries may not be quite 85what you're looking for: often you'll get a few spurious entries, a few missing entries, 86and a few entries where the section name used as an index entry is less than ideal. 87So AutoIndex provides some powerful regular expression based rules that allow you 88to add, remove, constrain, or rewrite entries. Normally just a few lines in 89AutoIndex's script file are enough to tailor the output to match the author's 90expectations (and thus hopefully the index user's expectations too!). 91 92AutoIndex also supports multiple indexes (as does Docbook), and since it knows 93which search terms are ['function], ['class], ['macro] or ['typedef] names, it 94can add the necessary attributes to the XML so that you can have separate 95indexes for each of these different types. These specialised indexes only contain 96entries for the ['function], ['class], ['macro] or ['typedef] names, ['section 97names] are never used as primary index terms here, unlike the main "include everything" 98index. 99 100Finally, while the Docbook XSL stylesheets create nice indexes complete with page 101numbers for PDF output, the HTML indexes look poorer by comparison, as these use 102section titles in place of page numbers... but as AutoIndex uses section titles 103as index entries this leads to a lot of repetition, so as an alternative AutoIndex 104can be instructed to construct the index itself. This is faster than using 105the XSL stylesheets, and now each index entry is a hyperlink to the 106appropriate section: 107 108[$../students_t_eg_3.png] 109 110With internal index generation there is also a helpful navigation bar 111at the start of each Index: 112 113[$../students_t_eg_4.png] 114 115Finally, you can choose what kind of XML container wraps an internally generated index - 116this defaults to `<section>...</section>` but you can use either command line options 117or Boost.Build Jamfile features, to select an alternative wrapper - for example ['appendix] 118or ['chapter] would be good choices, whatever fits best into the flow of the 119document. You can even set the container wrapper to type ['index] provided you turn 120off index generation by the XSL stylesheets, for example by setting the following 121build requirements in the Jamfile: 122 123[pre 124<format>html:<auto-index-internal>on # Use internally generated indexes. 125<auto-index-type>index # Use <index>...</index> as the XML wrapper. 126<format>html:<xsl:param>generate.index=0 # Don't let the XSL stylesheets generate indexes. 127] 128 129[endsect] [/section:overview Overview] 130 131[section:tut Getting Started and Tutorial] 132 133[section:build Step 1: Build the AutoIndex tool] 134 135[note This step is strictly optional, but very desirable to speed up build times.] 136 137cd into `tools/auto_index/build` and invoke bjam as: 138 139 bjam release 140 141Optionally pass the name of the compiler toolset you want to use to bjam as well: 142 143 bjam release gcc 144 145This will build the tool and place a copy in the current directory (which is to say `tools/auto_index/build`) 146 147Now open up your `user-config.jam` file and at the end of the file add the line: 148 149[pre 150using auto-index : ['full-path-to-boost-tree]/tools/auto_index/build/auto-index.exe ; 151] 152 153[note 154This declaration must go towards the end of `user-config.jam`, or in any case after the Boostbook initialisation. 155 156Also note that Windows users must use forward slashes in the paths in `user-config.jam`] 157 158[endsect] [/section:build Step 1: Build the AutoIndex tool] 159 160[section:configure Step 2: Configure Boost.Build jamfile to use AutoIndex] 161 162Assuming you have a Jamfile for building your documentation that looks 163something like: 164 165[pre 166boostbook standalone 167 : 168 mylibrary 169 : 170 # build requirements go here: 171 ; 172] 173 174Then add the line: 175 176[pre using auto-index ; ] 177 178to the start of the Jamfile, and then add whatever auto-index options 179you want to the ['build requirements section], for example: 180 181[pre 182 boostbook standalone 183 : 184 mylibrary 185 : 186 # Build requirements go here: 187 188 # <auto-index>on (or off) one turns on (or off) indexing: 189 <auto-index>on 190 191 # Turns on (or off) auto-index-verbose for diagnostic info. 192 # This is highly recommended until you have got all the many details correct! 193 <auto-index-verbose>on 194 195 # Choose the indexing method (separately for html and PDF) - see manual. 196 # Choose indexing method for PDFs: 197 <format>pdf:<auto-index-internal>off 198 199 # Choose indexing method for html: 200 <format>html:<auto-index-internal>on 201 202 # Set the name of the script file to use (index.idx is popular): 203 <auto-index-script>index.idx 204 # Commands in the script file should all use RELATIVE PATHS 205 # otherwise the script will not be portable to other machines. 206 # Relative paths are normally taken as relative to the location 207 # of the script file, but we can add a prefix to all 208 # those relative paths using the <auto-index-prefix> feature. 209 # The path specified by <auto-index-prefix> may be either relative or 210 # absolute, for example the following will get us up to the boost root 211 # directory for most Boost libraries: 212 <auto-index-prefix>..\/..\/.. 213 214 # Tell Quickbook that it should enable indexing. 215 <quickbook-define>enable_index ; 216 217 ; 218] [/pre] 219 220[section:options Available Indexing Options] 221 222The available options are: 223 224[variablelist 225[[<auto-index>off/on][Turns indexing of the document on, defaults to 226"off", so be sure to set this if you want AutoIndex invoked!]] 227[[<auto-index-internal>off/on][Chooses whether AutoIndex creates the index 228itself (feature on), or whether it simply inserts the necessary DocBook 229markup so that the DocBook XSL stylesheets can create the index. Defaults to "off".]] 230[[<auto-index-script>filename][Specifies the name of the script to load.]] 231[[<auto-index-no-duplicates>off/on][When ['on] AutoIndex will only index a term 232once in any given section, otherwise (the default) multiple index entries per 233term may be created if the term occurs more than once in the section.]] 234[[<auto-index-section-names>off/on][When ['on] AutoIndex will use create two 235index entries for each term found - one uses the term itself as the primary 236index key, the other uses the enclosing section name. When off the index 237entry that uses the section title is not created. Defaults to "on"]] 238[[<auto-index-verbose>off/on][Defaults to "off". When turned on AutoIndex 239prints progress information - useful for debugging purposes during setup.]] 240[[<auto-index-prefix>filename][Optionally specifies a directory to apply 241as a prefix to all relative file paths in the script file. 242 243You may wish to do this to reduce typing of pathnames, and\/or where the 244paths can't be located relative to the script file location, 245typically if the headers are in the Boost trunk, 246but the script file is in Boost sandbox. 247 248For Boost standard library layout, 249[^<auto-index-prefix>..\/..\/..] will get you back up to the 'root' of the Boost tree, 250so [^!scan-path boost\/mylibrary\/] is where your headers will be, and [^libs\/mylibrary] for other files. 251Without a prefix all relative paths are relative to the location of the script file. 252]] 253 254[[<auto-index-type>element-name][Specifies the name of the XML element in which to enclose an internally generated indexes: 255 defaults to ['section], but could equally be ['appendix] or ['chapter] or some other block level element that has a formal title. 256 The actual list of available options depends upon the Quickbook document type, the following table gives the available options, 257 assuming that the index is placed at the top level, and not in some sub-section or other container:]] 258] 259 260[table 261[[Document Type][Permitted Index Types]] 262[[book][appendix index article chapter reference part]] 263[[article][section appendix index sect1]] 264[[chapter][section index sect1]] 265[[library][The same as Chapter (section index sect1)]] 266[[part][appendix index article chapter reference]] 267[[appendix][section index sect1]] 268[[preface][section index sect1]] 269[[qandadiv][N/A: an index would have to be placed within a subsection of the document.]] 270[[qandaset][N/A: an index would have to be placed within a subsection of the document.]] 271[[reference][N/A: an index would have to be placed within a subsection of the document.]] 272[[set][N/A: an index would have to be placed within a subsection of the document.]] 273] 274 275In large part then the choice of `<auto-index-type>element-name` depends on the 276formatting you want to be applied to the index: 277 278[table 279[[XML Container Used for the Index][Formatting Applied by the XSL Stylesheets]] 280[[appendix][Starts a new page.]] 281[[article][Starts a new page.]] 282[[chapter][Starts a new page.]] 283[[index][Starts a new page only if it's contained within an article or book.]] 284[[part][Starts a new page.]] 285[[reference][Starts a new page.]] 286[[sect1][Starts a new page as long as it's not the first section (but is controlled by the XSL parameters chunk.section.depth and/or chunk.first.sections).]] 287[[section][Starts a new page as long as it's not the first section or nested within another section (but is controlled by the XSL parameters chunk.section.depth and/or chunk.first.sections).]] 288] 289 290In almost all cases the default (section) is the correct choice - the exception is when the index is to be placed 291directly inside a /book/ or /part/, in which case you should probably use the same XML container for the index as 292you use for whatever subdivisions are in the /book/ or /part/. In any event placing a /section/ within a /book/ or 293/part/ will result in invalid XML. 294 295Finally, if you are using Quickbook to generate the documentation, then you may wish to add: 296 297[pre <include>$boost-root/tools/auto_index/include] 298 299to your projects requirements (replacing $boost-root with the path to the root of the Boost tree), so that 300the file auto_index_helpers.qbk can be included in your quickbook source with simply a: 301 302[pre \[include auto_index_helpers.qbk\]] 303 304[endsect] [/section:options Available Indexing Options] 305 306[section:optional Making AutoIndex optional] 307 308It is considerate to make the [*use of auto-index optional] in Boost.Build, 309to allow users who do not have AutoIndex installed to still be able to build your documentation. 310 311This also very convenient while you are refining your documentation, 312to allow you to decide to build indexes, or not: 313building indexes can take long time, if you are just correcting typos, 314you won't want to wait while you keep rebuilding the index! 315 316One method of setting up optional AutoIndex support is to place all 317AutoIndex configuration in a the body of a bjam if statement: 318 319[pre 320 if --enable-index in \[ modules.peek : ARGV \] 321 { 322 ECHO "Building the docs with automatic index generation enabled." ; 323 324 using auto-index ; 325 project : requirements 326 <auto-index>on 327 <auto-index-script>index.idx 328 329 ... other AutoIndex options here... 330 331 # And tell Quickbook that it should enable indexing. 332 <quickbook-define>enable_index 333 ; 334 } 335 else 336 { 337 ECHO "Building the my_library docs with automatic index generation disabled. To get an Index, try building with --enable-index." ; 338 } 339] [/pre] 340 341You will also need to add a conditional statement at the end of your Quickbook file, 342so that the index(es) is/are only added after the last section if indexing is enabled. 343 344[pre 345\[\? '''enable_index''' 346\'\'\' 347 <index/> 348\'\'\' 349\] 350] [/pre] 351 352 353To use this jamfile, you need to cd to your docs folder, for example: 354 355 cd \boost-sandbox\guild\mylibrary\libs\mylibrary\doc 356 357and then run `bjam` to build the docs without index, for example: 358 359 bjam -a html > mylibrary_html.log 360 361or with index(es) 362 363 bjam -a html --enable-index > mylibrary_html_index.log 364 365[endsect] [/section:optional Making AutoIndex optional] 366 367[tip Always send the output to a log file. 368It will contain of lot of stuff, but is invaluable to check if all has gone right, 369or else diagnose what has gone wrong. 370] [/tip] 371 372[tip A return code of 0 is not a reliable indication 373that you have got what you really want - 374inspecting the log file is the only certain way. 375] [/tip] 376 377[tip If you upgrade compiler version, for example MSVC from 9 to 10, 378then you may need to rebuild Autoindex 379to avoid what Microsoft call a 'side-by-side' error. 380And make sure that the autoindex.exe version you are using is the new one. 381] [/tip] 382 383[endsect] [/section:configure Step 2: Configure Boost.Build to use AutoIndex] 384 385[section:add_indexes Step 3: Add indexes to your documentation] 386 387To add a single "include everything" index to a BoostBook\/Docbook document, 388(perhaps generated using Quickbook, and perhaps also using Doxygen reference section), 389add `<index/>` at the location where you want the index to appear. 390The index will be rendered as a separate section called "Index" 391when the documentation is built. 392 393To add multiple indexes, then give each one a title and set its 394`type` attribute to specify which terms will be included, for example 395to place the ['function], ['class], ['macro] or ['typedef] names 396indexed by ['AutoIndex] in separate indexes along with a main 397"include everything" index as well, one could add: 398 399[pre 400<index type\="class_name"> 401<title>Class Index<\/title> 402<\/index> 403 404<index type\="typedef_name"> 405<title>Typedef Index<\/title> 406<\/index> 407 408<index type\="function_name"> 409<title>Function Index<\/title> 410<\/index> 411 412<index type\="macro_name"> 413<title>Macro Index<\/title> 414<\/index> 415 416<index\/> 417] 418 419[note Multiple indexes like this only work correctly if you tell the XSL stylesheets 420to honor the "type" attribute on each index as by default [/[*they do not do this]]. 421You can turn the feature on by adding `<xsl:param>index.on.type=1` to your projects 422requirements in the Jamfile.] 423 424In Quickbook, you add the same markup but enclose it between two triple-tick \'\'\' escapes, 425thus 426 427[pre \'\'\'<index\/>\'\'\' ] 428 429Or more easily via the helper file auto_index_helpers.qbk, so that given: 430 431[pre \[include auto_index_helpers.qbk\]] 432 433one can simply write: 434 435[pre 436\[named_index class_name Class Index\] 437\[named_index function_name Function Index\] 438\[named_index typedef_name Typedef Index\] 439\[named_index macro_name Macro Index\] 440\[index\] 441] 442 443[note AutoIndex knows nothing of the XML `xinclude` element, so if 444you're writing raw Docbook XML then you may want to run this through an 445XSL processor to flatten everything to one XML file before passing to 446AutoIndex. If you're using Boostbook or quickbook though, this all 447happens for you anyway, and AutoIndex will index the whole document 448including any sections included with `xinclude`.] 449 450If you are using AutoIndex's internal index generation on 451 452[pre 453<auto-index-internal>on 454] 455(usually recommended for HTML output, but ['not] the default) 456then you can also decide what kind of XML wrapper the generated index is placed in. 457By default this is a `<section>...</section>` XML block (this replaces the original 458`<index>...</index>` block). However, depending upon the structure of the document 459and whether or not you want the index on a separate page - or else on the front page after 460the TOC - you may want to place the index inside a different type of XML block. For example 461if your document uses `<chapter>` top level content rather than `<section>`s then 462it may be preferable to place the index in a `<chapter>` or `<appendix>` block. 463You can also place the index inside an `<index>` block if you prefer, in which case the index 464does not appear in on a page of its own, but after the TOC in the HTML output. 465 466You control the type of XML block used by setting the =<auto-index-type>element-name= 467attribute in the Jamfile, or via the `index-type=element-name` command line option to 468AutoIndex itself. For example, to place the index in an appendix, your Jamfile might 469look like: 470 471[pre 472using quickbook ; 473using auto-index ; 474 475xml mylibrary : mylibary.qbk ; 476boostbook standalone 477 : 478 mylibrary 479 : 480 # auto-indexing is on: 481 <auto-index>on 482 483 # PDFs rely on the XSL stylesheets to generate the index: 484 <format>pdf:<auto-index-internal>off 485 486 # HTML output uses auto-index to generate the index: 487 <format>html:<auto-index-internal>on 488 489 # Name of script file to use: 490 <auto-index-script>index.idx 491 492 # Set the XML wrapper for HML Indexes to "appendix": 493 <format>html:<auto-index-type>appendix 494 495 # Turn on multiple index support: 496 <xsl:param>index.on.type=1 497] 498 499 500[endsect] [/section:add_indexes Step 3: Add indexes to your documentation] 501 502[section:script Step 4: Create the .idx script file - to control what to terms to index] 503 504AutoIndex works by reading a script file that tells it what terms to index. 505 506If your document contains largely text, and only a small amount of simple C++, 507and/or if you are using Doxygen to provide a C++ Reference section 508(that lists the C++ elements), 509and/or if you are relying on the indexing provided from a Standalone Doxygen Index, 510you may decide that a index is not needed 511and that you may only want the text part indexed. 512 513But if you want C++ classes functions, typedefs and/or macros AutoIndexed, 514optionally, the script file also tells which other C++ files to scan. 515 516At its simplest, it will scan one or more headers for terms that 517should be indexed in the documentation. So for example to scan 518"myheader.hpp" the script file would just contain: 519 520 !scan myheader.hpp 521 !scan mydetailsheader.hpp 522 523Or, more likely in practice, so 524we can recursively scan through directories looking for all 525the files to scan whose [*name matches a particular regular expression]: 526 527[pre !scan-path "boost\/mylibrary" ".*\.hpp" true ] 528 529Each argument is whitespace separated and can be optionally 530enclosed in "double quotes" (recommended). 531 532The final ['true] argument indicates 533that subdirectories in `/boost/math/mylibrary` should be searched 534recursively in addition to that directory. 535 536[caution The second ['file-name-regex] argument is a regular expression and not a filename GLOB!] 537 538[caution The scan-path is modified by any setting of <auto-index-prefix>. 539The examples here assume that this is [^<auto-index-prefix>..\/..\/..] 540so that `boost/mylibrary` will be your header files, 541`libs/mylibrary/doc` will contain your documentation files and 542`libs/mylibrary/example` will contain your examples. 543] 544 545You could also scan any examples (.cpp) files, 546typically in folder `/mylibrary/lib/example`. 547 548[pre 549# All example source files, assuming no sub-folders. 550!scan-path "libs\/mylibrary\/example" ".*\.cpp" 551] [/pre] 552 553Often the ['scan] or ['scan-path] rules will bring in too many terms 554to search for, so we need to be able to exclude terms as well: 555 556 !exclude type 557 558Which excludes the term "type" from being indexed. 559 560We can also add terms manually: 561 562 foobar 563 564will index occurrences of "foobar" and: 565 566 foobar \<\w*(foo|bar)\w*\> 567 568will index any whole word containing either "foo" or "bar" within it, 569this is useful when you want to index a lot of similar or related 570words under one entry, for example: 571 572 reflex 573 574Will only index occurrences of "reflex" as a whole word, but: 575 576 reflex \<reflex\w*\> 577 578will index occurrences of "reflex", "reflexing" and 579"reflexed" all under the same entry ['reflex]. 580You will very often need to use this to deal with plurals and other variants. 581 582This inclusion rule can also restrict the term to 583certain sections, and add an index category that 584the term should belong to (so it only appears in certain 585indexes). 586 587Finally the script can add rewrite rules, that rename section names 588that are automatically used as index entries. For example we might 589want to remove leading "A" or "The" prefixes from section titles 590when AutoIndex uses them as an index entry: 591 592 !rewrite-name "(?i)(?:A|The)\s+(.*)" "\1" 593 594[endsect] [/section:script Step 4: Create the script file - to control what to terms to index] 595 596[section:entries Step 5: Add Manual Index Entries to Docbook XML - Optional] 597 598If you add manual `<indexentry>` markup to your Docbook XML then these will be 599passed through unchanged. Please note however, that if you are using 600AutoIndex's internal index generation then it only recognises 601`<primary>`, `<secondary>` and `<tertiary>` elements within the `<indexterm>`. 602`<see>` and `<seealso>` elements are not currently recognised 603and AutoIndex will emit a warning if these are used. 604 605Likewise none of the attributes which can be applied to these elements are used when 606AutoIndex generates the index itself, with the exception of the `<type>` attribute. 607 608For Quickbook users, there are some templates in auto_index_helpers.qbk that assist 609in adding manual entries without having to escape to Docbook. 610 611[endsect] [/section:entries Step 5: Add Manual Index Entries to Docbook XML - Optional] 612 613[section:pis Step 6: Using XML processing instructions to control what gets indexed.] 614 615Sometimes when you need to exclude certain sections of text from indexing, 616then you can achieve this with the following XML processing instructions: 617 618[table 619[[Instruction][Effect]] 620[[`<?BoostAutoIndex IgnoreSection?>`] 621 [Causes the whole of the current section to be excluded from indexing. 622 By "section" we mean either a true "section" or any sibling XML element: 623 "dedication", "toc", "lot", "glossary", "bibliography", "preface", "chapter", 624 "reference", "part", "article", "appendix", "index", "setindex", "colophon", 625 "sect1", "refentry", "simplesect", "section" or "partintro".]] 626[[`<?BoostAutoIndex IgnoreBlock?>`] 627 [Causes the whole of the current text block to be excluded from indexing. 628 A text block may be any of the section/chapter elements listed above, or a 629 paragraph, code listing, table etc. The complete list is: 630 "calloutlist", "glosslist", "bibliolist", "itemizedlist", "orderedlist", 631 "segmentedlist", "simplelist", "variablelist", "caution", "important", "note", 632 "tip", "warning", "literallayout", "programlisting", "programlistingco", 633 "screen", "screenco", "screenshot", "synopsis", "cmdsynopsis", "funcsynopsis", 634 "classsynopsis", "fieldsynopsis", "constructorsynopsis", 635 "destructorsynopsis", "methodsynopsis", "formalpara", "para", "simpara", 636 "address", "blockquote", "graphic", "graphicco", "mediaobject", 637 "mediaobjectco", "informalequation", "informalexample", "informalfigure", 638 "informaltable", "equation", "example", "figure", "table", "msgset", "procedure", 639 "sidebar", "qandaset", "task", "productionset", "constraintdef", "anchor", 640 "bridgehead", "remark", "highlights", "abstract", "authorblurb" or "epigraph".]] 641] 642 643For Quickbook users the file auto_index_helpers.qbk contains a helper template 644that assists in inserting these processing instructions, for example: 645 646[pre \[AutoIndex IgnoreSection\]] 647 648Will cause that section to not be indexed. 649 650[endsect] [/section:pis Step 6: Using XML processing instructions to control what gets indexed.] 651 652[section:build_docs Step 7: Build the Docs] 653 654Using Boost.Build you build the docs with either: 655 656 bjam release > mylibrary_html.log 657 658To build the html docs or: 659 660 bjam pdf release > mylibrary_pdf.log 661 662To build the pdf. 663 664During the build process you should see AutoIndex emit a message in the log file 665such as: 666 667[pre Indexing 990 terms... ] 668 669If you don't see that, or if it's indexing 0 terms then something is wrong! 670 671Likewise when index generation is complete, AutoIndex will emit another message: 672 673[pre 38 Index entries were created.] 674 675Again, if you see that 0 entries were created then something is wrong! 676 677Examine the log file, and if the cause is not obvious, 678make sure that you have [^<auto-index-verbose>on] and that 679any needed 680[^!debug regular-expression] directives are in your script file. 681 682[endsect] [/section:build_docs Step 7: Build the Docs] 683 684[section:refine Step 8: Iterate - to refine your index] 685 686Creating a good index is an iterative process, often the first step is 687just to add a header scanning rule to the script file and then generate 688the documentation and see: 689 690* What's missing. 691* What's been included that shouldn't be. 692* What's been included under a poor name. 693 694Further rules can then be added to the script to handle these cases 695and the next iteration examined, and so on. 696 697[tip If you don't understand why a particular term is (or is not) present in the index, 698try adding a ['!debug regular-expression] 699directive to the [link boost_autoindex.script_ref script file]. 700] [/tip] 701 702[heading Restricting which Sections are indexed for a particular term] 703 704You can restrict which sections are indexed for a particular term. 705So assuming that the docbook document has the usual hierarchical names for section ID's 706(as Quickbook generates, for example), 707you can easily place a constraint on which sections are examined for a particular term. 708 709For example, if you want to index occurrences of Lord Kelvin's name, 710but only in the introduction section, you might then add: 711 712 Kelvin "" ".*introduction.*" 713 714to the script file, 715assuming that the section ID of the intro is "some_library_or_chapter_name.introduction". 716 717This would avoid an index entry every time 'Kelvin' is found, 718something the user is unlikely to find helpful. 719 720[endsect] [/section:refine Step 8: Iterate - to refine your index] 721 722[endsect] [/section:tut Getting Started and Tutorial] 723 724 725[section:script_ref Script File (.idx) Reference] 726 727The following elements can occur in a script: 728 729[h4 Comments and blank lines] 730 731Blank lines consisting of only whitespace are ignored, so are lines that [*start with a #]. 732 733[note You can't append \# comments onto the end of a line\!] 734 735[h4 Inclusion of Index terms] 736 737 term [regular-expression1 [regular-expression2 [category]]] 738 739[variablelist 740[[term][ 741['Term to index.] 742 743The index term will form a primary entry in the Index 744with the section title(s) containing the term as secondary entries, and 745also will be used as a secondary entry beneath each of the section 746titles that the index term occurs in.] 747] [/term] 748 749[[regular-expression1][ 750['Index term Searcher.] 751 752An optional regular expression: each occurrence 753of the regular expression in the text of the document will result 754in one index term being emitted. 755 756If the regular expression is omitted (default) or is "", then the ['index term] itself 757will be used as the search text - and only occurrence of whole words matching 758['index term] will be indexed. 759 760For example: 761 762``foobar`` 763 764will index occurrences of "foobar" in any section, but 765 766``foobar \<\w*(foo|bar)\w*\>`` 767 768will index any whole word containing either "foo" or "bar" within it. 769This is useful when you want to index a lot of similar or related words under one entry. 770 771``reflex`` 772 773will only index occurrences of "reflex" as a whole word, but: 774 775``reflex \<reflex\w*\>`` 776 777will index occurrences of "reflex", "reflexes", "reflexing" and "reflexed" ... 778all under the same entry reflex. 779 780You will very often need to use this to deal with plurals and other variants.] 781] [/regular-expression1] 782 783[[regular-expression2] 784[['Section(s) Selector.] 785 786A constraint that specifies which sections are 787indexed for ['term]: only if the ID of the section matches 788['regular-expression2] exactly will that section be indexed 789for occurrences of ['term]. 790 791For example, to limit indexing to just [*one specific section] (but not sub-sections below): 792 793``myclass "" "mylib\.examples"`` 794 795 796For example, to limit indexing to specific sections, [*and sub-sections below]: 797 798``myclass "" "mylib\.examples.*"`` 799 800will index occurrences of "myclass" as a whole word, 801but only in sections whose section ID [*begins] "mylib.examples", while 802 803``myclass "\<myclass\w*\>" "mylib\.examples.*"`` 804 805will also index plurals myclass, myclasses, myclasss ... 806 807and: 808 809``myclass "" "(?!mylib\.introduction).*"`` 810 811will index occurrences of "myclass" in any section, 812except those whose section IDs begin "mylib.introduction". 813 814Finally, two (or more) sections can be excluded by OR'ing them together: 815 816``myclass "" "(?!mylib\.introduction|mylib\.reference).*"`` 817 818which excludes searching for this term in sections whose ID's start with either "mylib.introduction" or "mylib.reference". 819 820If this third section selection field is omitted (the default) 821or is "", then [*all sections] are indexed for this term. 822] 823] [/regular-expression2] 824 825[[category][ 826['Index Category Constraint.] 827 828Optionally a category to place occurrences of ['index term] in. 829If you have multiple indexes then this is the name 830assigned to the indexes "type" attribute. 831 832For example: 833 834 myclass "" "" class_name 835 836Will index occurances of ['myclass] and place them in the class-index if there is one. 837 838]] [/category] 839 840] [/variablelist] 841 842You can have an index term appear more than once in the script file: 843 844* If they have different /category/ names then they are treated quite separately. 845* Otherwise they are combined, so that the logical or of the regular expressions provided are taken. 846 847Thus: 848 849 myterm search_expression1 constrait_expression2 foo 850 myterm search_expression1 constrait_expression2 bar 851 852Will be treated as different terms each with their own entries, while: 853 854 myterm search_expression1 constrait_expression2 mycategory 855 myterm search_expression1 constrait_expression2 mycategory 856 857Will be combined into a single term equivalent to: 858 859 myterm (?:search_expression1|search_expression1) (?:constrait_expression2|constrait_expression2) mycategory 860 861[h4 Source File Scanning] 862 863 !scan source-file-name 864 865Scans the C\/C++ source file ['source-file-name] for definitions of 866['function]s, ['class]s, ['macro]s or ['typedef]s and makes each of 867these a term to be indexed. Terms found are assigned to the index category 868"function_name", "class_name", "macro_name" or "typedef_name" depending 869on how they were seen in the source file. These may then be included 870in a specialised index whose "type" attribute has the same category name. 871 872[important 873When actually indexing a document, the scanner will not index just any old occurrence of the 874terms found in the source files. Instead it searches for class definitions or function or 875typedef declarations. This reduces the number of spurious matches placed in the index, but 876may also miss some legitimate terms: 877refer to the /define-scanner/ command for information on how to change this. 878] 879 880[h4 Directory and Source File Scanning] 881 882 !scan-path directory-name file-name-regex [recurse] 883 884[variablelist 885[[directory-name][The directory to scan: this should be a path relative 886to the script file (or to the path specified with the prefix=path option on the command line) 887and should use all forward slashes in its file name.]] 888 889[[file-name-regex][A regular expression: any file in the directory whose name 890matches the regular expression will be scanned for terms to index.]] 891 892[[recurse][An optional boolean value - either "true" or "false" - that 893indicates whether to recurse into subdirectories. This defaults to "false".]] 894] 895 896[h4 Excluding Terms] 897 898 !exclude term-list 899 900Excludes all the terms in whitespace separated ['term-list] from being indexed. 901This should be placed /after/ any ['!scan] or ['!scan-path] rules which may 902result in the terms becoming included. In other words this removes terms from 903the scanners internal list of things to index. 904 905[h4 Rewriting Section Names] 906 907[pre !rewrite-id regular-expression new-name] 908 909[variablelist 910[[regular-expression][A regular expression: all section ID's that match 911the expression exactly will have index entries ['new-name] instead of 912their title(s).]] 913 914[[new-name][The name that the section will appear under in the index.]] 915] 916 917 !rewrite-name regular-expression format-text 918 919[variablelist 920[[regular-expression][A regular expression: all sections whose titles 921match the regular expression exactly, will have index entries composed 922of the regular expression match combined with the regex format string 923['format-text].]] 924[[format-text][The Perl-style format string used to reformat the title.]] 925] 926 927For example: 928 929[pre 930!rewrite-name "(?:A|An|The)\s+(.*)" "\1" 931] 932 933Will remove any leading "A", "An" or "The" from all index entries - thus preventing lots of 934entries under "The" etc! 935 936[h4 Defining or Changing the File Scanners] 937 938 !define-scanner type file-search-expression xml-regex-formatter term-formatter id-filter filename-filter 939 940When a source file is scanned using the =!scan= or =!scan-path= rules, then the file is searched using 941a series of regular expressions to look for classes, functions, macros or typedefs that should be indexed. 942A set of default regular expressions are provided for this (see below), but sometimes you may want to replace 943the defaults, or add new scanners. The arguments to this rule are: 944 945[variablelist 946[[type][The ['type] to which items found using this rule will assigned, index terms created from the 947source file and then found in the XML, will have the type attribute set to this value, and may then appear in a 948specialized index with the same type attribute]] 949[[file-search-expression][A regular expression that is used to scan the source file for index terms, the result of 950a match against this expression will be transformed by the next two arguments.]] 951[[xml-regex-formatter][A regular expression format string that extracts the salient information from whatever 952matched the ['file-search-expression] in the source file, and creates ['a new regular expression] that will 953be used to search the document being indexed for occurrences of this index term.]] 954[[term-formatter][A regular expression format string that extracts the salient information from whatever 955matched the ['file-search-expression] in the source file, and creates the index term that will appear in 956the index.]] 957[[id-filter][Optional. A regular expression that restricts the section-id's that are searched in the document being indexed: 958only sections whose ID attribute matches this expression exactly will be considered for indexing terms found by this scanner.]] 959[[filename-filter][Optional. A regular expression that restricts which files are scanned by this scanner: only files whose file name 960matches this expression exactly will be scanned for index terms to use. Note that the filename matched against this may 961well be an absolute path, and contain either forward or backward slash path separators.]] 962] 963 964If, when the first file is scanned, there are no scanners whose ['type] is "class_name", "typedef_name", "macro_name" or 965"function_name", then the defaults are installed. These are equivalent to: 966 967 !define-scanner class_name "^[[:space:]]*(template[[:space:]]*<[^;:{]+>[[:space:]]*)?(class|struct)[[:space:]]*(\<\w+\>([[:blank:]]*\([^)]*\))?[[:space:]]*)*(\<\w*\>)[[:space:]]*(<[^;:{]+>)?[[:space:]]*(\{|:[^;\{()]*\{)" "(?:class|struct)[^;{]+\<\5\>[^;{]+\{" \5 968 !define-scanner typedef_name "typedef[^;{}#]+?(\w+)\s*;" "typedef[^;]+\<\1\>\s*;" "\1" 969 !define-scanner "macro_name" "^\s*#\s*define\s+(\w+)" "\<\1\>" "\1" 970 !define-scanner "function_name" "\w++(?:\s*+<[^>]++>)?[\s&*]+?(\w+)\s*(?:BOOST_[[:upper:]_]+\s*)?\([^;{}]*\)\s*[;{]" "\\<\\w+\\>(?:\\s+<[^>]*>)*[\\s&*]+\\<\1\\>\\s*\\([^;{]*\\)" "\1" 971 972Note that these defaults are not installed if you have provided your own versions with these ['type] names. In this case if 973you want the default scanners to be in effect as well as your own, you should include the above in your script file. 974It is also perfectly allowable to have multiple scanners with the same ['type], but with the other fields differing. 975 976Finally you should note that the default scanners are quite strict 977in what they will find, for example the class 978scanner will only create index entries for classes that have class definitions of the form: 979 980 class my_class : public base_classes 981 { 982 // etc 983 984In the documentation, so that simple mentions of the class name will ['not] get indexed, 985only the class synopsis if there is one. 986If this isn't how you want things, then include the ['class_name] scanner definition 987above in your script file, and change 988the ['xml-regex-formatter] field to something more permissive, for example: 989 990 !define-scanner class_name "^[[:space:]]*(template[[:space:]]*<[^;:{]+>[[:space:]]*)?(class|struct)[[:space:]]*(\<\w+\>([[:blank:]]*\([^)]*\))?[[:space:]]*)*(\<\w*\>)[[:space:]]*(<[^;:{]+>)?[[:space:]]*(\{|:[^;\{()]*\{)" "\<\5\>" \5 991 992Will look for ['any] occurrence of whatever class names the scanner may find in the documentation. 993 994[h4 Debugging scanning] 995 996If you see a term in the index, and you don't understand why it's there, add a ['debug] directive: 997 998[pre 999!debug regular-expression 1000] 1001 1002Now, whenever ['regular-expression] matches either the found index term, 1003or the section title it appears in, or the ['type] field of a scanner, then 1004some diagnostic information will be printed that will look something like: 1005 1006[pre 1007Debug term found, in block with ID: spirit.qi.reference.parser_concepts.parser 1008Current section title is: Notation 1009The main index entry will be : Notation 1010The indexed term is: parser 1011The search regex is: \[P\|p\]arser 1012The section constraint is: .*qi.reference.parser_concepts.* 1013The index type for this entry is: qi_index 1014] 1015 1016This can produce a lot of output in your log file, 1017but until you are satisfied with your file selection and scanning process, 1018it is worth switching it on. 1019 1020[endsect] [/section:script_ref Script File Reference] 1021 1022[section:workflow Understanding The AutoIndex Workflow] 1023 1024# Load the script file (usually index.idx) 1025 and process it one line at a time, 1026 producing one or more index term per (non-comment) line. 1027 1028# Reading all lines builds a list of ['terms to index]. 1029 Some of those may be terms defined (by you) directly in the script file, 1030 others may be terms found by scanning C++ header and source files 1031 that were specified by the ['!scan-path] directive. 1032 1033# Once the complete list of ['terms to index] is complete, 1034 it loads the Docbook XML file. 1035 (If this comes from Quickbook\/Doxygen\/Boostbook\/Docbook then this is 1036 the complete documentation after conversion to Docbook format). 1037 1038# AutoIndex builds an internal __DocObjMod of the Docbook XML. 1039 This internal representation then gets scanned for occurrences of the ['terms to index]. 1040 This scanning works at the XML paragraph level 1041 (or equivalent sibling such as a table or code block) 1042 - so all the XML encoding within a paragraph gets flattened to plain text.[br] 1043 This flattening means the regular expressions used to search for ['terms to index] 1044 can find anything that is completely contained within a paragraph 1045 (or code block etc). 1046 1047# For each term found then an ['indexterm] Docbook element is inserted 1048 into the __DocObjMod (provided internal index generation is off), 1049 1050# Also the AutoIndex's internal index representation gets updated. 1051 1052# Once the whole XML document has been indexed, 1053 then, if AutoIndex has been instructed to generate the index itself, 1054 it creates the necessary XML and inserts this into the __DocObjMod. 1055 1056# Finally the whole __DocObjMod is written out as a new Docbook XML file, 1057 and normal processing of this continues via the XSL stylesheets (with xsltproc) 1058 to actually build the final human-readable docs. 1059 1060[endsect] [/section:workflow AutoIndex Workflow] 1061 1062 1063[section:xml XML Handling] 1064 1065AutoIndex is rather simplistic in its handling of XML: 1066 1067* When indexing a document, all block content at the paragraph level gets collapsed into a single 1068string for matching against the regular expressions representing each index term. In other words, 1069for the most part, you can assume that you're indexing plain text when writing regular expressions. 1070* Named XML entities for &, ", ', < or > are converted to their corresponding characters before indexing 1071a section of text. However, decimal or hex escape sequences are not currently converted. 1072* Index terms are assumed to be plain text (whether they originate from the script file 1073or from scanning source files) and the characters &, ", < and > will be escaped to 1074& " < and > respectively. 1075 1076[endsect] [/section:xml XML Handling] 1077 1078[section:qbk Quickbook Support] 1079 1080The file auto_index_helpers.qbk in ['boost-path]/tools/auto_index/include contains various Quickbook 1081templates to assist with AutoIndex support. One would normally add the above path to your include 1082search path via an `<include>path` statement in your Jamfile, and then make the templates available 1083to your Quickbook source via a: 1084 1085[pre \[include auto_index_helpers.qbk\]] 1086 1087statement at the start of your Quickbook file. 1088 1089The available templates are then: 1090 1091[table 1092[[Template][Description]] 1093[[`[index]`][Creates a main index, with no "type" category set, which will be titled simply "Index".]] 1094[[`[named_index type title]`][Creates an index with the type attribute set to "type" and the title will be "title".[br] 1095 For example to create an index containing only class names one would typically add `[named_index class_name Class Index]` 1096 to your Quickbook source.]] 1097[[`[AutoIndex Arg]`][Creates a Docbook processing instruction that will be handled by AutoIndex, valid values for "Arg" 1098 are either "IgnoreSection" or "IgnoreBlock".]] 1099[[`[indexterm1 primary-key]`][Creates a manual index entry that will link to the current section, and have a single primary key "primary-key". 1100 Note that this index key will not have a "type" attribute set, and so will only appear in the main index.]] 1101[[`[indexterm2 primary-key secondary-key]`][Creates a manual index entry that will link to the current section, and has 1102 "primary-key" and "secondary key" as the primary and secondary keys respectively. 1103 Note that this index key will not have a "type" attribute set, and so will only appear in the main index.]] 1104[[`[indexterm3 primary-key secondary-key tertiary-key]`][Creates a manual index entry that will link to the current section, 1105 and have primary, secondary and tertiary keys: "primary-key", "secondary key" and "tertiary key". 1106 Note that this index key will not have a "type" attribute set, and so will only appear in the main index.]] 1107 1108[[`[typed_indexterm1 type primary-key]`][Creates a manual index entry that will link to the current section, and have a single primary key "primary-key". 1109 Note that this index key will have the "type" attribute set to the "type" argument, and so may appear in named sub-indexes 1110 that also have their type attribute set.]] 1111[[`[typed_indexterm2 type primary-key secondary-key]`][Creates a manual index entry that will link to the current section, and has 1112 "primary-key" and "secondary key" as the primary and secondary keys respectively. 1113 Note that this index key will have the "type" attribute set to the "type" argument, and so may appear in named sub-indexes 1114 that also have their type attribute set.]] 1115[[`[typed_indexterm3 type primary-key secondary-key tertiary-key]`][Creates a manual index entry that will link to the current section, 1116 and have primary, secondary and tertiary keys: "primary-key", "secondary key" and "tertiary key". 1117 Note that this index key will have the "type" attribute set to the "type" argument, and so may appear in named sub-indexes 1118 that also have their type attribute set.]] 1119] 1120 1121[endsect] 1122 1123[section:comm_ref Command Line Reference] 1124 1125The following command line options are supported by AutoIndex: 1126 1127[variablelist 1128[[--in=infilename][Specifies the name of the XML input file to be indexed.]] 1129[[--out=outfilename][Specifies the name of the new XML file to create.]] 1130[[--scan=source-filename][Specifies that ['source-filename] should be scanned 1131for terms to index.]] 1132[[--script=script-filename][Specifies the name of the script file to process.]] 1133[[--no-duplicates][If a term occurs more than once in the same section, then 1134include only one index entry.]] 1135[[--internal-index][Specifies that AutoIndex should generate the actual 1136indexes rather than inserting `<indexterm>`s and leaving index generation 1137to the XSL stylesheets.]] 1138[[--no-section-names][Prevents AutoIndex from using section names as index entries.]] 1139[[--prefix=pathname][Specifies a directory to apply as a prefix to all relative file paths in the script file.]] 1140[[--index-type=element-name][Specifies the name of the XML element to enclose internally generated indexes in: 1141 defaults to ['section], but could equally be ['appendix] or ['chapter] 1142 or some other block level element that has a formal title.]] 1143] 1144 1145[endsect] [/section:comm_ref Command Line Reference] 1146 1147[include ../include/auto_index_helpers.qbk] 1148 1149[index] 1150