1[/============================================================================== 2 Copyright (C) 2001-2011 Joel de Guzman 3 Copyright (C) 2001-2011 Hartmut Kaiser 4 5 Distributed under the Boost Software License, Version 1.0. (See accompanying 6 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 7===============================================================================/] 8 9[section Preface] 10 11[:['["Examples of designs that meet most of the criteria for 12"goodness" (easy to understand, flexible, efficient) are a 13recursive-descent parser, which is traditional procedural 14code. Another example is the STL, which is a generic library of 15containers and algorithms depending crucially on both traditional 16procedural code and on parametric polymorphism.]] [*--Bjarne 17Stroustrup]] 18 19[heading History] 20 21[heading /80s/] 22 23In the mid-80s, Joel wrote his first calculator in Pascal. Such an 24unforgettable coding experience, he was amazed at how a mutually 25recursive set of functions can model a grammar specification. In time, 26the skills he acquired from that academic experience became very 27practical as he was tasked to do some parsing. For instance, whenever he 28needed to perform any form of binary or text I/O, he tried to approach 29each task somewhat formally by writing a grammar using Pascal-like 30syntax diagrams and then a corresponding recursive-descent parser. This 31process worked very well. 32 33[heading /90s/] 34 35The arrival of the Internet and the World Wide Web magnified the need 36for parsing a thousand-fold. At one point Joel had to write an HTML 37parser for a Web browser project. Using the W3C formal specifications, 38he easily wrote a recursive-descent HTML parser. With the influence of 39the Internet, RFC specifications were abundent. SGML, HTML, XML, email 40addresses and even those seemingly trivial URLs were all formally 41specified using small EBNF-style grammar specifications. Joel had more 42parsing to do, and he wished for a tool similar to larger parser 43generators such as YACC and ANTLR, where a parser is built automatically 44from a grammar specification. 45 46This ideal tool would be able to parse anything from email addresses and 47command lines, to XML and scripting languages. Scalability was a primary 48goal. The tool would be able to do this without incurring a heavy 49development load, which was not possible with the above mentioned parser 50generators. The result was Spirit. 51 52Spirit was a personal project that was conceived when Joel was involved 53in R&D in Japan. Inspired by the GoF's composite and interpreter 54patterns, he realized that he can model a recursive-descent parser with 55hierarchical-object composition of primitives (terminals) and composites 56(productions). The original version was implemented with run-time 57polymorphic classes. A parser was generated at run time by feeding in 58production rule strings such as: 59 60 "prod ::= {'A' | 'B'} 'C';" 61 62A compile function compiled the parser, dynamically creating a hierarchy 63of objects and linking semantic actions on the fly. A very early text 64can be found here: __early_spirit__. 65 66[heading /2001 to 2006/] 67 68Version 1.0 to 1.8 was a complete rewrite of the original Spirit parser 69using expression templates and static polymorphism, inspired by the 70works of Todd Veldhuizen (__exprtemplates__, C++ Report, June 711995). Initially, the static-Spirit version was meant only to replace 72the core of the original dynamic-Spirit. Dynamic-Spirit needed a parser 73to implement itself anyway. The original employed a hand-coded 74recursive-descent parser to parse the input grammar specification 75strings. It was at this time when Hartmut Kaiser joined the Spirit 76development. 77 78After its initial "open-source" debut in May 2001, static-Spirit became 79a success. At around November 2001, the Spirit website had an activity 80percentile of 98%, making it the number one parser tool at Source Forge 81at the time. Not bad for a niche project like a parser library. The 82"static" portion of Spirit was forgotten and static-Spirit simply became 83Spirit. The library soon evolved to acquire more dynamic features. 84 85Spirit was formally accepted into __boost__ in October 2002. Boost is a 86peer-reviewed, open collaborative development effort around a collection 87of free Open Source C++ libraries covering a wide range of domains. The 88Boost Libraries have become widely known as an industry standard for 89design and implementation quality, robustness, and reusability. 90 91[heading /2007/] 92 93Over the years, especially after Spirit was accepted into Boost, Spirit 94has served its purpose quite admirably. [*/Classic-Spirit/] (versions 95prior to 2.0) focused on transduction parsing, where the input string is 96merely translated to an output string. Many parsers fall into the 97transduction type. When the time came to add attributes to the parser 98library, it was done in a rather ad-hoc manner, with the goal being 100% 99backward compatible with Classic Spirit. As a result, some parsers have 100attributes, some don't. 101 102Spirit V2 is another major rewrite. Spirit V2 grammars are fully 103attributed (see __attr_grammar__) which means that all parser components 104have attributes. To do this efficiently and elegantly, we had to use a 105couple of infrastructure libraries. Some did not exist, some were quite 106new when Spirit debuted, and some needed work. __mpl__ is an important 107infrastructure library, yet is not sufficient to implement Spirit V2. 108Another library had to be written: __fusion__. Fusion sits between MPL 109and STL --between compile time and runtime -- mapping types to values. 110Fusion is a direct descendant of both MPL and __boost_tuples__. Fusion 111is now a full-fledged __boost__ library. __phoenix__ also had to be 112beefed up to support Spirit V2. The result is __phoenix__. Last 113but not least, Spirit V2 uses an __exprtemplates__ library called 114__boost_proto__. 115 116Even though it has evolved and matured to become a multi-module library, 117Spirit is still used for micro-parsing tasks as well as scripting 118languages. Like C++, you only pay for features that you need. The power 119of Spirit comes from its modularity and extensibility. Instead of giving 120you a sledgehammer, it gives you the right ingredients to easily create 121a sledgehammer. 122 123[heading New Ideas: Spirit V2] 124 125Just before the development of Spirit V2 began, Hartmut came across the 126__string_template__ library that is a part of the ANTLR parser 127framework. [footnote Quote from http://www.stringtemplate.org/: It is a 128Java template engine (with ports for C# and Python) for generating 129source code, web pages, emails, or any other formatted text output.] 130The concepts presented in that library lead Hartmut to 131the next step in the evolution of Spirit. Parsing and generation are 132tightly connected to a formal notation, or a grammar. The grammar 133describes both input and output, and therefore, a parser library should 134have a grammar driven output. This duality is expressed in Spirit by the 135parser library __qi__ and the generator library __karma__ using the same 136component infrastructure. 137 138The idea of creating a lexer library well integrated with the Spirit 139parsers is not new. This has been discussed almost since Classic-Spirit 140(pre V2) initially debuted. Several attempts to integrate existing lexer 141libraries and frameworks with Spirit have been made and served as a 142proof of concept and usability (for example see __wave__: The Boost 143C/C++ Preprocessor Library, and __slex__: a fully dynamic C++ lexer 144implemented with Spirit). Based on these experiences we added __lex__: a 145fully integrated lexer library to the mix, allowing the user to take 146advantage of the power of regular expressions for token matching, 147removing pressure from the parser components, simplifying parser 148grammars. Again, Spirit's modular structure allowed us to reuse the same 149underlying component library as for the parser and generator libraries. 150 151[heading How to use this manual] 152 153Each major section (there are 3: __sec_qi__, __sec_karma__, and 154__sec_lex__) is roughly divided into 3 parts: 155 156# Tutorials: A step by step guide with heavily annotated code. These 157 are meant to get the user acquainted with the library as quickly as 158 possible. The objective is to build the confidence of the user in 159 using the library through abundant examples and detailed 160 instructions. Examples speak volumes and we have volumes of 161 examples! 162 163# Abstracts: A high level summary of key topics. The objective is to 164 give the user a high level view of the library, the key concepts, 165 background and theories. 166 167# Reference: Detailed formal technical reference. We start with a quick 168 reference -- an easy to use table that maps into the reference proper. 169 The reference proper starts with C++ concepts followed by 170 models of the concepts. 171 172Some icons are used to mark certain topics indicative of their relevance. 173These icons precede some text to indicate: 174 175[table Icons 176 177 [[Icon] [Name] [Meaning]] 178 179 [[__note__] [Note] [Generally useful information (an aside that 180 doesn't fit in the flow of the text)]] 181 182 [[__tip__] [Tip] [Suggestion on how to do something 183 (especially something that is not obvious)]] 184 185 [[__important__] [Important] [Important note on something to take 186 particular notice of]] 187 188 [[__caution__] [Caution] [Take special care with this - it may 189 not be what you expect and may cause bad 190 results]] 191 192 [[__danger__] [Danger] [This is likely to cause serious 193 trouble if ignored]] 194] 195 196This documentation is automatically generated by Boost QuickBook 197documentation tool. QuickBook can be found in the __boost_tools__. 198 199[heading Support] 200 201Please direct all questions to Spirit's mailing list. You can subscribe 202to the __spirit_list__. The mailing list has a searchable archive. A 203search link to this archive is provided in __spirit__'s home page. You 204may also read and post messages to the mailing list through 205__spirit_general__ (thanks to __gmane__). The news group mirrors the 206mailing list. Here is a link to the archives: __mlist_archive__. 207 208[endsect] [/ Preface] 209