• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[library Boost.MPI
2    [quickbook 1.6]
3    [authors [Gregor, Douglas], [Troyer, Matthias] ]
4    [copyright 2005 2006 2007 Douglas Gregor, Matthias Troyer, Trustees of Indiana University]
5    [id mpi]
6    [license
7        Distributed under the Boost Software License, Version 1.0.
8        (See accompanying file LICENSE_1_0.txt or copy at
9        <ulink url="http://www.boost.org/LICENSE_1_0.txt">
10            http://www.boost.org/LICENSE_1_0.txt
11        </ulink>)
12    ]
13]
14
15[/ Links ]
16[def _MPI_         [@http://www-unix.mcs.anl.gov/mpi/ MPI]]
17[def _MPI_implementations_
18   [@http://www-unix.mcs.anl.gov/mpi/implementations.html
19    MPI implementations]]
20[def _Serialization_ [@boost:/libs/serialization/doc
21                      Boost.Serialization]]
22[def _BoostPython_ [@http://www.boost.org/libs/python/doc
23                      Boost.Python]]
24[def _Python_      [@http://www.python.org Python]]
25[def _MPICH_        [@http://www-unix.mcs.anl.gov/mpi/mpich/ MPICH2]]
26[def _OpenMPI_      [@http://www.open-mpi.org OpenMPI]]
27[def _IntelMPI_     [@https://software.intel.com/en-us/intel-mpi-library Intel MPI]]
28[def _accumulate_   [@http://www.sgi.com/tech/stl/accumulate.html
29                     `accumulate`]]
30
31[include introduction.qbk]
32[include getting_started.qbk]
33[include tutorial.qbk]
34[include c_mapping.qbk]
35
36[xinclude mpi_autodoc.xml]
37
38[include python.qbk]
39
40[section:design Design Philosophy]
41
42The design philosophy of the Parallel MPI library is very simple: be
43both convenient and efficient. MPI is a library built for
44high-performance applications, but it's FORTRAN-centric,
45performance-minded design makes it rather inflexible from the C++
46point of view: passing a string from one process to another is
47inconvenient, requiring several messages and explicit buffering;
48passing a container of strings from one process to another requires
49an extra level of manual bookkeeping; and passing a map from strings
50to containers of strings is positively infuriating. The Parallel MPI
51library allows all of these data types to be passed using the same
52simple `send()` and `recv()` primitives. Likewise, collective
53operations such as [funcref boost::mpi::reduce `reduce()`]
54allow arbitrary data types and function objects, much like the C++
55Standard Library would.
56
57The higher-level abstractions provided for convenience must not have
58an impact on the performance of the application. For instance, sending
59an integer via `send` must be as efficient as a call to `MPI_Send`,
60which means that it must be implemented by a simple call to
61`MPI_Send`; likewise, an integer [funcref boost::mpi::reduce
62`reduce()`] using `std::plus<int>` must be implemented with a call to
63`MPI_Reduce` on integers using the `MPI_SUM` operation: anything less
64will impact performance. In essence, this is the "don't pay for what
65you don't use" principle: if the user is not transmitting strings,
66s/he should not pay the overhead associated with strings.
67
68Sometimes, achieving maximal performance means foregoing convenient
69abstractions and implementing certain functionality using lower-level
70primitives. For this reason, it is always possible to extract enough
71information from the abstractions in Boost.MPI to minimize
72the amount of effort required to interface between Boost.MPI
73and the C MPI library.
74[endsect]
75
76[section:performance Performance Evaluation]
77
78Message-passing performance is crucial in high-performance distributed
79computing. To evaluate the performance of Boost.MPI, we modified the
80standard [@http://www.scl.ameslab.gov/netpipe/ NetPIPE] benchmark
81(version 3.6.2) to use Boost.MPI and compared its performance against
82raw MPI. We ran five different variants of the NetPIPE benchmark:
83
84# MPI: The unmodified NetPIPE benchmark.
85
86# Boost.MPI: NetPIPE modified to use Boost.MPI calls for
87  communication.
88
89# MPI (Datatypes): NetPIPE modified to use a derived datatype (which
90  itself contains a single `MPI_BYTE`) rather than a fundamental
91  datatype.
92
93# Boost.MPI (Datatypes): NetPIPE modified to use a user-defined type
94  `Char` in place of the fundamental `char` type. The `Char` type
95  contains a single `char`, a `serialize()` method to make it
96  serializable, and specializes [classref
97  boost::mpi::is_mpi_datatype is_mpi_datatype] to force
98  Boost.MPI to build a derived MPI data type for it.
99
100# Boost.MPI (Serialized): NetPIPE modified to use a user-defined type
101  `Char` in place of the fundamental `char` type. This `Char` type
102  contains a single `char` and is serializable. Unlike the Datatypes
103  case, [classref boost::mpi::is_mpi_datatype
104  is_mpi_datatype] is *not* specialized, forcing Boost.MPI to perform
105  many, many serialization calls.
106
107The actual tests were performed on the Odin cluster in the
108[@http://www.cs.indiana.edu/ Department of Computer Science] at
109[@http://www.iub.edu Indiana University], which contains 128 nodes
110connected via Infiniband. Each node contains 4GB memory and two AMD
111Opteron processors. The NetPIPE benchmarks were compiled with Intel's
112C++ Compiler, version 9.0, Boost 1.35.0 (prerelease), and
113[@http://www.open-mpi.org/ Open MPI] version 1.1. The NetPIPE results
114follow:
115
116[$../../libs/mpi/doc/netpipe.png]
117
118There are a some observations we can make about these NetPIPE
119results. First of all, the top two plots show that Boost.MPI performs
120on par with MPI for fundamental types. The next two plots show that
121Boost.MPI performs on par with MPI for derived data types, even though
122Boost.MPI provides a much more abstract, completely transparent
123approach to building derived data types than raw MPI. Overall
124performance for derived data types is significantly worse than for
125fundamental data types, but the bottleneck is in the underlying MPI
126implementation itself. Finally, when forcing Boost.MPI to serialize
127characters individually, performance suffers greatly. This particular
128instance is the worst possible case for Boost.MPI, because we are
129serializing millions of individual characters.  Overall, the
130additional abstraction provided by Boost.MPI does not impair its
131performance.
132
133[endsect]
134
135[section:history Revision History]
136
137* *Boost 1.36.0*:
138  * Support for non-blocking operations in Python, from Andreas Klöckner
139
140* *Boost 1.35.0*: Initial release, containing the following post-review changes
141  * Support for arrays in all collective operations
142  * Support default-construction of [classref boost::mpi::environment environment]
143
144* *2006-09-21*: Boost.MPI accepted into Boost.
145
146[endsect:history]
147
148[section:acknowledge Acknowledgments]
149Boost.MPI was developed with support from Zurcher Kantonalbank. Daniel
150Egloff and Michael Gauckler contributed many ideas to Boost.MPI's
151design, particularly in the design of its abstractions for
152MPI data types and the novel skeleton/context mechanism for large data
153structures. Prabhanjan (Anju) Kambadur developed the predecessor to
154Boost.MPI that proved the usefulness of the Serialization library in
155an MPI setting and the performance benefits of specialization in a C++
156abstraction layer for MPI. Jeremy Siek managed the formal review of Boost.MPI.
157
158[endsect:acknowledge]
159