1 .. Copyright (C) 2004-2008 The Trustees of Indiana University. 2 Use, modification and distribution is subject to the Boost Software 3 License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at 4 http://www.boost.org/LICENSE_1_0.txt) 5 6 =============================================== 7 An Overview of the Parallel Boost Graph Library 8 =============================================== 9 10 .. image:: ../graph.png 11 :width: 206 12 :height: 184 13 :alt: An example graph 14 :align: right 15 16 The Parallel Boost Graph Library (Parallel BGL) is a C++ library for 17 parallel, distributed computation on graphs. The Parallel BGL contains 18 distributed graph data structures, distributed graph algorithms, 19 abstractions over the communication medium (such as MPI), and 20 supporting data structures. A graph (also called a *network*) consists 21 of a set of *vertices* and a set of relationships between vertices, 22 called *edges*. The edges may be *undirected*, meaning that the 23 relationship between vertices is mutual, e.g., "X is related to Y", or 24 they can be *directed*, meaning that the relationship goes only one 25 way, e.g., "X is the child of Y". The following figure illustrates a 26 typical directed graph, where *a-i* are the vertices and the arrows 27 represent edges. 28 29 .. image:: ../distributed-graph.png 30 :width: 229 31 :height: 199 32 :alt: A distributed graph 33 :align: right 34 35 The Parallel BGL is primarily concerned with *distributed* 36 graphs. Distributed graphs are conceptually graphs, but their storage 37 is spread across multiple processors. The following figure 38 demonstrates a distributed version of the graph above, where the graph 39 has been divided among three processors (represented by the grey 40 rectangles). Edges in the graph may be either local (with both 41 endpoints stored on the same processor) or remote (the target of the 42 edge is stored on a different processor). 43 44 The Parallel BGL is a generic library. At its core are *generic* 45 distributed graph algorithms, which can operate on any distributed 46 graph data structure provided that data structure meets certain 47 requirements. For instance, the algorithm may need to enumerate the 48 set of vertices stored on the current processor, enumerate the set of 49 outgoing edges from a particular vertex, and determine on which 50 processor the target of each edge resides. The graph algorithms in the 51 Parallel BGL are also generic with respect to the *properties* 52 attached to edges and vertices in a graph; for instance, the weight of 53 each edge can be stored as part of the graph or allocated in a 54 completely separate data structure. 55 56 The genericity available in the algorithms of the Parallel BGL allows 57 them to be applied to existing graph data structures. However, most 58 users will instead be writing new code that takes advantage of the 59 Parallel BGL. The Parallel BGL provides distributed graph data 60 structures that meet the requirements of the Parallel BGL 61 algorithms. The primary data structure is the `distributed adjacency 62 list`_, which allows storage and manipulation of a (distributed) 63 graph. The vertices in the graph are divided among the various 64 processors, and each of the edges outgoing from a vertex are stored on 65 the processor that "owns" (stores) that vertex. The following figure 66 illustrates the distributed adjacency list representation. 67 68 .. image:: ../dist-adjlist.png 69 :width: 446 70 :height: 154 71 :alt: A distributed adjacency list 72 :align: center 73 74 .. image:: ../dist-pmap.png 75 :width: 271 76 :height: 175 77 :alt: A distributed property map 78 :align: right 79 80 The `distributed adjacency list`_ distributes the structure of a graph 81 over multiple processors. While graph structure is in important part 82 of many graph problems, there are typically other properties attached 83 to the vertices and edges, such as edge weights or the position of 84 vertices within a grid. These properties are stored in *property 85 maps*, which associate a single piece of data with each edge or vertex 86 in a graph. Distributed property maps extend this notion to 87 distributed computing, where properties are stored on the same 88 processor as the vertex or edge. The following figure illustrates the 89 distribution of a property map storing colors (white, gray, black) for 90 each vertex. In addition to the storage for each vertex, the 91 processors store some "ghost cells" that cache values actually stored 92 on other processors, represented by the dashed boxes. 93 94 Tying together all of the distributed data structures of the Parallel 95 BGL are its process groups and distributed graph algorithms. Process 96 groups coordinate the interactions between multiple processes and 97 distributed data structures by abstracting the communication 98 mechanism. The algorithms are typically written using the SPMD model 99 (Single Program, Multiple Data) and interact with both the distributed 100 data structures and the process group itself. At various points in the 101 algorithm's execution, all processes execute a synchronization point, 102 which allows all of the distributed data structures to ensure an 103 appropriate degree of consistency across processes. The following 104 diagram illustrates the communication patterns within the the 105 execution of a distributed algorithm in the Parallel BGL. In 106 particular, the diagram illustrates the distributed data structures 107 used in a distributed breadth-first search, from the top-left and 108 proceeding clockwise: 109 110 - a user-defined property map that tracks the distance from the 111 source vertex to all other vertices, 112 113 - an automatically-generated property map that tracks the "color" 114 of vertices in the (distributed) graph, to determine which 115 vertices have been seen before, 116 117 - a distributed queue, which coordinates the breadth-first search 118 and distributes new vertices to search, and 119 120 - a distributed graph, on which the breadth-first search is 121 operating. 122 123 .. image:: ../architecture.png 124 :width: 485 125 :height: 410 126 :alt: Parallel Boost Graph Library architecture 127 :align: center 128 129 ---------------------------------------------------------------------------- 130 131 Copyright (C) 2005 The Trustees of Indiana University. 132 133 Authors: Douglas Gregor and Andrew Lumsdaine 134 135 .. _Distributed adjacency list: distributed_adjacency_list.html 136 .. _Process groups: 137