1.. Copyright (C) 2004-2008 The Trustees of Indiana University. 2 Use, modification and distribution is subject to the Boost Software 3 License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at 4 http://www.boost.org/LICENSE_1_0.txt) 5 6============================ 7|Logo| Minimum Spanning Tree 8============================ 9 10The Parallel BGL contains four `minimum spanning tree`_ (MST) 11algorithms [DG98]_ for use on undirected, weighted, distributed 12graphs. The graphs need not be connected: each algorithm will compute 13a minimum spanning forest (MSF) when provided with a disconnected 14graph. 15 16The interface to each of the four algorithms is similar to the 17implementation of 'Kruskal's algorithm'_ in the sequential BGL. Each 18accepts, at a minimum, a graph, a weight map, and an output 19iterator. The edges of the MST (or MSF) will be output via the output 20iterator on process 0: other processes may receive edges on their 21output iterators, but the set may not be complete, depending on the 22algorithm. The algorithm parameters are documented together, because 23they do not vary greatly. See the section `Selecting an MST 24algorithm`_ for advice on algorithm selection. 25 26The graph itself must model the `Vertex List Graph`_ concept and the 27Distributed Edge List Graph concept. Since the most common 28distributed graph structure, the `distributed adjacency list`_, only 29models the Distributed Vertex List Graph concept, it may only be used 30with these algorithms when wrapped in a suitable adaptor, such as the 31`vertex_list_adaptor`_. 32 33.. contents:: 34 35Where Defined 36------------- 37<``boost/graph/distributed/dehne_gotz_min_spanning_tree.hpp``> 38 39Parameters 40---------- 41 42IN: ``Graph& g`` 43 The graph type must be a model of `Vertex List Graph`_ and 44 `Distributed Edge List Graph`_. 45 46 47 48IN/OUT: ``WeightMap weight`` 49 The weight map must be a `Distributed Property Map`_ and a `Readable 50 Property Map`_ whose key type is the edge descriptor of the graph 51 and whose value type is numerical. 52 53 54 55IN/OUT: ``OutputIterator out`` 56 The output iterator through which the edges of the MSF will be 57 written. Must be capable of accepting edge descriptors for output. 58 59 60 61 62IN: ``VertexIndexMap index`` 63 A mapping from vertex descriptors to indices in the range *[0, 64 num_vertices(g))*. This must be a `Readable Property Map`_ whose 65 key type is a vertex descriptor and whose value type is an integral 66 type, typically the ``vertices_size_type`` of the graph. 67 68 **Default:** ``get(vertex_index, g)`` 69 70 71IN/UTIL: ``RankMap rank_map`` 72 Stores the rank of each vertex, which is used for maintaining 73 union-find data structures. This must be a `Read/Write Property Map`_ 74 whose key type is a vertex descriptor and whose value type is an 75 integral type. 76 77 **Default:** An ``iterator_property_map`` built from an STL vector 78 of the ``vertices_size_type`` of the graph and the vertex index map. 79 80 81IN/UTIL: ``ParentMap parent_map`` 82 Stores the parent (representative) of each vertex, which is used for 83 maintaining union-find data structures. This must be a `Read/Write 84 Property Map`_ whose key type is a vertex descriptor and whose value 85 type is also a vertex descriptor. 86 87 **Default:** An ``iterator_property_map`` built from an STL vector 88 of the ``vertex_descriptor`` of the graph and the vertex index map. 89 90 91IN/UTIL: ``SupervertexMap supervertex_map`` 92 Stores the supervertex index of each vertex, which is used for 93 maintaining the supervertex list data structures. This must be a 94 `Read/Write Property Map`_ whose key type is a vertex descriptor and 95 whose value type is an integral type. 96 97 **Default:** An ``iterator_property_map`` built from an STL vector 98 of the ``vertices_size_type`` of the graph and the vertex index map. 99 100 101 102``dense_boruvka_minimum_spanning_tree`` 103--------------------------------------- 104 105:: 106 107 namespace graph { 108 template<typename Graph, typename WeightMap, typename OutputIterator, 109 typename VertexIndexMap, typename RankMap, typename ParentMap, 110 typename SupervertexMap> 111 OutputIterator 112 dense_boruvka_minimum_spanning_tree(const Graph& g, WeightMap weight_map, 113 OutputIterator out, 114 VertexIndexMap index, 115 RankMap rank_map, ParentMap parent_map, 116 SupervertexMap supervertex_map); 117 118 template<typename Graph, typename WeightMap, typename OutputIterator, 119 typename VertexIndex> 120 OutputIterator 121 dense_boruvka_minimum_spanning_tree(const Graph& g, WeightMap weight_map, 122 OutputIterator out, VertexIndex index); 123 124 template<typename Graph, typename WeightMap, typename OutputIterator> 125 OutputIterator 126 dense_boruvka_minimum_spanning_tree(const Graph& g, WeightMap weight_map, 127 OutputIterator out); 128 } 129 130Description 131~~~~~~~~~~~ 132 133The dense Boruvka distributed minimum spanning tree algorithm is a 134direct parallelization of the sequential MST algorithm by 135Boruvka. The algorithm first creates a *supervertex* out of each 136vertex. Then, in each iteration, it finds the smallest-weight edge 137incident to each vertex, collapses supervertices along these edges, 138and removals all self loops. The only difference between the 139sequential and parallel algorithms is that the parallel algorithm 140performs an all-reduce operation so that all processes have the 141global minimum set of edges. 142 143Unlike the other three algorithms, this algorithm emits the complete 144list of edges in the minimum spanning forest via the output iterator 145on all processes. It may therefore be more useful than the others 146when parallelizing sequential BGL programs. 147 148Complexity 149~~~~~~~~~~ 150 151The distributed algorithm requires *O(log n)* BSP supersteps, each of 152which requires *O(m/p + n)* time and *O(n)* communication per 153process. 154 155Performance 156~~~~~~~~~~~ 157 158The following charts illustrate the performance of this algorithm on 159various random graphs. We see that the algorithm scales well up to 64 160or 128 processors, depending on the type of graph, for dense 161graphs. However, for sparse graphs performance tapers off as the 162number of processors surpases *m/n*, i.e., the average degree (which 163is 30 for this graph). This behavior is expected from the algorithm. 164 165.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=5 166 :align: left 167.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=5&speedup=1 168 169.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=5 170 :align: left 171.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=5&speedup=1 172 173``merge_local_minimum_spanning_trees`` 174-------------------------------------- 175 176:: 177 178 namespace graph { 179 template<typename Graph, typename WeightMap, typename OutputIterator, 180 typename VertexIndexMap> 181 OutputIterator 182 merge_local_minimum_spanning_trees(const Graph& g, WeightMap weight, 183 OutputIterator out, 184 VertexIndexMap index); 185 186 template<typename Graph, typename WeightMap, typename OutputIterator> 187 inline OutputIterator 188 merge_local_minimum_spanning_trees(const Graph& g, WeightMap weight, 189 OutputIterator out); 190 } 191 192Description 193~~~~~~~~~~~ 194 195The merging local MSTs algorithm operates by computing minimum 196spanning forests from the edges stored on each process. Then the 197processes merge their edge lists along a tree. The child nodes cease 198participating in the computation, but the parent nodes recompute MSFs 199from the newly acquired edges. In the final round, the root of the 200tree computes the global MSFs, having received candidate edges from 201every other process via the tree. 202 203Complexity 204~~~~~~~~~~ 205 206This algorithm requires *O(log_D p)* BSP supersteps (where *D* is the 207number of children in the tree, and is currently fixed at 3). Each 208superstep requires *O((m/p) log (m/p) + n)* time and *O(m/p)* 209communication per process. 210 211Performance 212~~~~~~~~~~~ 213 214The following charts illustrate the performance of this algorithm on 215various random graphs. The algorithm only scales well for very dense 216graphs, where most of the work is performed in the initial stage and 217there is very little work in the later stages. 218 219.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=6 220 :align: left 221.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=6&speedup=1 222 223.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=6 224 :align: left 225.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=6&speedup=1 226 227 228``boruvka_then_merge`` 229---------------------- 230 231:: 232 233 namespace graph { 234 template<typename Graph, typename WeightMap, typename OutputIterator, 235 typename VertexIndexMap, typename RankMap, typename ParentMap, 236 typename SupervertexMap> 237 OutputIterator 238 boruvka_then_merge(const Graph& g, WeightMap weight, OutputIterator out, 239 VertexIndexMap index, RankMap rank_map, 240 ParentMap parent_map, SupervertexMap 241 supervertex_map); 242 243 template<typename Graph, typename WeightMap, typename OutputIterator, 244 typename VertexIndexMap> 245 inline OutputIterator 246 boruvka_then_merge(const Graph& g, WeightMap weight, OutputIterator out, 247 VertexIndexMap index); 248 249 template<typename Graph, typename WeightMap, typename OutputIterator> 250 inline OutputIterator 251 boruvka_then_merge(const Graph& g, WeightMap weight, OutputIterator out); 252 } 253 254Description 255~~~~~~~~~~~ 256 257This algorithm applies both Boruvka steps and local MSF merging steps 258together to achieve better asymptotic performance than either 259algorithm alone. It first executes Boruvka steps until only *n/(log_d 260p)^2* supervertices remain, then completes the MSF computation by 261performing local MSF merging on the remaining edges and 262supervertices. 263 264Complexity 265~~~~~~~~~~ 266 267This algorithm requires *log_D p* + *log log_D p* BSP supersteps. The 268time required by each superstep depends on the type of superstep 269being performed; see the distributed Boruvka or merging local MSFS 270algorithms for details. 271 272Performance 273~~~~~~~~~~~ 274 275The following charts illustrate the performance of this algorithm on 276various random graphs. We see that the algorithm scales well up to 64 277or 128 processors, depending on the type of graph, for dense 278graphs. However, for sparse graphs performance tapers off as the 279number of processors surpases *m/n*, i.e., the average degree (which 280is 30 for this graph). This behavior is expected from the algorithm. 281 282.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=7 283 :align: left 284.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=7&speedup=1 285 286.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=7 287 :align: left 288.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=7&speedup=1 289 290``boruvka_mixed_merge`` 291----------------------- 292 293:: 294 295 namespace { 296 template<typename Graph, typename WeightMap, typename OutputIterator, 297 typename VertexIndexMap, typename RankMap, typename ParentMap, 298 typename SupervertexMap> 299 OutputIterator 300 boruvka_mixed_merge(const Graph& g, WeightMap weight, OutputIterator out, 301 VertexIndexMap index, RankMap rank_map, 302 ParentMap parent_map, SupervertexMap 303 supervertex_map); 304 305 template<typename Graph, typename WeightMap, typename OutputIterator, 306 typename VertexIndexMap> 307 inline OutputIterator 308 boruvka_mixed_merge(const Graph& g, WeightMap weight, OutputIterator out, 309 VertexIndexMap index); 310 311 template<typename Graph, typename WeightMap, typename OutputIterator> 312 inline OutputIterator 313 boruvka_mixed_merge(const Graph& g, WeightMap weight, OutputIterator out); 314 } 315 316Description 317~~~~~~~~~~~ 318 319This algorithm applies both Boruvka steps and local MSF merging steps 320together to achieve better asymptotic performance than either method 321alone. In each iteration, the algorithm first performs a Boruvka step 322and then merges the local MSFs computed based on the supervertex 323graph. 324 325Complexity 326~~~~~~~~~~ 327 328This algorithm requires *log_D p* BSP supersteps. The 329time required by each superstep depends on the type of superstep 330being performed; see the distributed Boruvka or merging local MSFS 331algorithms for details. However, the algorithm is 332communication-optional (requiring *O(n)* communication overall) when 333the graph is sufficiently dense, i.e., *m/n >= p*. 334 335Performance 336~~~~~~~~~~~ 337 338The following charts illustrate the performance of this algorithm on 339various random graphs. We see that the algorithm scales well up to 64 340or 128 processors, depending on the type of graph, for dense 341graphs. However, for sparse graphs performance tapers off as the 342number of processors surpases *m/n*, i.e., the average degree (which 343is 30 for this graph). This behavior is expected from the algorithm. 344 345.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=8 346 :align: left 347.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=8&speedup=1 348 349.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=8 350 :align: left 351.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=8&speedup=1 352 353 354Selecting an MST algorithm 355-------------------------- 356 357Dehne and Gotz reported [DG98]_ mixed results when evaluating these 358four algorithms. No particular algorithm was clearly better than the 359others in all cases. However, the asymptotically best algorithm 360(``boruvka_mixed_merge``) seemed to perform more poorly in their tests 361than the other merging-based algorithms. The following performance 362charts illustrate the performance of these four minimum spanning tree 363implementations. 364 365Overall, ``dense_boruvka_minimum_spanning_tree`` gives the most 366consistent performance and scalability for the graphs we 367tested. Additionally, it may be more suitable for sequential programs 368that are being parallelized, because it emits complete MSF edge lists 369via the output iterators in every process. 370 371Performance on Sparse Graphs 372~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 373.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeSparse&columns=5,6,7,8 374 :align: left 375.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeSparse&columns=5,6,7,8&speedup=1 376 377.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeSparse&columns=5,6,7,8 378 :align: left 379.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeSparse&columns=5,6,7,8&speedup=1 380 381.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeSparse&columns=5,6,7,8 382 :align: left 383.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeSparse&columns=5,6,7,8&speedup=1 384 385Performance on Dense Graphs 386~~~~~~~~~~~~~~~~~~~~~~~~~~~ 387.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeDense&columns=5,6,7,8 388 :align: left 389.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeDense&columns=5,6,7,8&speedup=1 390 391.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeDense&columns=5,6,7,8 392 :align: left 393.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeDense&columns=5,6,7,8&speedup=1 394 395.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeDense&columns=5,6,7,8 396 :align: left 397.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeDense&columns=5,6,7,8&speedup=1 398 399----------------------------------------------------------------------------- 400 401Copyright (C) 2004 The Trustees of Indiana University. 402 403Authors: Douglas Gregor and Andrew Lumsdaine 404 405.. |Logo| image:: pbgl-logo.png 406 :align: middle 407 :alt: Parallel BGL 408 :target: http://www.osl.iu.edu/research/pbgl 409 410.. _minimum spanning tree: http://www.boost.org/libs/graph/doc/graph_theory_review.html#sec:minimum-spanning-tree 411.. _Kruskal's algorithm: http://www.boost.org/libs/graph/doc/kruskal_min_spanning_tree.html 412.. _Vertex list graph: http://www.boost.org/libs/graph/doc/VertexListGraph.html 413.. _distributed adjacency list: distributed_adjacency_list.html 414.. _vertex_list_adaptor: vertex_list_adaptor.html 415.. _Distributed Edge List Graph: DistributedEdgeListGraph.html 416.. _Distributed property map: distributed_property_map.html 417.. _Readable Property Map: http://www.boost.org/libs/property_map/ReadablePropertyMap.html 418.. _Read/Write Property Map: http://www.boost.org/libs/property_map/ReadWritePropertyMap.html 419 420.. [DG98] Frank Dehne and Silvia Gotz. *Practical Parallel Algorithms 421 for Minimum Spanning Trees*. In Symposium on Reliable Distributed Systems, 422 pages 366--371, 1998. 423 424