1.. Copyright (C) 2004-2008 The Trustees of Indiana University. 2 Use, modification and distribution is subject to the Boost Software 3 License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at 4 http://www.boost.org/LICENSE_1_0.txt) 5 6=========================== 7|Logo| Connected Components 8=========================== 9 10:: 11 12 namespace graph { 13 // Default constructed ParentMap 14 template<typename Graph, typename ComponentMap, typename ParentMap> 15 typename property_traits<ComponentMap>::value_type 16 connected_components( const Graph& g, ComponentMap c); 17 18 // User supplied ParentMap 19 template<typename Graph, typename ComponentMap, typename ParentMap> 20 typename property_traits<ComponentMap>::value_type 21 connected_components( const Graph& g, ComponentMap c, ParentMap p); 22 } 23 24The ``connected_components()`` function computes the connected 25components of an undirected graph. The distributed connected 26components algorithm uses the sequential version of the connected 27components algorithm to compute the connected components of the local 28subgraph, then executes the parallel phase of the algorithm. The 29parallel portion of the connected components algorithm is loosely 30based on the work of Goddard, Kumar, and Prins. The interface is a 31superset of the interface to the BGL `sequential connected 32components`_ algorithm. 33 34Prior to executing the sequential phase of the algorithm, each process 35identifies the roots of its local components. An adjacency list of 36all vertices adjacent to members of the component is then constructed 37at the root vertex of each component. 38 39The parallel phase of the distributed connected components algorithm 40consists of a series of supersteps. In each superstep, each root 41attempts to hook to a member of it's adjacency list by assigning it's 42parent pointer to that vertex. Hooking is restricted to vertices 43which are logically less than the current vertex to prevent looping. 44Vertices which hook successfully are removed from the list of roots 45and placed on another list of completed vertices. All completed 46vertices now execute a pointer jumping step until every completed 47vertex has as its parent the root of a component. This pointer 48jumping step may be further optimized by the addition of Cycle 49Reduction (CR) rules developed by Johnson and Metaxas, however current 50performance evaluations indicate that this would have a negligible 51impact on the overall performance of the algorithm. These CR rules 52reduce the number of pointer jumping steps from *O(n)* to *O(log n)*. 53Following this pointer jumping step, roots which have hooked in this 54phase transmit their adjacency list to their new parent. The 55remaining roots receive these edges and execute a pruning step on 56their adjacency lists to remove vertices that are now members of their 57component. The parallel phase of the algorithm is complete when no 58root successfully hooks. Once the parallel phase is complete a final 59pointer jumping step is performed on all vertices to assign the parent 60pointers of the leaves of the initial local subgraph components to 61their final parent which has now been determined. 62 63The single largest performance bottleneck in the distributed connected 64components algorithm is the effect of poor vertex distribution on the 65algorithm. For sparse graphs with a single large component, many 66roots may hook to the same component, resulting in severe load 67imbalance at the process owning this component. Several methods of 68modifying the hooking strategy to avoid this behavior have been 69implemented but none has been successful as of yet. 70 71.. contents:: 72 73Where Defined 74------------- 75<``boost/graph/connected_components.hpp``> 76 77Parameters 78---------- 79 80IN: ``Graph& g`` 81 The graph typed must be a model of `Distributed Graph`_. 82 83OUT: ``ComponentMap c`` 84 The algorithm computes how many connected components are in the 85 graph, and assigns each component an integer label. The algorithm 86 then records to which component each vertex in the graph belongs by 87 recording the component number in the component property map. The 88 ``ComponentMap`` type must be a `Distributed Property Map`_. The 89 value type must be the ``vertices_size_type`` of the graph. The key 90 type must be the graph's vertex descriptor type. If you do not wish 91 to compute component numbers, pass ``dummy_property_map`` as the 92 component map and parent information will be provided in the parent 93 map. 94 95UTIL: ``ParentMap p`` 96 A parent map may be supplied to the algorithm, if not supplied the 97 parent map will be constructed automatically. The ``ParentMap`` type 98 must be a `Distributed Property Map`_. The value type and key type 99 must be the graph's vertex descriptor type. 100 101OUT: ``property_traits<ComponentMap>::value_type`` 102 The number of components found will be returned as the value type of 103 the component map. 104 105Complexity 106---------- 107 108The local phase of the algorithm is *O(V + E)*. The parallel phase of 109the algorithm requires at most *O(d)* supersteps where *d* is the 110number of initial roots. *d* is at most *O(V)* but becomes 111significantly smaller as *E* increases. The pointer jumping phase 112within each superstep requires at most *O(c)* steps on each of the 113completed roots where *c* is the length of the longest cycle. 114Application of CR rules can reduce this to *O(log c)*. 115 116Performance 117----------- 118The following charts illustrate the performance of the Parallel BGL 119connected components algorithm. It performs well on very sparse and 120very dense graphs. However, for cases where the graph has a medium 121density with a giant connected component, the algorithm will perform 122poorly. This is a known problem with the algorithm and as far as we 123know all implemented algorithms have this degenerate behavior. 124 125.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=9 126 :align: left 127.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=9&speedup=1 128 129.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=9 130 :align: left 131.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=9&speedup=1 132 133 134----------------------------------------------------------------------------- 135 136Copyright (C) 2004 The Trustees of Indiana University. 137 138Authors: Nick Edmonds, Douglas Gregor, and Andrew Lumsdaine 139 140.. |Logo| image:: pbgl-logo.png 141 :align: middle 142 :alt: Parallel BGL 143 :target: http://www.osl.iu.edu/research/pbgl 144 145.. _Sequential connected components: http://www.boost.org/libs/graph/doc/connected_components.html 146.. _Distributed Graph: DistributedGraph.html 147.. _Distributed Property Map: distributed_property_map.html 148