• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1.. Copyright (C) 2004-2008 The Trustees of Indiana University.
2   Use, modification and distribution is subject to the Boost Software
3   License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
4   http://www.boost.org/LICENSE_1_0.txt)
5
6===========================
7|Logo| Connected Components
8===========================
9
10::
11
12  namespace graph {
13    // Default constructed ParentMap
14    template<typename Graph, typename ComponentMap, typename ParentMap>
15    typename property_traits<ComponentMap>::value_type
16    connected_components( const Graph& g, ComponentMap c);
17
18    // User supplied ParentMap
19    template<typename Graph, typename ComponentMap, typename ParentMap>
20    typename property_traits<ComponentMap>::value_type
21    connected_components( const Graph& g, ComponentMap c, ParentMap p);
22  }
23
24The ``connected_components()`` function computes the connected
25components of an undirected graph.  The distributed connected
26components algorithm uses the sequential version of the connected
27components algorithm to compute the connected components of the local
28subgraph, then executes the parallel phase of the algorithm.  The
29parallel portion of the connected components algorithm is loosely
30based on the work of Goddard, Kumar, and Prins. The interface is a
31superset of the interface to the BGL `sequential connected
32components`_ algorithm.
33
34Prior to executing the sequential phase of the algorithm, each process
35identifies the roots of its local components.  An adjacency list of
36all vertices adjacent to members of the component is then constructed
37at the root vertex of each component.
38
39The parallel phase of the distributed connected components algorithm
40consists of a series of supersteps.  In each superstep, each root
41attempts to hook to a member of it's adjacency list by assigning it's
42parent pointer to that vertex.  Hooking is restricted to vertices
43which are logically less than the current vertex to prevent looping.
44Vertices which hook successfully are removed from the list of roots
45and placed on another list of completed vertices.  All completed
46vertices now execute a pointer jumping step until every completed
47vertex has as its parent the root of a component.  This pointer
48jumping step may be further optimized by the addition of Cycle
49Reduction (CR) rules developed by Johnson and Metaxas, however current
50performance evaluations indicate that this would have a negligible
51impact on the overall performance of the algorithm.  These CR rules
52reduce the number of pointer jumping steps from *O(n)* to *O(log n)*.
53Following this pointer jumping step, roots which have hooked in this
54phase transmit their adjacency list to their new parent.  The
55remaining roots receive these edges and execute a pruning step on
56their adjacency lists to remove vertices that are now members of their
57component.  The parallel phase of the algorithm is complete when no
58root successfully hooks.  Once the parallel phase is complete a final
59pointer jumping step is performed on all vertices to assign the parent
60pointers of the leaves of the initial local subgraph components to
61their final parent which has now been determined.
62
63The single largest performance bottleneck in the distributed connected
64components algorithm is the effect of poor vertex distribution on the
65algorithm.  For sparse graphs with a single large component, many
66roots may hook to the same component, resulting in severe load
67imbalance at the process owning this component.  Several methods of
68modifying the hooking strategy to avoid this behavior have been
69implemented but none has been successful as of yet.
70
71.. contents::
72
73Where Defined
74-------------
75<``boost/graph/connected_components.hpp``>
76
77Parameters
78----------
79
80IN:  ``Graph& g``
81  The graph typed must be a model of `Distributed Graph`_.
82
83OUT:  ``ComponentMap c``
84  The algorithm computes how many connected components are in the
85  graph, and assigns each component an integer label.  The algorithm
86  then records to which component each vertex in the graph belongs by
87  recording the component number in the component property map.  The
88  ``ComponentMap`` type must be a `Distributed Property Map`_.  The
89  value type must be the ``vertices_size_type`` of the graph.  The key
90  type must be the graph's vertex descriptor type. If you do not wish
91  to compute component numbers, pass ``dummy_property_map`` as the
92  component map and parent information will be provided in the parent
93  map.
94
95UTIL:  ``ParentMap p``
96  A parent map may be supplied to the algorithm, if not supplied the
97  parent map will be constructed automatically.  The ``ParentMap`` type
98  must be a `Distributed Property Map`_.  The value type and key type
99  must be the graph's vertex descriptor type.
100
101OUT:  ``property_traits<ComponentMap>::value_type``
102  The number of components found will be returned as the value type of
103  the component map.
104
105Complexity
106----------
107
108The local phase of the algorithm is *O(V + E)*.  The parallel phase of
109the algorithm requires at most *O(d)* supersteps where *d* is the
110number of initial roots.  *d* is at most *O(V)* but becomes
111significantly smaller as *E* increases.  The pointer jumping phase
112within each superstep requires at most *O(c)* steps on each of the
113completed roots where *c* is the length of the longest cycle.
114Application of CR rules can reduce this to *O(log c)*.
115
116Performance
117-----------
118The following charts illustrate the performance of the Parallel BGL
119connected components algorithm. It performs well on very sparse and
120very dense graphs. However, for cases where the graph has a medium
121density with a giant connected component, the algorithm will perform
122poorly. This is a known problem with the algorithm and as far as we
123know all implemented algorithms have this degenerate behavior.
124
125.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=9
126  :align: left
127.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=9&speedup=1
128
129.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=9
130  :align: left
131.. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=9&speedup=1
132
133
134-----------------------------------------------------------------------------
135
136Copyright (C) 2004 The Trustees of Indiana University.
137
138Authors: Nick Edmonds, Douglas Gregor, and Andrew Lumsdaine
139
140.. |Logo| image:: pbgl-logo.png
141            :align: middle
142            :alt: Parallel BGL
143            :target: http://www.osl.iu.edu/research/pbgl
144
145.. _Sequential connected components: http://www.boost.org/libs/graph/doc/connected_components.html
146.. _Distributed Graph: DistributedGraph.html
147.. _Distributed Property Map: distributed_property_map.html
148