1<HTML> 2<!-- 3 Copyright (c) Jeremy Siek 2000 4 5 Distributed under the Boost Software License, Version 1.0. 6 (See accompanying file LICENSE_1_0.txt or copy at 7 http://www.boost.org/LICENSE_1_0.txt) 8 --> 9<Head> 10<Title>Kevin Bacon Example</Title> 11<BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b" 12 ALINK="#ff0000"> 13<IMG SRC="../../../boost.png" 14 ALT="C++ Boost" width="277" height="86"> 15 16<BR Clear> 17 18<H1>Six Degrees of Kevin Bacon</H1> 19 20<P> 21A fun application of graph theory comes up in the popular game ``Six 22Degrees of Kevin Bacon''. The idea of the game is to connect an actor 23to Kevin Bacon through a trail of actors who appeared together in 24movies, and do so in less than six steps. For example, Theodore 25Hesburgh (President Emeritus of the University of Notre Dame) was in 26the movie ``Rudy'' with the actor Gerry Becker, who was in the movie 27``Sleepers'' with Kevin Bacon. Why Kevin Bacon? For some reason the 28three students who invented the game, Mike Ginelli, Craig Fass, and 29Brian Turtle decided that Kevin Bacon is the center of the 30entertainment world. The Kevin Bacon game is quite similar to the <a 31href="http://www.oakland.edu/~grossman/erdoshp.html">Erdös 32Number</a> which has ``been a part of the folklore of 33mathematicians throughout the world for many years''. 34 35<P> 36The ``Six Degrees of Kevin Bacon'' game is really a graph problem. If 37you assign each actor to a vertex, and add an edge between two actors 38if they have appeared together in a movie, then you have a graph that 39represents the data for this game. Then the problem of finding a trail 40of actors to Kevin Bacon becomes a traditional graph problem: that of 41finding a <I>path</I> between two vertices. Since we wish to find a 42path that is shorter than six steps, ideally we would find the 43<I>shortest path</I> between the vertices. One might think to apply 44Dijkstra's shortest path algorithm to this problem, but that would be 45overkill since Dijkstra's algorithm is meant for situations when each 46edge has an associated length (or weight) and the goal is to find the 47path with the shortest cumulative length. Since we are only concerned 48with finding the shortest paths in terms of the number of edges the 49breadth-first search algorithm will solve the problem (and use less 50time than Dijkstra's). 51 52<P> 53 54<H2>Input File and Graph Setup</H2> 55 56<P> 57For this example we will use a much smaller graph than all movies and 58all actors. The full source code for this example is in <a 59href="../example/kevin-bacon.cpp"><TT>examples/kevin-bacon.cpp</TT></a>. 60We have supplied a file <TT>kevin-bacon.dat</TT> which contains a list 61of actor pairs who appeared in the same movie. Each line of the file 62contains an actor's name, a movie, and another actor that appeared in 63the movie. The ``;'' character is used as a separator. Here is an 64excerpt. 65 66<P> 67<PRE> 68Patrick Stewart;Prince of Egypt, The (1998);Steve Martin 69</PRE> 70 71<P> 72Our first task will be to read in the file and create a graph from 73it. We use a <TT>std::ifstream</TT> to input the file. 74 75<P> 76<PRE> 77 std::ifstream datafile("./kevin-bacon.dat"); 78 if (!datafile) { 79 std::cerr << "No ./kevin-bacon.dat file" << std::endl; 80 return EXIT_FAILURE; 81 } 82</PRE> 83 84<P> 85We will want to attach the actor's names to the vertices in the graph, 86and the movie names to the edges. We use the <TT>property</TT> class to 87specify the addition of these vertex and edge properties to the graph. 88We choose to use an undirected graph, because the relationship of 89actors appearing together in a movie is symmetric. 90 91<P> 92<PRE> 93 using namespace boost; 94 typedef adjacency_list<vecS, vecS, undirectedS, 95 property<vertex_name_t, string>, 96 property<edge_name_t, string > > 97 > Graph; 98</PRE> 99 100<P> 101To access the properties, we will need to obtain property map 102objects from the graph. The following code shows how this is done. 103 104<P> 105<PRE> 106 property_map<Graph, vertex_name_t>::type actor_name = get(vertex_name, g); 107 property_map<Graph, edge_name_t>::type connecting_movie = get(edge_name, g); 108</PRE> 109 110<P> 111Next we can start to loop through the file, parsing each line into a 112sequence of tokens using the <a href="../../tokenizer/index.html">Boost 113Tokenizer Library</a>. 114 115<P> 116<PRE> 117 for (std::string line; std::getline(datafile, line);) { 118 char_delimiters_separator<char> sep(false, "", ";"); 119 tokenizer<> line_toks(line, sep); 120 ... 121 } 122</PRE> 123 124<P> 125Each line of the input corresponds to an edge in the graph, with the 126two actors as the vertices incident to the edge, and the movie name 127will be a property attached to the edge. One challenge in creating the 128graph from this format of input is that the edges are specified by a 129pair of actor names. As we add vertices to the graph, we'll need to 130create a map from actor names to their vertices so that later 131appearances of the same actor (on a different edge) can be linked with 132the correct vertex in the graph. The <a 133href="http://www.boost.org/sgi/stl/Map.html"><TT>std::map</TT></a> 134can be used to implement this mapping. 135 136<P> 137<PRE> 138 typedef graph_traits<Graph>::vertex_descriptor Vertex; 139 typedef std::map<string, Vertex> NameVertexMap; 140 NameVertexMap actors; 141</PRE> 142 143<P> 144The first token will be an actor's name. If the actor is not already 145in our actor's map we will add a vertex to the graph, set the name 146property of the vertex, and record the vertex descriptor in the map. 147If the actor is already in the graph, we will retrieve the vertex 148descriptor from the map's <TT>pos</TT> iterator. 149 150<P> 151<PRE> 152 NameVertexMap::iterator pos; 153 bool inserted; 154 Vertex u, v; 155 156 tokenizer<>::iterator i = line_toks.begin(); 157 std::string actors_name = *i++; 158 159 boost::tie(pos, inserted) = actors.insert(std::make_pair(actors_name, Vertex())); 160 if (inserted) { 161 u = add_vertex(g); 162 actor_name[u] = actors_name; 163 pos->second = u; 164 } else 165 u = pos->second; 166</PRE> 167 168<P> 169The second token is the name of the movie, and the third token is the 170other actor. We use the same technique as above to insert the second 171actor into the graph. 172 173<P> 174<PRE> 175 std::string movie_name = *i++; 176 177 boost::tie(pos, inserted) = actors.insert(std::make_pair(*i, Vertex())); 178 if (inserted) { 179 v = add_vertex(g); 180 actor_name[v] = *i; 181 pos->second = v; 182 } else 183 v = pos->second; 184</PRE> 185 186<P> 187The final step is to add an edge connecting the two actors, and record 188the name of the connecting movie. 189 190<P> 191<PRE> 192 graph_traits<Graph>::edge_descriptor e; 193 boost::tie(e, inserted) = add_edge(u, v, g); 194 if (inserted) 195 connecting_movie[e] = movie_name; 196</PRE> 197 198<P> 199 200<H2>A Custom Visitor Class</H2> 201 202<P> 203We create a custom visitor class to record the Kevin Bacon 204numbers. The Kevin Bacon number will be the shortest distances from 205each actor to Kevin Bacon. This distance has also been referred to as 206the ``Bacon Number'' in memory of the ``Erdos Number'' used to rank 207mathematicians according to how many publications separate them from 208Paul Erdos. The shortest distance to from Kevin Bacon to each actor 209will follow the breadth-first tree. The BFS algorithm identifies edges 210that belong to the breadth-first tree and calls our visitor's 211<tt>tree_edge</tt> function. We record the distance to the target 212vertex as one plus the distance to the source vertex. 213 214 215<PRE> 216 template <typename DistanceMap> 217 class bacon_number_recorder : public default_bfs_visitor 218 { 219 public: 220 bacon_number_recorder(DistanceMap dist) : d(dist) { } 221 222 template <typename Edge, typename Graph> 223 void tree_edge(Edge e, const Graph& g) const 224 { 225 typename graph_traits<Graph>::vertex_descriptor 226 u = source(e, g), v = target(e, g); 227 d[v] = d[u] + 1; 228 } 229 private: 230 DistanceMap d; 231 }; 232 // Convenience function 233 template <typename DistanceMap> 234 bacon_number_recorder<DistanceMap> 235 record_bacon_number(DistanceMap d) 236 { 237 return bacon_number_recorder<DistanceMap>(d); 238 } 239</PRE> 240 241<P> 242We will use a vector to store the bacon numbers. 243 244<P> 245<PRE> 246 std::vector<int> bacon_number(num_vertices(g)); 247</PRE> 248 249<H2>Apply the Breadth-First Search</H2> 250 251<P> 252The BFS algorithm requires a source vertex as input; of course this 253will be Kevin Bacon. We now call the BGL BFS algorithm, passing in the 254graph, source vertex, and the visitor. 255 256<P> 257<PRE> 258 Vertex src = actors["Kevin Bacon"]; 259 bacon_number[src] = 0; 260 261 breadth_first_search(g, src, visitor(record_bacon_number(&bacon_number[0]))); 262</PRE> 263 264<P> 265We can output the Bacon number for each actor simply by looping 266through all the vertices in the graph and use them to index into the 267<TT>bacon_number</TT> vector. 268 269<P> 270<PRE> 271 graph_traits<Graph>::vertex_iterator i, end; 272 for (boost::tie(i, end) = vertices(g); i != end; ++i) 273 std::cout << actor_name[*i] << "'s bacon number is " 274 << bacon_number[*i] << std::endl; 275</PRE> 276 277<P> 278Note that vertex descriptor objects can not always be used as indices 279into vectors or arrays such as <TT>bacon_number</TT>. This is valid 280with the <TT>adjacency_list</TT> class with <TT>VertexList=vecS</TT>, 281but not with other variations of <TT>adjacency_list</TT>. A more 282generic way to index based on vertices is to use the ID property 283map (<TT>vertex_index_t</TT>) in coordination with the <A 284HREF="../../property_map/doc/iterator_property_map.html"><TT>iterator_property_map</TT></a>. 285 286<P> 287Here are some excepts from the output of the program. 288<PRE> 289William Shatner's bacon number is 2 290Denise Richards's bacon number is 1 291Kevin Bacon's bacon number is 0 292Patrick Stewart's bacon number is 2 293Steve Martin's bacon number is 1 294... 295</PRE> 296 297 298 299 300<br> 301<HR> 302<TABLE> 303<TR valign=top> 304<TD nowrap>Copyright © 2000-2001</TD><TD> 305<A HREF="http://www.boost.org/people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>) 306</TD></TR></TABLE> 307 308</BODY> 309</HTML> 310<!-- LocalWords: gif Hesburgh Ginelli Fass Erd vertices cerr namespace vecS 311 --> 312<!-- LocalWords: undirectedS Tokenizer sep tokenizer toks NameVertexMap pos 313 --> 314<!-- LocalWords: bool Erdos BFS typename DistanceMap bfs const num BGL src 315 --> 316<!-- LocalWords: indices Shatner's Richards's siek htm 317 --> 318