Thuật toán Algorithms (Phần 41)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	10
Dung lượng	77,91 KB

Nội dung

CONNECTMTY Graph Traversal Algorithms Depth-first search is a member of a family of graph traversal algorithms that are quite natural when viewed nonrecursively. Any one of these methods can be used to solve the simple connectivity problems posed in the last chapter. In this section, we’ll see how one program can be used to implement graph traversal methods with quite different characteristics, merely by changing the value of one variable. This method will be used to solve several graph problems in the chapters which follow. Consider the analogy of traveling through a maze. Any time we face a choice of several vertices to visit, we can only go along a path one of them, so we must “save” the others to visit later. One way to implement a program based on this idea is to remove the recursion from the recursive depth-first algorithm given in the previous chapter. This would result in a program that saves, on a stack, the point in the adjacency list of the vertex being visited at which the search should resume after all vertices connected to previous vertices on the adjacency list have been visited. Instead of examining this implementation in more detail, we’ll look at a more general framework which encompasses several algorithms. We begin by thinking of the vertices as being divided into three classes: tree (or visited) vertices, those connected together by paths that we’ve trav- ersed; fringe vertices, those adjacent to tree vertices but not yet visited; and unseen vertices, those that haven’t been encountered at all yet. To search a connected component of a graph systematically (implement the visit procedure of the previous chapter), we begin with one vertex on the fringe, all others unseen, and perform the following step until all vertices have been visited: “move one vertex (call it from the fringe to the tree, and put any unseen vertices adjacent to x on the fringe.” Graph traversal methods differ in how it is decided which vertex should be moved from the fringe to the tree. For depth-first search, we always want to choose the vertex from the fringe that was most recently encountered. This can be implemented by always moving the first vertex on the fringe to the tree, then putting the unseen vertices adjacent to that vertex (x) at the front of the fringe and moving vertices adjacent to which happen to be already on the fringe to the front. (Note carefully that a completely different traversal method results if we leave untouched the vertices adjacent to x which are already on the fringe.) For example, consider the undirected graph given at the beginning of this chapter. The following table shows the contents of the fringe each time a vertex is moved to the tree; the corresponding search tree is shown at the right: CHAPTER 30 394 A G E D F C H I J M L K B A G B C F ECHJLBF DFCHJLB FCHJLB CHJLB H J L B I JLB J L B MLKB LKB K B B In this algorithm, the fringe essentially operates as a pushdown stack: we remove a vertex (call it from the beginning of the fringe, then go through x’s edge list, adding unseen vertices to the beginning of the fringe, and moving fringe vertices to the beginning. Note that this is not strictly a stack, since we use the additional operation of moving a vertex to the beginning. The algorithm can be efficiently implemented by maintaining the fringe as a linked list and keeping pointers into this list in an array indexed by vertex number: we’ll omit this implementation in favor of a program that can implement other traversal strategies as well. This gives a different depth-first search tree than the one drawn in the biconnectivity section above because x’s edges are put on the stack one at a time, so their order is reversed when the stack is popped. The same tree as for recursive depth-first search would result if we were to append all of z’s edge list on the front of the stack, then go through and delete from the stack any other occurrences of vertices from x’s edge list. The reader might find it interesting to compare this process with the result of directly removing the recursion from the visit procedure of the previous chapter. By itself, the fringe table does not give enough information to reconstruct the search tree. In order to actually build the tree, it is necessary to store, with each node in the fringe table, the name of its father in the tree. This is available when the node is entered into the table (it’s the node that caused the entry), it is changed whenever the node moves to the front of the fringe, and it is needed when the node is removed from the table (to determine where in the tree to attach it). A second classic traversal method derives from maintaining the fringe as a queue: always pick the least recently encountered vertex. This can be maintained by putting the unseen vertices adjacent to x at the end of the fringe 395 in the general strategy above. This method is called breadth-first search: first we visit a node, then all the nodes adjacent to it, then all the nodes adjacent to those nodes, etc. This leads to the following fringe table and search tree for our example graph: A F C B G E D L J H M K I A F C B G C B G E B G E D G E D E D L J D L J H L J H JHM HMK MKI K I I D H We remove a vertex (call it from the beginning of the fringe, then go through edge list, putting unseen vertices at the end of the fringe. Again, an efficient implementation is available using a linked list representation for the fringe, but we’ll omit this in favor of a more general method. A fundamental feature of this general graph traversal strategy is that the fringe is actually operating as a priority queue: the vertices can be assigned priorities with the property that the “highest priority” vertex is the one moved from the fringe to the tree. That is, we can directly use the priority queue routines from Chapter 11 to implement a general graph searching program. For the applications in the next chapter, it is convenient to assign the highest priority to the lowest value, so we assume that the inequalities are switched in the programs from Chapter 11. The following program is a “priority first search” routine for a graph represented with adjacency lists (so it is most appropriate for sparse graphs). 396 CHAPTER 30 procedure sparsepfs; var now, k: integer; link; begin for to V do begin end; pqconstruct; repeat if then begin end t:=adj[k]; while do begin if then if and then begin priority); :=k end; end until pqempty; end; (The functions onpq and pqempty are priority queue utility routines which are easily implemented additions to the set of programs given in Chapter 11: pqempty returns true if the priority queue is empty; onpq returns true if the given vertex is currently on the priority queue.) Below and in Chapter 31, we’ll see how the substitution of various expressions for priority in this program yields several classical graph traversal algorithms. Specifically, the program operates as follows: first, we give all the vertices the sentinel value unseen (which could be and initialize the dad array, which is used to store the search tree. Next we construct an indirect priority queue containing all the vertices (this construction is trivial because the values are initially all the same). In the terminology above, tree vertices are those which are not on the priority queue, unseen vertices are those on the priority queue with value unseen, and fringe vertices are the others on the priority queue. With these conventions understood, the operation of the program is straightforward: it repeatedly removes the highest priority vertex from the queue and puts it on the tree, then updates the priorities of all fringe or unseen vertices connected to that vertex. If all vertices on the priority queue are unseen, then no vertex previously 397 encountered is connected to any vertex on the queue: that is, we’re entering a new connected component. This is automatically handled by the priority queue mechanism, so there is no need for a separate visit procedure inside a main procedure. But note that maintaining the proper value of now is more complicated than for the recursive depth-first search program of the previous chapter. The convention of this program is to leave the val entry unseen and zero for the root of the depth-first search tree for each connected component: it might be more convenient to set it to zero or some other value (for example, now) for various applications. Now, recall that now increases from 1 to V during the execution of the algorithm so it can be used to assign unique priorities to the vertices. If we change the two occurrences of priority in sparsepfs to V-now, we get first search, because newly encountered nodes have the highest priority. If we use now for priority we get breadth-first search, because old nodes have the highest priority. These priority assignments make the priority queues operate like stacks and queues as described above. (Of course, if we were only interested in using depth-first or breadth-first search, we would use a direct implementation for stacks or queues, not priority queues as in sparsepfs.) In the next chapter, we’ll see that other priority assignments lead to other classical graph algorithms. The running time for graph traversal when implemented in this way depends on the method of implementing the priority queue. In general, we have to do a priority queue operation for each edge and for each vertex, so the worst case running time should be proportional to (E + V) log V if the priority queue is implemented as a heap as indicated. However, we’ve already noted that for both depth-first and breadth-first search we can take advantage of the fact that each new priority is the highest or the lowest so far encountered to get a running time proportional to E + V. Also, other priority queue implementations might sometimes be appropriate: for example if the graph is dense then we might as well simply keep the priority queue as an unordered array. This gives a worst case running time proportional to E + (or just since each edge simply requires setting or resetting a priority, but each vertex now requires searching through the whole queue to find the highest priority vertex. An implementation which works in this way is given in the next chapter. The difference between depth-first and breadth-first search is quite evi- dent when a large graph is considered. The diagram at left below shows the edges and nodes visited when depth-first search is halfway through the maze graph of the previous chapter starting at the upper left corner; the diagram at right is the corresponding picture for breadth-first search: 398 CHAPTER 30 Tree nodes are blackened in these diagrams, fringe nodes are crossed, and unseen nodes are blank. Depth-first search “explores” the graph by looking for new vertices far away from the start point, taking closer vertices only when dead ends are encountered; breadth-first search completely covers the area close to the starting point, moving farther away only when everything close has been looked at. Depth-first search is appropriate for one person looking for something in a maze because the “next place to look” is always close by; breadth-first search is more like a group of people looking for something by fanning out in all directions. Beyond these operational differences, it is interesting to reflect on the fundamental differences in the implementations of these methods. Depth-first search is very simply expressed recursively (because its underlying data structure is a stack), and breadth-first search admits to a very simple nonrecursive implementation (because its underlying data structure is a queue). But we’ve seen that the true underlying data structure for graph algorithms is a priority queue, and this admits a wealth of interesting properties to consider. Again, we’ll see more examples in the next chapter. Union-Find Algorithms In some applications we wish to know simply whether a vertex x is connected to a vertex y in a graph; the actual path connecting them may not be relevant. This problem has been carefully studied in recent years; some efficient algorithms have been developed which are of independent interest because they can also be used for processing sets (collections of objects). Graphs correspond to sets of objects in a natural way, with vertices corresponding to objects and edges have the meaning “is in the same set as.” Thus, the sample graph in the previous chapter corresponds to the sets {A B C D E F G}, {H I} and {J K L M}. Eac connected component corresponds h CONNECTIVITY 399 to a different set. For sets, we’re interested in the fundamental question x in the same set as This clearly corresponds to the fundamental graph question “is vertex x connected to vertex y?” Given a set of edges, we can build an adjacency list representation of the corresponding graph and use depth-first search to assign to each vertex the index of its connected component, so the questions of the form “is x connected to y?” can be answered with just two array accesses and a comparison. The extra twist in the methods that we consider here is that they are dynamic: they can accept new edges arbitrarily intermixed with questions and answer the questions correctly using the information received. From the correspondence with the set problem, the addition of a new edge is called a union operation, and the queries are called jind operations. Our objective is to write a function which can check if two vertices x and y are in the same set (or, in the graph representation, the same connected component) and, if not, can put them in the same set (put an edge between them in the graph). Instead of building a direct adjacency list or other representation of the graph, we’ll gain efficiency by using an internal structure specifically oriented towards supporting the union and find operations. The internal structure that we will use is a forest of trees, one for each connected component. We need to be able to find out if two vertices belong to the same tree and to be able to combine two trees to make one. It turns out that both of these operations can be implemented efficiently. To illustrate the way this algorithm works, we’ll look at the forest con- structed when the edges from the sample graph that we’ve been using in this chapter are processed in some arbitrary order. Initially, all nodes are in separate trees. Then the edge AG causes a two node tree to be formed, with A at the root. (This choice is arbitrary -we could equally well have put G at the root.) The edges AB and AC add B and C to this tree in the same way, leaving The edges LM, JM, JL, and JK build a tree containing J, K, L, and M that has a slightly different structure (note that JL doesn’t contribute anything, since LM and JM put L and J in the same component), and the edges ED, FD, and HI build two more trees, leaving the forest: 400 CHAPTER 30 This forest indicates that the edges processed to this point describe a graph with four connected components, or, equivalently, that the set union operations processed to this point have led to four sets {A B C G}, {J K L M}, {D E F} and {H I}. Now the edge FE doesn’t contribute anything to the structure, since F and E are in the same component, but the edge combines the first two trees; then GC doesn’t contribute anything, but GH and JG result in everything being combined into one tree: It must be emphasized that, unlike depth-first search trees, the only relationship between these union-find trees and the underlying graph with the given edges is that they divide the vertices into sets in the same way. For example, there is no correspondence between the paths that connect nodes in the trees and the paths that connect nodes in the graph. We know that we can represent these trees in exactly the same way that we represented graph search trees: we keep an array of integers dad which contains, for each vertex, the index of its father (with a 0 entry for nodes which are at the root of a tree). To find the father of a vertex j, we simply set j : =dad , and to find the root of the tree to which j belongs, we repeat this operation until reaching 0. The union and find operations are then 401 very easily implemented: function find(x, y: integer; union: boolean): boolean; var i, j: integer; begin while do i:=dad[i]; while do if union and (i<>j) then end This function returns true if the two given vertices are in the same component. In addition, if they are not in the same component and the union flag is set, they are put into the same component. The method used is simple: Use the dad array to get to the root of the tree containing each vertex, then check to see if they are the same. To merge the tree rooted at j with the tree rooted at i, we simply set .I as shown in the following table: ABCDEFGH IJKLM AG: A AB: A A AC: AA A LM: A A A L JM: AA A J L JL: AA A JL * JK: AA A J J L ED: AAE A J J L FD: AAEF A J J L HI: AAEF A H J J L FE: AAEF A H JJL * AF: AAEFAA H J J L GE: AAEFAA H JJL* GC: AAEFAA H JJL * GH: AAEFAAAH J J L JG: JAAEFAAAH J J L LG: JAAEFAAAH JJL * An asterisk at the right indicates that the vertices are already in the same component at the time the edge is processed. As usual, we are assuming 402 CHAPTER 30 that we have available functions index and name to translate between vertex names and integers between 1 and each table entry is the name of the corresponding dad array entry. Also, for example, the function call index(y), f a se would be used to test whether a vertex named ) x is in the same component as a vertex named y (without introducing an edge between them). The algorithm described above has bad worst-case performance because the trees formed could be degenerate. For example, taking the edges BC CD DE EF FG GH HI IJ . . . YZ in that order will produce a long chain with Z pointing to Y, Y pointing to X, etc. This kind of structure takes time proportional to to build, and has time proportional to V for an average equivalence test. Several methods have been suggested to deal with this problem. One natural method, which may have already occurred to the reader, is to try to do the “right” thing when merging two trees rather than arbitrarily setting When a tree rooted at i is to be merged with a tree rooted at j, one of the nodes must remain a root and the other (and all its descendants) must go one level down in the tree. To minimize the distance to the root for the most nodes, it makes sense to take as the root the node with more This idea, called weight balancing, is easily implemented by maintaining the size of each tree (number of descendants of the root) in the dad array entry for each root node, encoded as a nonpositive number so that the root node can be detected when traveling up the tree in find. Ideally, we would like every node to point directly to the root of its tree. No matter what strategy we use, achieving this ideal would require examining at least all the nodes in the smaller of the two trees to be merged, and this could be quite a lot compared to the relatively few nodes on the path to the root that find usually examines. But we can approach the ideal by making all the nodes that we do examine point to the root! This seems like a drastic step at first blush, but it is relatively easy to do, and it must be remembered that there is nothing sacrosanct about the structure of these trees: if they can be modified to make the algorithm more efficient, we should do so. This method, called path compression, is easily implemented by making another pass through each tree after the root has been found, and setting the dad entry of each vertex encountered along the way to point to the root. The combination of weight balancing and path compression ensures that the algorithms will run very quickly. The following implementation shows that the extra code involved is a small price to pay to guard against degenerate cases. . CONNECTMTY Graph Traversal Algorithms Depth-first search is a member of a family of graph traversal algorithms that are quite natural when. more detail, we’ll look at a more general framework which encompasses several algorithms. We begin by thinking of the vertices as being divided into three

Ngày đăng: 28/10/2013, 21:15

Xem thêm