Data Structures and Program Design in C++ phần 7 pdf

422 Chapter 9 • Tables and Information Retrieval 1. Updating the Configuration The crucial Life method is update, whose task is to start with one Life configuration and determine what the configuration will become at the next generation. review of first Life program In Section 1.4.4, we did this by examining every possible cell in the grid configuration, calculating its neighbor count to determine whether or not it should live in the coming generation. This information was stored in a local variable new_grid that was eventually copied to grid. Let us to continue to follow this outline, except that, with an unbounded grid, we must not try to examine every possible cell in the configuration. Instead, we must limit our attention to the cells that may possibly be alive in the coming generation. Which cells are these? Certainly, we should examine all the living cells to 316 determine which of them remain alive. We must also examine some of the dead cells. For a dead cell to become alive, it must have exactly three living neighbors (according to the rules in Section 1.2.1). Therefore, we will include all these cells (and likely others besides) if we examine all the cells that are neighbors of living cells. All such neighbors are shown as the shaded fringe in the configuration of Figure 9.18. Figure 9.18. A Life configuration with fringe of dead cells In the method update, a local variable Life new_configuration is thereby gradu- ally built up to represent the upcoming configuration: We loop over all the (living) cells from the current configuration, and we also loop over all the (dead) cells that are neighbors of these (living) cells. For each cell, we must first determine whether it has already been added to new_configuration, since we must be care- ful not to add duplicate copies of any cell. If the cell does not already belong to new_configuration, we use the function neighbor_count to decide whether it should be added, and if appropriate we insert it into new_configuration. At the end of the method, we must swap the List and Hash_table members between the current configuration and new_configuration. This exchange ensures that the Life object now represents an updated configuration. Moreover, it ensures that the destructor that will automatically be applied to the local variable Life new_configuration will dispose of the cells, the List, and the Hash_table that represent the former configuration. Section 9.9 • Application: The Life Game Revisited 423 We thus obtain the following implementation: 317 void Life::update( ) / * Post: The Life object contains the next generation of configuration. Uses: The class Hash_table and the class Life and its auxiliary functions. * / { Life new_configuration; Cell * old_cell; for (int i = 0; i < living->size(); i++) { living->retrieve(i, old_cell); // Obtain a living cell. for (int row_add = −1; row_add < 2; row_add ++) for (int col_add = −1; col_add < 2; col_add++) { int new_row = old_cell->row + row_add, new_col = old_cell->col + col_add; // new_row, new_col is now a living cell or a neighbor of a living cell, if (!new_configuration.retrieve(new_row, new_col)) switch (neighbor_count(new_row, new_col)) { case 3: // With neighbor count 3, the cell becomes alive. new_configuration .insert(new_row, new_col); break; case 2: // With count 2, cell keeps the same status. if (retrieve(new_row, new_col)) new_configuration .insert(new_row, new_col); break; default: // Otherwise, the cell is dead. break; } } } // Exchange data of current configuration with data of new_configuration. List < Cell * > * temp_list = living; living = new_configuration.living; new_configuration.living = temp_list; Hash_table * temp_hash = is_living; is_living = new_configuration.is_living; new_configuration.is_living = temp_hash; } 2. Printing We recognize that it is impossible to display more than a small piece of the now printing window unbounded Life configuration on a user’s screen. Therefore, we shall merely print a rectangular window, showing the status of a 20 × 80 central portion of a Life configuration. For each cell in the window, we retrieve its status from the hash table and print either a blank or non-blank character accordingly. 424 Chapter 9 • Tables and Information Retrieval 318 void Life::print( ) / * Post: A central window onto the Life object is displayed. Uses: The auxiliary function Life ::retrieve. * / { int row, col; cout << endl << " The current Life configuration is:"<<endl; for (row = 0; row < 20; row++) { for (col = 0; col < 80; col++) if (retrieve(row, col)) cout <<  *  ; else cout <<  ; cout << endl; } cout << endl; } 3. Creation and Insertion of new Cells We now turn to the function insert that creates a Cell object and explicitly references the hash table. The task of the function is to create a new cell, with the given coordinates and put it in both the hash table and the List living. This outline task translates into the following C++ function. Error_code Life :: insert(int row, int col) / * Pre: The cell with coordinates row and col does not belong to the Life configuration. Post: The cell has been added to the configuration. If insertion into either the List or the Hash_table fails, an error code is returned. Uses: The class List, the class Hash_table, and the struct Cell * / { Error_code outcome; Cell * new_cell = new Cell(row, col); int index = living->size( ); outcome = living->insert(index, new_cell); if (outcome == success) outcome = is_living ->insert(new_cell); if (outcome != success) cout << " Warning: new Cell insertion failed"<<endl; return outcome; } 4. Construction and Destruction of Life Objects We must provide a constructor and destructor for our class Life to allocate and dispose of its dynamically allocated members. The constructor need only apply the new operator. Section 9.9 • Application: The Life Game Revisited 425 319 Life :: Life( ) / * Post: The members of a Life object are dynamically allocated and initialized. Uses: The class Hash_table and the class List. * / { living = new List < Cell * > ; is_living = new Hash_table; } The destructor must dispose of any object that might ever be dynamically defined by a method of the class Life. In addition to the List * living and the Hash_table * is_living that are dynamically created by the constructor, the Cell objects that they reference are dynamically created by the method insert. The following implementation begins by disposing of these Cell objects: Life :: ∼Life( ) / * Post: The dynamically allocated members of a Life object and all Cell objects that they reference are deleted. Uses: The class Hash_table and the class List. * / { Cell * old_cell; for (int i = 0; i < living->size(); i++) { living->retrieve(i, old_cell); delete old_cell; } delete is_living; // Calls the Hash_table destructor delete living; // Calls the List destructor } 5. The Hash Function Our hash function will differ slightly from those earlier in this chapter, in that its argument already comes in two parts (row and column), so that some kind of folding can be done easily. Before deciding how, let us for a moment consider the special case of a small array, where the function is one-to-one and is exactly the index function. When there are exactly maxrow entries in each row, the index i, j maps to index function i + maxrow * j to place the rectangular array into contiguous, linear storage, one row after the next. It should prove effective to use a similar mapping for our hash function, where we replace maxrow by some convenient number (like a prime) that will maximize the spread and reduce collisions. Hence we obtain 426 Chapter 9 • Tables and Information Retrieval const int factor = 101; int hash(int row, int col) / * Post: The function returns the hashed valued between 0 and hash_size − 1 that corresponds to the given Cell parameter. * / { int value; value = row + factor * col; value %= hash_size; if (value < 0) return value + hash_size; else return value; } 6. Other Subprograms The remaining Life member functions initialize, retrieve, and neighbor_count all bear considerable resemblance either to one of the preceding functions or to the corresponding function in our earlier Life program. These functions can therefore safely be left as projects. Programming Projects 9.9 P1. Write the Life methods (a) neighbor_count, (b) retrieve, and (c) initialize. P2. Write the Hash_table methods (a) insert and (b) retrieve for the chained implementation that stores pointers to cells of a Life configuration. P3. Modify update so that it uses a second local Life object to store cells that have been considered for insertion, but rejected. Use this object to make sure that no cell is considered twice. POINTERS AND PITFALLS 1. Use top-down design for your data structures, just as you do for your algorithms. First determine the logical structure of the data, then slowly specify 320 more detail, and delay implementation decisions as long as possible. 2. Before considering detailed structures, decide what operations on the data will be required, and use this information to decide whether the data belong in a list or a table. Traversal of the data structure or access to all the data in a prespecified order generally implies choosing a list. Access to any entry in time O(1) generally implies choosing a table. 3. For the design and programming of lists, see Chapter 6. 4. Use the logical structure of the data to decide what kind of table to use: an ordinary array, a table of some special shape, a system of inverted tables, or a hash table. Choose the simplest structure that allows the required operations and that meets the spacerequirements of the problem. Don’t writecomplicated functions to save space that will then remain unused. Chapter 9 • Review Questions 427 5. Let the structure of the data help you decide whether an index function or an access array is better for accessing a table of data. Use the features built into your programming language whenever possible. 6. In using a hash table, let the nature of the data and the required operations help you decide between chaining and open addressing. Chaining is generally preferable if deletions are required, if the records are relatively large, or if overflow might be a problem. Open addressing is usually preferable when the individual records are small and there is no danger of overflowing the hash table. 7. Hashfunctionsusuallyneedtobecustomdesignedforthekind of keys used for accessing the hash table. In designing a hash function, keep the computations as simple and as few as possible while maintaining a relatively even spread of the keys over the hash table. There is no obligation to use every part of the key in the calculation. For important applications, experiment by computer with several variations of your hash function, and look for rapid calculation and even distribution of the keys. 8. Recall from the analysis of hashing that some collisions will almost inevitably occur, so don’t worry about the existence of collisions if the keys are spread nearly uniformly through the table. 9. Foropenaddressing, clusteringis unlikely tobeaproblem until the hashtableis more than half full. If the table can be made several times larger than the space required for the records, then linear probing should be adequate; otherwise more sophisticated collision resolution may be required. On the other hand, if the table is many times larger than needed, then initialization of all the unused space may require excessive time. REVIEW QUESTIONS 1. In terms of the Θand Ωnotations, compare the difference in time required for 9.1 table lookup and for list searching. 2. What are row-major and column-major ordering? 9.2 3. Why do jagged tables require access arrays instead of index functions? 9.3 4. For what purpose are inverted tables used? 5. What is the difference in purpose, ifany, betweenan index function and anaccess array ? 6. What operations are available for an abstract table? 9.4 7. What operations are usually easier for a list than for a table? 8. In 20 words or less, describe how radix sort works. 9.5 9. In radix sort, why are the keys usually partitioned first by the least significant position, not the most significant? 428 Chapter 9 • Tables and Information Retrieval 10. What is the difference in purpose, if any, between an index function and a hash 9.6 function? 11. What objectives should be sought in the design of a hash function? 12. Name three techniques often built into hash functions. 13. What is clustering in a hash table? 14. Describe two methods for minimizing clustering. 15. Name four advantages of a chained hash table over open addressing. 16. Name one advantage of open addressing over chaining. 17. If a hash function assigns 30 keys to random positions in a hash table of size 9.7 300, about how likely is it that there will be no collisions? REFERENCES FOR FURTHER STUDY The primary reference for this chapter is KNUTH, Volume 3. (See page 77 for bibli- ographic details.) Hashing is the subject of Volume 3, pp. 506–549. K NUTH studies every method we have touched, and many others besides. He does algorithm analysis in considerably more detail than we have, writing his algorithms in a pseudo-assembly language, and counting operations in detail there. The following book (pp. 156–185) considers arrays of various kinds, index functions, and access arrays in considerable detail: C. C. GOTLIEB and L. R. GOTLIEB, Data Types and Structures, Prentice Hall, Englewood Cliffs, N.J., 1978. An interesting study of hash functions and the choice of constants used is: B. J. MCKENZIE,R.HARRIES, and T. C. BELL, “Selecting a hashing algorithm,” Software Practice and Experience 20 (1990), 209–224. Extensions of the birthday surprise are considered in M. S. KLAMKIN and D. J. NEWMAN, Journal of Combinatorial Theory 3 (1967), 279–282. A thorough and informative analysis of hashing appears in Chapter 8 of ROBERT SEDGEWICK and PHILIPPE FLAJOLET, An Introduction to the Analysis ofAlgorithms, Addison-Wesley, Reading, Mass., 1996. Binary Trees 10 L INKED LISTS have great advantages of flexibility over the contiguous rep- resentation of data structures, but they have one weak feature: They are sequential lists; that is, they are arranged so that it is necessary to move through them only one position at a time. In this chapter we overcome these disadvantages by studying trees as data structures, using the methods of pointers and linked lists for their implementation. Data structures organized as trees will prove valuable for a range of applications, especially for problems of information retrieval. 10.1 Binary Trees 430 10.1.1 Definitions 430 10.1.2 Traversal of Binary Trees 432 10.1.3 Linked Implementation of Binary Trees 437 10.2 Binary Search Trees 444 10.2.1 Ordered Lists and Implementations 446 10.2.2 Tree Search 447 10.2.3 Insertion into a Binary Search Tree 451 10.2.4 Treesort 453 10.2.5 Removal from a Binary Search Tree 455 10.3 Building a Binary Search Tree 463 10.3.1 Getting Started 464 10.3.2 Declarations and the Main Function 465 10.3.3 Inserting a Node 466 10.3.4 Finishing the Task 467 10.3.5 Evaluation 469 10.3.6 Random Search Trees and Optimality 470 10.4 Height Balance: AVL Trees 473 10.4.1 Definition 473 10.4.2 Insertion of a Node 477 10.4.3 Removal of a Node 484 10.4.4 The Height of an AVL Tree 485 10.5 Splay Trees: A Self-Adjusting Data Structure 490 10.5.1 Introduction 490 10.5.2 Splaying Steps 491 10.5.3 Algorithm Development 495 10.5.4 Amortized Algorithm Analysis: Introduction 505 10.5.5 Amortized Analysis of Splaying 509 Pointers and Pitfalls 515 Review Questions 516 References for Further Study 518 429 10.1 BINARY TREES For some time, we have been drawing trees to illustrate the behavior of algorithms. We have drawn comparison trees showing the comparisons of keys in searching and sorting algorithms; we have drawn trees of subprogram calls; and we have drawn recursion trees. If, for example, we consider applying binary search to the following list of names, then the order in which comparisons will bemade is shown in the comparison tree of Figure 10.1. Amy Ann Dot Eva Guy Jan Jim Jon Kay Kim Ron Roy Tim Tom 326 Jim Dot Ron Amy Guy Kay Tim Ann Eva Jan Jon Kim Roy Tom Figure 10.1. Comparison tree for binary search 10.1.1 Definitions In binary search, when we make a comparison with a key, we then move either left or right depending on the outcome of the comparison. It is thus important to keep the relation of left and right in the structure we build. It is also possible that the 322 part of the tree on one side or both below a given node is empty. In the example of Figure 10.1, the name Amy has an empty left subtree. For all the leaves, both subtrees are empty. We can now give the formal definition of a new data structure. Definition A binary tree is either empty, or it consists of a node called the root together with two binary trees called the left subtree and the right subtree of the root. Note that this definition is that of a mathematical structure. To specify binary trees as an abstract data type, we must state what operations can be performed on binary ADT trees. Rather than doing so at once, we shall develop the operations as the chapter progresses. Note also that this definition makes no reference to the way in which binary trees will be implemented in memory. As we shall presently see, a linked represen- tation is natural and easy to use, but other implementations are possible as well. Note, finally, that this definition makes no reference to keys or the way in which 430 Section 10.1 • Binary Trees 431 they are ordered. Binary trees are used for many purposes other than searching; hence we have kept the definition general. Before we consider general properties ofbinarytrees further, letus return to the general definition and see how its recursive nature works out in the construction of small binary trees. The first case, the base case that involves no recursion, is that of an empty binary tree. For other kinds of trees, we might never think of allowing an empty one, but small binary trees for binary trees it is convenient, not only in the definition, but in algorithms, to allow for an empty tree. The empty tree will usually be the base case for recursive algorithms and will determine when the algorithm stops. The only way to construct a binary tree with one node is to make that node into the root and to make both the left and right subtrees empty. Thus a single node with no branches is the one and only binary tree with one node. With two nodes in the tree, one of them will be the root and the other will be in a subtree. Thus either the left or right subtree must be empty, and the other will contain exactly one node. Hence there are two different binary trees with two nodes. At this point, you should note that the concept of a binary tree differs from some of the examples of trees that we have previously seen, in that left and right are important for binary trees. The two binary trees with two nodes can be drawn left and right as and which are different from each other. We shall never draw any part of a binary tree to look like since there is no way to tell if the lower node is the left or the right child of its parent. We should, furthermore, note that binary trees are not the same class as the 2-trees we studied in the analysis of algorithms in Chapter 7 and Chapter 8. Each comparison trees node in a 2-tree has either 0 or 2 children, never 1, as can happen with a binary tree. Left and right are not fundamentally important for studying the properties of comparison trees, but they are crucial in working with binary trees. 1 1 In Section 10.3.6 we shall, however, see that binary trees can be converted into 2-trees and vice versa . [...]... a and b, and so on The insertions will produce a chain for the binary search tree, as shown in the final part of Figure 10.8 Such a chain, as we have already seen, is very inefficient for searching Hence we conclude: If the keys to be inserted into an empty binary search tree are in their natural order, then the method insert will produce a tree that degenerates into an inefficient chain The method insert... height 0 and a tree with only one node has height 1 Binary_tree insert Binary_tree clear Binary_tree destructor E7 Write a method and the corresponding recursive function to insert an Entry, passed as a parameter, into a linked binary tree If the root is empty, the new entry should be inserted into the root, otherwise it should be inserted into the shorter of the two subtrees of the root (or into the... different points of view that we might take: three views ¯ We can regard binary search trees as a new abstract data type with its own definition and its own methods; ¯ Since binary search trees are special kinds of binary trees, we may consider their methods as special kinds of binary tree methods; ¯ Since the entries in binary search trees contain keys, and since they are applied 331 for information... Figure 10.9 Insertions into a binary search tree g 452 Chapter 10 • Binary Trees different orders, same tree When the first entry, e, is inserted, it becomes the root, as shown in part (a) Since b comes before e, its insertion goes into the left subtree of e, as shown in part (b) Next we insert d, first comparing it to e and going left, then comparing it to b and going right The next insertion, f, goes to... useful and efficient for problems requiring searching 10.1.3 Linked Implementation of Binary Trees 3 27 root A binary tree has a natural implementation in linked storage As usual for linked structures, we shall link together nodes, created in dynamic storage, so we shall need a separate pointer variable to enable us to find the tree Our name for this pointer variable will be root, since it will point to... solution to this problem By making the entries of an ordered list into the nodes of a binary tree, we shall find that we can search for a target key in O(log n) steps, just as with binary search, and we shall obtain algorithms for inserting and deleting entries also in time O(log n) When we studied binary search, we drew comparison trees showing the progress of binary search by moving either left (if the... (*visit)(sub_root- >data) ; } } Section 10.1 • Binary Trees 330 441 We leave the coding of standard Binary_tree methods such as height, size, and clear as exercises These other methods are also most easily implemented by calling recursive auxiliary functions In the exercises, we shall develop a method to insert entries into a Binary_tree This insertion method is useful for testing our basic Binary_tree class Later in. .. slow in comparison with binary search Hence, assuming we can keep the keys in order, searching becomes much faster if we use a contiguous list and binary search Suppose we also frequently need to make changes in the list, inserting new entries or deleting old entries Then it is much slower to use a contiguous list than a linked list, because insertion or removal in a contiguous list requires moving... stored in the nodes of the binary trees are all distinct, but it is not assumed that the trees are binary search trees That is, there is no necessary connection between any ordering of the data and their location in the trees If a tree is traversed in a particular order, and each key is printed when its node is visited, the resulting sequence is called the sequence corresponding to that traversal E 17 Suppose... of the following binary trees will be visited under (1) preorder, (2) inorder, and (3) postorder traversal 442 Chapter 10 • Binary Trees 1 1 1 2 2 2 3 5 4 5 (a) 5 (b) 7 2 4 3 3 4 1 6 4 3 5 7 6 8 9 8 (c) (d) E3 Draw expression trees for each of the following expressions, and show the order of visiting the vertices in (1) preorder, (2) inorder, and (3) postorder: (a) log n! (b) (a − b)−c Binary_tree size . into your programming language whenever possible. 6. In using a hash table, let the nature of the data and the required operations help you decide between chaining and open addressing. Chaining. with the given coordinates and put it in both the hash table and the List living. This outline task translates into the following C++ function. Error_code Life :: insert(int row, int col) / * Pre:. Getting Started 464 10.3.2 Declarations and the Main Function 465 10.3.3 Inserting a Node 466 10.3.4 Finishing the Task 4 67 10.3.5 Evaluation 469 10.3.6 Random Search Trees and Optimality 470 10.4

Định dạng
Số trang	73
Dung lượng	653,61 KB