PRIORITY QUEUES 133 again by exchanging C with the larger of its two sons (in this case S). The process continues until the heap condition is no longer violated at the node occupied by C. In the example, C makes it all the way to the bottom of the heap, leaving: The “remove the largest” operation involves almost the same process. Since the heap will be one element smaller after the operation, it is necessary to decrement leaving no place for the element that was stored in the last position. But the largest element is to be removed, so the remove operation amounts to a replace, using the element that was in For example, the following heap results from removing the T from the heap above: The implementation of these procedures is centered around the operation of fixing up a heap which satisfies the heap condition everywhere except possibly at the root. The same operation can be used to fix up the heap after the value in any position is lowered. It may be implemented as follows: 134 CHAPTER 11 procedure downheap(k: integer) label i, j, v: integer; begin v:=a[k]; while N div 2 do begin j:=k+k; if j<N then if then if then k:=j; end; end This procedure moves down the heap (starting from position k), exchanging the node at position j with the larger of its two sons if necessary, stopping when j is larger than both sons or the bottom is reached. As above, a full exchange is not needed because v is always involved in the exchanges. The inner loop in this program is an example of a loop which really has two distinct exits: one for the case that the bottom of the heap is hit (as in the first example above), and another for the case that the heap condition is satisfied somewhere in the interior of the heap. Now the implementation of the remove operation is simple: function remove: integer; begin N:=N-1; end ; The return value is set from then the element from a [N] is put into and the size of the heap decremented, leaving only a call to to fix up the heap condition everywhere. The implementation of the replace operation is only slightly more com- plicated: PRIORITY QUEUES 135 function begin downheap( ; end This code uses a[O] in an artificial way: its sons are 0 (itself) and 1, so if v is larger than the largest element in the heap, the heap is not touched; otherwise v is put into the heap and returned The delete operation for an arbitrary element from the heap and the change operation can also be implemented by using a simple combination of the methods above. For example, if the priority of the element at position k is raised, then can be called, and if it is lowered then does the job. On the other hand, the join operation is far more difficult and seems to require a much more sophisticated data structure. All of the basic operations insert, remove, replace, and upheup), delete, and change involve moving along a path between the root and the tom of the heap, which includes no more than about log N elements for a heap of size N. Thus the running times of the above programs are logarithmic. An elegant and efficient sorting method can be defined from the basic opera- tions on heaps outlined above. This method, called Heapsort, uses no extra memory and is guaranteed to sort M elements in about Mlog M steps no matter what the input. Unfortunately, its inner loop is quite a bit longer than the inner loop of Quicksort, and it is about twice as slow as Quicksort on the average. The idea is simply to build a heap containing the elements to be sorted and then to remove them all in order. In this section, N will continue to be the size of the heap, so we will use M for the number of elements to be sorted. One way to sort is to implement the construct operation by doing M insert operations, as in the first two lines of the following code, then do M remove operations, putting the element removed into the place just vacated by the shrinking heap: for to M do for k:=M do a[k]:=remove; 136 CHAPTER 11 This code breaks all the rules of abstract data structures by assuming a par- ticular representation for the priority queue (during each loop, the priority queue resides in . . . , a[k-1]), but it is reasonable to do this here because we are implementing a sort, not a priority queue. The priority queue proce- dures are being used only for descriptive purposes: in an actual implementa- tion of the sort, we would simply use the code from the procedures to avoid doing so many unnecessary procedure calls. It is actually a little better to build the heap by going backwards through it, making little heaps from the bottom up. Note that every position in the array is the root of a small heap, and will work equally well for such small heaps as for the big heap. Also, we’ve noted that remove can be implemented by exchanging the first and last elements, decrementing and calling This leads to the following implementation of Heapsort: procedure heapsort; var k, integer; begin N:=M; for k:=M div 2 do downheap( repeat until = end The first two lines of this code constitute an implementation of integer) to build a heap of M elements. (The keys in a[ (M div each form heaps of one element, so they trivially satisfy the heap condition and don’t need to be checked.) It is interesting to note that, though the loops in this program seem to do very different things, they can be built around the same fundamental procedure. The following table shows the contents of each heap operated on by for our sorting example, just after has made the heap condition hold everywhere. PRIORITY QUEUES 137 12345678 9 10 11 12 13 14 15 ASORTINGEXAMPLE N L E P M I X T A R G E P 0 N M I L E X R T G E S A XTPRSONGEAAMILE TSPREONGEAAMIL SRPLEONGEAAMI R L P IEONGEAAM P L 0 IEMNGEAA OLNIEMAGEA NLMIEAAGE M L E IEAAG L IEGEAA IGEAEA G E E A A E A E A E A A A A A AAEEGILMNOPRSTX As mentioned above, the primary reason that is of practical interest is that the number of steps required to sort M elements is guaranteed to be proportional to M log M, no matter what the input. Unlike the other methods that we’ve seen, there is no “worst-case” input that will make sort run slower. The proof of this is simple: we make about calls to (about to construct the heap and M for the sort), each of which examines less than log M heap elements, since the heap never has more than M elements. Actually, the above proof uses an overestimate. In fact, it can be proven that the construction process takes linear time since so many small heaps are processed. This is not of particular importance to Heapsort, since this time is still dominated by the M log M time for sorting, but it is important for other priority queue applications, where a linear time construct can lead to a linear time algorithm. Note that constructing a heap with M successive inserts requires M log M steps in the worst case (though it turns out to be linear on the average). 138 11 Indirect Heaps For many applications of priority queues, we don’t want the records moved around at all. Instead, we want the priority queue routine to tell us which of the records is the largest, etc., instead of returning values. This is akin to the “indirect sort” or the “pointer sort” concept described at the begin- ning of Chapter 8. Modification of the above programs to work in this way is straightforward, though sometimes confusing. It will be worthwhile to ex- amine this in more detail here because it is so convenient to use heaps in this way. Specifically, instead of rearranging the keys in the array a the priority queue routines will work with an array heap of indices into the array a, such that a[heap[k]] is the key of the kth element of the heap, for k between 1 and N. Moreover, we want to maintain another array inv which keeps the heap position of the kth array element. Thus the inv entry for the largest element in the array is etc. For example, if we wished to change the value of a[k] we could find its heap position in for use by or downheap. The following table gives the values in these arrays for our sample heap: k: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 a[k]:A S 0 R T I N G E X A M P L E 5 13 4 2 3 7 8 9 1 11 12 6 14 15 a[heap[k]]: X T P R S 0 N G E A A M I L E a[k]: A S 0 R T INGEXAMPLE 10 5 6 4 2 13 7 8 9 1 11 12 3 14 15 Note that for all k from 1 to N. We start with for k from 1 to which indicates that no rearrangement has been done. The code for heap construction looks much the same as before: procedure pqconstruct; integer; begin N:=M; for to N do begin heap[k] inv[k] :=k end; for div 2 do pqdownheap(k) ; end PRIORITY QUEUES 139 We’ll prefix implementations of priority queue routines based on indirect heaps with for when they are used in later chapters. Now, to modify to work indirectly, we need only examine the places where it references a. Where it did a comparison before, it must now access a indirectly through heap. Where it did a move before, it must now make the move in heap, not a, and it must modify inv accordingly. This leads to the following implementation: procedure pqdownheap(k: integer); label j, v: integer; v:=heap[k]; while N div 2 do begin j:=k+k; if j<N then if then if then heap [k] :=heap ; k:=j; end; inv[v] :=k end The other procedures given above can be modified in a similar fashion to implement “pqchange,” etc. A similar indirect implementation can be developed based on maintaining heap as an array of pointers to separately allocated records. In this case, a little more work is required to implement the function of inv (find the heap position, given the record). Advanced Implementations If the join operation must be done efficiently, then the implementations that we have done so far are insufficient and more advanced techniques are needed. Although we don’t have space here to go into the details of such methods, we can discuss some of the considerations that go into their design. By “efficiently,” we mean that a join should be done in about the same time as the other operations. This immediately rules out the repre- sentation for heaps that we have been using, since two large heaps can be joined only by moving all the elements in at least one of them to a large array. It is easy to translate the algorithms we have been examining to use linked representations; in fact, sometimes there are other reasons for doing 140 CHAPTER 11 so (for example, it might be inconvenient to have a large contiguous array). In a direct linked representation, links would have to be kept in each node pointing to the father and both sons. It turns out that the heap condition itself seems to be too strong to allow efficient implementation of the join operation. The advanced data structures designed to solve this problem all weaken either the heap or the balance condition in order to gain the flexibility needed for the join. These structures allow all the operations be completed in logarithmic time. PRIORITY QUEUES 141 Exercises 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Draw the heap that results when the following operations are performed on an intitially empty heap: remove, insert(3). Is a file in reverse sorted order a heap? Give the heap constructed by successive application of insert on the keys EASYQUESTION. Which positions could be occupied by the 3rd largest key in a heap of size Which positions could not be occupied by the 3rd smallest key in a heap of size Why not use a sentinel to avoid the j<N test in downheap? Show how to obtain the functions of stacks and normal queues as special cases of priority queues. What is the minimum number of keys that must be moved during a remove the largest operation in a heap? Draw a heap of size 15 for which the minimum is achieved. Write a program to delete the element at d in a heap. Empirically compare the two methods of heap construction described in the text, by building heaps with 1000 random keys. Give the contents of inv after pqconstruct is used on the keys E A S Y Q UESTION. . elements in at least one of them to a large array. It is easy to translate the algorithms we have been examining to use linked representations; in fact, sometimes