Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 53 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
53
Dung lượng
521,31 KB
Nội dung
- 319 - Rule 2 (the root is always black). Don't worry about this yet. Instead, rotate the other way. Position the red arrow on 25, which is now the root (the arrow should already point to 25 after the previous rotation). Click the RoL button to rotate left. The nodes will return to the position of Figure 9.4. Experiment 3 Start with the position of Figure 9.4, with nodes 25 and 75 inserted in addition to 50 in the root position. Note that the parent (the root) is black and both its children are red. Now try to insert another node. No matter what value you use, you'll see the message Can't Insert: Needs color flip. As we mentioned, a color flip is necessary whenever, during the insertion process, a black node with two red children is encountered. The red arrow should already be positioned on the black parent (the root node), so click the Flip button. The root's two children change from red to black. Ordinarily the parent would change from black to red, but this is a special case because it's the root: it remains black to avoid violating Rule 2. Now all three nodes are black. The tree is still red-black correct. Now click the Ins button again to insert the new node. Figure 9.6 shows the result if the newly inserted node has the key value 12. The tree is still red-black correct. The root is black, there's no situation in which a parent and child are both red, and all the paths have the same number of black nodes (2). Adding the new red node didn't change the red-black correctness. Experiment 4 Now let's see what happens when you try to do something that leads to an unbalanced tree. In Figure 9.6 one path has one more node than the other. This isn't very unbalanced, and no red-black rules are violated, so neither we nor the red-black algorithms need to worry about it. However, suppose that one path differs from another by two or more levels (where level is the same as the number of nodes along the path). In this case the red-black rules will always be violated, and we'll need to rebalance the tree. Figure 9.6: Colors flipped, new node inserted Insert a 6 into the tree of Figure 9.6. You'll see the message Error: parent and child are both red. Rule 3 has been violated, as shown in Figure 9.7. - 320 - Figure 9.7: Parent and child are both red How can we fix things so Rule 3 isn't violated? An obvious approach is to change one of the offending nodes to black. Let's try changing the child node, 6. Position the red arrow on it and press the R/B button. The node becomes black. The good news is we fixed the problem of both parent and child being red. The bad news is that now the message says Error: Black heights differ. The path from the root to node 6 has three black nodes in it, while the path from the root to node 75 has only two. Thus Rule 4 is violated. It seems we can't win. This problem can be fixed with a rotation and some color changes. How to do this will be the topic of later sections. More Experiments Experiment with the RBTree Workshop applet on your own. Insert more nodes and see what happens. See if you can use rotations and color changes to achieve a balanced tree. Does keeping the tree red-black correct seem to guarantee an (almost) balanced tree? Try inserting ascending keys (50, 60, 70, 80, 90) and then restart with the Start button and try descending keys (50, 40, 30, 20, 10). Ignore the messages; we'll see what they mean later. These are the situations that get the ordinary binary search tree into trouble. Can you still balance the tree? The Red-Black Rules and Balanced Trees Try to create a tree that is unbalanced by two or more levels but is red-black correct. As it turns out, this is impossible. That's why the red-black rules keep the tree balanced. If one path is more than one node longer than another, then it must either have more black nodes, violating Rule 4, or it must have two adjacent red nodes, violating Rule 3. Convince yourself that this is true by experimenting with the applet. Null Children Remember that Rule 4 specifies all paths that go from the root to any leaf or to any null children must have the same number of black nodes. A null child is a child that a non-leaf node might have, but doesn't. Thus in Figure 9.8 the path from 50 to 25 to the right child of 25 (its null child) has only one black node, which is not the same as the paths to 6 and 75, which have 2. This arrangement violates Rule 4, although both paths to leaf nodes have the same number of black nodes. - 321 - Figure 9.8: Path to a null child The term black height is used to describe the number of black nodes from between a given node and the root. In Figure 9.8 the black height of 50 is 1, of 25 is still 1, of 12 is 2, and so on. Rotations To balance a tree, it's necessary to physically rearrange the nodes. If all the nodes are on the left of the root, for example, you need to move some of them over to the right side. This is done using rotations. In this section we'll learn what rotations are and how to execute them. Rotations are ways to rearrange nodes. They were designed to do the following two things: • Raise some nodes and lower others to help balance the tree. • Ensure that the characteristics of a binary search tree are not violated. Recall that in a binary search tree the left children of any node have key values less than the node, while its right children have key values greater or equal to the node. If the rotation didn't maintain a valid binary search tree it wouldn't be of much use, because the search algorithm, as we saw in the last chapter , relies on the search-tree arrangement. Note that color rules and node color changes are used only to help decide when to perform a rotation; fiddling with the colors doesn't accomplish anything by itself; it's the rotation that's the heavy hitter. Color rules are like rules of thumb for building a house (such as "exterior doors open inward"), while rotations are like the hammering and sawing needed to actually build it. Simple Rotations In Experiment 2 we tried rotations to the left and right. These rotations were easy to visualize because they involved only three nodes. Let's clarify some aspects of this process. What's Rotating? The term rotation can be a little misleading. The nodes themselves aren't rotated, the relationship between them changes. One node is chosen as the "top" of the rotation. If we're doing a right rotation, this "top" node will move down and to the right, into the position of its right child. Its left child will move up to take its place. Remember that the top node isn't the "center" of the rotation. If we talk about a car tire, the top node doesn't correspond to the axle or the hubcap, it's more like the topmost part - 322 - of the tire tread. The rotation we described in Experiment 2 was performed with the root as the top node, but of course any node can be the top node in a rotation, provided it has the appropriate child. Mind the Children You must be sure that, if you're doing a right rotation, the top node has a left child. Otherwise there's nothing to rotate into the top spot. Similarly, if you're doing a left rotation, the top node must have a right child. The Weird Crossover Node Rotations can be more complicated than the three-node example we've discussed so far. Click Start, and then, with 50 already at the root, insert nodes with following values, in this order: 25, 75, 12, 37. When you try to insert the 12, you'll see the Can't insert: needs color flip message. Just click the Flip button. The parent and children change color. Then press Ins again to complete the insertion of the 12. Finally insert the 37. The resulting arrangement is shown in Figure 9.9a. FIGURE 9.9: Rotation with crossover node Now we'll try a rotation. Place the arrow on the root (don't forget this!) and press the RoR button. All the nodes move. The 12 follows the 25 up, and the 50 follows the 75 down. But what's this? The 37 has detached itself from the 25, whose right child it was, and become instead the left child of 50. Some nodes go up, some nodes go down, but the 37 moves across. The result is shown in Figure 9.9b. The rotation has caused a violation of Rule 4; we'll see how to fix this later. In the original position of Figure 9.9a, the 37 is called an inside grandchild of the top node, 50. (The 12 is an outside grandchild.) The inside grandchild, if it's the child of the node that's going up (which is the left child of the top node in a right rotation) is always disconnected from its parent and reconnected to its former grandparent. It's like becoming your own uncle (although it's best not to dwell too long on this analogy). Subtrees on the Move We've shown individual nodes changing position during a rotation, but entire subtrees can move as well. To see this, click Start to put 50 at the root, and then insert the - 323 - following sequence of nodes in order: 25, 75, 12, 37, 62, 87, 6, 18, 31, 43. Click Flip whenever you can't complete an insertion because of the Can't insert: needs color flip message. The resulting arrangement is shown in Figure 9.10a. Figure 9.10: Subtree motion during rotation Position the arrow on the root, 50. Now press RoR. Wow! (Or is it WoW?) A lot of nodes have changed position. The result is shown in Figure 9.10b. Here's what happens: • The top node (50) goes to its right child. • The top node's left child (25) goes to the top. • The entire subtree of which 12 is the root moves up. • The entire subtree of which 37 is the root moves across to become the left child of 50. • The entire subtree of which 75 is the root moves down. You'll see the Error: root must be black message but you can ignore it for the time being. You can flip back and forth by alternately pressing RoR and RoL with the arrow on the top node. Do this and watch what happens to the subtrees, especially the one with 37 as its root. The figures show the subtrees encircled by dotted triangles. Note that the relations of the nodes within each subtree are unaffected by the rotation. The entire subtree moves as a unit. The subtrees can be larger (have more descendants) than the three nodes we show in this example. No matter how many nodes there are in a subtree, they will all move together during a rotation. Human Beings Versus Computers This is pretty much all you need to know about what a rotation does. To cause a rotation, you position the arrow on the top node, then press RoR or RoL. Of course, in a real red- black tree insertion algorithm, rotations happen under program control, without human intervention. Notice however that, in your capacity as a human being, you could probably balance any tree just by looking at it and performing appropriate rotations. Whenever a node has a lot of left descendants and not too many right ones, you rotate it right, and vice versa. - 324 - Unfortunately, computers aren't very good at "just looking" at a pattern. They work better if they can follow a few simple rules. That's what the red-black scheme provides, in the form of color coding and the four color rules. Inserting a New Node Now you have enough background to see how a red-black tree's insertion routine uses rotations and the color rules to maintain the tree's balance. Preview We're going to briefly preview our approach to describing the insertion process. Don't worry if things aren't completely clear in the preview; we'll discuss things in more detail in a moment. In the discussion that follows we'll use X, P, and G to designate a pattern of related nodes. X is a node that has caused a rule violation. (Sometimes X refers to a newly inserted node, and sometimes to the child node when a parent and child have a red-red conflict.) • X is a particular node. • P is the parent of X. • G is the grandparent of X (the parent of P). On the way down the tree to find the insertion point, you perform a color flip whenever you find a black node with two red children (a violation of Rule 2). Sometimes the flip causes a red-red conflict (a violation of Rule 3). Call the red child X and the red parent P. The conflict can be fixed with a single rotation or a double rotation, depending on whether X is an outside or inside grandchild of G. Following color flips and rotations, you continue down to the insertion point and insert the new node. After you've inserted the new node X, if P is black you simply attach the new red node. If P is red, there are two possibilities: X can be an outside or inside grandchild of G. You perform two color changes (we'll see what they are in a moment). If X is an outside grandchild, you perform one rotation, and if it's an inside grandchild you perform two. This restores the tree to a balanced state. Now we'll recapitulate this preview in more detail. We'll divide the discussion into three parts, arranged in order of complexity: 1. Color flips on the way down 2. Rotations once the node is inserted 3. Rotations on the way down If we were discussing these three parts in strict chronological order, we'd examine part 3 before part 2. However, it's easier to talk about rotations at the bottom of the tree than in the middle, and operations 1 and 2 are encountered more frequently than operation 3, so we'll discuss 2 before 3. Color Flips on the Way Down The insertion routine in a red-black tree starts off doing essentially the same thing it does - 325 - in an ordinary binary search tree: It follows a path from the root to the place where the node should be inserted, going left or right at each node depending on the relative size of the node's key and the search key. However, in a red-black tree, getting to the insertion point is complicated by color flips and rotations. We introduced color flips in Experiment 3; now we'll look at them in more detail. Imagine the insertion routine proceeding down the tree, going left or right at each node, searching for the place to insert a new node. To make sure the color rules aren't broken, it needs to perform color flips when necessary. Here's the rule: Every time the insertion routine encounters a black node that has two red children, it must change the children to black and the parent to red (unless the parent is the root, which always remains black). Figure 9.11: Color flip How does a color flip affect the red-black rules? For convenience, let's call the node at the top of the triangle, the one that's red before the flip, P for parent. We'll call P's left and right children X1 and X2. This is shown in Figure 9.11a. Black Heights Unchanged Figure 9.11b shows the nodes after the color flip. The flip leaves unchanged the number of black nodes on the path from the root on down through P to the leaf or null nodes. All such paths go through P, and then through either X1 or X2. Before the flip, only P is black, so the triangle (consisting of P, X1, and X2) adds one black node to each of these paths. After the flip, P is no longer black, but both L and R are, so again the triangle contributes one black node to every path that passes through it. So a color flip can't cause Rule 4 to be violated. Color flips are helpful because they make red leaf nodes into black leaf nodes. This makes it easier to attach new red nodes without violating Rule 3. Could Be Two Reds Although Rule 4 is not violated by a color flip, Rule 3 (a node and its parent can't both be red) may be. If the parent of P is black, there's no problem when P is changed from black to red. However, if the parent of P is red, then, after the color change, we'll have two reds in a row. This needs to be fixed before we continue down the path to insert the new node. We can correct the situation with a rotation, as we'll soon see. The Root Situation What about the root? Remember that a color flip of the root and its two children leaves the root, as well as its children, black. This avoids violating Rule 2. Does this affect the - 326 - other red-black rules? Clearly there are no red-to-red conflicts, because we've made more nodes black and none red. Thus, Rule 3 isn't violated. Also, because the root and one or the other of its two children are in every path, the black height of every path is increased the same amount; that is, by 1. Thus, Rule 4 isn't violated either. Finally, Just Insert It Once you've worked your way down to the appropriate place in the tree, performing color flips (and rotations) if necessary on the way down, you can then insert the new node as described in the last chapter for an ordinary binary search tree. However, that's not the end of the story. Rotations Once the Node is Inserted The insertion of the new node may cause the red-black rules to be violated. Therefore, following the insertion, we must check for rule violations and take appropriate steps. Remember that, as described earlier, the newly inserted node, which we'll call X, is always red. X may be located in various positions relative to P and G, as shown in Figure 9.12. Figure 9.12: Handed variations of node being inserted Remember that a node X is an outside grandchild if it's on the same side of its parent P that P is of its parent G. That is, X is an outside grandchild if either it's a left child of P and P is a left child of G, or it's a right child of P and P is a right child of G. Conversely, X is an inside grandchild if it's on the opposite side of its parent P that P is of its parent G. If X is an outside grandchild, it may be either the left or right child of P, depending on whether P is the left or right child of G. Two similar possibilities exist if X is an inside grandchild. It's these four situations that are shown in Figure 9.12. This multiplicity of what we might call "handed" (left or right) variations is one reason the red-black insertion routine is challenging to program. The action we take to restore the red-black rules is determined by the colors and configuration of X and its relatives. Perhaps surprisingly, there are only three major ways in which nodes can be arranged (not counting the handed variations already mentioned). Each possibility must be dealt with in a different way to preserve red-black correctness and thereby lead to a balanced tree. We'll list the three possibilities briefly, then discuss each one in detail in its own section. Figure 9.13 shows what they look like. Remember that X is always red. - 327 - Figure 9.13: Three post-insertion possibilities 1. P is black. 2. P is red and X is an outside grandchild of G. 3. P is red and X is an inside grandchild of G. It might seem that this list doesn't cover all the possibilities. We'll return to this question after we've explored these three. Possibility 1: P Is Black If P is black, we get a free ride. The node we've just inserted is always red. If its parent is black, there's no red-to-red conflict (Rule 3), and no addition to the number of black nodes (Rule 4). Thus no color rules are violated. We don't need to do anything else. The insertion is complete. Possibility 2: P Is Red, X Is Outside If P is red and X is an outside grandchild, we need a single rotation and some color changes. Let's set this up with the Workshop applet so we can see what we're talking about. Start with the usual 50 at the root, and insert 25, 75, and 12. You'll need to do a color flip before you insert the 12. Now insert 6, which is X, the new node. Figure 9.14a shows how this looks. The message on the Workshop applet says Error: parent and child both red, so we know we need to take some action. - 328 - Figure 9.14: P is red, X is an outside grandchild In this situation, we can take three steps to restore red-black correctness and thereby balance the tree. Here are the steps: 1. Switch the color of X's grandparent G (25 in this example). 2. Switch the color of X's parent P (12). 3. Rotate with X's grandparent G (25) at the top, in the direction that raises X (6). This is a right rotation in the example. As you've learned, to switch colors, put the arrow on the node and press the R/B button. To rotate right, put the arrow on the top node and press RoR. When you've completed the three steps, the Workshop applet will inform you that the Tree is red/black correct. It's also more balanced than it was, as shown in Figure 9.14b. In this example, X was an outside grandchild and a left child. There's a symmetrical situation when the X is an outside grandchild but a right child. Try this by creating the tree 50, 25, 75, 87, 93 (with color flips when necessary). Fix it by changing the colors of 75 and 87, and rotating left with 75 at the top. Again the tree is balanced. Possibility 3: P Is Red and X Is Inside If P is red and X is an inside grandchild, we need two rotations and some color changes. To see this one in action, use the Workshop applet to create the tree 50, 25, 75, 12, 18. (Again you'll need a color flip before you insert the 12.) The result is shown in Figure 9.15a. [...]... contain only one or two data items instead of three Also, notice that the tree is balanced It retains its balance even if you insert a sequence of data in ascending (or descending) order The 2-3-4 tree's self-balancing capability results from the way new data items are inserted, as we'll see in a moment Searching Finding a data item with a particular key is similar to the search routine in a binary... particular key; insert a new item into the node, moving existing items if necessary; and remove an item, again moving existing items if necessary Don't confuse these methods with the find() and insert() routines in the Tree234 class, which we'll look at next A display routine displays a node with slashes separating the data items, like / 27/ 56/89/, /14/66/, or /45/ Don't forget that in Java, references... to insert a new data item, and f to find an existing item Here's some sample interaction: Enter first letter of show, insert, or find: s level=0 child=0 /50/ level=1 child=0 /30/40/ level=1 child=1 /60 /70 / Enter first letter of show, insert, or find: f Enter value to find: 40 Found 40 Enter first letter of show, insert, or find: i Enter value to insert: 20 Enter first letter of show, insert, or find:... the new data item is simply inserted into it Figure 10.4 shows a data item with key 18 being inserted into a 2-3-4 tree Figure 10.4: Insertion with no splits Insertion may involve moving one or two other items in a node so the keys will be in the correct order after the new item is inserted In this example the 23 had to be shifted right to make room for the 18 Node Splits Insertion becomes more complicated... increased by one - 339 - Figure 10.6: Splitting the root Another way to describe splitting the root is to say that a 4-node is split into three 2nodes Following a node split, the search for the insertion point continues down the tree In Figure 10.6, the data item with a key of 41 is inserted into the appropriate leaf Figure 10 .7: Insertions into a 2-3-4 tree Splitting on the Way Down Notice that, because... operations involved in inserting a node: making rotations on the way down to the insertion point As we noted, although we're discussing this last, it actually takes place before the node is inserted We've waited until now to discuss it only because it was easier to explain rotations for a just-installed node than for nodes in the middle of the tree During the discussion of color flips during the insertion... starts at 0 on the left.) You don't find the data item in this node either, so you must go to the next child Here, because 64 is greater than 60 but less than 70 , you go again to child 1 This time you find the specified item in the 62/64/66 link Insertion New data items are always inserted in leaves, which are on the bottom row of the tree If - 3 37 - items were inserted in nodes with children, then the... should go to next Inserting The insert() method starts with code similar to find(), except that if it finds a full node it splits it Also, it assumes it can't fail; it keeps looking, going to deeper and deeper levels, until it finds a leaf node At this point it inserts the new data item into the leaf (There is always room in the leaf, otherwise the leaf would have been split.) Splitting The split() method... items being inserted, 20 and 10 The second of these caused a node (the root's child 0) to split Figure 10.12 depicts the tree that results from these insertions, following the final press of the s key Listing for tree234 .java Listing 10.1 shows the complete tree234 .java program, including all the classes just discussed As with most object-oriented programs, it's probably easiest to start by reeexamining... being split is not the root; we'll examine splitting the root later.) • A new, empty node is created It's a sibling of the node being split, and is placed to its right • Data item C is moved into the new node • Data item B is moved into the parent of the node being split • Data item A remains where it is • The rightmost two children are disconnected from the node being split and connected to the new node . We're going to briefly preview our approach to describing the insertion process. Don't worry if things aren't completely clear in the preview; we'll discuss things in more. then insert the - 323 - following sequence of nodes in order: 25, 75 , 12, 37, 62, 87, 6, 18, 31, 43. Click Flip whenever you can't complete an insertion because of the Can't insert:. operations involved in inserting a node: making rotations on the way down to the insertion point. As we noted, although we're discussing this last, it actually takes place before the node is inserted.