Thuật toán Algorithms (Phần 36)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	10
Dung lượng	80,53 KB

Nội dung

SEARCHING 343 adapting to the point set at hand. Trees Two-dimensional trees are dynamic, adaptable data structures which are very similar to binary trees but divide up a geometric space in a manner convenient for use in range searching and other problems. The idea is to build binary search trees with points in the nodes, using the y and x coordinates of the points as keys in a strictly alternating sequence. The same algorithm is used for inserting points into 2D trees as for normal binary search trees, except at the root we use the y coordinate (if the point to be inserted has a smaller y coordinate than the point at the root, go left; otherwise go right), then at the next level we use the coordinate, then at the next level the y coordinate, etc., alternating until an external node is encountered. For example, the following 2D tree is built for our sample set of points: The particular coordinate used is given at each node along with the point name: nodes for which the y coordinate is used are drawn vertically, and those for which the x coordinates is used are drawn horizontally. 344 CHAPTER 26 This technique corresponds to dividing up the plane in a simple way: all the points below the point at the root go in the left all those above in the right then all the points above the point at the root and to the left of the point in the right go in the left of the right of the root, etc. Every external node of the tree corresponds to some rectangle in the plane. The diagram below shows the division of the plane corresponding to the above tree. Each numbered region corresponds to an external node in the tree; each point lies on a horizontal or vertical line segment which defines the division made in the tree at that point. For example, if a new point was to be inserted into the tree from region 9 in the diagram above, we would move left at the root, since all such points are below A, then right at B, since all such points are to the right of B, then right at J, since all such points are above J. Insertion of a point in region 9 would correspond to drawing a vertical line through it in the diagram. The code for the construction of 2D trees is a straightforward modification of standard binary tree search to switch between x and y coordinates at each level: RANGE SEARCHING 345 function twoDinsert(p: point; t: link) : link; var link; d, td: boolean; begin repeat if d then else td :=p.y< if td then else d:= not d; until new(t); if then else twoDinsert:=t end As usual, we use a header node head with an artificial point which is “less” than all the other points so that the tree hangs off the right link of head, and an artificial node z is used to represent all the external nodes. The call head) will insert a new node containing into the tree. A boolean variable d is toggled on the way down the tree to effect the alternating tests on x and coordinates. Otherwise the procedure is identical to the standard procedure from Chapter 14. In fact, it turns out that for randomly distributed points, 2D trees have all the same performance characteristics of binary search trees. For example, the average time to build such a tree is proportional to N log but there is an worst case. To do range searching using 2D trees, we test the point at each node against the range along the dimension that is used to divide the plane of that node. For our example, we begin by going right at the root and right at node E, since our search rectangle is entirely above A and to the right of E. Then, at node F, we must go down both subtrees, since F falls in the x range defined by the rectangle (note carefully that this is not the same as F falling within the rectangle). Then the left of P and K are checked, corresponding to checking areas 12 and 14 of the plane, which overlap the search rectangle. This process is easily implemented with a straightforward generalization of the range procedure that we examined at the beginning of this chapter: 346 CHAPTER 26 procedure link; rectangle; d: boolean); var t2, tx2, boolean ; begin if then begin if then begin := txl tx2 end else begin end; if then not d); if then write(name( if then not d); end end This procedure goes down both only when the dividing line cuts the rectangle, which should happen infrequently for relatively small rectangles. Although the method hasn’t been fully analyzed, its running time seems sure to be proportional to + log N to retrieve points from reasonable ranges in a region containing N points, which makes it very competitive with the grid method. Multidimensional Range Searching Both the grid method and 2D trees generalize directly to more than two dimensions: simple, straightforward extensions to the above algorithms immediately yield range-searching methods which work for more than two dimensions. However, the nature of multidimensional space dictates that some caution is called for and that the performance characteristics of the algorithms might be difficult to predict for a particular application. To implement the grid method for k-dimensional searching, we simply make grid a k-dimensional array and use one index per dimension. The main problem is to pick a reasonable value for size. This problem becomes quite obvious when large k is considered: what type of grid should we use for dimensional search? The problem is that even if we use only three divisions per dimension, we need grid squares, most of which will be empty, for reasonable values of N. The generalization from 2D to trees is also straightforward: simply cycle through the dimensions (as we did for two dimensions by alternating between x and y) while going down the tree. As before, in a random situation, the resulting trees have the same characteristics as binary search trees. Also as before, there is a natural correspondence between the trees and a simple RANGE SEARCHING 347 geometric process. In three dimensions, branching at each node corresponds to cutting the three-dimensional region of interest with a plane; in general we cut the k-dimensional region of interest with a (k- 1)-dimensional hyperplane. If k is very large, there is likely to be a significant amount of imbalance in the trees, again because practical point sets can’t be large enough to take notice of randomness over a large number of dimensions. Typically, all points in a will have the same value across several dimensions, which leads to several one-way branches in the trees. One way to help alleviate this problem is, rather than simply cycle through the dimensions, always to use the dimension that will divide up the point set in the best way. This technique can also be applied to 2D trees. It requires that extra information (which dimension should be discriminated upon) be stored in each node, but it does relieve imbalance, especially in high-dimensional trees. In summary, though it is easy to see how to to generalize the programs for range searching that we have developed to handle multidimensional problems, such a step should not be taken lightly for a large application. Large databases with many attributes per record can be quite complicated objects indeed, and it is often necessary to have a good understanding of the characteristics of the database in order to develop an efficient range-searching method for a particular application. This is a quite important problem which is still being studied. 348 Exercises 1. 2. 3. 4. 5. 6. 7. 8. 10. Write a nonrecursive version of the range program given in the text. Write a program to print out all points from a binary tree which do not fall in a specified interval. Give the maximum and minimum number of grid squares that will be searched in the grid method as functions of the dimensions of the grid squares and the search rectangle. Discuss the idea of avoiding the search of empty grid squares by using linked lists: each grid square could be linked to the next grid square in the same row and the next grid square in the same column. How would the use of such a scheme affect the grid square size to be used? Draw the tree and the subdivision of the plane that results if we build a 2D tree for our sample points starting with a vertical dividing line. (That is, call range with a third argument of false rather than true.) Give a set of points which leads to a worst-case 2D tree having no nodes with two sons; give the subdivision of the plane that results. Describe how you would modify each of the methods, to return all points that fall within a given circle. Of all search rectangles with the same area, what shape is likely to make each of the methods perform the worst? Which method should be preferred for range searching in the case that the points cluster together in large groups spaced far apart? Draw the 3D tree that results when the points are inserted into an initially empty tree. 27. Geometric Intersection A natural problem arising frequently in applications involving geometric data is: “Given a set of N objects, do any two intersect?” The “objects” involved may be lines, rectangles, circles, polygons, or other types of geometric objects. For example, in a system for designing and processing integrated circuits or printed circuit boards, it is important to know that no two wires intersect to make a short circuit. In an industrial application for designing layouts to be executed by a numerically controlled cutting tool, it is important to know that no two parts of the layout intersect. In computer graphics, the problem of determining which of a set of objects is obscured from a particular viewpoint can be formulated as a geometric intersection problem on the projections of the objects onto the viewing plane. And in operations research, the mathematical formulation of many important problems leads naturally to a geometric intersection problem. The obvious solution to the intersection problem is to check each pair of objects to see if they intersect. Since there are about pairs of objects, the running time of this algorithm is proportional to For many applications, this is not a problem because other factors limit the number of objects which can be processed. However, geometric applications systems have become much more ambitious, and it is not uncommon to have to process hundreds of thousands or even millions of objects. The brute-force algorithm is obviously inadequate for such applications. In this section, we’ll study a general method for determining whether any two out of a set of N objects intersect in time proportional to N log N, based on algorithms presented by M. Shamos and D. Hoey in a seminal paper in 1976. First, we’ll consider an algorithm for returning all intersecting pairs among a set of lines that are constrained to be horizontal or vertical. This makes the problem easier in one sense (horizontal and vertical lines are relatively simple geometric objects), more difficult in another sense (returning all 349 350 CHAPTER 27 intersecting pairs is more difficult than simply determining whether one such pair exists). The implementation that we’ll develop applies binary search trees and the interval range-searching program of the previous chapter in a doubly recursive program. Next, we’ll examine the problem of determining whether any two of a set of lines intersect, with no constraints on the lines. The same general strategy as used for the horizontal-vertical case can be applied. In fact, the same basic idea works for detecting intersections among many other types of geometric objects. However, for lines and other objects, the extension to return all intersecting pairs is somewhat more complicated than for the horizontal-vertical case. Horizontal and Vertical Lines To begin, we’ll assume that all lines are either horizontal or vertical: the two points defining each line either have equal x coordinates or equal y coordinates, as in the following sample set of lines: J (This is sometimes called Manhattan geometry because, the Chrysler building notwithstanding, the Manhattan skyline is often sketched using only horizontal and vertical lines.) Constraining lines to be horizontal or vertical is cer- tainly a severe restriction, but this is far from a “toy” problem. It is often the case that this restriction is imposed for some other reason for a particular GEOMETRIC INTERSECTION application. For example, very large-scale integrated circuits are typically designed under this constraint. The general plan of the algorithm to find an intersection in such a set of lines is to imagine a horizontal scan line sweeping from bottom to top in the diagram. Projected onto this scan line, vertical lines are points, and horizontal lines are intervals: as the scan line proceeds from bottom to top, points (representing vertical lines) appear and disappear, and horizontal lines are periodically encountered. An intersection is found when a horizontal line is encountered which represents an interval on the scan line that contains a point representing a vertical line. The point means that the vertical line intersects the scan line, and the horizontal line lies on the scan line, so the horizontal and vertical lines must intersect. In this way, the two-dimensional problem of finding an intersecting pair of lines is reduced to the one-dimensional searching problem of the previous chapter. Of course, it is not necessary actually to “sweep” a horizontal line all the way up through the set of lines: since we only need to take action when endpoints of the lines are encountered, we can begin by sorting the lines according to their y coordinate, then processing the lines in that order. If the bottom endpoint of a vertical line is encountered, we add the x coordinate of that line to the tree; if the top endpoint of a vertical line is encountered, we delete that line from the tree; and if a horizontal line is encountered, we do an interval range search using its two x coordinates. As we’ll see, some care is required to handle equal coordinates among line endpoints (the reader should now be accustomed to encountering such difficulties in geometric algorithms). To trace through the operation of our algorithm on our set of sample points, we first must sort the line endpoints by their y coordinate: BBDEFHJCGDICAGJFEI Each vertical line appears twice in this list, each horizontal line appears once. For the purposes of the line intersection algorithm, this sorted list can be thought of as a sequence of insert (vertical lines when the bottom endpoint is encountered), delete (vertical lines when the top endpoint is encountered), and range (for the endpoints of horizontal lines) commands. All of these “commands” are simply calls on the standard binary tree routines from Chapters 14 and 26, using x coordinates as keys. For our example, we begin with the following sequence of binary search trees: 352 CHAPTER 27 E D E F First B is inserted into an empty tree, then deleted. Then D, E, and F are inserted. At this point, H is encountered, and a range search for the interval defined by H is performed on the rightmost tree in the above diagram. This search discovers the intersection between H and F. Proceeding down the list above in order, we add J, C, then G to get the following sequence of trees: E F J Next, the upper endpoint of D is encountered, so it is deleted; then I is added and C deleted, which gives the following sequence of trees: At this point A is encountered, and a range search for the interval defined . more than two dimensions: simple, straightforward extensions to the above algorithms immediately yield range-searching methods which work for more than. some caution is called for and that the performance characteristics of the algorithms might be difficult to predict for a particular application. To implement

Ngày đăng: 20/10/2013, 16:15

Xem thêm