Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 50 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
50
Dung lượng
2,61 MB
Nội dung
CHAPTER 9 DESIGNING SYSTEMS FOR APPLICATION CONCURRENCY 281 Tip If you are interested in finding further statistics about the waits enforced by Resource Governor, try looking for rows in the sys.dm_os_wait_stats DMV where wait_type is RESMGR_THROTTLED . Summary Concurrency is a complex topic with many possible solutions. In this chapter, I introduced the various concurrency models that should be considered from a business process and data collision point of view, and explained how they differ from the similarly named concurrency models supported by the SQLServer database engine. Pessimistic concurrency is probably the most commonly used form, but it can be complex to set up and maintain. Optimistic concurrency, while more lightweight, might not be so applicable to many business scenarios, and multivalue concurrency control, while a novel technique, might be difficult to implement in such a way that allowing collisions will help deliver value other than a performance enhancement. Finally, I covered an overview of how Resource Governor can balance th e way in which limited resources are allocated between different competing requests in a concurrent environment. The discussion here only scratched the surface of the potential for this technique, and I recommend that readers interested in the subject dedicate some time to further research this powerful feature. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. C H A P T E R 10 Working with Spatial Data The addition of spatial capabilities was one of the most exciting new features introduced in SQLServer2008. Although generally a novel concept for many SQL developers, the principles of working with spatial data have been well established for many years. Dedicated geographic information systems (GISs), such as ARC/INFO from ESRI, have existed since the 1970s. However, until recently, spatial data analysis has been regarded as a distinct, niche subject area, and knowledge and usage of spatial data has remained largely confined within its own realm rather than being integrated with mainstream development. The truth is that there is hardly any corporate database that does not store spatial information of some sort or other. Customers’ addresses, sales regions, the area targeted by a local marketing campaign, or the routes taken by delivery and logistics vehicles all represent spatial data that can be found in many common applications. In this chapter, I’ll first describe some of the fundamental principles involved in working with spatial data, and then discuss some of the important features of the geometry and geography datatypes, which are the specific datatypes used to represent and perform operations on spatial data in SQL Server. After demonstrating how to use these methods to answer some common spatial questions, I’ll then concentrate on the elements that need to be considered to create high-performance spatial applications. Note Working with spatial data presents a unique set of challenges, and in many cases requires the adoption of specific techniques and understanding compared to other traditional datatypes. If you’re interested in a more thorough introduction to spatial data in SQL Server, I recommend reading Beginning Spatial with SQLServer 2008, one of my previous books (Apress, 2008). Modeling Spatial Data Spatial data describes the position, shape, and orientation of objects in space. These objects might be tangible, physical things, like an office building, railroad, or mountain, or they might be abstract features such as the imaginary line marking the political boundary between countries or the area served by a particular store. SQLServer adopts a vector model of spatial data, in which every object is represented using one or more geometries—primitive shapes that approximate the shape of the real-world object they represent. There are three basic types of geometry that may be used with the geometry and geography datatypes: Point, LineString, and Polygon: 283 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 10 WORKING WITH SPATIAL DATA • A Point is the most fundamental type of geometry, representing a singular location in space. A Point geometry is zero-dimensional, meaning that it has no associated area or length. • A LineString is comprised of a series of two or more distinct points, together with the line segments that connect those points together. LineStrings have a length, but no associated area. A simple LineString is one in which the path drawn between the points does not cross itself. A closed LineString is one that starts and ends at the same point. A LineString that is both simple and closed is known as a ring. • A Polygon consists of an exterior ring, which defines the perimeter of the area of space contained within the polygon. A polygon may also specify one or more internal rings, which define areas of space contained within the external ring but excluded from the Polygon. Internal rings can be thought of as “holes” cut out of the Polygon. Polygons are two-dimensional—they have a length measured as the total length of all defined rings, and also an area measured as the space contained within the exterior ring (and not excluded by any interior rings). Note The word geometry has two distinct meanings when dealing with spatial data in SQL Server. To make the distinction clear, I will use the word geometry (regular font) as the generic name to describe Points, LineStrings, and Polygons, and geometry (code font) to refer to the geometry datatype. Sometimes, a single feature may be represented by more than one geometry, in which case it is known as a GeometryCollection. GeometryCollections may be homogenous or heterogeneous. For example, the Great Wall of China is not a single contiguous wall; rather, it is made up of several distinct sections of wall. As such, it could be represented as a MultiLineString—a homogenous collection of LineString geometries. Similarly, many countries, such as Japan, may be represented as a MultiPolygon—a GeometryCollection consisting of several polygons, each one representing a distinct island. It is also possible to have a heterogeneous GeometryCollection, such as a collection containing a Point, three LineStrings, and two Polygons. Figure 10-1 illustrates the three basic types of geometries used in SQLServer2008 and some examples of situations in which they are commonly used. Having chosen an appropriate type of geometry to represent a given feature, we need some way of relating each point in the geometry definition to the relevant real-world position it represents. For example, to use a Polygon geometry to represent the US Department of Defense Pentagon building, we need to specify that the five points that define the boundary of the Polygon geometry relate to the location of the five corners of the building. So how do we do this? You are probably familiar with the terms longitude and latitude, in which case you may be thinking that it is simply a matter of listing the relevant latitude and longitude coordinates for each point in the geometry. Unfortunately, it’s not quite that simple. 284 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 10 WORKING WITH SPATIAL DATA Figure 10-1. Different types of geometries and their common uses What many people don’t realize is that any particular point on the earth’s surface does not have only one unique latitude or longitude associated with it. There are, in fact, many different systems of latitude and longitude, and the coordinates of a given point on the earth will vary depending on which system is used. Furthermore, latitude and longitude coordinates are not the only way of expressing positions on the earth—there are other types of coordinates that define the location of an object without using latitude and longitude at all. In order to understand how to specify the coordinates of a geometry, we first need to examine how different spatial reference systems work. 285 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 10 WORKING WITH SPATIAL DATA Spatial Reference Systems A spatial reference system is a system designed to unambiguously identify and describe the location of any point in space. This ability is essential to enable spatial data to store the coordinates of geometries used to represent features on the earth. To describe the positions of points in space, every spatial reference system is based on an underlying coordinate system. There are many different types of coordinate systems used in various fields of mathematics, but when defining geospatial data in SQLServer 2008, you are most likely to use a spatial reference system based on either a geographic coordinate system or a projected coordinate system. Geographic Coordinate Systems In a geographic coordinate system, any position on the earth’s surface can be defined using two angular coordinates: • The latitude coordinate of a point measures the angle between the plane of the equator and a line drawn perpendicular to the surface of the earth at that point. • The longitude coordinate measures the angle in the equatorial plane between a line drawn from the center of the earth to the point and a line drawn from the center of the earth to the prime meridian. Typically, geographic coordinates are measured in degrees. As such, latitude can vary between –90° (at the South Pole) and +90° (at the North Pole). Longitude values extend from –180° to +180°. Figure 10-2 illustrates how a geographic coordinate system can be used to identify a point on the earth’s surface. Projected Coordinate Systems In contrast to the geographic coordinate system, which defines positions on a three-dimensional, round model of the earth, a projected coordinate system describes positions on the earth’s surface on a flat, two-dimensional plane (i.e., a projection of the earth’s surface). In simple terms, a projected coordinate system describes positions on a map rather than positions on a globe. If we consider all of the points on the earth’s surface to lie on a flat plane, we can define positions on that plane using familiar Cartesian coordinates of x and y (sometimes referred to as Easting and Northing), which represent the distance of a point from an origin along the x axis and y axis, respectively. Figure 10-3 illustrates how the same point illustrated in Figure 10-2 could be defined using a projected coordinate system. 286 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 10 WORKING WITH SPATIAL DATA Figure 10-2. Describing a position on the earth using a geographic coordinate system Figure 10-3. Describing a position on the earth using a projected coordinate system 287 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 10 WORKING WITH SPATIAL DATA Applying Coordinate Systems to the Earth A set of coordinates from either a geographic or projected coordinate system does not, on its own, uniquely identify a position on the earth. We need to know additional information, such as where to measure those coordinates from and in what units, and what shape to use to model the earth. Therefore, in addition to specifying the coordinate system used, every spatial reference system must also contain a datum, a prime meridian, and a unit of measurement. Datum A datum contains information about the size and shape of the earth. Specifically, it contains the details of a reference ellipsoid and a reference frame, which are used to create a geodetic model of the earth onto which a coordinate system can be applied. The reference ellipsoid is a three-dimensional shape that is used as an approximation of the shape of the earth. Although described as a reference ellipsoid, most models of the earth are actually an oblate spheroid—a squashed sphere that can be exactly mathematically described by two parameters—the length of the semimajor axis (which represents the radius of the earth at the equator) and the length of the semiminor axis (the radius of the earth at the poles), as shown in Figure 10-4. The degree by which the spheroid is squashed may be stated as a ratio of the semimajor axis to the difference between the two axes, which is known as the inverse-flattening ratio. Different reference ellipsoids provide different approximations of the shape of the earth, and there is no single reference ellipsoid that provides a best fit across the whole surface of the globe. For this reason, spatial applications that operate at a regional level tend to use a spatial reference system based on whatever reference ellipsoid provides the best approximation of the earth’s surface for the area in question. In Britain, for example, this is the Airy 1830 ellipsoid, which has a semimajor axis of 6,377,563m and a semiminor axis of 6,356,257m. In North America, the NAD83 ellipsoid is most commonly used, which has a semimajor axis of 6,378,137m and a semiminor axis of 6,356,752m. The reference frame defines a set of locations in the real world that are assigned known coordinates relative to the reference ellipsoid. By establishing a set of points with known coordinates, these points can then be used to correctly line up the coordinate system with the reference ellipsoid so that the coordinates of other, unknown points can be determined. Reference points are normally places on the earth’s surface itself, but they can also be assigned to the positions of satellites in stationary orbit around the earth, which is how the WGS84 datum used by global positioning system (GPS) units is realized. Prime Meridian As defined earlier, the geographic coordinate of longitude is the angle in the equatorial plane between the line drawn from the center of the earth to a point and the line drawn from the center of the earth to the prime meridian. Therefore, any spatial reference system must state its prime meridian—the axis from which the angle of longitude is measured. It is a common misconception to believe that there is a single prime meridian based on some inherent fundamental property of the earth. In fact, the prime meridian of any spatial reference system is arbitrarily chosen simply to provide a line of zero longitude from which all other coordinates of longitude can be measured. One commonly used prime meridian passes through Greenwich, London, but there are many others. If you were to choose a different prime meridian, the value of every longitude coordinate in a given spatial reference system would change. 288 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 10 WORKING WITH SPATIAL DATA Figure 10-4. Properties of a reference ellipsoid Projection A projected coordinate reference system allows you to describe positions on the earth on a flat, two- dimensional image of the world, created as a result of projection. There are many ways of creating such map projections, and each one results in a different image of the world. Some common map projections include Mercator, Bonne, and equirectangular projections, but there are many more. It is very important to realize that, in order to represent a three-dimensional model of the earth on a flat plane, every map projection distorts the features of the earth in some way. Some projections attempt to preserve the relative area of features, but in doing so distort their shape. Other projections preserve the properties of features that are close to the equator, but grossly distort features toward the poles. Some compromise projections attempt to balance distortion in order to create a map in which no one 289 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. CHAPTER 10 WORKING WITH SPATIAL DATA aspect is distorted too significantly. The magnitude of distortion of features portrayed on the map is normally related to the extent of the area projected. For this reason, projected spatial reference systems tend to work best when only applied to a single country or smaller area, rather than a full world view. Since the method of projection affects the features on the resulting map image, coordinates from a projected coordinate system are only valid for a given projection. Spatial Reference Identifiers The most common spatial reference system in global usage uses a geographic coordinate based on the WGS84 datum, which has a reference ellipsoid of radius 6,378,137m and an inverse-flattening ratio of 298.257223563. Coordinates are measured in degrees, based on a prime meridian of Greenwich. This system is used by handheld GPS devices, as well as many consumer mapping products, including Google Earth and Bing Maps APIs. Using the Well-Known Text (WKT) format, which is the industry standard for such information (and the system SQLServer uses in the well_known_text column of the sys.spatial_references table), the properties of this spatial reference system can be expressed as follows: GEOGCS[ "WGS 84", DATUM[ "World Geodetic System 1984", ELLIPSOID[ "WGS 84", 6378137, 298.257223563 ] ], PRIMEM["Greenwich", 0], UNIT["Degree", 0.0174532925199433] ] Returning to the example at the beginning of this chapter, using this spatial reference system, we can describe the approximate location of each corner of the US Pentagon building as a pair of latitude and longitude coordinates as follows: 38.870, -77.058 38.869, -77.055 38.871, -77.053 38.873, -77.055 38.872, -77.058 Note that, since we are describing points that lie to the west of the prime meridian, the longitude coordinate in each case is negative. Now let’s consider another spatial reference system—the Universal Transverse Mercator (UTM) Zone 18N system, which is a projected coordinate system used in parts of North America. This spatial reference system is based on the 1983 North American datum, which has a reference ellipsoid of 6,378,137m and an inverse-flattening ratio of 298.257222101. This geodetic model is projected using a transverse Mercator projection, centered on the meridian of longitude 75°W, and coordinates based on the projected image are measured in meters. The full properties of this system are expressed in WKT format as follows: 290 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. [...]... the query optimizer chooses a plan using the idxallCountries spatial index The cost-based estimates for spatial queries are not always accurate, which means that SQL Server2008 does not always pick the optimal plan for a query SQLServer2008 SP1 improves the situation, but there are still occasions when an explicit hint is required to ensure that a spatial index is used As in the last example, this... challenge presented to many users new to the spatial features in SQLServer2008 is how to get spatial data into the database Unfortunately, the most commonly used spatial format, the ESRI shapefile format (SHP), is not directly supported by any of the geography or geometry methods, nor by any of the file data sources available in SQLServer Integration Services (SSIS) What’s more, internally, geography... and Filter(), which performs an approximate test of intersection based on a spatial index In this section, I won’t discuss every available method—you can look these up on SQLServer Books Online (or in Beginning Spatial with SQL Server 2008) Instead, I’ll examine a couple of common scenarios and illustrate how you can combine one or more methods to solve them Before we begin, let’s create a clustered... represented by SRID 4326 and SRID 26918, respectively Every time you state an item of spatial data using the geography or geometry types in SQL Server 2008, you must state the corresponding SRID from which the coordinate values were obtained What’s more, since SQLServer does not provide any mechanism for converting between spatial reference systems, if you want to perform any calculations involving... such generic spatial data, from a variety of commercial and free sources SQLServer doesn’t provide any specific tools for importing predefined spatial data, but there are a number of third-party tools that can be used for this purpose It is also possible to use programmatic techniques based on the functionality provided by the SqlServer.Types.dll library, which contains the methods used by the geography... All geography data, in contrast, is assumed to be valid at all times Although this means that once geography data is in SQL Server, you can work with it comfortable in the knowledge that it is always valid, it can provide an obstacle to importing that data in the first place Since SQLServer cannot import invalid geography data, you may have to rely on external tools to validate and fix any erroneous... Figure 10-6 Polygon ring orientation is significant for the geography datatype The solution used by SQLServer (and in common with some other spatial systems) is to consider the ring orientation of the Polygon—i.e., the order in which the points of the ring are specified When defining a geography Polygon, SQLServer treats the area on the “left” of the path drawn between the points as contained within the... conciseness, the WKT format is a popular way of storing and sharing spatial data, and is the format used in most of the examples in this chapter It is also the format used in the spatial documentation in SQL Server2008 Books Online, at http://msdn.microsoft.com/enus/library/ms130214.aspx The following code listing demonstrates the WKT string used to represent a Point geometry located at an x coordinate of... find out the SRID associated with any given spatial reference system, you can use the search facility provided at www.epsg-registry.org Geography vs Geometry Early Microsoft promotional material for SQL Server2008 introduced the geography datatype as suitable for “round-earth” data, whereas the geometry datatype was for “flat-earth” data These terms have since been repeated verbatim by a number of commentators,... wizard successfully completes the import operation, every column from the Geonames dataset will be imported into the allCountries table, but we are not yet making use of the spatial capabilities of SQLServer2008 Every record in the table has an associated latitude and longitude coordinate value, but these are currently held in separate, floating point, columns Since the Geonames coordinates are measured . introduction to spatial data in SQL Server, I recommend reading Beginning Spatial with SQL Server 2008, one of my previous books (Apress, 2008) . Modeling Spatial. geometry types in SQL Server 2008, you must state the corresponding SRID from which the coordinate values were obtained. What’s more, since SQL Server does not