24 INTRODUCTION CHAPTER 1 COLLADA COLLADA, short for COLLAborative Design Activity, 12 started as an open-source project led by Sony, but is nowadays being developed and promoted by the Khronos Group. COLLADA is an interchange format for 3D content; it is the glue which binds together digital content creation (DCC) tools and various intermediate processing tools to form a production pipeline. In other words, COLLADA is a tool for content development, not for content deliver y—the final applications are better served with more compact formats designed for their particular tasks. COLLADA can represent pretty much everything in a 3D scene that the content authoring tools can, including geometry, material and shading properties, physics, and animation, just to name a few. It also has a mobile profile that corresponds to OpenGL ES 1.x and M3G 1.x, enabling an easy mapping to the M3G binary file format. One of the latest addi- tions is COLLADA FX, which allows interchange of complex, multi-pass shader effects. COLLADA FX allows encapsulation of multiple descriptions of an effect, such as different levels of detail, or different shading for daytime and nighttime versions. Exporters for COLLADA are currently available for all major 3D content creation tools, such as Lightwave, Blender, Maya, Softimage, and 3ds Max. A stand-alone viewer is also available from Feeling Software. Adobe uses COLLADA as an import format for editing 3D textures, and it has been adopted as a data format for Google Earth and Unreal Engine. For an in-depth coverage of COLLADA, see the book by Arnaud and Barnes [AB06]. 12 www.khronos.org/collada PART I ANATOMY OF A GRAPHICS ENGINE This page intentionally left blank 2 CHAPTER LINEAR ALGEBRA FOR 3D GRAPHICS This chapter is about the coordinate systems and transformations that 3D objects undergo during their travel through the graphics pipeline, as illustrated in Figure 2.1. Understand- ing this subset of linear algebra is crucial for figuring out what goes on inside a 3D graphics engine, as well as for making effective use of such an engine. If you want to rush ahead into the graphics primitives instead, study Figure 2.1, skip to Chapter 3, and return here later. 2.1 COORDINATE SYSTEMS To be able to define shapes and locations, we need to have a frame of reference: a coordinate system, also known as a space. A coordinate system has an origin and a set of axes. The origin is a point (or equivalently, a location), while the axes are directions. As a mathematical construct, a coordinate system may have an arbitrary set of axes with arbitrary directions, but here we are only concerned about coordinate systems that are three-dimensional, orthonormal, and right-handed. Such coordinate systems have three axes, usually called x, y, and z. Each axis is normalized (unit length) and orthogonal (perpendicular) to the other two. Now, if we first place the x and y axes so that they meet at the origin at right angles (90 ◦ ), we have two possibilities to orient the z axis so that it is perpendicular to both x and y. These choices make the coordinate system either right- handed or left-handed; Figure 2.2 shows two formulations of the right-handed choice. 27 28 LINEAR ALGEBRA FOR 3D GRAPHICS CHAPTER 2 Object coordinates eye coordinates clip coordinates projection matrix w w 2w 2w near far model- view matrix viewport and depth range window coordinates 21 21 1 1 normalized device coordinates height height width 0 0 0 0 1 Figure 2.1: Summary of the coordinate system transformations from vertex definition all the way to the frame buffer. X Y Z X Y Z Figure2.2: Two different ways to visualize a right-handed, orthogonal 3D coordinate system. Left: the thumb, index finger, and middle finger of the right hand are assigned the axes x, y, and z, in that order. The positive direction of each axis is pointed to by the corresponding finger. Right: we grab the z axis with the right hand so that the thumb extends toward the positive direction; the other fingers then indicate the direction of positive rotation angles on the xy-plane. SECTION 2.1 COORDINATE SYSTEMS 29 A coordinate system is always defined with respect to some other coordinate system, except for the global world coordinate system. For example, the coordinate system of a room might have its origin at the southwest corner, with the x axis pointing east, y point- ing north, and z upward. A chair in the room might have its own coordinate system, its origin at its center of mass, and its axes aligned with the chair’s axes of symmetry. When the chair is moved in the room, its coordinate system moves and may reorient with respect to the parent coordinate system (that of the room). 2.1.1 VECTORS AND POINTS A3Dpoint is a location in space, in a 3D coordinate system. We can find a point p with coordinates p x p y p z by starting from the origin (at [ 000 ] ) and moving the dis- tance p x along the x axis, from there the distance p y along y, and finally the distance p z along z. Two points define a line segment between them, three points define a triangle with corners at those points, and several interconnected triangles can be used to define the surface of an object. By placing many such objects into a world coordinate system, we define a virtual world. Then we only need to position and orient an imaginary camera to define a viewpoint into the world, and finally let the graphics engine create an image. If we wish to animate the world, we have to move either the camera or some of the points, or both, before rendering the next frame. When we use points to define geometric entities such as triangles, we often call those points vertices. We may also expand the definition of a vertex to include any other data that are associated with that surface point, such as a color. Besides points, we also need vectors to represent surface normals, viewing directions, light directions, and so on. A vector v is a displacement, a difference of two points; it has no position, but does have a direction and a length. Similar to points, vectors can be repre- sented by three coordinates. The vector v ab , which is a displacement from point a to point b, has coordinates b x − a x b y − a y b z − a z . It is also possible to treat a point as if it were a vector from the origin to the point itself. The sum of two vectors is another vector: a + b = a x + b x a y + b y a z + b z .Ifyouadd a vector to a point, the result is a new point that has been displaced by the vector. Vectors can also be multiplied by a scalar: sa = sa x sa y sa z . Subtraction is simply an addition where one of the vectors has been multiplied by −1. 2.1.2 VECTOR PRODUCTS There are two ways to multiply two 3D vectors. The dot product or scalar product of vectors a and b can be defined in two different but equivalent ways: a · b = a x b x + a y b y + a z b z (2.1) a · b = cos(θ)||a||||b|| (2.2) 30 LINEAR ALGEBRA FOR 3D GRAPHICS CHAPTER 2 The first definition is algebraic, using the vector coordinates. The latter definition is geometric, and is based on the lengths of the two vectors (||a|| and ||b||), and the small- est angle between them ( θ). An important property related to the angle is that when the vectors are orthogonal, the cosine term and therefore the whole expression goes to zero. This is illustrated in Figure 2.3. The dot product allows us to compute the length, or norm, of a vector. We first com- pute the dot product of the vector with itself using the algebraic formula: a · a. We then note that θ = 0 and therefore cos(θ) = 1. Now, taking the square root of Equation (2.2) yields the norm: ||a|| = √ a · a. (2.3) We can then normalize the vector so that it becomes unit length: ˆ a = a/||a||. (2.4) The other way to multiply two vectors in 3D is called the cross product. While the dot product can be done in any coordinate system, the cross product only exists in 3D. The cross product creates a new vector, a × b = a y b z − a z b y a z b x − a x b z a x b y − a y b x , (2.5) which is perpendicular to both a and b; see Figure 2.3. The new vector is also right-handed with respect to a and b in the same way as shown in Figure 2.2. The length of the new vector is sin(θ)||a||||b||.Ifa and b are parallel (θ = 0 ◦ or θ = 180 ◦ ), the result is zero. Finally, reversing the order of multiplication flips the sign of the result: a × b = −b ×a. (2.6) a . b . 0 a . b 5 0 a . b , 0 a b a 3 b a b a b a b Figure2.3: The dot product produces a positive number when the vectors form an acute angle (less than 90 ◦ ), zero when they are perpendicular (exactly 90 ◦ ), and negative when the angle is obtuse (greater than 90 ◦ ). The cross product defines a third vector that is in a right-hand orientation and perpendicular to both vectors. SECTION 2.2 MATRICES 31 2.1.3 HOMOGENEOUS COORDINATES Representing both points and direction vectors with three coordinates can be confusing. Homogeneous coordinates are a useful tool to make the distinction explicit. We simply add a fourth coordinate (w): if w = 0, we have a direction, otherwise a location. If we have a homogeneous point [h x h y h z h w ], we get the corresponding 3D point by dividing the components by h w .Ifh w = 0 we would get a point infinitely far away, which we interpret as a direction toward the point h x h y h z . Conversely, we can homogenize the point p x p y p z by adding a fourth component: p x p y p z 1 . In fact, we can use any non-zero w, and all such wp x wp y wp z w correspond to the same 3D point. We can also see that with normalized homogeneous coordinates—for which w is either 1 or 0—taking a difference of two points creates a direction vector (w becomes 1 −1 = 0), and adding a direction vector to a point displaces the point by the vector and yields a new point (w becomes 1 + 0 = 1). There is another, even more important, reason for adopting homogeneous 4D coordi- nates instead of the more familiar 3D coordinates. They allow us to express all linear 3D transformations using a 4 × 4 matrix that operates on 4 × 1 homogeneous vectors. This representation is powerful enough to express translations, rotations, scalings, shearings, and even perspective and parallel projections. 2.2 MATRICES A 4 × 4 matrix M has components m ij where i stands for the row and j stands for the column: M = ⎡ ⎢ ⎢ ⎢ ⎣ m 00 m 01 m 02 m 03 m 10 m 11 m 12 m 13 m 20 m 21 m 22 m 23 m 30 m 31 m 32 m 33 ⎤ ⎥ ⎥ ⎥ ⎦ , (2.7) while a column vector v has components v i : v = ⎡ ⎢ ⎢ ⎢ ⎣ v 0 v 1 v 2 v 3 ⎤ ⎥ ⎥ ⎥ ⎦ = [ v 0 v 1 v 2 v 3 ] T . (2.8) The transpose operation above converts a row vector to column vector, and vice versa. We will generally use column vectors in the rest of this book, but will write them in transposed form: v = [ v 0 v 1 v 2 v 3 ] T .OnamatrixM = [m ij ], transposition produces a matrix 32 LINEAR ALGEBRA FOR 3D GRAPHICS CHAPTER 2 that is mirrored with respect to the diagonal: M T = [m ji ], that is, columns are switched with rows. 2.2.1 MATRIX PRODUCTS A matrix times a vector produces a new vector. Directions and positions are both trans- formed by multiplying the corresponding homogeneous vector v with a transformation matrix M as v = Mv. Each component of this column vector v is obtained by taking a dotproductofarowofM with v; the first row (M 0• ) producing the first component, the second row (M 1• ) producing the second component, and so on: v = Mv = ⎡ ⎢ ⎢ ⎢ ⎣ [ m 00 m 01 m 02 m 03 ] · v [ m 10 m 11 m 12 m 13 ] · v [ m 20 m 21 m 22 m 23 ] · v [ m 30 m 31 m 32 m 33 ] · v ⎤ ⎥ ⎥ ⎥ ⎦ . (2.9) Note that for this to work, M needs to have as many columns as v has rows. An alternative, and often more useful way when trying to understand the geometric mean- ing of the matrix product, is to think of M being composed of four column vectors M •0 , ,M •3 , each being multiplied by the corresponding component of v, and finally being added up: v = Mv = v 0 ⎡ ⎢ ⎢ ⎢ ⎣ m 00 m 10 m 20 m 30 ⎤ ⎥ ⎥ ⎥ ⎦ + v 1 ⎡ ⎢ ⎢ ⎢ ⎣ m 01 m 11 m 21 m 31 ⎤ ⎥ ⎥ ⎥ ⎦ + v 2 ⎡ ⎢ ⎢ ⎢ ⎣ m 02 m 12 m 22 m 32 ⎤ ⎥ ⎥ ⎥ ⎦ + v 3 ⎡ ⎢ ⎢ ⎢ ⎣ m 03 m 13 m 23 m 33 ⎤ ⎥ ⎥ ⎥ ⎦ . (2.10) The product of two matrices, on the other hand, produces another matrix, which can be obtained from several products of a matrix and a vector. Simply break the columns of the rightmost matrix apart into several column vectors, multiply each of them by the matrix on the left, and join the results into columns of the resulting matrix: AB = A(B •0 ) A(B •1 ) A(B •2 ) A(B •3 ) . (2.11) Note that in general matrix multiplication does not commute, that is, the order of multiplication is important (AB = BA). The transpose of a product is the product of transposes, but in the reverse order: (AB) T = B T A T . (2.12) SECTION 2.2 MATRICES 33 Now we are ready to express the dot product as a matrix multiplication: a · b = a T b = a 0 a 1 a 2 ⎡ ⎢ ⎣ b 0 b 1 b 2 ⎤ ⎥ ⎦ , (2.13) that is, transpose a into a row vector and multiply it with a column vector b. 2.2.2 IDENTITY AND INVERSE The number one is special in the sense that when any number is multiplied with it, that number remains unchanged (1 · a = a), and for any number other than zero there is an inverse that produces one (a 1 a = aa −1 = 1). For matrices, we have an identity mat rix: I = ⎡ ⎢ ⎢ ⎢ ⎣ 1000 0100 0010 0001 ⎤ ⎥ ⎥ ⎥ ⎦ (2.14) A matrix multiplied by the identity matrix remains unchanged (M = IM = MI). If a matrix M has an inverse wedenoteitbyM −1 , and the matrix multiplied with its inverse yields identity: MM −1 = M −1 M = I. Only square matrices, for which the number of rows equals the number of columns, can have an inverse, and only the matrices where all columns are linearly independent have inverses. The inverse of a product of matrices is the product of inverses, in reverse order: (AB) −1 = B −1 A −1 . (2.15) Letuscheck:AB(AB) −1 = ABB −1 A −1 = AIA −1 = AA −1 = I. We will give the inverses of most transformations that we introduce, but in a general case you may need to use a numerical method such as Gauss-Jordan elimination to calculate the inverse [Str03]. As discussed earlier, we can use 4 × 4 matrices to represent various transformations. In particular, you can interpret every matrix as transforming a vertex to a new coordinate system. If M ow transforms a vertex from its local coordinate system, the object coordinates, to world coordinates (v = M ow v), its inverse performs the transformation from world coordinates to object coordinates (v = M −1 ow v = M wo v ), that is, M −1 ow = M wo . 2.2.3 COMPOUND TRANSFORMATIONS Transformation matr ices can be compounded. If M ow transformsavertexfromobject coordinates to world coordinates, and M we transforms from world coordinates to eye . right- handed or left-handed; Figure 2.2 shows two formulations of the right-handed choice. 27 28 LINEAR ALGEBRA FOR 3D GRAPHICS CHAPTER 2 Object coordinates eye coordinates clip coordinates projection matrix w w 2w 2w near far model- view matrix viewport and depth. place the x and y axes so that they meet at the origin at right angles (90 ◦ ), we have two possibilities to orient the z axis so that it is perpendicular to both x and y. These choices make the. coordinate system moves and may reorient with respect to the parent coordinate system (that of the room). 2.1.1 VECTORS AND POINTS A3Dpoint is a location in space, in a 3D coordinate system.