5.2. MATRICES AND DETERMINANTS 185 Example 3. Consider the real symmetric matrix A = 11 –62 –610–4 2 –46 . Its eigenvalues are λ 1 = 18, λ 2 = 6, λ 3 = 3 and the respective eigenvectors are X 1 = 1 2 2 , X 2 = 2 1 –2 , X 3 = 2 –2 1 . Consider the matrix S with the columns X 1 , X 2 ,andX 3 : S = 12 2 21–2 2 –21 . Taking A 1 = S T AS, we obtain a diagonal matrix: A 1 = S T AS = 12 2 21–2 2 –21 11 –62 –610–4 2 –46 12 2 21–2 2 –21 = 27 0 0 054 0 0 0 162 . Taking A 2 = S –1 AS, we obtain a diagonal matrix with the eigenvalues on the main diagonal: A 2 = S –1 AS =– 1 27 –3 –6 –6 –6 –36 –66–3 11 –62 –610–4 2 –46 12 2 21–2 2 –21 = 30 0 06 0 0018 . We note that A 1 = 9 A 2 . 5.2.3-8. Characteristic equation of a matrix. The algebraic equation of degree n f A (λ) ≡ det(A – λI) ≡ det [a ij – λδ ij ] ≡ a 11 – λa 12 ··· a 1n a 21 a 22 – λ ··· a 2n . . . . . . . . . . . . a n1 a n2 ··· a nn – λ = 0 is called the characteristic equation of the matrix A of size n × n,andf A (λ) is called its characteristic polynomial. The spectrum of the matrix A (i.e., the set of all its eigenvalues) coincides with the set of all roots of its characteristic equation. The multiplicity of every root λ i of the characteristic equation is equal to the multiplicity m i of the eigenvalue λ i . Example 4. The characteristic equation of the matrix A = 4 –81 5 –91 4 –6 –1 has the form f A (λ) ≡ det 4 – λ –81 5 –9 – λ 1 4 –6 –1 – λ =–λ 3 – 6λ 2 – 11λ – 6 =–(λ + 1)(λ + 2)(λ + 3). Similar matrices have the same characteristic equation. Let λ j be an eigenvalue of a square matrix A.Then 1) αλ j is an eigenvalue of the matrix αA for any scalar α; 2) λ p j is an eigenvalue of the matrix A p (p=0, 1, , N foranondegenerate A;otherwise, p = 0, 1, , N), where N is a natural number; 3) a polynomial f(A) of the matrix A has the eigenvalue f (λ). 186 ALGEBRA Suppose that the spectra of matrices A and B consist of eigenvalues λ j and μ k , respec- tively. Then the spectrum of the Kronecker product A ⊗ B is the set of all products λ j μ k . The spectrum of the direct sum of matrices A = A 1 ⊕ ⊕ A n is the union of the spectra of the matrices A 1 , , A n . The algebraic multiplicities of the same eigenvalues of matrices A 1 , , A n are summed. Regarding bounds for eigenvalues see Paragraph 5.6.3-4. 5.2.3-9. Cayley–Hamilton theorem. Sylvester theorem. CAYLEY–HAMILTON THEOREM. Each square matrix A satisfies its own characteristic equa- tion; i.e., f A (A)=0 . Example 5. Let us illustrate the Cayley–Hamilton theorem by the matrix in Example 4: f A (A)=–A 3 – 6A 2 – 11A – 6I =– 70 –116 19 71 –117 19 64 –102 11 – 6 –20 34 –5 –21 35 –5 –18 28 –1 – 11 4 –81 5 –91 4 –6 –1 – 6 100 010 001 = 0. A scalar polynomial p(λ) is called an annihilating polynomial of a square matrix A if p(A)=0. For example, the characteristic polynomial f A (λ) is an annihilating polynomial of A. The unique monic annihilating polynomial of least degree is called the minimal polynomial of A and is denoted by ψ(λ). The minimal polynomial is a divisor of every annihilating polynomial. By dividing an arbitrary polynomial f(λ)ofdegreen by an annihilating polynomial p(λ) of degree m (p(λ) ≠ 0), one obtains the representation f(λ)=p(λ)q(λ)+r(λ), where q(λ) is a polynomial of degree n – m (if m ≤ n)orq(λ)=0 (if m > n)andr(λ)isa polynomial of degree l < m or r(λ)=0. Hence f(A)=p(A)q( A)+r(A), where p(A)=0 and f (A)=r(A). The polynomial r(λ) in this representation is called the interpolation polynomial of A. Example 6. Let f(A)=A 4 + 4A 3 + 2A 2 – 12A – 10I, where the matrix A is defined in Example 4. Dividing f(λ) by the characteristic polynomial f A (λ)=–λ 3 – 6λ 2 – 11λ – 6, we obtain the remainder r(λ)=3λ 2 + 4λ + 2. Consequently, f(A)=r(A)=3A 2 + 4A + 2I. THEOREM. Every analytic function of a square n × n matrix A can be represented as a polynomial of the same matrix, f(A)= 1 Δ(λ 1 , λ 2 , , λ n ) n k=1 Δ n–k A n–k , where Δ(λ 1 , λ 2 , , λ n ) is the Vandermonde determinant and Δ i is obtained from Δ by replacing the (i + 1) st row by (f(λ 1 ), f (λ 2 ), , f (λ n )) . Example 7. Let us find r(A) by this formula for the polynomial in Example 6. We find the eigenvalues of A from the characteristic equation f A (λ)=0: λ 1 =–1, λ 2 =–2,andλ 3 =–3. Then the Vandermonde determinant is equal to Δ(λ 1 , λ 2 , λ 3 )=–2, and the other determinants are Δ 1 =–4, Δ 2 =–8,andΔ 3 =–6. It follows that f(A)= 1 –2 [(–6)A 2 +(–8)A +(–4)I]=3A 2 + 4A + 2I. 5.3. LINEAR SPACES 187 The Cayley–Hamilton theorem can also be used to find the powers and the inverse of a matrix A (since if f A (A)=0,thenA k f A (A)=0 for any positive integer k). Example 8. For the matrix in Examples 4–7, one has f A (A)=–A 3 – 6A 2 – 11A – 6I = 0. Hence we obtain A 3 =–6A 2 – 11A – 6I. By multiplying this expression by A, we obtain A 4 =–6A 3 – 11A 2 – 6A. Now we use the representation of the cube of A via lower powers of A and eventually arrive at the formula A 4 = 25A 2 + 60A + 36I. For the inverse matrix, by analogy with the preceding, we obtain A –1 f A (A)=A –1 (–A 3 – 6A 2 – 11A – 6I)=–A 2 – 6A – 11I – 6A –1 = 0. The definitive result is A –1 =– 1 6 (A 2 + 6A + 11I). In some cases, an analytic function of a matrix A can be computed by a formula in the following theorem. S YLVESTER’S THEOREM. If all eigenvalues of a matrix A are distinct, then f(A)= n k=1 f(λ k )Z k , Z k = i≠k (A – λ i I) i≠k (λ k – λ i ) , and, moreover, Z k = Z m k ( m = 1, 2, 3, ). 5.3. Linear Spaces 5.3.1. Concept of a Linear Space. Its Basis and Dimension 5.3.1-1. Definition of a linear space. A linear space or a vector space over a field of scalars (usually, the field of real numbers or the field of complex numbers) is a set V of elements x, y, z, (also called vectors)of any nature for which the following conditions hold: I. There is a rule that establishes correspondence between any pair of elements x, y V and a third element z V, called the sum of the elements x, y and denoted by z = x + y. II. There is a rule that establishes correspondence between any pair x, λ,wherex is an element of V and λ is a scalar, and an element u V, called the product of a scalar λ and a vector x and denoted by u = λx. III. The following eight axioms are assumed for the above two operations: 1. Commutativity of the sum: x + y = y + x. 2. Associativity of the sum: (x + y)+z = x +(y + z). 3. There is a zero element 0 such that x + 0 = x for any x. 4. For any element x there is an opposite element x such that x + x = 0. 5. A special role of the unit scalar 1: 1 ⋅ x = x for any element x. 6. Associativity of the multiplication by scalars: λ(μx)=(λμ)x. 7. Distributivity with respect to the addition of scalars: (λ + μ)x = λx + μx. 8. Distributivity with respect to a sum of vectors: λ(x + y)=λx + λy. This is the definition of an abstract linear space. We obtain a specific linear space if the nature of the elements and the operations of addition and multiplication by scalars are concretized. 188 ALGEBRA Example 1. Consider the set of all free vectors in three-dimensional space. If addition of these vectors and their multiplication by scalars are defined as in analytic geometry (see Paragraph 4.5.1-1), this set becomes a linear space denoted by B 3 . Example 2. Consider the set {x} whose elements are all positive real numbers. Let us define the sum of two elements x and y as the product of x and y, and define the product of a real scalar λ and an element x as the λth power of the positive real x. The number 1 is taken as the zero element of the space {x}, and the opposite of x is taken equal to 1/x. It is easy to see that the set {x} with these operations of addition and multiplication by scalars is a linear space. Example 3. Consider the n-dimensional coordinate space R n , whose elements are ordered sets of n arbitrary real numbers (x 1 , , x n ). The generic element of this space is denoted by x, i.e., x =(x 1 , , x n ), and the reals x 1 , , x n are called the coordinates of the element x. From the algebraic standpoint, the set R n may be regarded as the set of all row vectors with n real components. The operations of addition of element of R n and their multiplication by scalars are defined by the following rules: (x 1 , , x n )+(y 1 , , y n )=(x 1 + y 1 , , x n + y n ), λ(x 1 , , x n )=(λx 1 , , λx n ). Remark. If the field of scalars λ, μ, in the above definition is the field of all real numbers, the corresponding linear spaces are called real linear spaces.Ifthefield of scalars is that of all complex numbers, the corresponding space is called a complex linear space. In many situations, it is clear from the context which field of scalars is meant. The above axioms imply the following properties of an arbitrary linear space: 1. The zero vector is unique, and for any element x the opposite element is unique. 2. The zero vector 0 is equal to the product of any element x by the scalar 0. 3. For any element x, the opposite element is equal to the product of x by the scalar –1. 4. The difference of two elements x and y, i.e., the element z such that z + y = x, is unique. 5.3.1-2. Basis and dimension of a linear space. Isomorphisms of linear spaces. An element y is called a linear combination of elements x 1 , , x k of a linear space V if there exist scalars α 1 , , α k such that y = α 1 x 1 + ···+ α k x k . Elements x 1 , , x k of the space V are said to be linearly dependent if there exist scalars α 1 , , α k such that |α 1 | 2 + ···+ |α k | 2 ≠ 0 and α 1 x 1 + ···+ α k x k = 0, where 0 is the zero element of V. Elements x 1 , , x k of the space V are said to be linearly independent if for any scalars α 1 , , α k such that |α 1 | 2 + ···+ |α k | 2 ≠ 0,wehave α 1 x 1 + ···+ α k x k ≠ 0. T HEOREM. Elements x 1 , , x k of a linear space V are linearly dependent if and only if one of them is a linear combination of the others. Remark. If at least one of the elements x 1 , , x k is equal to zero, then these elements are linearly depen- dent. If some of the elements x 1 , , x k are linearly dependent, then all these elements are linearly dependent. Example 4. The elements i 1 =(1, 0, , 0), i 2 =(0, 1, , 0), , i n =(0, 0, , 1)ofthespaceR n (see Example 3) are linearly independent. For any x =(x 1 , , x n ) R n , the vectors x, i 1 , , i n are linearly dependent. 5.3. LINEAR SPACES 189 A basis of a linear space V is defined as any system of linearly independent vectors e 1 , , e n such that for any element x of the space V there exist scalars x 1 , , x n such that x = x 1 e 1 + ···+ x n e n . This relation is called the representation of an element x in terms of the basis e 1 , , e n , and the scalars x 1 , , x n are called the coordinates of the element x in that basis. U NIQUENESS THEOREM. The representation of any element x V in terms of a given basis e 1 , , e n is unique. Let e 1 , , e n be any basis in V and vectors x and y have the coordinates x 1 , , x n and y 1 , , y n in that basis. Then the coordinates of the vector x + y in that basis are x 1 + y 1 , , x n + y n , and the coordinates of the vector λx are λx 1 , , λx n for any scalar λ. Example 5. Any three noncoplanar vectors form a basis in the linear space B 3 of all free vectors. The n elements i 1 =(1, 0, , 0), i 2 =(0, 1, , 0), , i n =(0, 0, , 1) form a basis in the linear space R n .Any basis of the linear space {x} from Example 2 consists of a single element. This element can be arbitrarily chosen of nonzero elements of this space. A linear space V is said to be n-dimensional if it contains n linearly independent elements and any n + 1 elements are linearly dependent. The number n is called the dimension of that space, n =dimV. A linear space V is said to be infinite-dimensional (dim V = ∞) if for any positive integer N it contains N linearly independent elements. T HEOREM 1. If V is a linear space of dimension n ,thenany n linearly independent elements of that space form its basis. THEOREM 2. If a linear space V has a basis consisting of n elements, then dim V = n . Example 6. The dimension of the space B 3 of all vectors is equal to 3. The dimension of the space R n is equal to n. The dimension of the space {x} is equal to 1. Two linear spaces V and V over the same field of scalars are said to be isomorphic if there is a one-to-one correspondence between the elements of these spaces such that if elements x and y from V correspond to elements x and y from V , then the element x + y corresponds to x + y and the element λx corresponds to λx for any scalar λ. Remark. If linear spaces V and V are isomorphic, then the zero element of one space corresponds to the zero element of the other. THEOREM. Any two n -dimensional real (or complex) spaces V and V are isomorphic. 5.3.1-3. Affine space. An affine space is a nonempty set A that consists of elements of any nature, called points, for which the following conditions hold: I. There is a given linear (vector) space V, called the associated linear space. II. There is a rule by which any ordered pair of points A, B A is associated with an element (vector) from V; this vector is denoted by −−→ AB and is called the vector issuing from the point A with endpoint at B. III. The following conditions (called axioms of affine space) hold: 1. For any point A A and any vector a V, there is a unique point B A such that −−→ AB = a. 2. −−→ AB + −−→ BC = −→ AC for any three points A, B, C A. 190 ALGEBRA By definition, the dimension of an affine space A is the dimension of the associated linear space V,dimA =dimV. Any linear space may be regarded as an affine space. In particular, the space R n can be naturally considered as an affine space. Thus if A = (a 1 , , a n )andB =(b 1 , , b n ) are points of the affine space R n , then the corresponding vector −−→ AB from the linear space R n is defined by −−→ AB =(b 1 – a 1 , , b n – a n ). Let Abe an n-dimensional affine space with the associated linear space V.Acoordinate system in the affine space A is a fixed point O A, together with a fixed basis e 1 , , e n V. The point O is called the origin of this coordinate system. Let M be a point of an affine space A with a coordinate system Oe 1 e n . One says that the point M has affine coordinates (or simply coordinates) x 1 , , x n in this coordinate system, and one writes M =(x 1 , , x n )ifx 1 , x n are the coordinates of the radius-vector −−→ OM in the basis e 1 , , e n , i.e., −−→ OM = x 1 e 1 + ···+ x n e n . 5.3.2. Subspaces of Linear Spaces 5.3.2-1. Concept of a linear subspace and a linear span. A subset L of a linear space V is called a linear subspace of V if the following condi- tions hold: 1. If x and y belong to L, then the sum x + y belongs to L. 2. If x belongs to L and λ is an arbitrary scalar, then the element λx belongs to L. The null subspace in a linear space V is its subset consisting of the single element zero. The space V itself can be regarded as its own subspace. These two subspaces are called improper subspaces. All other subspaces are called proper subspaces. Example 1. A subset B 2 consisting of all free vectors parallel to a given plane is a subspace in the linear space B 3 of all free vectors. The linear span L(x 1 , , x m )ofvectorsx 1 , , x m in a linear space V is, by definition, the set of all linear combinations of these vectors, i.e., the set of all vectors of the form α 1 x 1 + ···+ α m x m , where α 1 , , α m are arbitrary scalars. The linear span L(x 1 , , x m ) is the least subspace of V containing the elements x 1 , , x m . If a subspace L of an n-dimensional space V does not coincide with V,thendimL < n =dimV. Let elements e 1 , , e k form a basis in a k-dimensional subspace of an n-dimensional linear space V. Then this basis can be supplemented by elements e k+1 , , e n of the space V, so that the system e 1 , , e k , e k+1 , , e n forms a basis in the space V. T HEOREM OF THE DIMENSION OF A LINEAR SPAN. The dimension of a linear span L(x 1 , , x m ) of elements x m , , x m is equal to the maximal number of linearly indepen- dent vectors in the system x 1 , , x m . 5.3.2-2. Sum and intersection of subspaces. The intersection of subspaces L 1 and L 2 of one and the same linear space V is, by definition, the set of all elements x of V that belong simultaneously to both spaces L 1 and L 2 .Such elements form a subspace of V. The sum of subspaces L 1 and L 2 of one and the same linear space V is, by definition, the set of all elements of V that can be represented in the form y + z,wherey is an element of V 1 and z is an element of L 2 . The sum of subspaces is also a subspace of V. 5.3. LINEAR SPACES 191 T HEOREM. The sum of dimensions of arbitrary subspaces L 1 and L 2 of a finite- dimensional space V is equal to the sum of the dimension of their intersection and the dimension of their sum. Example 2. Let V be the linear space of all free vectors (in three-dimensional space). Denote by L 1 the subspace of all free vectors parallel to the plane OXY , and by L 2 the subspace of all free vectors parallel to the plane OXZ. Then the sum of the subspaces L 1 and L 2 coincides with V, and their intersection consists of all free vectors parallel to the axis OX. The dimension of each space L 1 and L 2 is equal to two, the dimension of their sum is equal to three, and the dimension of their intersection is equal to unity. 5.3.2-3. Representation of a linear space as a direct sum of its subspaces. A linear space V can be represented as a direct sum of its subspaces, V 1 and V 2 if each element x V admits the unique representation x = x 1 + x 2 ,wherex 1 V 1 and x 2 V 2 .In this case, one writes V = V 1 ⊕ V 2 . Example 3. The space V of all free vectors (in three-dimensional space) can be represented as the direct sum of the subspace V 1 formed by all free vectors parallel to the plane OXY and the subspace V 2 formed by all free vectors parallel to the axis OZ. THEOREM. An n -dimensional space V is a direct sum of its subspaces V 1 and V 2 if and only if the intersection of V 1 and V 2 is the null subspace and dim V =dimV 1 +dimV 2 . Remark. If R is the sum of its subspaces R 1 and R 2 , but not the direct sum, then the representation x = x 1 + x 2 is nonunique, in general. 5.3.3. Coordinate Transformations Corresponding to Basis Transformations in a Linear Space 5.3.3-1. Basis transformation and its inverse. Let e 1 , , e n and e 1 , , e n be two arbitrary bases of an n-dimensional linear space V. Suppose that the elements e 1 , , e n are expressed via e 1 , , e n by the formulas e 1 = a 11 e 1 + a 12 e 2 + ···+ a 1n e n , e 2 = a 21 e 1 + a 22 e 2 + ···+ a 2n e n , e n = a n1 e 1 + a n2 e 2 + ···+ a nn e n . Thus, the transition from the basis e 1 , , e n to the basis e 1 , , e n is determined by the matrix A ≡ ⎛ ⎜ ⎜ ⎝ a 11 a 12 ··· a 1n a 21 a 22 ··· a 2n . . . . . . . . . . . . a n1 a n2 ··· a nn ⎞ ⎟ ⎟ ⎠ . Note that det A ≠ 0, i.e., the matrix A is nondegenerate. The transition from the basis e 1 , , e n to the basis e 1 , , e n is determined by the matrix B ≡ [b ij ]=A –1 . Thus, we can write e i = n j=1 a ij e j , e k = n j=1 b kj e j (i, k = 1, 2, , n). (5.3.3.1) . to both spaces L 1 and L 2 .Such elements form a subspace of V. The sum of subspaces L 1 and L 2 of one and the same linear space V is, by definition, the set of all elements of V that can be represented. real numbers. Let us define the sum of two elements x and y as the product of x and y, and define the product of a real scalar λ and an element x as the λth power of the positive real x. The number. , x m . 5.3.2-2. Sum and intersection of subspaces. The intersection of subspaces L 1 and L 2 of one and the same linear space V is, by definition, the set of all elements x of V that belong simultaneously