5.2. MATRICES AND DETERMINANTS 171 Remark. If a matrix is real (i.e., all its entries are real), then the corresponding transpose and the adjoint matrix coincide. A square matrix A is said to be normal if A ∗ A = AA ∗ . A normal matrix A is said to be unitary if A ∗ A = AA ∗ = I, i.e., A ∗ = A –1 (see Paragraph 5.2.1-6). 5.2.1-4. Trace of a matrix. The trace of a square matrix A ≡ [a ij ]ofsizen × n is the sum S of its diagonal entries, S =Tr(A)= n i=1 a ii . If λ is a scalar and square matrices A and B has the same size, then Tr(A + B)=Tr(A)+Tr(B), Tr(λA)=λTr(A), Tr(AB)=Tr(BA), 5.2.1-5. Linear dependence of row vectors (column vectors). A row vector (column vector) B is a linear combination of row vectors (column vectors) A 1 , , A k if there exist scalars α 1 , , α k such that B = α 1 A 1 + ···+ α k A k . Row vectors (column vectors) A 1 , , A k are said to be linearly dependent if there exist scalars α 1 , , α k (α 2 1 + ···+ α 2 k ≠ 0) such that α 1 A 1 + ···+ α k A k = O, where O is the zero row vector (column vector). Row vectors (column vectors) A 1 , , A k are said to be linearly independent if, for any α 1 , , α k (α 2 1 + ···+ α 2 k ≠ 0)wehave α 1 A 1 + ···+ α k A k ≠ O. T HEOREM. Row vectors (column vectors) A 1 , , A k are linearly dependent if and only if one of them is a linear combination of the others. 5.2.1-6. Inverse matrices. Let A be a square matrix of size n × n,andletI be the unit matrix of the same size. A square matrix B of size n × n is called a right inverse of A if AB = I. A square matrix C of size n × n is called a left inverse of A if CA = I. If one of the matrices B or C exists, then the other exists, too, and these two matrices coincide. In such a case, the matrix A is said to be nondegenerate (nonsingular). T HEOREM. A square matrix is nondegenerate if and only if its rows (columns) are linearly independent. Remark. Generally, instead of the terms “left inverse matrix” and “right inverse matrix”, the term “inverse matrix” is used with regard to the matrix B = A –1 for a nondegenerate matrix A,sinceAB = BA = I. UNIQUENESS THEOREM. The matrix A –1 is the unique matrix satisfying the condition AA –1 = A –1 A = I for a given nondegenerate matrix A . Remark. For the existence theorem, see Paragraph 5.2.2-7. 172 ALGEBRA Properties of inverse matrices: (AB) –1 = B –1 A –1 ,(λA) –1 = λ –1 A –1 , (A –1 ) –1 = A,(A –1 ) T =(A T ) –1 ,(A –1 ) ∗ =(A ∗ ) –1 , where square matrices A and B are assumed to be nondegenerate and scalar λ ≠ 0. The problem of finding the inverse matrix is considered in Paragraphs 5.2.2-7, 5.2.4-5, and 5.5.2-3. 5.2.1-7. Powers of matrices. A product of several matrices equal to one and the same matrix A can be written as a positive integer power of the matrix A: AA = A 2 , AAA = A 2 A = A 3 , etc. For a positive integer k, one defines A k = A k–1 A as the kth power of A. For a nondegenerate matrix A, one defines A 0 = AA –1 = I, A –k =(A –1 ) k . Powers of a matrix have the following properties: A p A q = A p+q ,(A p ) q = A pq , where p and q are arbitrary positive integers and A is an arbitrary square matrix; or p and q are arbitrary integers and A is an arbitrary nondegenerate matrix. There exist matrices A k whose positive integer power is equal to the zero matrix, even if A ≠ O.IfA k = O for some integer k > 1,thenA is called a nilpotent matrix. AmatrixA is said to be involutive if it coincides with its inverse: A = A –1 or A 2 = I. 5.2.1-8. Polynomials and matrices. Basic functions of matrices. A polynomial with matrix argument is the expression obtained from a scalar polynomial f (x) by replacing the scalar argument x with a square matrix X: f(X)=a 0 I + a 1 X + a 2 X 2 + ···, where a i (i = 0, 1, 2, ) are real or complex coefficients. The polynomial f(X) is a square matrix of the same size as X. A polynomial with matrix coefficients is an expression obtained from a polynomial f (x) by replacing its coefficients a i (i = 0, 1, 2, ) with matrices A i (i = 0, 1, 2, )ofthe same size: F (x)=A 0 + A 1 x + A 2 x 2 + ···. Example 3. For the matrix A = 4 –81 5 –91 4 –6 –1 , the characteristic matrix (see Paragraph 5.2.3-2) is a polynomial with matrix coefficients and argument λ: F (λ) ≡ A – λI = A 0 + A 1 λ = 4 – λ –81 5 –9 – λ 1 4 –6 –1 – λ , where A 0 = A = 4 –81 5 –91 4 –6 –1 , A 1 =–I = –10 0 0 –10 00–1 . The corresponding adjugate matrix (see Paragraph 5.2.2-7) can also be represented as a polynomial with matrix coefficients: G(λ)= λ 2 + 10λ + 15 –8λ – 14 λ + 1 5λ + 9 λ 2 – 3λ – 8 λ + 1 4λ + 6 –6λ – 8 λ 2 + 5λ + 4 = A 0 + A 1 λ + A 2 λ 2 , 5.2. MATRICES AND DETERMINANTS 173 where A 0 = 15 –14 1 9 –81 6 –84 , A 1 = 10 –81 5 –31 4 –65 , A 2 = I = 100 010 001 . The variable x in a polynomial with matrix coefficients can be replaced by a matrix X, which yields a polynomial of matrix argument with matrix coefficients. In this situation, one distinguishes between the “left” and the “right” values: F (X)=A 0 + A 1 X + A 2 X 2 + ···, F (X)=A 0 + XA 1 + X 2 A 2 + ···. The exponential function of a square matrix X can be represented as the following convergent series: e X = 1 + X + X 2 2! + X 3 3! + ···= ∞ k=0 X k k! . The inverse matrix has the form (e X ) –1 = e –X = 1 – X + X 2 2! – X 3 3! + ···= ∞ k=0 (–1) k X k k! . Remark. Note that e X e Y ≠ e Y e X , in general. The relation e X e Y = e X+Y holds only for commuting matrices X and Y . Some other functions of matrices can be expressed in terms of the exponential function: sin X = 1 2i (e iX – e –iX ), cos X = 1 2 (e iX + e –iX ), sinh X = 1 2 (e X – e –X ), cosh X = 1 2 (e X + e –X ). 5.2.1-9. Decomposition of matrices. THEOREM 1. For any square matrix A , the matrix S 1 = 1 2 (A + A T ) is symmetric and the matrix S 2 = 1 2 (A – A T ) is skew-symmetric. The representation of A as the sum of symmetric and skew-symmetric matrices is unique: A = S 1 + S 2 . THEOREM 2. For any square matrix A , the matrices H 1 = 1 2 (A+A ∗ ) and H 2 = 1 2i (A–A ∗ ) are Hermitian, and the matrix iH 2 is skew-Hermitian. The representation of A as the sum of Hermitian and skew-Hermitian matrices is unique: A = H 1 + iH 2 . THEOREM 3. For any square matrix A , the matrices AA ∗ and A ∗ A are nonnegative Hermitian matrices (see Paragraph 5.7.3-1). THEOREM 4. Any square matrix A admits a polar decomposition A = QU and A = U 1 Q 1 , where Q and Q 1 are nonnegative Hermitian matrices, Q 2 = AA ∗ and Q 2 1 = A ∗ A ,and U and U 1 are unitary matrices. The matrices Q and Q 1 are always unique, while the matrices U and U 1 are unique only in the case of a nondegenerate A . 174 ALGEBRA 5.2.1-10. Block matrices. Let us split a given matrix A ≡ [a ij ](i = 1, 2, , m; j = 1, 2, , n)ofsizem × n into separate rectangular cells with the help of (M –1) horizontal and (N –1) vertical lines. Each cell is a matrix A αβ ≡ [a ij ](i = i α , i α + 1, , i α + m α – 1; j = j β , j β + 1, , j β + n β – 1)of size m α × n β and is called a block of the matrix A.Herei α = m α–1 + i α–1 , j β = n β–1 + j β–1 . Then the given matrix A can be regarded as a new matrix whose entries are the blocks: A ≡ [A αβ ](α = 1, 2, , M; β = 1, 2, , N). This matrix is called a block matrix. Example 4. The matrix A ≡ ⎛ ⎜ ⎜ ⎜ ⎝ a 11 a 12 a 13 a 14 a 15 a 21 a 22 a 23 a 24 a 25 a 31 a 32 a 33 a 34 a 35 a 41 a 42 a 43 a 44 a 45 a 51 a 52 a 53 a 54 a 55 ⎞ ⎟ ⎟ ⎟ ⎠ can be regarded as the block matrix A ≡ A 11 A 12 A 21 A 22 of size 2×2with the entries being the blocks A 11 ≡ a 11 a 12 a 13 a 21 a 22 a 23 , A 12 ≡ a 14 a 15 a 24 a 25 , A 21 ≡ a 31 a 32 a 33 a 41 a 42 a 43 a 51 a 52 a 53 , A 22 ≡ a 34 a 35 a 44 a 45 a 54 a 55 of size 2×3, 2×2, 3×3, 3×2, respectively. Basic operations with block matrices are practically the same as those with common matrices, the role of the entries being played by blocks: 1. For matrices A ≡ [a ij ] ≡ [A αβ ]andB ≡ [b ij ] ≡ [B αβ ] of the same size and the same block structure, their sum C ≡ [C αβ ]=[A αβ + B αβ ] is a matrix of the same size and the same block structure. 2. For a matrix A ≡ [a ij ]ofsizem× n regarded as a block matrix A ≡ [A αβ ]ofsizeM × N, the multiplication by a scalar is defined by λA =[λA αβ ]=[λa ij ]. 3. Let A ≡ [a ik ] ≡ [A αγ ]andB ≡ [b kj ] ≡ [B γβ ] be two block matrices such that the number of columns of each block A αγ is equal to the number of the rows of the block B γβ . Then the product of the matrices A and B can be regarded as the block matrix C ≡ [C αβ ]=[ γ A αγ B γβ ]. 4. For a matrix A ≡ [a ij ]ofsizem× n regarded as a block matrix A ≡ [A αβ ]ofsizeM × N, the transpose has the form A T =[A T βα ]. 5. For a matrix A ≡ [a ij ]ofsizem× n regarded as a block matrix A ≡ [A αβ ]ofsizeM × N, the adjoint matrix has the form A ∗ =[A ∗ βα ]. Let A be a nondegenerate matrix of size n × n represented as the block matrix A ≡ A 11 A 12 A 21 A 22 , where A 11 and A 22 are square matrices of size p × p and q × q, respectively (p + q = n). Then the following relations, called the Frobenius formulas, hold: A –1 = A –1 11 + A –1 11 A 12 NA 21 A –1 11 –A –1 11 A 12 N –NA 21 A –1 11 N , A –1 = K –KA 12 A –1 22 –A –1 22 A 21 KA –1 22 + A –1 22 A 21 KA 12 A –1 22 . Here, N =(A 22 – A 21 A –1 11 A 12 ) –1 , K =(A 11 – A 12 A –1 22 A 21 ) –1 ;inthefirst formula, the matrix A 11 is assumed nondegenerate, and in the second formula, A 22 is assumed nondegenerate. 5.2. MATRICES AND DETERMINANTS 175 The direct sum of two square matrices A and B of size m × m and n × n, respectively, is the block matrix C = A ⊕ B = A 0 0 B of size m + n. Properties of the direct sum of matrices: 1. For any square matrices A, B,andC the following relations hold: (A ⊕ B) ⊕ C = A ⊕ (B ⊕ C) (associativity), Tr(A ⊕ B)=Tr(A)+Tr(B) (trace property). 2. For nondegenerate square matrices A and B, the following relation holds: (A ⊕ B) –1 = A –1 ⊕ B –1 . 3. For square matrices A m , B m of size m × m and square matrices A n , B n of size n × n, the following relations hold: (A m ⊕ A n )+(B m ⊕ B n )=(A m + B m ) ⊕ (A n + B n ); (A m ⊕ A n )(B m ⊕ B n )=A m B m ⊕ A n B n . 5.2.1-11. Kronecker product of matrices. The Kronecker product of two matrices A ≡ [a i a j a ]andB ≡ [b i b j b ]ofsizem a × n a and m b × n b , respectively, is the matrix C ≡ [c kh ]ofsizem a m b × n a n b with entries c kh = a i a j a b i b j b (k = 1, 2, , m a m b ; h = 1, 2, , n a n b ), where the index k is the serial number of a pair (i a , i b ) in the sequence (1, 1), (1, 2), , (1, m b ), (2, 1), (2, 2), (m a , m b ), and the index h is the serial number of a pair (j a , j b ) in a similar sequence. This Kronecker product can be represented as the block matrix C ≡ [a i a j a B]. Note that if A and B are square matrices and the number of rows in C is equal to the number of rows in A, and the number of rows in D is equal to the number of rows in B,then (A ⊗ B)(C ⊗ D)=AC ⊗ BD. The following relations hold: (A ⊗ B) T = A T ⊗ B T ,Tr(A ⊗ B)=Tr(A)Tr(B). 5.2.2. Determinants 5.2.2-1. Notion of determinant. With each square matrix A ≡ [a ij ]ofsizen × n one can associate a numerical characteristic, called its determinant. The determinant of such a matrix can be defined by induction with respect to the size n. For a matrix of size 1×1(n = 1), the first-order determinant is equal to its only entry, Δ ≡ det A = a 11 . For a matrix of size 2×2(n = 2), the second-order determinant, is equal to the difference of the product of its entries on the main diagonal and the product of its entries on the secondary diagonal: Δ ≡ det A ≡ a 11 a 12 a 21 a 22 = a 11 a 22 – a 12 a 21 . 176 ALGEBRA For a matrix of size 3×3(n = 3), the third-order determinant, Δ ≡ det A ≡ a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 = a 11 a 22 a 33 + a 12 a 23 a 31 + a 21 a 32 a 13 – a 13 a 22 a 31 – a 12 a 21 a 33 – a 23 a 32 a 11 . This expression is obtained by the triangle rule (Sarrus scheme), illustrated by the following diagrams, where entries occurring in the same product with a given sign are joined by segments: + ❅ ❅ ❅ ❅ ❅ ❅ ❅ ❅ ✟ ✟ ✟ ✟ ✟ ✁ ✁ ✁ ✁ ✁ ❅ ❅ ❅ ✟ ✟ ✟ ✟ ✟ ✁ ✁ ✁ ✁ ✁ – ❍ ❍ ❍ ❍ ❍ ❆ ❆ ❆ ❆ ❆ ❍ ❍ ❍ ❍ ❍ ❆ ❆ ❆ ❆ ❆ For a matrix of size n × n (n > 2), the nth-order determinant is defined as follows under the assumption that the (n – 1)st-order determinant has already been defined for a matrix of size (n – 1) × (n – 1). Consider a matrix A =[a ij ]ofsizen × n.Theminor M i j corresponding to an entry a ij is defined as the (n – 1)st-order determinant of the matrix of size (n – 1) × (n – 1) obtained from the original matrix A by removing the ith row and the jth column (i.e., the row and the column whose intersection contains the entry a ij ). The cofactor A i j of the entry a ij is defined by A i j =(–1) i+j M i j (i.e., it coincides with the corresponding minor if i + j is even, and is the opposite of the minor if i + j is odd). The nth-order determinant of the matrix A is defined by Δ ≡ det A ≡ a 11 a 12 ··· a 1n a 21 a 22 ··· a 2n . . . . . . . . . . . . a n1 a n2 ··· a nn = n k=1 a ik A i k = n k=1 a kj A k j . This formula is also called the ith row expansion of the determinant of A andalsothejth column expansion of the determinant of A. Example 1. Let us find the third-order determinant of the matrix A = 1 –12 61 5 2 –1 –4 . To this end, we use the second-column expansion of the determinant: det A = 3 i=1 (–1) i+2 a i2 M i 2 =(–1) 1+2 × (–1) × 65 2 –4 +(–1) 2+2 ×1× 12 2 –4 +(–1) 3+2 × (–1) × 12 65 = 1×[6×(–4)–5×2]+1×[1×(–4)–2×2]+1×[1×5– 2×6]=–49. 5.2.2-2. Properties of determinants. Basic properties: 1. Invariance with respect to transposition of matrices: det A =detA T . 2. Antisymmetry with respect to the permutation of two rows (or columns): if two rows (columns) of a matrix are interchanged, its determinant preserves its absolute value, but changes its sign. 5.2. MATRICES AND DETERMINANTS 177 3. Linearity with respect to a row (or column) of the corresponding matrix: suppose that the ith row of a matrix A ≡ [a ij ] is a linear combination of two row vectors, (a i1 , , a i3 )=λ(b 1 , , b n )+μ(c 1 , , c n ); then det A = λ det A b + μ det A c , where A b and A c are the matrices obtained from A by replacing its ith row with (b 1 , , b n )and(c 1 , , c n ). This fact, together with the first property, implies that a similar linearity relation holds if a column of the matrix A is a linear combination of two column vectors. Some useful corollaries from the basic properties: 1. The determinant of a matrix with two equal rows (columns) is equal to zero. 2. If all entries of a row are multiplied by λ, the determinant of the resulting matrix is multiplied by λ. 3. If a matrix contains a row (columns) consisting of zeroes, then its determinant is equal to zero. 4. If a matrix has two proportional rows (columns), its determinant is equal to zero. 5. If a matrix has a row (column) that is a linear combination of its other rows (columns), its determinant is equal to zero. 6. The determinant of a matrix does not change if a linear combination of some of its rows is added to another row of that matrix. T HEOREM (NECESSARY AND SUFFICIENT CONDITION FOR A MATRIX TO BE DEGENER- ATE). The determinant of a square matrix is equal to zero if and only if its rows (columns) are linearly dependent. 5.2.2-3. Minors. Basic minors. Rank and defect of a matrix. Let A ≡ [a ij ]beamatrixofsizen × n.Itsmth-order (m ≤ n) minor of the first kind, denoted by M i 1 i 2 i m j 1 j 2 j m , is the mth-order determinant of a submatrix obtained from A by removing some of its n – m rows and n – m columns. Here, i 1 , i 2 , , i m are the indices of the rows and j 1 , j 2 , , j m are the indices of the columns involved in that submatrix. The (n – m)th-order determinant of the second kind, denoted by M i 1 i 2 i m j 1 j 2 j m ,is the (n – m)th-order determinant of the submatrix obtained from A by removing the rows and the columns involved in M i 1 i 2 i m j 1 j 2 j m .Thecofactor of the minor M i 1 i 2 i m j 1 j 2 j m is defined by A i 1 i 2 i m j 1 j 2 j m =(–1) i 1 +i 2 +···+i m +j 1 +j 2 +···+j m M i 1 i 2 i m j 1 j 2 j m . Remark. minors of the first kind can be introduced for any rectangular matrix A ≡ [a ij ]ofsizem × n.Its kth-order (k ≤ min{m, n}) minor M i 1 i 2 i k j 1 j 2 j k is the determinant of the submatrix obtained from A by removing some of its m – k rows and n – k columns. LAPLACE THEOREM. Given m rows with indices i 1 , , i m (or m columns with indices j 1 , , j m )ofasquarematrix A , its determinant Δ is equal to the sum of products of all m th-order minors M i 1 i 2 i m j 1 j 2 j m in those rows (resp., columns) and their cofactors A i 1 i 2 i m j 1 j 2 j m , i.e., Δ ≡ det A = j 1 ,j 2 , ,j m M i 1 i 2 i m j 1 j 2 j m A i 1 i 2 i m j 1 j 2 j m = i 1 ,i 2 , ,i m M i 1 i 2 i m j 1 j 2 j m A i 1 i 2 i m j 1 j 2 j m . Here, in the first sum i 1 , , i m are fixed, and in the second sum j 1 , , j m are fixed. Let A ≡ [a ij ]beamatrixofsizem × n with at least one nonzero entry. Then there is a positive integer r ≤ n for which the following conditions hold: i) the matrix A has an rth-order nonzero minor, and ii) any minor of A of order (r + 1) and higher (of it exists) is equal to zero. . formula, the matrix A 11 is assumed nondegenerate, and in the second formula, A 22 is assumed nondegenerate. 5.2. MATRICES AND DETERMINANTS 175 The direct sum of two square matrices A and B of. [a i a j a B]. Note that if A and B are square matrices and the number of rows in C is equal to the number of rows in A, and the number of rows in D is equal to the number of rows in B,then (A ⊗ B)(C. a nn = n k=1 a ik A i k = n k=1 a kj A k j . This formula is also called the ith row expansion of the determinant of A andalsothejth column expansion of the determinant of A. Example 1. Let us find the third-order determinant of the matrix A