164 ALGEBRA 2 ◦ . A constant K is called an upper bound for the real roots of equation (5.1.5.1) or the polynomial P n (x) if equation (5.1.5.1) has no real roots greater than or equal to K;ina similar way, one defines a lower and an upper bound for positive and negative roots of an equation or the corresponding polynomial. Let K 1 be an upper bound for the positive roots of the polynomial P n (x), K 2 be an upper bound for the positive roots of the polynomial P n (–x), K 3 > 0 be an upper bound for the positive roots of the polynomial x n P n (1/x), K 4 > 0 be an upper bound for the positive roots of the polynomial x n P n (–1/x). Then all nonzero real roots of the polynomial P n (x) (if they exist) belong to the intervals (–K 2 ,–1/K 4 )and(1/K 3 , K 1 ). Next, we describe three methods for finding upper bounds for positive roots of a polynomial. Maclaurin method. Suppose that the first m leading coefficients of the polynomial (5.1.5.2) are nonnegative, i.e., a n > 0, a n–1 ≥ 0, , a n–m+1 ≥ 0, and the next coefficient is negative, a n–m < 0.Then K = 1 + B a n 1/m (5.1.5.5) is an upper bound for the positive roots of this polynomial, where B is the largest of the absolute values of negative coefficients of P n (x). Example 3. Consider the fourth-degree equation from Example 2. In this case, m = 2, B = 36 and formula (5.1.5.5) yields K = K 1 = 1+(36/9) 1/2 = 3. Now, consider the polynomial P 4 (–x)=9x 4 –9x 2 +36x+1. Its positive roots has the upper bound K 2 = 1 +(9/9) 1/2 = 2. For the polynomial x 4 P 4 (1/x)=x 4 –36x 3 –9x 2 +9, we have m = 1, K 3 = 1 + 36 = 37. Finally, for the polynomial x 4 P 4 (–1/x)=x 4 +36x 3 –9x 2 +9,wehavem = 2, k 4 = 1 + 9 1/2 = 4. Thus if P 4 (x) has real roots, they must belong to the intervals (–2,–1/4)and(1/37, 3). Newton method. Suppose that for x = c, the polynomial P n (x) and all its derivatives P n (x), , P (n) n (x) take positive values. Then c is an upper bound for the positive roots of P n (x). Example 4. Consider the polynomial from Example 2 and calculate the derivatives P 4 (x)=9x 4 – 9x 2 – 36x + 1, P 4 (x)=36x 3 – 18x – 36, P 4 (x)=108x 2 – 18, P 4 (x)=216x, P 4 (x)=216. It is easy to check that for x = 2 this polynomial and all its derivatives take positive values, and therefore c = 2 is an upper bound for its positive roots. A method based on the representation of a polynomial as a sum of polynomials. As- suming a n > 0, let us represent the polynomial (5.1.5.4) (without rearranging its terms) as the sum P n (x)=f 1 (x)+ + f m (x), where each polynomial f k (x)(k = 1, 2, , m)hasa positive leading coefficient and the sequence of its coefficients does not change sign more than once. Suppose that for c > 0 all these polynomials are positive, f 1 (c)>0, , f m (c)>0. Then c is an upper bound for the positive roots of P n (x). Example 5. The polynomial P 7 (x)=x 7 + 2x 6 – 4x 5 – 7x 4 + 2x 3 – 3x 2 + ax + b (a > 0, b > 0) can be represented as a sum of three polynomials f 1 (x)=x 7 + 2x 6 – 4x 5 – 7x 4 = x 4 (x 3 + 2x 2 – 4x – 7), f 2 (x)=2x 3 – 3x 2 = x 2 (2x – 3), f 3 (x)=ax + b 5.1. POLYNOMIALS AND ALGEBRAIC EQUATIONS 165 (in the first two polynomials the sign of the sequence of coefficients changes once, and in the last polynomial the coefficients do not change sign). It is easy to see that all these polynomials are positive for x = 2. Therefore, c = 2 is an upper bound for the positive roots of the given polynomial. 5.1.5-5. Theorems on the number of real roots of polynomials. The number all negative roots of a polynomial P n (x) is equal to the number of all positive roots of the polynomial P n (–x). 1 ◦ . The exact number of positive roots of a polynomial whose coefficients form a sequence that does not change sign or changes sign only once can be found with the help of the Descartes theorem. D ESCARTES THEOREM. The number of positive roots (counted according to their mul- tiplicity) of a polynomial P n (x) with real coefficients is either equal to the number of sign alterations in the sequence of its coefficients or is by an even number less. Applying the Descartes theorem to P n (–x), we obtain a similar theorem for the negative roots of the polynomial P n (x). Example 6. Consider the cubic polynomial P 3 (x)=x 3 – 3x + 4. Its coefficients have the signs + – +, and therefore we have two alterations of sign. Therefore, the number of positive roots of P 3 (x) is equal either to 2 or to 0. Now, consider the polynomial P 3 (–x)=–x 3 + 2x + 1.The sequence of its coefficients changes sign only once. Therefore, the original equation has one negative root. 2 ◦ . A stronger version of the Descartes theorem. Suppose that all roots of a polynomial P n (x) are real ∗ ; then the number of positive roots of P n (x) is equal to the number of sign alterations in the sequence of its coefficients, and the number of its negative roots is equal to the number of sign alterations in the sequence of coefficients of the polynomial P n (–x). Example 7. Consider the characteristic polynomial of the symmetric matrix P 3 (x)= –2 – x 11 11– x 3 131– x =–x 3 + 14x + 20, which has only real roots. The sequence of its coefficients changes sign only once, and therefore it has a single positive root. The number of its negative roots is equal to two, since this polynomial has three nonzero real roots and only one of them can be positive. 3 ◦ . If two neighboring coefficients of a polynomial P n (x) are equal to zero, then the roots of the polynomial cannot be all real (in this case, the stronger version of the Descartes theorem cannot be used). 4 ◦ . The number of real roots of a polynomial P n (x) greater than a fixed c is either equal to the number of sign alterations in the sequence P n (c), , P (n) n (c)orisbyanevennumber less. If all roots of P n (x) are real, then the number of its roots greater than c coincides with the number of sign alterations in the sequence P n (c), , P (n) n (c). Example 8. Consider the polynomial P 4 (x)=x 4 – 3x 3 + 2x 2 – 2a 2 x + a 2 . For x = 1,wehaveP 4 (1)=–a 2 , P 4 (1)=–1 – 2a 2 , P 4 (1)=–2, P 4 (1)=6, P 4 (1)=24. Thus, there is a single sign alteration, and therefore the polynomial has a single real root greater than unity. ∗ This is the case, for instance, if we are dealing with the characteristic polynomial of a symmetric matrix. 166 ALGEBRA 5 ◦ . Budan–Fourier method.LetN(x) be the number of sign alterations in the sequence P n (x), , P (n) n (x) consisting of the values of the polynomial (5.1.5.2) and its derivatives. Then the number of real roots of equation (5.1.5.1) on the interval [a, b] with P n (a) ≠ 0, P n (b) ≠ 0 is either equal to N (a)–N(b) or is less by an even number. When calculating N(a), zero terms of the sequence are dropped. When calculating N (b), it may happen that P (i) n (b)=0 for k ≤ i ≤ m and P (k–1) n (b) ≠ 0, P (m+1) n (b) ≠ 0;thenP (i) n (b) should be replaced by (–1) m+1–i sign P (m+1) n (b). 6 ◦ . Sturm method for finding the number of real roots. Consider a polynomial P n (x) with no multiple roots and denote by N (x) the number of sign alterations in the sequence of values of the polynomials (zero terms of the sequence are not taken into account): f 0 (x)=g 0 (x)f 1 (x)–f 2 (x), f 1 (x)=g 1 (x)f 2 (x)–f 3 (x), , where f 0 (x)=P n (x), f 1 (x)=P n (x); for k > 1, every polynomial –f k (x) is the residue after dividing the polynomial f k–2 (x)byf k–1 (x); the last polynomial f n (x) is a nonzero constant. Then the number of all real roots of equation (5.1.5.1) on the segment [a, b]forP n (a) ≠ 0, P n (b) ≠ 0 is equal to N(a)–N(b). Remark 1. Taking a =–L and b = L and passing to the limit as L →∞, we obtain the overall number of real roots of the algebraic equation. Example 9. Consider the following cubic equation with the parameter a: P 3 (x)=x 3 + 3x 2 – a = 0. The Sturm system for this equation has the form P 3 (x)=f 0 (x)=x 3 + 3x 2 – a, [P 3 (x)] x = f 1 (x)=3x 2 + 6x, f 2 (x)=2x + a, f 3 (x)= 3 4 a(4 – a). Case 0 < a < 4.Letusfind the number of sign alterations in the Sturm system for x =–∞ and x = ∞: xf 0 (x) f 1 (x) f 2 (x) f 3 (x) number of sign alterations –∞ –+–+ 3 ∞ ++++ 0 It follows that N(–∞)–N(∞)=3. Therefore, for 0 < a < 4, the given polynomial has three real roots. Case a < 0 or a > 4.Letusfind the number of sign alterations in the Sturm system: xf 0 (x) f 1 (x) f 2 (x) f 3 (x) number of sign alterations –∞ –+–– 2 ∞ +++– 1 It follows that N(–∞)–N(∞)=1, and therefore for a < 0 or a > 4, the given polynomial has one real root. Remark 2. If equation P n (x)=0 has multiple roots, then P n (x)andP n (x)haveacommondivisorand the multiple roots are found by equating to zero this divisor. In this case, f n (x) is nonconstant and N(a)–N(b) is the number of roots between a and b, each multiple root counted only once. 5.2. MATRICES AND DETERMINANTS 167 5.1.5-6. Bounds for complex roots of polynomials with real coefficients. 1 ◦ . Routh–Hurwitz criterion. For an algebraic equation (5.1.5.1) with real coefficients, the number of roots with positive real parts is equal to the number of sign alterations in any of the two sequences T 0 , T 1 , T 2 /T 1 , , T n /T n–1 ; T 0 , T 1 , T 1 T 2 , , T n–2 T n–1 , a 0 ; where T m (it is assumed that T m ≠ 0 for all m)aredefined by T 0 = a n > 0, T 1 = a n–1 , T 2 = a n–1 a n a n–3 a n–2 , T 3 = a n–1 a n 0 a n–3 a n–2 a n–1 a n–5 a n–4 a n–3 , T 4 = a n–1 a n 00 a n–3 a n–2 a n–1 a n a n–5 a n–4 a n–3 a n–2 a n–7 a n–6 a n–5 a n–4 , T 5 = a n–1 a n 000 a n–3 a n–2 a n–1 a n 0 a n–5 a n–4 a n–3 a n–2 a n–1 a n–7 a n–6 a n–5 a n–4 a n–3 a n–9 a n–8 a n–7 a n–6 a n–5 , 2 ◦ . All roots of equation (5.1.5.1) have negative real parts if and only if all T 0 , T 1 , , T n are positive. 3 ◦ . All roots of an nth-degree equation (5.1.5.1) have negative real parts if and only if this is true for the following (n – 1)st-degree equation: a n–1 x n–1 + a n–2 – a n a n–1 a n–3 x n–2 + a n–3 x n–3 + a n–4 – a n a n–1 a n–5 x n–2 + ···= 0. 5.2. Matrices and Determinants 5.2.1. Matrices 5.2.1-1. Definition of a matrix. Types of matrices. A matrix of size (or dimension) m×n is a rectangular table with entries a ij (i = 1, 2, , m; j = 1, 2, , n) arranged in m rows and n columns: A ≡ ⎛ ⎜ ⎜ ⎝ a 11 a 12 ··· a 1n a 21 a 22 ··· a 2n . . . . . . . . . . . . a m1 a m2 ··· a mn ⎞ ⎟ ⎟ ⎠ . Note that, for each entry a ij , the index i refers to the ith row and the index j to the jth column. Matrices are briefly denoted by uppercase letters (for instance, A, as here), or by the symbol [a ij ], sometimes with more details: A ≡ [a ij ](i = 1, 2, , m; j = 1, 2, , n). The numbers m and n are called the dimensions of the matrix. A matrix is said to be finite if it has finitely many rows and columns; otherwise, the matrix is said to be infinite.Inwhat follows, only finite matrices are considered. The null or zero matrix is a matrix whose entries are all equal to zero: a ij = 0 (i = 1, 2, , m, j = 1, 2, , n). A column vector or column is a matrix of size m ×1.Arow vector or row is a matrix of size 1×n. Both column and row vectors are often simply called vectors. 168 ALGEBRA TABLE 5.2 Types of square matrices (¯a ij is the complex conjugate of a number a ij ) Type of square matrix [a ij ] Entries Unit (identity) I =[δ ij ] a ij = δ ij = 1, i = j, 0, i ≠ j, (δ ij is the Kronecker delta) Diagonal a ij = any, i = j, 0, i ≠ j Upper triangular (superdiagonal) a ij = any, i ≤ j, 0, i > j Strictly upper triangular a ij = any, i < j, 0, i ≥ j Lower triangular (subdiagonal) a ij = any, i ≥ j, 0, i < j Strictly lower triangular a ij = any, i > j, 0, i ≤ j Symmetric a ij = a ji (see also Paragraph 5.2.1-3) Skew-symmetric (antisymmetric) a ij =–a ji (see also Paragraph 5.2.1-3) Hermitian (self-adjoint) a ij = ¯a ji (see also Paragraph 5.2.1-3) Skew-Hermitian (antihermitian) a ij =–¯a ji (see also Paragraph 5.2.1-3) Monomial (generalized permutation) Each column and each row contain exactly one nonzero entry A square matrix is a matrix of size n × n,andn is called the dimension of this square matrix. The main diagonal of a square matrix is its diagonal from the top left corner to the bottom right corner with the entries a 11 a 22 a nn .Thesecondary diagonal of a square matrix is the diagonal from the bottom left corner to the top right corner with the entries a n1 a (n–1)2 a 1n . Table 5.2 lists the main types of square matrices (see also Paragraph 5.2.1-3). 5.2.1-2. Basic operations with matrices. Two matrices are equal if they are of the same size and their respective entries are equal. The sum of two matrices A ≡ [a ij ]andB ≡ [b ij ]ofthesamesizem × n is the matrix C ≡ [c ij ]ofsizem × n with the entries c ij = a ij + b ij (i = 1, 2, , m; j = 1, 2, , n). The sum of two matrices is denoted by C = A + B, and the operation is called addition of matrices. Properties of addition of matrices: A + O = A (property of zero), A + B = B + A (commutativity), (A + B)+C = A +(B + C) (associativity), where matrices A, B, C, and zero matrix O have the same size. 5.2. MATRICES AND DETERMINANTS 169 The difference of two matrices A ≡ [a ij ]andB ≡ [b ij ]ofthesamesizem × n is the matrix C ≡ [c ij ]ofsizem × n with entries c ij = a ij – b ij (i = 1, 2, , m; j = 1, 2, , n). The difference of two matrices is denoted by C = A – B, and the operation is called subtraction of matrices. The product of a matrix A ≡ [a ij ]ofsizem × n by a scalar λ is the matrix C ≡ [c ij ]of size m × n with entries c ij = λa ij (i = 1, 2, , m; j = 1, 2, , n). The product of a matrix by a scalar is denoted by C = λA, and the operation is called multiplication of a matrix by a scalar. Properties of multiplication of a matrix by a scalar: 0A = O (property of zero), (λμ)A = λ(μA) (associativity with respect to a scalar factor), λ(A + B)=λA + λB (distributivity with respect to addition of matrices), (λ + μ)A = λA + μA (distributivity with respect to addition of scalars), where λ and μ are scalars, matrices A, B, C, and zero matrix O have the same size. The additively inverse (opposite) matrix for a matrix A ≡ [a ij ]ofsizem× n is the matrix C ≡ [c ij ]ofsizem × n with entries c ij =–a ij (i = 1, 2, , m; j = 1, 2, , n), or,inmatrixform, C =(–1)A. Remark. The difference C of two matrices A and B can be expressed as C = A +(–1)B. The product of a matrix A ≡ [a ij ]ofsizem × p and a matrix B ≡ [b ij ]ofsizep × n is the matrix C ≡ [c ij ]ofsizem × n with entries c ij = p k=1 a ik b kj (i = 1, 2, , m; j = 1, 2, , n); i.e., the entry c ij in the ith row and jth column of the matrix C is equal to the sum of products of the respective entries in the ith row of A and the jth column of B. Note that the product is defined for matrices of compatible size; i.e., the number of the columns in the first matrix should be equal to the number of rows in the second matrix. The product of two matrices A and B is denoted by C = AB, and the operation is called multiplication of matrices. Example 1. Consider two matrices A = 12 6 –3 and B = 0101 –6 –0.520 . The product of the matrix A and the matrix B is the matrix C = AB = 12 6 –3 0101 –6 –0.520 = 1×0+ 2×(–6) 1×10+ 2×(–0.5) 1×1+ 2×20 6×0+(–3) × (–6) 6×10+(–3) × (–0.5) 6×1+(–3) ×20 = –12 9 41 18 61.5 –54 . Two square matrices A and B aresaid to commuteif AB =BA, i.e., if their multiplication is subject to the commutative law. 170 ALGEBRA Properties of multiplication of matrices: AO = O 1 , A + O = A (property of zero matrix), (AB)C = A(BC) (associativity of the product of three matrices), AI = A (multiplication by unit matrix), A(B + C)=AB + AC (distributivity with respect to a sum of two matrices), λ(AB)=(λA)B = A(λB) (associativity of the product of a scalar and two matrices), SD = DS (commutativity for any square and any diagonal matrices), where λ is a scalar, matrices A, B, C, square matrix S, diagonal matrix D, zero matrices O and O 1 , and unit matrix I have the compatible sizes. 5.2.1-3. Transpose, complex conjugate matrix, adjoint matrix. The transpose of a matrix A ≡ [a ij ]ofsizem × n is the matrix C ≡ [c ij ]ofsizen × m with entries c ij = a ji (i = 1, 2, , n; j = 1, 2, , m). The transpose is denoted by C = A T . Example 2. If A =(a 1 , a 2 )thenA T = a 1 a 2 . Properties of transposes: (A + B) T = A T + B T ,(λA) T = λA T ,(A T ) T = A, (AC) T = C T A T , O T = O 1 , I T = I, where λ is a scalar; matrices A, B, and zero matrix O have size m × n;matrixC has size n × l; zero matrix O 1 has size n × m. A square matrix A is said to be orthogonal if A T A = AA T = I, i.e., A T = A –1 (see Paragraph 5.2.1-6). Properties of orthogonal matrices: 1. If A is an orthogonal matrix, then A T is also orthogonal. 2. The product of two orthogonal matrices is an orthogonal matrix. 3. Any symmetric orthogonal matrix is involutive (see Paragraph 5.2.1-7). The complex conjugate of a matrix A ≡ [a ij ]ofsizem × n is the matrix C ≡ [c ij ]of size m × n with entries c ij = ¯a ij (i = 1, 2, , m; j = 1, 2, , n), where ¯a ij is the complex conjugate of a ij . The complex conjugate matrix is denoted by C = A. The adjoint matrix of a matrix A ≡ [a ij ]ofsizem × n is the matrix C ≡ [c ij ]ofsize n × m with entries c ij = ¯a ji (i = 1, 2, , n; j = 1, 2, , m). The adjoint matrix is denoted by C = A ∗ . Properties of adjoint matrices: (A + B) ∗ = A ∗ + B ∗ ,(λA) ∗ = ¯ λA ∗ ,(A ∗ ) ∗ = A, (AC) ∗ = C ∗ A ∗ , O ∗ = O 1 , I ∗ = I, where λ is a scalar; matrices A, B, and zero matrix O have size m × n;matrixC has size n × l; zero matrix O 1 has a size n × m. . ith row and jth column of the matrix C is equal to the sum of products of the respective entries in the ith row of A and the jth column of B. Note that the product is defined for matrices of compatible. the number of positive roots of P n (x) is equal to the number of sign alterations in the sequence of its coefficients, and the number of its negative roots is equal to the number of sign alterations. positive for x = 2. Therefore, c = 2 is an upper bound for the positive roots of the given polynomial. 5.1.5-5. Theorems on the number of real roots of polynomials. The number all negative roots of