Linear Systems: Solution by Iteration

Case III. Complex conjugate roots are of minor practical importance, and we discuss the derivation of real solutions from complex ones just in terms of a typical example

Step 3. Solution of the Entire Problem

20.3 Linear Systems: Solution by Iteration

The Gauss elimination and its variants in the last two sections belong to the direct methods for solving linear systems of equations; these are methods that give solutions after an amount of computation that can be specified in advance. In contrast, in an indirector iterative methodwe start from an approximation to the true solution and, if successful, obtain better and better approximations from a computational cycle repeated as often as may be necessary for achieving a required accuracy, so that the amount of arithmetic depends upon the accuracy required and varies from case to case.

We apply iterative methods if the convergence is rapid (if matrices have large main diagonal entries, as we shall see), so that we save operations compared to a direct method.

We also use iterative methods if a large system is sparse, that is, has very many zero coefficients, so that one would waste space in storing zeros, for instance, 9995 zeros per equation in a potential problem of equations in unknowns with typically only 5 nonzero terms per equation (more on this in Sec. 21.4).

Gauss–Seidel Iteration Method4

This is an iterative method of great practical importance, which we can simply explain in terms of an example.

E X A M P L E 1 Gauss–Seidel Iteration We consider the linear system

(1)

x1⫺0.25x2⫺0.25x3 ⫽50

⫺0.25x1⫹ x2 ⫺0.25x4⫽50

⫺0.25x1 ⫹ x3⫺0.25x4⫽25

⫺0.25x2 ⫺0.25x3⫹ x4⫽25.

104 104

4PHILIPP LUDWIG VON SEIDEL (1821–1896), German mathematician. For Gauss see footnote 5 in Sec. 5.4.

(Equations of this form arise in the numeric solution of PDEs and in spline interpolation.) We write the system in the form

(2)

These equations are now used for iteration; that is, we start from a (possibly poor) approximation to the solution, say x1(0)⫽100, x2(0)⫽100, x3(0)⫽100, x4(0)⫽100,and compute from (2) a perhaps better approximation

x1⫽ 0.25x2⫹0.25x3 ⫹50

x2⫽0.25x1 ⫹0.25x4⫹50

x3⫽0.25x1 ⫹0.25x4⫹25

x4⫽ 0.25x2⫹0.25x3 ⫹25.

+ 50.00 = 100.00 + 50.00 = 100.00 + 25.00 = 75.00 + 25.00 = 68.75 Use “old” values

(“New” values here not yet available)

Use “new” values

(1)=

x1 0.25x(0)2 +0.25x(0)3

0.25x(1)2 +0.25x(1)3 0.25x(1)1

0.25x(1)1

0.25x(0)4 0.25x(0)4

(1)= x2

(1)= x3

(1)= x4 (3)

These equations (3) are obtained from (2) by substituting on the right the most recentapproximation for each unknown. In fact, corresponding values replace previous ones as soon as they have been computed, so that in the second and third equations we use (not and in the last equation of (3) we use and (not

and Using the same principle, we obtain in the next step

Further steps give the values

x1(2)⫽ 0.25x2(1)⫹0.25x3(1) ⫹50.00⫽93.750 x2(2)⫽0.25x1(2) ⫹0.25x4(1)⫹50.00⫽90.625 x3(2)⫽0.25x1(2) ⫹0.25x4(1)⫹25.00⫽65.625 x4(2)⫽ 0.25x2(2)⫹0.25x3(2) ⫹25.00⫽64.062 x3(0)).

x2(0)

x3(1)

x2(1)

x1(0)), x1(1)

x1 x2 x3 x4

89.062 88.281 63.281 62.891 87.891 87.695 62.695 62.598 87.598 87.549 62.549 62.524 87.524 87.512 62.512 62.506 87.506 87.503 62.503 62.502

Hence convergence to the exact solution (verify!) seems rather fast.

An algorithm for the Gauss–Seidel iteration is shown in Table 20.2. To obtain the algorithm, let us derive the general formulas for this iteration.

We assume that for (Note that this can be achieved if we can rearrange the equations so that no diagonal coefficient is zero; then we may divide each equation by the corresponding diagonal coefficient.) We now write

j⫽1,Á, n.

ajj⫽1

䊏

x1⫽x2⫽87.5, x3⫽x4⫽62.5

(4)

where Iis the unit matrix and Land Uare, respectively, lower and upper triangular matrices with zero main diagonals. If we substitute (4) into we have

Taking Lxand Uxto the right, we obtain, since (5)

Remembering from (3) in Example 1 that below the main diagonal we took “new”

approximations and above the main diagonal “old” ones, we obtain from (5) the desired iteration formulas

“New” “Old”

(6)

where is the mth approximation and is the st

approximation. In components this gives the formula in line 1 in Table 20.2. The matrix Amust satisfy for all j. In Table 20.2 our assumption is no longer required, but is automatically taken care of by the factor 1>ajjin line 1.

ajj⫽1 ajj⫽0

(m⫹1) x(m⫹1)⫽[xj(m⫹1)]

x(m)⫽[xj(m)

]

(ajj⫽1) x(m⫹1)⫽b⫺Lx(m⫹1)⫺Ux(m)

xⴝbⴚLxⴚUx.

Ixⴝx, Axⴝ(IⴙLⴙU)xⴝb.

Ax⫽b, n⫻n

(ajj⫽1) A⫽I⫹L⫹U

Table 20.2 Gauss–Seidel Iteration

ALGORITHM GAUSS–SEIDEL (A, b, x(0), , N)

This algorithm computes a solution xof the system Ax⫽bgiven an initial approximation x(0), where A⫽[ajk]is an n⫻nmatrix with ajj⫽0, j⫽1, • • •, n.

INPUT: A, b,initial approximation x(0), tolerance ⬎0, maximum number of iterations N

OUTPUT: Approximate solution [ ]or failure message that x(N)does not satisfy the tolerance condition

For m⫽0, • • •, N⫺1, do:

For j⫽1, • • •, n, do:

End

2 If max

j xj

(m⫹1)⫺xj

(m)⬍ xj (m⫹1)

then OUTPUT x(m⫹1). Stop [Procedure completed successfully]

End

OUTPUT: “No solution satisfying the tolerance condition obtained after N iteration steps.” Stop

[Procedure completed unsuccessfully]

End GAUSS–SEIDEL

P xj(m⫹1)⫽ 1

ajj abj⫺ a

jⴚ1

k⫽1

ajkxk(m⫹1)⫺ a

k⫽j⫹1

ajkxk(m)b xj(m)

x(m)⫽

P P

Convergence and Matrix Norms

An iteration method for solving is said to converge for an initial if the corresponding iterative sequence converges to a solution of the given system. Convergence depends on the relation between and . To get this relation for the Gauss–Seidel method, we use (6). We first have

and by multiplying by from the left,

(7) where

The Gauss–Seidel iteration converges for every if and only if all the eigenvalues (Sec. 8.1) of the “iteration matrix” have absolute value less than 1. (Proof in Ref. [E5], p. 191, listed in App. 1.)

CAUTION!If you want to get C, first divide the rows of Aby to have main diagonal If the spectral radiusof C maximum of those absolute values) is small, then the convergence is rapid.

Sufficient Convergence Condition. A sufficient condition for convergence is (8)

Here is some matrix norm, such as

(9) (Frobenius norm)

or the greatest of the sums of the in a columnof C

(10) (Column “sum” norm)

or the greatest of the sums of the in a rowof C

(11) (Row “sum” norm).

These are the most frequently used matrix norms in numerics.

In most cases the choice of one of these norms is a matter of computational convenience.

However, the following example shows that sometimes one of these norms is preferable to the others.

C⫽max

j a

k⫽1

ƒcjkƒ ƒcjkƒ

C⫽max

k a

j⫽1

ƒcjkƒ ƒcjkƒ

C⫽ Ba

j⫽1

k⫽1

cjk2

C⬍1.

(⫽ 1,Á, 1.

ajj C⫽[cjk]

x(0)

C⫽ ⫺(I⫹L)ⴚ1 U.

x(m⫹1)⫽Cx(m)⫹(I⫹L)ⴚ1 b (I⫹L)ⴚ1

(I⫹L) x(m⫹1)⫽b⫺Ux(m)

x(m⫹1) x(m)

x(0), x(1), x(2), Á

x(0) Ax⫽b

E X A M P L E 2 Test of Convergence of the Gauss–Seidel Iteration Test whether the Gauss–Seidel iteration converges for the system

written

Solution. The decomposition (multiply the matrix by – why?) is

It shows that

We compute the Frobenius norm of C

and conclude from (8) that this Gauss–Seidel iteration converges. It is interesting that the other two norms would permit no conclusion, as you should verify. Of course, this points to the fact that (8) is sufficient for convergence rather than necessary.

Residual. Given a system the residual r of xwith respect to this system is defined by

(12)

Clearly, if and only if xis a solution. Hence for an approximate solution. In the Gauss–Seidel iteration, at each stage we modify or relax a component of an approximate solution in order to reduce a component of rto zero. Hence the Gauss–Seidel iteration belongs to a class of methods often called relaxation methods. More about the residual follows in the next section.

Jacobi Iteration

The Gauss–Seidel iteration is a method of successive corrections because for each component we successively replace an approximation of a component by a corresponding new approximation as soon as the latter has been computed. An iteration method is called a method of simultaneous correctionsif no component of an approximation is used until allthe components of have been computed. A method of this type is the Jacobi iteration, which is similar to the Gauss–Seidel iteration but involves notusing improved values until a step has been completed and then replacing by at once, directly before the beginning of the next step. Hence if we write (with as before!) in the form the Jacobi iteration in matrix notation is

(13) x(m⫹1)⫽b⫹(I⫺A)x(m) (ajj⫽1).

x⫽b⫹(I⫺A)x,

ajj⫽1 Ax⫽b

x(m⫹1) x(m)

x(m)

x(m) r⫽0

r⫽0

r⫽b⫺Ax.

Ax⫽b,

䊏

C⫽A14⫹14⫹161 ⫹161 ⫹641 ⫹649 B1>2⫽A5064B1>2⫽0.884⬍1 C⫽ ⫺(I⫹L)ⴚ1 U⫽ ⫺D

1 0 0

⫺12 1 0

⫺14 ⫺12 1 T D

0 12 12

0 0 12

0 0 0

T⫽D

0 ⫺12 ⫺12

0 14 ⫺14

0 18 38 T . D

1 12 12

2 1 12

2 1

T⫽I⫹L⫹U⫽I⫹D

0 0 0

2 0 0

2 1

2 0

T⫹D

0 12 12

0 0 12

0 0 0

T .

1 2

x⫽2⫺12y⫺12z y⫽2⫺12x⫺12z z⫽2⫺12x⫺12y.

2x⫹ y⫹ z⫽4 x⫹2y⫹ z⫽4 x⫹ y⫹2z⫽4

This method converges for every choice of if and only if the spectral radius of is less than 1. It has recently gained greater practical interest since on parallel processors all nequations can be solved simultaneously at each iteration step.

For Jacobi, see Sec. 10.3. For exercises, see the problem set.

I⫺A x(0)

1. Verify the solution in Example 1 of the text.

2. Show that for the system in Example 2 the Jacobi iteration diverges. Hint.Use eigenvalues.

3. Verify the claim at the end of Example 2.

4 –10 GAUSS–SEIDEL ITERATION

Do 5 steps, starting from and using 6S in the computation. Hint.Make sure that you solve each equation for the variable that has the largest coefficient (why?). Show the details.

10. 4x1 ⫹5x3⫽ 12.5 x1⫹6x2⫹2x3⫽ 18.5 8x1⫹2x2⫹ x3⫽ ⫺11.5 5x1⫹ x2⫹2x3⫽ 19

x1⫹4x2⫺2x3⫽ ⫺2 2x1⫹3x2⫹8x3⫽ 39 3x1⫹2x2⫹ x3⫽7

x1⫹3x2⫹2x3⫽4 2x1⫹ x2⫹3x3⫽7

5x1⫺ 2x2 ⫽ 18

⫺2x1⫹10x2⫺ 2x3⫽ ⫺60

⫺ 2x2⫹15x3⫽ 128 x2⫹7x3⫽ 25.5

5x1⫹ x2 ⫽ 0

x1⫹6x2⫹ x3⫽ ⫺10.5 10x1⫹ x2⫹ x3⫽6

x1⫹10x2⫹ x3⫽6 x1⫹ x2⫹10x3⫽6

4x1⫺ x2 ⫽ 21

⫺x1⫹4x2⫺ x3⫽ ⫺45

⫺ x2⫹4x3⫽ 33

x0⫽[1 1 1]T

11. Apply the Gauss–Seidel iteration (3 steps) to the system in Prob. 5, starting from (a) (b)

Compare and comment.

12. In Prob. 5, compute C (a)if you solve the first equation for the second for the third for proving convergence; (b) if you nonsensically solve the third equation for the first for the second for proving divergence.

13. CAS Experiment. Gauss–Seidel Iteration. (a)Write a program for Gauss–Seidel iteration.

(b) Apply the program to starting from where

For determine the number of

steps to obtain the exact solution to 6S and the corresponding spectral radius of C. Graph the number of steps and the spectral radius as functions of tand comment.

(c) Successive overrelaxation (SOR). Show that by adding and subtracting on the right, formula (6) can be written

Anticipation of further corrections motivates the introduction of an overrelaxation factor to get the SOR formula for Gauss–Seidel

(14)

intended to give more rapid convergence. A rec-

ommended value is where is

the spectral radius of Cin (7). Apply SOR to the matrix in (b) for and 0.8 and notice the improvement of convergence. (Spectacular gains are made with larger systems.)

t⫽0.5

r v⫽2>(1⫹11⫺r),

(ajj⫽1)

⫺(U⫹I)x(m)) x(m⫹1)⫽x(m)⫹v(b⫺Lx(m⫹1)

v⬎1 (ajj⫽1).

x(m⫹1)⫽x(m)⫹b⫺Lx(m⫹1)⫺(U⫹I)x(m) x(m)

t⫽0.2, 0.5, 0.8, 0.9 A(t)⫽D

1 t t

t 1 t

t t 1

T , b⫽D 2 2 2

T . [0 0 0]T,

A(t)x⫽b,

x3, x2,

x1,

x3, x2,

x1,

10, 10, 10.

0, 0, 0

P R O B L E M S E T 2 0 . 3

Some Applications of Eigenvalue Problems

Vector and Scalar Functions and Their Fields