13 rigorous numerics for ODE’s joseph galance

18 290 0
13  rigorous numerics for ODE’s joseph galance

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Rigorous Numerics for ODE’s Joseph Galante September 2008 1 Abstract We will give an overview of how computers solve ODE’s numerically in a nonrigorous fashion and examine the sources of error. We will introduce the tools of interval arithmetic and of Taylor Models to outline a method of how a rigorous ODE solver can be implemented on a computer system. 2 Introduction Many (real world) problems have ODE’s describing them that are too com- plex to estimate by hand and understand in detail. However rigorously ver- ified information must be known about solutions in order complete proofs. For example, tracking a solution to see if it connects two regions, or examin- ing the shape of ball of initial conditions as the flow acts on it. Computers can be used to quickly give numerical simulations of a particular solution, however most ODE solvers (for example Runge-Kutta Methods) give that the computer generated solution is only O(h n ) where h and n depend on the method, and the big-O holds unknown derviatives from the ODE in ques- tion. This is unacceptable for serious proof since we have no assurance that computer solution and the actual solution agree after some long time accept for a heurestic ”pick h small and n large”. 2.1 Euler’s Method It is helpful to have a model method to understand the sources of error a computer can make. For this we use Euler’s Method. The idea is simple. 1 Suppose we have the initial value problem  ˙x = f (x) x(0) = x 0 (1) Throughout this paper, we will assume that solutions exist, are unique, and are defined for all time, and that f is sufficently smooth (either C ∞ or real analytic). We specify a fixed step size h for the Euler Method. If x(t) is a solution to the IVP, then we have from Taylor’s Theorem x(t + h) = x(t) + h ˙ x(t) + O(h 2 ) ≈ x(t) + hf (x(t)) (2) Euler’s Method forgets the remainder and makes the a linear approximation at each time step to give  t i = t i−1 + h x i = x i−1 + hf (x i−1 ) (3) Each step of the Euler Method makes an error of O(h 2 ), which for h small isn’t too bad. The small errors made by disregarding the O(h 2 ) causes the method to track a slightly different solution each timestep. Nearby solutions usually behave similarly, however after many steps, these small errors can accumulate and destroy the method’s usefulness by jumping to a solution which has different b ehavior from the one desired. Figure 1: Truncation errors can lead to tracking the wrong solution 2 3 Interval Arithmetic When working on a computer, there is another source of error which must be accounted for - floating point error. Differential equations usually have solutions which require real numbers to represent, however a computer is incapable of representing a general real number. The se t of ”Representable Numbers” is the set of numbers which our computer can perform computa- tions with. This is obviously dependent on the computer’s architechture and software, however we most computers adopt IEEE standards which specify such representable numbers. We assume that we have adopted such a stan- dard. For IEEE, the gap inbetween representable numbers is approximately 10 −16 , which is usually denoted called the machine-. When performing com- putations, all calculations are subject to errors introduced by allowing this discrepancy - to the computer 3 = 3 +  2 = 3 −  2 . A well known trick to get around these difficulties is by using so called inter- val arithmetic. If x is a real number, then on a computer, we can represent x as the interval [a,b], where a and b are machine representable numbers. Rules may be developed to handle basic operations. For example if x is represented by [a,b] and y is represented by [c,d], then to compute x+y, we compute [a+c,b+d]. We can then perform a ’round’ which ensures that a+c and b+d are still machine representable. This corresponds roughly to rep- resenting x+y as [a + c − , b + d + ]. The other operations of subtraction, multiplication, and division, as well as the concept of ’round’ and repre- sentable number can be made completely rigorous and are done so in [KM]. As an added benefit, we gain the operations of union and intersection. For example if we are computing a specific quantity that we know is positive, then we may take the interval arithmetic calculation and intersect the in- terval with [0, ∞]. As a down side, we lose the concept of equality. x=y becomes x-y=0, which says compute x-y, then check if zero is in the interval [−, ]. Equality on a computer is only good up to the size of machine-. Another downside from a practical standpoint is that interval operations will have at least twice as long as conventional computer arithmetic, however we gain mathematical rigor. Returning to our problem of rigorous numerics for ODE’s, interval arith- metic can help to produce a rigorous solver. We must first reformulate the 3 problem as an ’Interval Value Problem’.  ˙x = f (x) x(0) ∈ I (4) where now I is some small interval of initial conditions, x is now made up of intervals instead of reals, and operations are performed via interval arith- metic. From a dynamical systems perspective we are seeking to transport a ball of initial conditions under the flow from the ODE. We can solve the ’IVP’ using an intervalized Euler Method which uses in- terval arithmetic. However we can do slightly more. The other source of error in Euler’s Method is error introduced by truncating off the O(h 2 ) term. Suppose we have information which allows us to bound the error so that O(h 2 ) ∈ E where E is some interval. Then we can make a rigorous solver by doing  t i = t i−1 + h x i = x i−1 + hf (x i−1 ) + E (5) where op erations are carried out via interval arithmetic. As we have included both the floating point errors, and the truncation errors, then we have pro- duced a method which rigorously solves the IVP. However in practice this method is useless. It turns out that (with the exception of a few special cases) the intervals which contain the true s olution grow very quickly due to the repeated addition of the interval term E. For example, if E=[-1,1] (a seemingly reasonable bound on an error term) then after two steps we have accumulated E+E=[-1,1]+[-1,1]=[-2,2]. In the literature, this is known as the wrapping effect. Iterating, by the nth step we will have a bound for the solution at least as bad as [-n,n]. This being completely unacceptable, we persue an idea which greatly refines interval methods. 4 Taylor Models Definition: Suppose f(x) is C n+1 in an open domain D. We define an n-th order Taylor Model about x 0 ∈ D as a pair (P,I) where P is nth order Taylor Polynomial of f about x 0 , and I is some interval such that for all x ∈ D we have f(x) ∈ P (x − x 0 ) + I. 4 Notice that since f is in C n+1 , then Taylor’s theorem gives that the size of I shrinks as n grows. Hence the definition is nothing more than a clever statement that says that smooth functions behave like their Taylor Poly- nomial approximations (up to some small error) inside a sufficently small neighborhood. Rules for ’Taylor Model Arithmetic’ have been generated in [B M1]. For ex- ample, suppose T 1 = (P 1 , I 1 ) and T 2 = (P 2 , I 2 ) are nth order Taylor Models about x 0 ∈ D. Then we have T 1 + T 2 = (P 1 + P 2 , I 1 + I 2 ) (6) T 1 · T 2 = (P 1 · P 2 − P h , B(P h ) + B(P 1 ) · I 2 + B(P 2 ) · I 1 + I 1 · I 2 (7) where P h is the polynomial made up of all terms of order (n+1) or larger in P 1 · P 2 , and B(P) is a bound on the polynomial. Notice that since we can obtain a bound on any polynomial by simply performing interval evaluation on the domain in which it is defined. Other arithmetic operations, trunca- tion, and the notion of an antiderviative can also be defined. One advantage in working with Taylor Models is that bounds for most com- mon functions are known and can be computed automatically with a com- puter. Bounds for polynomials, trig, exponential, and logarithmic functions, as well as operations like 1/x and Sqrt(x) (all referred to a intrinsic functions) have been computed in [BM1]. All of these quantities are known explicitly and for the domain [-h,h] have remainders that scale like O(h n+1 ) Since most complicated expressions (ie the ones we care about) are made from com- posing these simple known quantities together, then we can get nice Taylor Models with remainder intervals that scale like O(h n+1 ). This is known as The Fundamental Theorem of Taylor Model Arithmetic. Suppose that the function f is described by an nth order Taylor Model (P f , I f ) on its domain D. Let g be a function which is composed of finitely many elementary operations and intrinsic functions, and suppose g is defined on the range of f. Let (P,I) be the Taylor Model which arises by plugging in (P f , I f ) into g and evaluating using Taylor Model arithmetic. Then (P,I) is a Taylor Model for g ◦ f. Furthermore, if the remainder interval I f scales like O(h n+1 ), then so does the remainder interval I. 5 Proof of this theorem, as well as a detailed list of intrinsic functions and their Taylor Models is found in [BM1]. It basically amounts to induction on each operation performed by g. This theorem is important since it allows to think of Taylor Models as data objects which we can move around and manipulate in a computer without risking losing control over the size of the remainder bound. Notice that zeroth-order Taylor Model Arithmetic is simply just interval arithmetic. Higher order Taylor Models however give us so much more. Con- sider the function g(x)=x-x. If we feed g the function f(x)=x on [-1,1], and use interval arithmetic to bound the answer, then we get g(f (x)) ∈ [−2, 2] since [-1,1]-[-1,1]=[-2,2], which is hardly a tight bound. Now suppose instead we use Taylor Models to represent f(x) as x+I where x ∈ [−1, 1] and I = [−, ] (machine precision). Then we get (x+I)-(x+I)=(x-x)+(I-I)=[−2, 2]. This is quite a dramatic improvement over intervals. Essentially by using a Taylor Model, we are storing the higher order information about the shape of the range of f, which can be manipulated and cancellated to gain better bounds. Due to the this extra storage of information, one of the primary disadvan- tages to Taylor Models is that they are very slow and require alot of storage space in comparsion to interval methods. Somewhat efficient methods to do this have been developed in [BM1]. (A nice trick is that polynomial coef- ficents below the machine- don’t need to be storaged, they only need to thrown into the remainder bound.) In practice however, n=5 is usually good enough in terms of both speed and accuracy, for your average problem. There is a scripting language called COSY developed by Martin Berz and Kyoto Makino (currently at MSU) which has Taylor Models as built in ob- jects that can automatically perform all of the operations describe d above. Additionally COSY has some more advanced theoretical tricks to obtain the bounds B(P). It also has a full interval arithmetic package built in and can convert between Taylor Models, intervals, and reals to the extent that it makes sense to do so. All intrinsic functions on Taylor Models are built in and bounds can be instantly obtained by working with them. COSY is currently avaliable free of charge for academic use (under some restrictions) at www.cosyinfinity.org. As an added bonus, COSY has been ’verified’ by other rigorous computer arithmetic packages.[BM1] [BM2] [BM3] [MB1] [M1] [RMB1] 6 5 Schauder’s Theorem and Verified Integra- tion Taylor Models have another operation which can be defined in a natural way, the operation of antidifferentiation. If (P,I) is a Taylor Model, then we define the antidifferentation operator ∂ −1 i (P, I) = (  x i 0 P n−1 (x)dx, (B(P − P n−1 ) + I) · B(x i )) (8) where P n−1 is the (n-1)-th degree truncation of P, and x i ∈ [a i , b i ]. Notice this operation is easy to compute since integration of a p olynomial is just manipulation of its coefficents. The bounds in the remainder term are easily computed by interval evaluation of the polynomial piece, and interval oper- ations. This is the key to implementing a rigorous ODE solver. Recall that every ODE of the form (1) can be written as an integral equation x(t) = x(0) +  t 0 f(x(s))ds for t ≤ h. We define the Picard operator A f (x)(t) = x(0) +  t 0 f(x(s))ds (9) A f is a map from C 0 ([0, h]) onto itself and fixed points of A f correspond to solutions of our IVP. (Note A f is continuous because we are assuming that f is continuous.) We have theorem which gives us the existence of such a fixed point. Schauder’s Theorem. Let A be a continuous operator on the Banach Space X. Let M ⊂ X be compact and convex, and let A(M ) ⊂ M . Then A has a fixed point in M. We going to apply this theorem to a subset of X = C 0 ([0, h]) which con- tains all Taylor Models and A=A f . This approach was originally done in [BM4] and we follow it here. We start by finding large Y ⊂ X which is agreeable to our analysis on Taylor Models and to which we can apply the Schauder Theorem. Let (P+I) be a Taylor Model depending on both time and the initial condition x 0 . Then we 7 define the set M (P,I) so that M (P,I) ⊂ X = C 0 ([0, t 0 ]) and for x ∈ M (P,I) we have x(0) = x 0 (10) x(t) ∈ P + I∀t ∈ [0, h]∀x 0 (11) |x(t  ) − x(t  )| ≤ k|t  − t  |∀t  , t  ∈ [0, h]∀x 0 (12) The last condition is a Lipschitz condition for existence/uniqueness of solu- tions to the ODE. We take k to be some Lipschitz bound on the function k. Define Y as Y =  (P,I) M (P,I) (13) So Y will contain all Taylor Models. Now if M ⊂ Y and if x 1 , x 2 ∈ M , then ax 1 +(1−a)x 2 ∈ M ∀a ∈ [0, 1] since (ax 1 +(1−a)x 2 )(0) = x 0 , ax 1 +(1−a)x 2 is also Lipschitz with constant k, and ax 1 + (1 − a)x 2 will be in the same Taylor Models as x 1 and x 2 (due to FTTMA). Hence M is convex. Some point-set topology and an application of Ascoli-Arzela Theorem shows that M is compact. Finally note that A maps Y into self since (A f (x))(0) = x 0 . A f (x) is continuous due to the integral and Lipschitz continuous with con- stant k because f is bounded by k. Lastly, since A is made up of intrinsic functions, then FTTMA gives that A maps Taylor Models to Taylor Models, ie Y into Y. To apply Schauder’s Theorem, we must then find a Taylor Model (P,I) so that A(P + I) ⊂ P + I. Then the fixed point, ie solution of the ODE will be contained in the Taylor Model. Notice that if I is small, then we will have succeeded in closely modeling the solution with the polynomial part. Find- ing such a Taylor Model is relatively easy computationally. Start with the zero polynomial, and repeatedly iterate it through A, each time disregarding terms of order (n+1) or higher. Claim. After (n+1) steps, this will produce an nth degree polynomial invari- ant under A. Proof. We will show that after k applications of A to the zero polynomial, all terms of degree (k-1) will be fixed. Since the Taylor Model is of degree n, 8 then applying A (n+1)-times will produce the result. Let P = A k (0). Then it suffices to show that A(P + O(t k )) = P + O(t k ). We proceed by induction on k. Basis: k=1. P = A(0) = x 0 . A(x 0 + O(t)) = x 0 +  t 0 f(x 0 + O(τ ))dτ = x 0 + O(t) (14) since all terms in the integral will pick up at least a factor of t after the integration. Now assume the result holds true for k. We will show it holds for (k+1). Let Q = A(P ) = A k+1 (0). By the inductive hypothesis, Q = R+S +O(t k+1 ) where R is the degree k-1 polynomial such that P = R+O(t k ) (hence R is fixed under iterates of A), and S is the polynomial composed of the degree k terms in Q. A(Q) =A(R + S + O(t k+1 )) = x 0 +  t 0 f(R + S + O(τ k+1 ))dτ (15) = x 0 +  t 0 f(R + S) + f  (R + S) · O(τ k+1 ) + f  (R + S) 2 O(τ k+1 ) 2 + dτ (16) = A(R + O(t k )) +  t 0 O(τ k+1 )(stuff)dτ (17) = R + O(t k ) + O(t k+2 ) (18) = R + (kth order terms) + O(t k+1 ) (19) We have used the Taylor Series expansion of f in terms of its argument x, and the inductive hypothesis. Our claim will be complete if the (kth order terms)=S. Notice that since deg(S)=k we have  t 0 f(R + S)dτ =  t 0 f(R) + f  (R)S + f  (R) 2 S 2 + dτ (20) =  t 0 f(R)dτ +  t 0 R · (stuff)dτ (21) =  t 0 f(R)dτ + O(t k+1 ) (22) 9 ie the (kth order terms) are actually independent of S since all the terms in S get integrated and land in the O(t k+1 ). Now A(P ) = A(R + (P − R)) (23) = x 0 +  t 0 f(R + (P − R ))dτ (24) = x 0 +  t 0 f(R) + f  (R)(P − R) + dτ (25) = A(R) +  t 0 (P − R)(stuff)dτ (26) = A(R) + O(t k+1 ) (27) since deg(P-R)=k. So we have A(Q) = A(R) + O(t k+1 ) and A(R) = A(P ) + O(t k+1 ) = A(A k (0)) + O(t k+1 ) = Q + O(t k+1 ), so we must have that A(Q) = Q + O(t k+1 ) = R + S + O(t k+1 ) which implies that (kth order terms)=S. Using this algorithm, we can generate a polynomial invariant under A. To complete the application of Schauder’s Theorem we must find a Taylor Model which is invariant under A. Let P be the A invariant nth degree polynomial. We desire an interval I so that A(P + I) ⊂ P + I, ie A(P + I) − P ⊂ I We have A(P + I) = x 0 +  t 0 f(P + I)dτ . By FTTMA f(P+I) will be a Taylor Model. We can decompose as f (P + I) = Q + R + ˆ I where Q is all terms of degree (n-1) or less, R is all degree n terms, and ˆ I is the remainder. Since A(P ) = P + O(t n+1 ) and since deg(R)=n it will integrate to an (n+1) order term, and then we must have that P = x 0 +  t 0 Qdτ. Thus the other terms will contribute only to the remainder. ie A(P + I) − P ⊂  t 0 R + ˆ Idτ (28) We want to better understand this relation since ˆ I depends upon I. We 10 [...]... is some vector we pick 15 7 Conclusion We have constructed a rigorous integrator which accounts for both the error in truncation of Taylor Series for the solution to an ODE and floating point error The method allows us to rigorously push boxes of solutions around under the flow We have numerical methods available which allows us to do this for reasonably long intevals of time Taylor Methods are still... have explicit formulas for the error now We also can carry them with us through the integration Our algorithm is now Input: f, initial conditions (as intervals), h 1) Make invariant polynomial to model flow and compute remainder interval for time up to t=h 2) Evaluate at the polynomial t=h and add the remainder interval Use this as a new initial condition and goto 1 As a result, we will get a rigorous integration... λ(P ))dτ ) (44) Which gives x0 λ3 h4 (45) 24(1 − λh) Notice that for λ > 0 we have to worry about the denominator going to zero, which would force us to take h small to avoid having d > 1 This phenomemon is known as stiffness and is present in other non -rigorous ODE solvers Dynamically this just says it is hard to model the flow exactly for long periods when it is growing exponentially d= Notice as well... methods avaliable to deal with exactly this problem 13 6 Shrink Wrapping We have succeeded in creating a rigorous integrator Moreover (depending on the ODE) it works for reasonable timescales However, as above with our interval integrator, we have the same troubles that interval remainders start to accumulate and ruin the entire result By using Taylor Models for the initial conditions, we have partially... since it is easier to model the flow for a short period of time, than for a longer period What is hidden is exactly how well it scales The term R which is the nth degree pieces of f(P+I) will behave like O(hn+1 ) so decreasing h (the time we are modeling the ODE for) , or increasing n will result in a dramatic increase in accuracy This is precisely what was said of the non -rigorous integrators in the introduction... bounds for Taylor coefficients and for Taylor remainder series J Comput Appl Math 152 (2003), 393-404 (RMB1) N Revol, K Makino, M Berz Taylor Models and Floating-Point Arithmetic: Proof that Arithmetic Operations are Validated in COSY Journal of Logic and Algebraic Programming 64 (2005) 135 -154 University of Lyon LIP Report RR 2003-11, MSU HEP report 30212 (Z1) Piotr Zgliczynski Lecture notes for Conference... numerical analysis, however new methods to improve integrators are being developed, for example in [BM6], [BM7], [N2] As computer speed and storage capacity increases, runtimes for code decrease, and it becomes easier to run these verified methods In one hundred years from now, might mathematicians look back onto the dark ages of numerics when things were done without verification and wonder why? 8 References... Preconditioning International Journal of Differential Equations and Applications 10(4) (2005) 353-384 (BM8) M Berz, K Makino Performance of Taylor Model Methods for Validated Integration of ODEs Lecture Notes in Computer Science 3732 (2005) 65-74 (BM9) M Berz K Makino Lecture notes for Conference on Computer Assisted Proofs in Dynamical Systems - 2008.http://www.bt.pa.msu.edu/cap08/Talks/ (BMH1) M Berz,... easily manipulated Doing so allows the rigorous integrator to work for a much longer period, since after each step, we shrink wrap the new initial conditions so that going into the next step, there is no interval remainder However Shrink Wrapping has its limits Generally we hope that the shrink wrap factor q is very close to one However, if it is too large (say q ≈ 1.1) for too many steps of integration,... accumulate over a long integration, or for a complicated ODE We use a technique called ’Shrink Wrapping’ outlined in [BM5] which works by attempting to absorb a large remainder bound into the polynomial part of a Taylor Model In doing so, the error can then be manipulated (and hopefully cancelled) along with polynomial parts We outline only the ideas and refer to [BM5] for the proof (The proofs simply rely

Ngày đăng: 12/01/2014, 21:59

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan