Stochastic Mechanics Random Media Signal Processing and Image Synthesis Mathematical Economics and Finance Stochastic Optimization Stochastic Control Applications of Mathematics Stochastic Modelling and Applied Probability 34 Edited by I Karatzas M Yor Advisory Board P Bremaud E Carlen W Fleming D Geman G Grimmett G Papanicolaou Scheinkman Springer-Verlag Berlin Heidelberg GmbH Applications of Mathematics 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 FlemingiRishel, Deterministic and Stochastic Optimal Control (1975) Marchuk, Methods of Numerical Mathematics, Second Edition (1982) Balakrishnan, Applied Functional Analysis, Second Edition (1981) Borovkov, Stochastic Processes in Queueing Theory (1976) LiptserlShiryayev, Statisties of Random Processes I: General Theory (1977) LiptserlShiryayev, Statistics of Random Processes 11: Applications (1978) Vorab'ev, Game Theory: Lectures for Econornists and Systems Scientists (1977) Shiryayev, Optimal Stopping Rules (1978) Ibragimov/Rozanov, Gaussian Random Processes (1978) Wonham, Linear Multivariable Control: A Geometrie Approach, Third Edition (1985) Hida, Brownian Motion (1980) Hestenes, Conjugate Direction Methods in Optimization (1980) Kallianpur, Stochastic Filtering Theory (1980) Krylov, Controlled Diffusion Processes (1980) Prabhu, Stochastic Storage Processes: Queues, Insurance Risk, and Dams (1980) Ibragimov/Has' minskii, Statistical Estimation: Asymptotic Theory (1981 ) Cesari, Optirnization: Theory and Applications (1982) Elliott, Stochastic Calculus and Applications (1982) Marchuk/Shaidourov, Difference Methods and Their Extrapolations (1983) Hijab, Stabilization of Control Systems (1986) Pratter, Stochastic Integration and Differential Equations (1990) Benveniste/Metivier/Priouret, Adaptive Algorithms and Stochastic Approximations (1990) Kloeden/Platen, Numerical Solution of Stochastic Differential Equations (1992) Kushner/Dupuis, Numerieal Methods for Stochastic Control Problems in Continuous Time (1992) FlemingiSoner, Controlled Markov Processes and Viscosity Solutions (1993) Baccelli/Bremaud, Elements of Queueing Theory (1994) Winkler, Image Analysis, Random Fields and Dynamic Monte Carlo Methods (1995) Kalpazidou, Cycle Representations of Markov Processes (1995) ElliottlAggouniMoore, Hidden Markov Models: Estimation and Control (1995) Hernandez-LermaiLasserre, Discrete-Time Markov Control Processes (I 995) Devroye/GyörfilLugosi, A Probabilistie Theory of Pattern Recognition (1996) MaitraiSudderth, Discrete Gambling and Stochastic Games (1996) Embrechts/Klüppelberg/Mikosch, Modelling Extremal Events (1997) Duflo, Random Iterative Models (1997) Marie Duflo Random Iterative Models Translated by Stephen S Wilson Springer MarieDuflo Universitt~ de Marne-la-Vallee Equipe d' Analyse et de Matbematiques Appliquees 2, rue de la Butte Verte, 93166 Noisy-Le-Grand Cedex, France Managing Editors I Karatzas Departments of Mathematics and Statistics, Columbia University New York, NY 10027, USA M.Yor Laboratoire de Probabilites, Universite Pierre et Marie Curie Place Jussieu, Tour 56, F-75230 Paris Cedex, France Title ofthe French original edition: Methodes recursives aleatoires Published by Masson, Paris 1990 Cover picture: From areport on Prediction ofElectricity Consumption drawn up in 1993 for E.D.F by Misiti M., Misiti Y., Oppenheim G and Poggi J.M Library of Congross Cotaloging-in-Publication Oota Duflo Marle [Methode_ recur_lve_ aleotolrO_ Engli_hJ Rando iterative model_ I Marie Duflo : tran_lated by Stephen S Wil_on p em (Applleation_ of othe.aties : 34) Ineludes bibllographieal referenees (p ) and index ISBN 978-3-642-08175-0 ISBN 978-3-662-12880-0 (eBook) DOI 10.1007/978-3-662-12880-0 Iterative methods (Mathematlesl Stoeha_tie processes Adaptive eontrol syste.s~-M.the •• tic.l models I Title n Series OA297.8.D8413 1997 003·.76·015114 dc21 96-45470 CIP Mathematics Subject Classification (1991): 60F05/15, 60042, 60J10/20, 62J02/05, 62L20, 62M05/20, 93EI2115/20/25 ISSN 0172-4568 ISBN 978-3-642-08175-0 This work is subject to copyright All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduetion on microfilm or in any other way, and storage in data banks Duplication of this publieation or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,1965, in its current version, and permission for use must always be obtained from Springer-Verlag Berlin Heidelberg GmbH Violations are liable for proseeution under the German Copyright Law © Springer-Verlag Berlin Heidelberg 1997 Originally published by Springer-Verlag Berlin Heidelberg New York in 1997 Softcover reprint of the hardcover 1st edition 1997 Typeset from the translator's LaTEX files with Springer-TEX style files SPIN: 10070196 41/3143 - 5432 I - Printed on acid-free paper Preface Be they random or non-random, iterative methods have progressively gained sway with the development of computer science and automatic control theory Thus, being easy to conceive and simulate, stochastic processes defined by an iterative formula (linear or functional) have been the subject of many studies The iterative structure often leads to simpler and more explicit arguments than certain classical theories of processes On the other hand, when it comes to choosing step-by-step decision algorithms (sequential statistics, control, learning, ) recursive decision methods are indispensable They lend themselves naturally to problems of the identification and control of iterative stochastic processes In recent years, know-how in this area has advanced greatly; this is reflected in the corresponding theoretical problems, many of which remain open At Whom Is This Book Aimed? I thought it useful to present the basic ideas and tools relating to random iterative models in a form accessible to a mathematician familiar with the classical concepts of probability and statistics but lacking experience in automatic control theory Thus, the first aim of this book is to show young research workers that work in this area is varied and interesting and to facilitate their initiation period The second aim is to present more seasoned probabilists with a number of recent original techniques and arguments relating to iterative methods in a fairly classical environment Very diverse problems (prediction of electricity consumption, production control, satellite communication networks, industrial chemistry, neurons, ) lead engineers to become interested in stochastic algorithms which can be used to stabilize, identify or control increasingly complex models Their experience and the diversity of their techniques go far beyond our aims here But the third aim of the book is to provide them with a toolbox containing a quite varied range of basic tools Lastly, it seems to me that many lectures on stochastic processes could be centred around a particular chapter The division into self-contained parts described below is intended to make it easy for undergraduate or postgraduate students and their teachers to access selected and relevant material VI Contents The overall general foundations are laid in Part The other three par~s can be read independently of each other (apart from a number of easily locatable references and optional examples) This facilitates partial use of this text as research material or as teaching material on stochastic models or the statistics of processes Part I Sources of Recursive Methods Chapter presents the first mathematical ideas about sequential statIstIcs and about stochastic algorithms (Robbins-Monro) An outline sketch of the theory of martingales is given together with certain complementary information about recursive methods Chapter summarizes the theory of convergence in distribution and that of the central limit theorem for martingales, which is then applied to the RobbinsMonro algorithm The AR(l) autoregressive vectorial model of order is studied in detail; this model will provide the essential link between the following three parts Despite its abstract style, the development of this book has been heavily inftuenced by dialogues with other research workers interested in highly specific industrial problems Chapter gives an all-too-brief glimpse of such examples Part 11 Liuear Models The mathematical foundations of automatie control theory, which were primed in Chapter based on the AR(l) model, are developed here Chapter discusses the concepts of causality and excitation for ARMAX models The importance of transferring the excitation of the noise onto that of the system is emphasized and algebraic criteria guaranteeing such a transfer are established Identification and tracking problems are considered in Chapter 5, using classical (gradient and least squares) or more recent (weighted least squares) estimators Part 111 Noulinear Models The first part of Chapter describes the concept of 'stability' of an iterative Markov Fellerian model Simple criteria ensuring the almost sure weak convergence of empirical distributions to a unique stationary distribution are obtained This concept of stability seems to me, pedagogically and practically, much more manageable than the classical notion of recurrence; moreover, many models (fractals, automatie control theory) can be stable without being recurrent A number of properties of rates of convergence in distribution and almost sure convergence complete this chapter VII The identification and tracking problems resolved in Chapter for the linear case are much more difficult for functional regression models Some partial solutions are given in Chapter 7, largely using the recursive kernel estimator Part IV Markov Models Paradoxically, Part IV of this book is the most classical It involves abrief presentation of probabilistic topics described in greater detail elsewhere, placing them in the context of the preceding chapters The general theory of the recurrence of Markov chains is finally given in Chapter Readers will note that, in many cases, it provides a useful complement to the stability theory of Chapter 7, but at the cost of much heavier techniques (and stronger assumptions about the noise) On the subject of learning, Chapter outlines the theory of controlled Markov chains and on-average optimal controls The chapter ends with a number of results from the theory of stochastic approximation introduced in Chapter I: the ordinary differential equation method, Markovian perturbation, traps, applications to visual neurons and principal components analysis What YOll Will Not Find Since the main aim was to present recursive methods wh ich are useful in adaptive control theory, it was natural to emphasize the almost sure properties (laws of large numbers, laws of the iterated logarithm, optimality of a strategy for the average asymptotic cost, ) Convergence in distribution is thus only discussed in outline and the principles of large deviations are not touched upon Iterative Markov models on finite spaces, the simulation of a particular model with a given stationary distribution and simulated annealing are currently in vogue, particularly in image processing circles Although they come under the umbrella of 'random iterative models', they are not dealt with here These gaps have been partially filled in my recent book 'Algorithmes Stochastiques', 1966, Springer-Verlag History The history of this book dates back to the end of the 1980s It was developed at that time within the statistical research team of the Universite Paris-Sud, in particular, by the automatic control team Its contents have been enriched by numerous exchanges with the research workers of this team and its composition has been smoothed by several years of post-graduate courses The first French edition of this book was published by Masson in 1990 VIII When, Springer-Verlag decided to commission an English translation in 1992, I feIt it was appropriate to present a reworked text, taking into account the rapid evolution of some of the subjects treated This book is a translation of that adaptation, which was carried out at the beginning of 1993 (with a number of additions and alterations to the Bibliography) Acknowledgments It is impossible to thank all those research workers and students at the Universite Paris-Sud and at the Universite de Marne-Ia-Vallee where I have worked since 1993, who have contributed to this book through their dialogue Their contributions will be acknowledged in the Bibliography Three research workers who have read and critically reviewed previous drafts deserve special mention: Bernard Bercu, Rachid SenoiIssi and Abderhamen Touati Lastly, Dr Stephen Wilson has been responsible for the English translation He deserves hearty thanks for the intelligent and most useful critical way in which he achieved it Notation Numbering System • Within a chapter, a continuous numbering system is used far the Exercises on the one hand and for the Theorems, Propositions, Corollaries, Lemmas and Definitions on the other hand The references indicate the chapter, section and number: Theorem 1.3.10 (or Exercise 1.3.10) occurs in Section of Chapter and is the tenth of that chapter • marks the end of a Proof; marks the end of the statement of an Exercise or aRemark Standard Mathematical Symbols • Abbreviations Constant is abbreviated to const and In(ln x) to LL • Sets N = integers 2: 0; Z = relative integers; Q = rational numbers; IR = real numbers; C = complex numbers lA is the characteristic function for A, lA(x) = { I if xE A if x rf A • Sequences If (u n ) is areal monotonic sequence, U oo is its limit, either finite or infinite If (u n ) and (vn ) are two positive sequences, U n =ü(vn ) (resp o(vn )) means that (un/v n ) is bounded (resp tends to 0) • Vectors u, tu, ·u, (u, v), Ilull - see Section 4.2.1 • Matrices d x d A = (Aij), Ior I d identity, tA, • A, Tr A, IIAII, detA - see Seetion 4.2.1; p(A) - see Seetion 2.3.1 • Positive Hermitian Matrices AminC, AmaxC, ,;c, C- l , Cl ~ C - see Seetion 4.2.1; Cl C2 - see Section 6.3.2 Norm of a rectangular matrix B, IIBII - see Section 4.2.1 • Excitation of a Sequence of Vectors Y = (Yn) en(y) = 2:Z=o Yk ·Yk We also set (see Section 4.2) sn(Y) =2:Z=o IIYk 11 , fn(Y) = ·Yn(en(y))-IYn and 9n(Y) = *(Yn(en_l(y))-IYn x • Functions If ep is differentiable from jRP to jRq, we denote its Jaeobian matrix by Dep When q = 1, 'Vep = tDep is its gradient • Lipschitzjunction Li(r, s) - Seetion 6.3.2 • ODE - Seetion 9.2 Standard Probabilistic Symbols • Measure (n, A, P) probability spaee; lF = (Fn ) filtration - see Section 1.1.5; (En,e®n) = (E,e)n; BE Borel a-field for E For f measurable from (n, A) to (E, e) and r E e, we denote {J E r} = {w; f(w) Er} For two sequenees of positive random variables (an) and (ßn), we denote {an=O(ßn)} = {w;an(w)=O(ßn(w))} {an = o (ßn)} = {w; an(w) = o (ßn(w»} a.s = almost surely (M) = inereasing proeess, hook of a martingale - see Seetions 1.3.1, 2.1.3 and 4.3.2 • Convergence ~ = eonverges almost surely ~ = eonverges in probability ~ = eonverges in distribution Symbols for Linear Models • Models ARMAX, ARMA, ARX, AR, MA - see Seetion 4.1.1; RMA - see Seetion 5.4.1 • Estimators LS, RLS - Seetion 5.2.1; SG - Seetion 5.3.1; WLS - Seetion 5.3.2; ELS, AML - Seetion 5.4.1 • R for the delay operator - Seetion 4.1.1 Symbols for Nonlinear Models ARF - Seetion 6.2.3; ARXF - Seetion 6.2.4; ARCH - Seetion 6.3.3 NL Nonlinear Models 371 Masry, E (1987): Almost sure convergence of recursive density estimators for stationary mixing processes Stat Probab Lett., 5, 249-54 Masry, E and Györfi, L (1987): Strong consistency and rates for recursive probability density estimators of a stationary process J Multivariate Anal., 22, 79-93 Menon, V.V., Prasad, B and Singh, R.S (1984): Nonparametrie recursive estimates of a probability density function and its derivatives Stat Plann Inference, 9, 73-82 Meyer, Y (1992): Les Ondelettes: Algorithmes et Applications A Colin Mokkadem, A (1987): Sur un modele autoregressif non lineaire: ergodicite et ergodicite geometrique J Time Sero Anal., 8, 195-204 Mokkadem, A (1988): Ergodicite des systemes stochastiques polynomiaux en temps discret In 'New Trends on Non Linear Control Theory' Lect Notes Control Inf Sei., 122, 404-13 Mokkadem, A (1989): Orbites de semi-groupes de morphismes reguliers et systemes non-lineaires en temps discrets Forum Mathematicum, 1, 359-76 Mokkadem, A (1990): Proprietes de melange des processus autoregressifs polynomiaux Ann Inst Henri Poincare, Probab Stat., 26, 219 60 Mokkadem, A (1995): Orbit theorems for semigroups of regular morphisms and non linear discrete time systems BuH Soc Math Fr., 123, 477-91 Naradaya, E.A (1964): On estimating regression Theory Probab Appl., 10, 18690 Narendra, K.S and Parthasarathy, K (1990): Identification and control of dynamical systems using neural networks IEEE Trans Neural Networks, 1, 4-27 Nazin, A.V., Polyak, B.T and Tsybakov, A.B (1989): Passive stochastic approximation Autom Remote Control, 50, 1563-9 Oppenheim, G and Portier, B (1993): Adaptive control of nonlinear dynamic systems: study of the non parametrie estimator Syst Control Eng., 1, 40-50 Oulidi, A (1993a): Loi du logarithme itere uniforme dilatee; application a l'estimation fonctionneHe Thesis Universite Paris-Nord Oulidi, A (1993b): Loi du logarithme itere uniforme dilatee C R Acad Sei., Paris, I, 316, 745-8 Parzen, E (1962): On the estimation of a probability density and mode Ann Math Stat., 33, 1065-76 Pham, D.T (1985): Bilinear Markovian representation and bilinear models Stochastic Processes Appl., 20, 295-306 Pham, D.T (1986): The mixing property of bilinear and generalized random coefficient autoregressive models Stochastic Anal Appl., 23, 291-300 Pham, D.T and Tran, L.T (1985): Some mixing properties of time series models Stochastic Processes Appl., 19, 297-303 Pierre-Loti Viaud, D and Portal, F (1993): Large deviations for random perturbations of discrete time dynamical systems BuH Sei Math 11 Ser., 117, 333-55 372 Bibliography Poggi, J.M and Portier, B (1995): Un test de linearite pour des modeles autoregressifs fonctionnels Preprint 95-82, Universite Paris-Sud Polyak, B.T and Tsybakov, A.B (1990): Asymptotic optimality of the C p test for the orthogonal series estimation of a regression Theory Probab Appl., 35, 293-306 Polyak, B.T and Tsybakov, A.B (1992): Optimal methods for projective regression estimate order Theory Probab Appl., 37, 471-81 Portal, F (1988): Loi asymptotique du supremum des estimateurs a noyaux pour la densite dans le cas de variables aleatoires melangeantes Thesis Universite Paris-Sud Portier, B (1992): Thesis Universite Paris-Sud Prasaka Rao, B.L.S (1978): Density estimation for Markov processes using delta sequences Ann Inst Stat Math., 30, 321-8 Prasaka Rao, B.L.S (1983): Non Parametrie Funetional Estimation Aeademie Press Revesz, P (1978): A strong law of the empirical density function Period Math Hung., 9, 317-24 Robinson, P.M (1984): Robust non parametrie autoregression In 'Robust and Nonlinear Time Series Analysis', J.R Franke, ed Leet Notes Stat., 26, 24755 Springer Rosenblatt, M (1956): Remarks on some non parametrie estimates of a density function Ann Math Stat., 27, 832-5 Rosenblatt, M (1970): Density estimates and Markov sequenees In 'Non Parametrie Teehniques in Statistical Inferenee', M.L.Puri, ed Cambridge University Press, 119-210 Rosenblatt, M (1979): Global measure of deviation of kernel and nearest neighbor estimates Leet Notes Math., 757, 181-90 Rosenblatt, M (1991): Stoehastic eurve estimation NSF-CBMS Regional Conf Ser Probab Stat., Ross, GJ.S (1990): Non Linear Estimation Springer Roussas, G.G (1969a): Non parametrie estimation of the transition distribution funetion of a Markov process Ann Math Stat., 40, 1386-400 Roussas, G.G (1969b): Non parametrie estimation in Markov proeesses Ann Inst Math., 21, 73-87 Roussas, G.G (1988): Non parametrie estimation in mixing sequences of random variables Stat Plann Inferenee, 18, 135-49 Roussas, G.G (1990): Non parametrie regression estimation under mixing conditions Stochastie Proeesses Appl., 36, 107-16 Roussas, G.G., ed (1991): Non Parametrie Functional Estimation and Related Topies Kluwer Academie Press Roussas, G.G and Tran, L.T (1992): Asymptotie normality of the reeursive kernel regression estimate under dependenee eonditions Ann Stat., 20, 98-120 Ruimgaart, F.H (1989): Strong uniform eonvergence of density estimators on spheres J Stat Plann Inferenee, 23, 45-52 NL Nonlinear Models 373 Sastry, S and Isidori, A (1989): Adaptive control of linearizable systems IEEE Trans Autom Control, 34, 1123-31 Schucany, W.R (1989): On non parametric regression with higher-order kerneIs I Stat Plann Inference, 23, 145-51 Schuster, E (1969): Estimation of a probability density function and its derivatives Ann Math Stat., 40, 1187-95 Schuster, E and Yakowitz, S (1979): Contribution to the theory of nonparametric regression with application to system identification Ann Stat., 7, 139-49 Scott, D (1992): Multivariate Density Estimation Wiley Seber, G.A and Wild, C.l (1989): Non Linear Regression Wiley Seminaire de Statistique d'Orsay (1990): Estimation fonctionnelle Universite Paris-Sud Senoussi, R (1990a): Statistique asymptotique presque sure de modeles statistiques convexes Ann Inst Henri Poincare, Probab Stat., 26, 19-44 Senoussi, R (1990b): Problemes d'identification dans le modele de Cox Ann Inst Henri Poincare, Probab Stat., 26, 45-64 Senoussi, R (1991): Lois du logarithme itere et identification Thesis Universite Paris-Sud Silverman, B.W (1978): Weak and strong uniform consistency of kernel estimate of a density and its derivatives Ann Stat., 6, 177-84 Silverman, B.W (1986): Density Estimation for Statistics and Data Analysis Chapman and Hall Stute, W (1982): A law of the iterated logarithm for kernel density estimators Ann Probab., 10,414-22 Subba Rao, T (1981): On the theory of bilinear time series models R Stat Soc., Sero B, 43, 244-55 Subba Rao, T and Gabr, M.M (1984): An Introduction to Bispectral Analysis and Bilinear Time Series Models Lect Notes Stat., 24 Springer Tjostheim, D (1990): Nonlinear time series and Markov chains? Adv Appl Probab., 22, 587-611 Tjostheim, D (1994): Nonlinear time series: a selective review Scand J Stat., 21,97-130 Tong, H (1983): Threshold Models in Non-Linear Time Series Analysis Leet Notes Stat., 21 Springer Tong, H (1990): Non Linear Time Series: A Dynamical System Approach Clarendon Press, Oxford Tran, L.T (1990): Recursive kernel density estimators under weak dependence condition using delta sequences Ann Inst Stat Math., 42, 305-29 Tuan, P.D (1986): The mixing properties of bilinear and random coefficient autoregressive models Z Wahrscheinlichkeitstheorie, 64, 291-300 Walter, G and Blum, J (1979): Probability density estimation using deltasequences Ann Stat., 7, 328-40 Watson, G.S (1964): Smooth regression analysis Sankhya, Sero A, 26, 359-72 374 Bibliography Wolverton, C.T and Wagner, T.J (1969): Recursive estimates of probability densities IEEE Trans Syst Man Cybern., 5, 246-7 Yakowitz, S (1985): Nonparametric density estimation, prediction, and regression for Markov sequences J Am Stat Assoc., 80, 215-21 MC Markov Chains and ControlIed Markov Chains de Acosta, A (1985): Upper bounds for large deviations of dependent random vectors Z Wahrscheinlichkeitstheorie, 69, 551-65 de Acosta, A (1988): Large deviations for vector valued functionals of a Markov chain: lower bounds Ann Probab., 16, 925-60 Asmussen, S (1987): Applied Probability and Queues Wiley Asmussen, S and Keiding, N (1978): Martingale central limit theorems and asymptotic theory for multitype branching processes Adv Appl Probab., 10, 109-29 Azema, J., Dufto, M and Revuz, D (1967): Mesure invariante sur les classes recurrentes des processus de Markov Z Wahrscheinlichkeitstheorie, 8, 157-81 Azema, J., Dufto, M and Revuz, D (1969): Proprietes relatives des processus de Markov n5currents Z Wahrscheinlichkeitstheorie, 13, 286-314 Bellman, R (1957): Dynamic Programming Princeton University Press Bellman, R (1961): Adaptive Control Processes: A Guided Tour Princeton University Press Bertsekas, D.P (1987): Dynamic Programming: Deterrninistic and Stochastic Models Prentice-Hall Bertsekas, D.P and Shreve, S.E (1978): Stochastic Optimal Control: The Discrete Time Case Acadernic Press Bhattacharya, R.N (1982): On the functional centrallimit theorem and the law of iterated logarithm for Markov processes Z Wahrscheinlichkeitstheorie, 60, 185-201 Billingsley, P (1961): Statistical Inference for Markov Processes University of Chicago Press Bolthausen, E (1982): The Berry-Esseen theorem for strongly mixing Harris recurrent Markov chains Z Wahrscheinlichkeitstheorie, 60, 283-9 Borkar, V.S (1988): A convex analytic approach to Markov decision processes Probab Theory Relat Fields, 78, 583-603 Borkar, V.S (1989): Control of Markov chains with long ron average cost criterion-the dynamic programming equations SIAM J ControlOptimization, 27, 642-57 Borkar, V.S and Varaiya, P (1979): Adaptive control of Markov chains IEEE Trans Autom Control, 24, 953-8 Borkar, V.S and Varaiya, P (1982): Identification and adaptive control ofMarkov chains SIAM J Control Optimization, 20, 470 89 Borovkov, A.A (1984): Asymptotic Methods in Queuing Theory Wiley MC Markov Chains 375 Borovkov, A.A (1989): On the ergodicity and stability of the sequence Wn+l = !(wnJ,n); application to communication networks Theory Probab Appl., 33, 595-611 Borovkov, A.A (1990): Ergodicity and stability of multidimensional Markov chains Theory Probab Appl., 35, 542-6 Borovkov, A.A (1991): Lyapunov functions and multidimensional Markov chains Theory Probab Appl., 36, 1-18 Carvalho, M.L (1986): Thesis University of Lisbon Carva1ho, M.L (1990): Un estimateur des valeurs propres de 1a matrice des moyennes pour un processus de Ga1ton-Watson multitype Technical Report University of Lisbon Cellier, D (1980): Methode de fission pour l'etude de la recurrence des chaines de Markov Thesis Universite de Rouen Char1ot, F., Chouaf, B and Guellil, A (1990): Sur 1es methodes de processus ponctue1s et de renouvellement dans les systemes de files d'attente Ann Sci Univ C1ermont-Ferrand 11, Probab Appl., 93, 13-62 Charlot, F., Guidouche, M and Hamami, M (1978): Irreductibilite et recurrence au sens de Harris des temps d'attente des G jG jq Z Wahrscheinlichkeitstheorie, 43, 187-203 Davydov, Yu (1973): Mixing conditions for Markov chains Theory Probab Appl., 28, 313-28 De Groot, M.H (1970): Optimal Statistical Decisions McGraw-Hill Dembo, A and Zeitouni, O (1993): Large Deviations Techniques and Applications Jones and Bartlett Derman, C (1970): Finite State Markovian Decision Processes Academic Press Deuschel, J.D and Stroock, D.W (1989): Large Deviations Academic Press Donsker, M.D and Varadhan, S.R.L (1975a): Asymptotic evaluation of certain Markov process expectations for 1arge time, part Commun Pure Appl Math., 28, 1-47 Donsker, M.D and Varadhan, S.R.L (1975b): Asymptotic evaluation of certain Markov process expectations for large time, part Commun Pure Appl Math., 28,279-301 Donsker, M.D and Varadhan, S.R.L (1976): Asymptotic evaluation of certain Markov process expectations for large time, part Commun Pure Appl Math., 29, 389-461 Donsker, M.D and Varadhan, S.R.L (1983): Asymptotic evaluation of certain Markov process expectations for large time, part Commun Pure Appl Math., 36, 183-212 Doshi, B.T and Shreve, S.E (1980): Strong consistency of a modified maximum estimator for controlled Markov chains J Appl Probab., 17, 726-34 Doukhan, P (1994): Mixing: Properties and Examples Lect Notes Stat., 72 Springer 376 Bibliography Doukhan, P., Leon, J and Portal, F (1987a): Principes d'invariance faible pour la mesure empirique d'une suite de variables aleatoires melangeantes Probab Theory Relat Fields, 76, 51-70 Doukhan, P., Leon, J and Portal, F (1987b): Une mesure de la deviation quadratique d'estimateurs non parametriques Ann Inst Henri Poincare, Probab Stat., 22, 37-66 Doukhan, P., Massart, P and Rio, E (1994): The functional centrallimit theorem for strongly mixing processes Ann Inst Henri Poincare, Probab Stat., 30, 63-82 Doukhan, P., Massart, P and Rio, E (1995): Invariance principle for absolutely regular empirical processes Ann Inst Henri Poincare, Probab Stat., 31, 393427 Dufto, M and Revuz, D (1969): Proprietes asymptotiques des probabilires de transition des processus de Markov recurrents Ann Inst Henri Poincare, Probab Stat., 5, 233-44 Dynkin, E.B (1965): Markov Processes Academic Press Dynkin, E.B and Yushkevitch, A (1979): Controlled Markov Processes Springer (Russian: (1975» EI Fattah, Y.M and Foulard, C (1978): Learning Systems: Decision, Simulation, Control Springer Foguel, S.R (1969): The Ergodic Theory of Markov Processes Van Nostrand Freedman, D (1985): Markov Chains Springer Georgin, J.P (1977): Estimation et contröle des chatnes de Markov dependant d'un parametre In 'Statistique des Processus Stochastiques' Lect Notes Math., 636, 71-114 Springer Georgin, J.P (1978): Contröle de chaines de Markov sur des espaces arbitraires Ann Inst Henri Poincare, Probab Stat., 14, 255-77 Gikhrnan, 1.1 and Skorokhod, A.V (1979): Controlled Stochastic Processes Springer Gordienko, E.1 (1985): Adaptive strategies for certain classes of controlled Markov processes Theory Probab Appl., 29, 504-18 Hajek, B (1982): Hitting-time and occupation-time bounds implied by drift analysis with applications Adv Appl Probab., 14, 502-25 Hernandez-Lerma, O (1989): Adaptive Markov Control Process Springer Hernandez-Lerma, o and Lasserre, J.B (1996): Discrete-Time Markov Control Processes, Basic Optimality Criteria Springer Hinderer, K (1970): Foundations of Nonstationary Dynamic Programming with Discrete Time Parameter Lect Notes Oper Res., 33 Springer Höpfner, R., Jacod, J and Ladelli, L (1990): Local asymptotic normality and mixed normality for Markov statistical models Probab Theory Relat Fields, 86,105-29 Howard, R.A (1960): Dynamic Programming and Markov Processes MIT Press Ibragimov, LA (1975): A note on the centrallimit theorem for dependent random variables Theory Probab Appl., 20, 135-41 MC Markov Chains 377 Ibragimov, LA and Linnik, Y.V (1971): Independent and Stationary Sequences of Random Variables W Nordhoof Pub!., Gröningen Iosifescu, M and Teodorescu, R (1969): Random Processes and Learning Springer Jain, N and Jamison, B (1967): Contribution to Doeblin's theory of Markov processes z Wahrscheinlichkeitstheorie, 8, 19-40 Jensen, J.L (1989): Asymptotic expansions for strongly mixing Harris recurrent Markov chains Scand J Stat., Theory App!., 16,47-63 Kemeny, J.G., Snell, J.L and Knapp, A.W (1976): Denumerable Markov Chains Springer Kumar, P.R and Becker, A (1982): A new family of optimal adaptive controllers for Markov chains IEEE Trans Autom Control, 27, 137-46 Kumar, P.R and Lin, W (1982): Optimal adaptive controllers for unknown Markov chains IEEE Trans Autom Contro!, 27, 765-74 Kurano, M (1985): Average optimal adaptive policies in semi-Markov decision processes including an unknown parameter J Oper Res Soc Japan, 28, 25266 Kurano, M (1986): Markov decision processes with Borel measurable cost function: the average reward case Math Oper Res., 11, 309-20 Kurano, M (1987): Learning algorithms for Markov decision processes J Appl Probab., 24, 270-6 Lindvall, T (1992): Lectures on Coupling Method Wiley Maigret, N (1978): Theoreme de limite centrale pour une chaine de Markov recurrente Harris positive Ann Inst Henri Poincare, Probab Stat., 14, 425-40 Maigret, N (1979a): Statistique des chaines contrölees felleriennes Asterisque, 68, 143-69 Maigret, N (l979b): Majoration de Chernoff et statistique sequentielle de chaines de Markov recurrentes au sens de Doeblin Asterisque, 68, 125-42 Malinovski, V.K (1974): On limit theorems for Harris recurrent Markov chains (I) Theory Probab Appl., 31, 269-85 Malinovski, V.K (1985): On limit theorems for the number of Markov renewals Lect Notes Math., 1155, 190-222 Malinovski, V.K (1989): On limit theorems for Harris recurrent Markov chains (11) Theory Probab Appl., 34, 252-65 Mandl, P (1973): On the adaptive control of finite state Markov processes Z Wahrscheinlichkeitstheorie, 27, 263-76 Mandl, P (1974): Estimation and control in Markov chains Adv Appl Probab., 6, 40-60 Mandl, P (1987): Limit theorems of probability theory and optimality in linear controlled systems with quadratic cost Proc 5th IFIP Conf Lect Notes Contro! Inf Sci., 316-29 Mandl, P and Hübner, G (1985): Transient phenomena and se!f-optimizing contro! of Markov chains Acta Univ Caro!., Math Phys., 26, 35-51 378 Bibliography Meyn, S.P (1989): Ergodie theorems for discrete time stochastic systems using a stochastic Lyapunov function SIAM Control Optimization, 27, 1409-39 Meyn, S.P and Caines, P.E (1991): Asymptotic behaviour of stochastic systems possessing Markovian realizations SIAM J Control Optimization, 29, 535-61 Meyn, S.P and Tweedie, RL (1992): Criteria for stability ofMarkovian processes I: discrete time chains Adv Appl Probab., 24, 542-74 Meyn, S.P and Tweedie, R.L (1993a): Markov chains and stochastic stability Springer Meyn, S.P and Tweedie, R.L (1993b): Stability of markovian processes 11: continuous time processes and sampled chains Adv Appl Probab., 25, 487517 Meyn, S.P and Tweedie, RL (1993c): Stability of markovian processes IIT: Foster-Lyapunov criteria for continuous time processes with examples Adv Appl Probab., 25, 518-48 Milito, RA and Cruz, J.B (1987): An optimization oriented approach to adaptive control of Markov chains IEEE Trans Autom Control, 32, 754-62 Mine, H and Osaki, S (1970): Markovian Decision Processes Elsevier Neveu, J (1972): Potentiel markovien des chaines de Markov Harris recurrentes Ann Inst Fourier, 22, 85-130 Ney, P and Nummelin, E (1987): Markov additive processes Il: large deviations Ann Probab., 15, 593-609 Nummelin, E (1978): A splitting technique for Harris recurrent chains Z Wahrscheinlichkeitstheorie, 43, 309-18 Nummelin, E (1984): General Irreductible Markov Chains and Nonnegative Operators Cambridge University Press Nummelin, E and Tweedie, R.L (1978): Geometrie ergodicity and R-positivity for general Markov chains Ann Probab., 6, 404-20 Nummelin, E and Tuominen, P (1982): Geometrie ergodicity of Harris recurrent Markov chains with applications to renewal theory Stochastic Processes Appl., 12, 187-202 Orey, S (1971): Lecture Notes on Limit Theorems for Markov Chain Transition Probabilities Van Nostrand Pakes, A.G (1969): Some conditions for ergodicity and recurrence of Markov chains Oper Res., 17, 1048-61 Pitrnan, J.W (1974): Uniform rates of convergence for Markov transition probabilities Z Wahrscheinlichkeitstheorie, 29, 193-227 Pitrnan, J.W (1976): On coupling of Markov chains Z Wahrscheinlichkeitstheorie, 35, 315-22 Portal, F and Touati, A (1984): Theoremes de grandes deviations pour des mesures aleatoires Z Wahrscheinlichkeitstheorie, 64,41-60 Revuz, D (1984): Markov Chains, 2nd edn North-Holland Rio, E (1995): The functionallaw of the iterated logarithrn for stationary strongly mixing sequences Ann Probab., 23, 1188-203 MC Markov Chains 379 Rosenblatt, M (1956): A central limit theorem and a strong mixing condition Proc Nat! Acad Sci USA, 42, 43-7 Roussas, G.G and Ioannides, D (1987): Moment inequalities for mixing sequences of random variables Stochastic Anal Appl., 5, 61-120 Ross, S.M (1970): Applied Probability Models with Optimization Applications Holden-Day Ross, S.M (1983): Introduction to Stochastic Dynamic Programming Academic Press Schäl, M (1987): Estimation and control in discounted dynamic programming Stochastics, 20, 51-71 Shiryayev, A.N (1978): Optimal Stopping Rules Springer (Nauka, 1969) Sigman, K (1990): The stability of open queuing networks Stochastic Processes Appl.,35, 11-25 Skorokhod, A.V (1987): Topologically recurrent Markov chains: ergodic properties Theory Probab Appl., 31, 563-71 Spitzer, F (1976): Principles of Random Walks Van Nostrand Striebel, C (1975): Optimal control of discrete time stochastic systems Lect Notes Econ Math Syst., 110 Springer Touati, A (1983): Theoremes de limite centrale fonctionnels pour les processus de Markov Ann Inst Henri Poincare, Probab Stat., 19, 43-55 Touati, A (1988): Loi fonctionnelle du logarithme itere pour les processus de Markov recurrents C R Acad Sci., Paris, Sero I, 306, 339-42 Touati, A (1989): Principes d'invariance avec limites non browniennes Thesis Universite Paris-Nord Tuominen, P and Tweedie, R.L (1979a): Markov chains with continuous components Proc Lond Math Soc., III Ser., 38, 89-114 Tuominen, P and Tweedie, R.L (l979b): The recurrence structure of general Markov processes Proc Lond Math Soc., III Ser., 39, 554-76 Tweedie, R.L (1974): R-theory for Markov chains on a general space I: solidarity properties and R-recurrent chains Ann Probab., 2, 840 64 Tweedie, R.L (1975): Sufficient conditions for ergodicity and recurrence of Markov chains on a general state space Stochastic Processes Appl., 3, 385-403 Tweedie, R.L (1976): Criteria for classifying Markov chains Adv Appl Probab., 8, 737-71 Tweedie, R.L (1981): Criteria for ergodicity, exponential ergodicity and strong ergodicity of Markov processes Appl Probab., 18, 122-30 Tweedie, R.L (1988): Invariant measures for Markov chains with no irreductibility assumptions J Appl Probab., 25, 275-85 Ueno, T (1957): Some limit theorems for temporally discrete Markov processes Fac Sci Univ Tokyo (I), 7, 449-62 WhittIe, P (1982-3): Optimization over Time: Dynamic Programming and Stochastic Control, vols 1-2 Wiley Index absorbing set 272,281,334 adapted adaptive controllO admissible - decision 253 - strategy 253,305 algorithm, stochastic 10,315-47 ALOHA model 83 aperiodic, aperiodicity 183,282,291 ARCH model 207 ARF model (see autoregressive model -functional) ARMA model 77,89-95,129,166-8 ARMAX - artificial neural network 86 - equation 119 - model 90,121,150-66,168-76 ARX model 78,90, 148, 176,263,329-30 Ascoli's Theorem 216,219,221 asymptotically stable, for an ODE 316 atom, of a Markov chain 279,296 atomic Markov chain 278-81 attractive point, strongly 327,332 autoregressive model 34,59-74,77,90, 111,140-4 - functional 183,193-6,244,274,287, 294 - Gaussian 138 - general 127 - general regular 127 - moving average 90 - of order 62 - of order p 77,90 - singular 128 average cost 135,261 averaging method 59 Bayesian regression model 139 Bezout's Theorem 120 Bochner's Lemma 228 Borel-Cantelli Theorem 20 - generalized 21 Box and Jenkins 76,77 Brownian motion 218 Burkholder-Davis-Gundy inequality 17 canonical version, of a Markov chain 185,256,270,305 Cantor measure 45,68 Cantor set 45 Cauchy distribution 9,37,58 causal, causality - for the control 169,170 - matricial polynomial 92,155 cemetery point 280 central limit theorem 7, 45-8 - almost sure 116 - for Markov chains 203,295-7 centred 14 Chow's Theorem 22 closed loop control 168 communicate 273 companion matrix 91 compensator 15 consistency - pointwise 231,241 - strong 5,7 - uniform 235, 236, 242 - weak continuous process 216 control 10, 30, 251, 305 (see also tracking, controlled-Markov chain) controllable, controllability - ARMA model 129 - autoregressive model 62,68,72, 124 controlled - iterative model 252 - Markov chain 252-65,305-15 - model 251 convergence - almost sure 40 - in distribution 39, 40, 216 382 Index - in probability 40 - weak 39 convolution 154 cost 260, 310 coupling 274,275,301 covariance 24,64 - matrix 24 degree of excitation 100 delay - of a control 168 - operator 76,91, 154 dominated model 255 Doob's Theorem 15 dosage 3, 30, 252 efficient a1gorithm 59 Egoroffs Theorem 66,74,341 empirie al - 8-estimator 238 - covariance 7,134,248 - estimator 7,32,33,227 of the autoregressive part 168,215 - - of a regression function 238 - mean - probability distribution 181 - - regularized 228 Epanechnikov's kernel 228 estimator, estimation (see empirieal, gradient, identification, kernei, least-squares estimator, likelihood, recursive) excessive measure 186,188,278 excitation - A- 100,174 - of an AR(l) model 124 - of an ARMA model 129 - inversion of 101 - of order T 100 - persistent 100, 116, 124 - of a sequence of vectors 100 - transfer of 100,116-32 explicative variable 75 - stable 241 - stabilized 245 explosion coefficient 10 1, 102, 125 explosive - AR(P) model 141 - ARMA model 94,167 - autoregressive model 62,68, 111 - component 71 Fellerian 186 - strongly 186 filtration finite-dimensional distribution Fisher information matrix Fourier series 152-7 216, 217 GIGll queue 274,281 Gaussian - distribution 7,9,42 - distribution, complex 51 - regression model 139 - white noise 26 geometrie - recurrence 293,299 - series 64 Gershgorin's Theorem 334 gradient - algorithm 13,337 - estimator 36,314 for regression models 144, 145, 152, 176,327 - predictor 149 Hajek's criterion 273,307 Hartman-Wintner Theorem 209 Hebb 85,334 hook 45,106 identification 100, 134 - of an AR(P) model 140-3 - of an ARMAX model 166-9 - nonlinear 227-50 - of a regression model 136-9 increasing process 16 initial - distribution 184,289 - state 91 invariant - measure 186,188,278 - for an ODE 316 irreducible, irreducibility - ARMAX model 119-20 - - for the control 170 - Markov chain 188,290 iterated logarithm, law of the 209-13 - uniform 220 - - expanded uniform 220 iterative model 183 - Lipschitz 199 - of order p 205 Jordan - block 70 - decomposition 70, 125 Index kernel 228 - estimator 228-50, 260, 298 Kiefer-Wolfowitz - algorithm 4, 31 - procedure Kohonen 85,334 - algorithm 336, 338 Kolmogorov' s inequalities 17 Kronecker' s Lemma 19 - for matrices 99 Kushner-Clark Theorem 318 ladder index 23,103,272 law of large numbers 7,19,21 - expanded uniform 220 - for a Markov chain 279,287 - for regressive series 26,109 - - weighted 111 - for square-integrable martingales 19 - for a stable model 212 - uniform 218 - for vector martingales 104, 106 leads to 272 learning 85 (see control, controlled -Markov chain) least-squares estimator 34,76,136-44, 151,162,214 - weighted 145,151,164 likelihood - of a controlled Markov chain 254-5 Lindeberg's condition 46,47 linear model 75,87,133 Lipschitz mixing 200-8,262, 325 log-likelihood 8,255 logistic distribution 9,37,58 Lyapunov - condition 47 - function 41,189,191,227,273,307, 316 maintenance 313 Markov - chain 183,184,269-304 controlled 251-65,305-15 - representation 196 Markovian perturbation 321 martingale 14, 15 - square-integrable 15,16,45,106 - vector 106 matricial polynomial 91,154 mean 24 - empirical distribution 182,184 moment - finite, of order b 24 - of order > a 25 moving average 50,150 - process 43 multitype branching process 383 72 neuron 85,334 Newton's algorithm 14 Newton's estimator 36,57 noise 24,25,90 - adapted to lF 25 - with a moment of order > a non-atomic 44,45, 68 25 occupation time, average 279,282 open loop control 168 optimal 135 - on average 5,32,33,135,261,310 ordinary differential equation 315 - method 317 Orey's Theorem 293,298 oscillation 216 Pakes's criterion 273,307 passive 156 passivity 152,156 Pavlov 85,334 persistent excitation 100,116,124 perturbed recursive equation 315 Poisson equation 201,203,212,262,271, 293,295,311 Polish 40 positive - Hermitian matrices 95 - - sequences of 99 - - series of 97 - recurrence (see recurrence, recurrent (Markov chain» predictable 6, 135 - square variation 45, 106 prediction 15,80,134 - error 134,138,247 for the noise 152 - - for the state of the system 152 - for a time-varying system 197 predictor 78, 133, 134 - for an ARMAX model 150 principal components analysis 343 quadratic cost 33,204,263 quality control 10 queue 274 384 Index r-adic random numbers 44 random walk 272,281 rate of pointwise eonsisteney 233 reeurrenee, reeurrent (Markov ehain) 81, 182,276-91 - aperiodic 183 - dass 281 - geometrie 293,299 - null 276 - on the open sets 189 - positive 182,276,293 - rapid 322 - Riemannian 293, 299 - in the sense of Doeblin 294 - topological 189 reeursive - algorithm 10 - estimation 7,35 region of attraetion 316 regression - funetion, estimation of 238 - linear 75 - model 90, 136 - multiple 75 - nonlinear 79 - with a stable explieative variable 241 - with a stabilized explieative variable 245 regressive series 24 - multidimensional 108 - negligible sets for 340 regular model - AR(P) 140 - ARMA 94 - statistieal renewal theorem 282 resolvent 270,271,284,290 retum time 16,272,306 Rieeati equation 96 Riemannian reeurrenee 293,299 Robbins-Monro - algorithm 4,29,52,315 - proeedure 3, 30 - Theorem 29 Robbins-Siegmund Theorem 18 robustness 163 sampie satellite eommunieation 82 seareh - for extrema 4,13,31,337 - for a zero 11 self-convergence 144 sequential statistics 10 shift operator 271 singular - AR(P) model 141 - ARMA model 94 small set 286, 310 spectral radius 59,93 split Markov chain 288 splitting 287,299,303 stable - autoregressive model 62,72 - AR(P) model 140,213,328 - ARF(I) model 194 - ARF(P) model 193 - - identification of 244 - ARMA model 93,94,166 - distribution 139 - iterative model of order p 205 - Markov chain 186 - model 182 stability 61,181-208 stabilization 61,93,189 - of a controlled Markov chain 256-60 - eriteria 189-91 stabilizing control 149 state - space - variable 196 stationary - m-dependent sequence 42 - distribution 182,276 - - autoregressive model 62 estimation of 227-37 stock management 252, 253, 256, 258 stopping - theorem 16 - time 16 strategy 251 - admissible 253, 305 - optimal on average 261, 310 - random-valued 254 - stationary 253 - translated 306 strong Markov property 271,306 submartingale 15 superharmonic 271 supermartingale 15 - exponential 209 threshold 85 - autoregressive model with tight 40,217 195 Index tightness 40 - of continuous processes 216,217 time series 76 time-varying system 149 Toeplitz's Lemma 54 total variation norm 292 tracking 5,33,49,80, 133, 135, 140-50, 168-76 - by an ARMAX model 168-76 - control 78,135,147,170,197 - by regression models 140-50 - by a time-varying system 197 trajectory 216 transfer of excitation 100,116-32 - to an ARMAX model 120 transient - atom 279 - set 272 - small set 286 transition - matrix 270 - probability 184 translation - model 8, 37, 58 - parameter 35,57 trap 385 320, 340-7 trend 76-7 tuning 77 two-armed bandit 5,10,32,49,213,238, 252,320 - autoregressive 35 - Gaussian 35 Ueno's Theorem 294 unstable - AR(P) model 140 - ARMA model 94, 166 - autoregressive model 62, 72 - component 71 visual neuron 334-6 Wald's Theorem 17 wavelet 86 weighted - law of large numbers 111 - least-squares estimator (see leastsquares estimator) white noise 25 window 229 Springer and the environment At Springer we flrmly believe that an international science publisher has a special obligation to the environment, and our corporate policies consistently reflect this conviction We also expect our business partners paper mills, printers, packaging manufacturers, etc - to commit themselves to using materials and production processes that not harm the environment The paper in this book is made from low- or no-chlorine pulp and is acid free, in conformance with international standards for paper permanency Springer ... Embrechts/Klüppelberg/Mikosch, Modelling Extremal Events (1997) Duflo, Random Iterative Models (1997) Marie Duflo Random Iterative Models Translated by Stephen S Wilson Springer MarieDuflo Universitt~... style files SPIN: 10070196 41/3143 - 5432 I - Printed on acid-free paper Preface Be they random or non -random, iterative methods have progressively gained sway with the development of computer science... Is This Book Aimed? I thought it useful to present the basic ideas and tools relating to random iterative models in a form accessible to a mathematician familiar with the classical concepts of