Transition probabilities and generators
A Markov process is a stochastic process which satisfies the condition that the future depends only on the present and not on the past, i.e., for anys 1 ≤ ã ã ã ≤s k ≤tand any measurable setsF 1 ,ã ã ã , F k , andF
(9) More formally letF t s be the subalgebra of F generated by all events of the form
{x u (ω)∈ F}whereF is a Borel set ands ≤u≤ t A stochastic processx t is a Markov process if for all Borel setsF, and all0≤s≤twe have almost surely
We will use later an equivalent way of describing the Markov property Let us con- sider 3 subsequent timest 1 < t 2 < t 3 The Markov property means that for anyg bounded measurable
E[g(x t 3)|F t t 2 2 × F t t 1 1 ] = E[g(x t 3)|F t t 2 2 ] (11) The time reversed Markov property that for any bounded measurable functionf
The equation E[f(x_t | F_t^3 × F_t^2)] = E[f(x_t | F_t^2)] indicates that the past is influenced solely by the present, not the future This relationship highlights the equivalence of two properties, which we will demonstrate as being linked to the symmetric condition.
E[g(x t 3)f(x t 1)|F t t 2 2 ] = E[g(x t 3)|F t t 2 2 ]E[f(x t 1)F t t 2 2 ], (13) which asserts that given the present, past and future are conditionally independent.
By symmetry it is enough to prove
Lemma 3.1 The relations (11) and (13) are equivalent.
Proof Let us fixf andgand let us setx t i =x i andF t t i i ≡ F i , fori= 1,2,3 Let us assume that Eq (11) holds and denote byˆg(x 2)the common value of (11) Then we have
The equation E[f(x₁)|F₂] ˆg(x₂) = E[f(x₁)|F₂]E[g(x₃)|F₂] illustrates a key relationship in the context of conditional expectations Assuming the validity of this equation, we can denote the left and right sides of the expression as g(x₁, x₂) and g(ˆx₂), respectively Additionally, let h(x₂) represent any bounded measurable function, highlighting the flexibility and applicability of these mathematical concepts.
= E[h(x 2)ˆg(x 2)E[f(x 1)| F 2]] = E[f(x 1)h(x 2)ˆg(x 2)] (15) Sincef andhare arbitrary this implies thatg(x 1 , x 2 ) = ˆg(x 2 )a.s
A natural way to construct a Markov process is via a transition probability func- tion
P t (x, F), t∈T , x∈ R n , Fa Borel set, (16) where (t, x) → P t (x, F)is a measurable function for any Borel setF andF →
P t (x, F)is a probability measure onR n for all(t, x) One defines
The finite dimensional distribution for a Markov process starting atxat time0are then given by
By the Kolmogorov Consistency Theorem this defines a stochastic process x t for whichP {x 0 =x}= 1 We denoteP x andE x the corresponding probability distri- bution and expectation.
One can also give an initial distributionπ, whereπis a probability measure on
R n which describe the initial state of the system att = 0 In this case the finite dimensional probability distributions have the form
F k−1 π(dx)P t (x, dx 1)P t 2 − t 1(x 1 , dx 2)ã ã ãP t k − t k−1 (x k − 1 , F k ), (19) and we denoteP π andE π the corresponding probability distribution expectation.
Remark 3.2 We have considered here only time homogeneous process, i.e., processes for whichP x {x t (ω)∈F|x s (ω)}depends only ont−s This can generalized this by considering transition functionsP(t, s, x, A).
The following property is a immediate consequence of the fact that the future de- pends only on the present and not on the past.
Lemma 3.3 (Chapman-Kolmogorov equation) For0≤s≤twe have
For a measurable functionf(x),x∈ R n , we have
P t (x, dy)f(y) (22) and we can associate to a transition probability a linear operator acting on measurable function by
From the Chapman-Kolmogorov equation it follows immediately thatT t is a semi- group: for alls, t≥0we have
We have also a dual semigroup acting onσ-finite measures onR n :
The semigroupT t has the following properties which are easy to verify.
1 T t preserves the constant, if1(x)denotes the constant function then
2 T t is positive in the sense that
3 T t is a contraction semigroup onL ∞ (dx), the set of bounded measurable func- tions equipped with the sup-norm ã ∞
The spectral properties of the semigroup \( T_t \) play a crucial role in examining the long-term ergodic behavior of the Markov process \( x_t \) To effectively apply methods from functional analysis, it is essential to define these semigroups on function spaces that are more suitable for analytical purposes than the space of measurable functions.
A semigroup \( T_t \) is classified as weak-Feller if it preserves the set of bounded continuous functions \( C_b(\mathbb{R}^n) \) When the transition probabilities \( P_t(x, A) \) exhibit stochastic continuity, specifically \( \lim_{t \to 0} P_t(x, B(x)) = 1 \) for any \( \epsilon > 0 \) (where \( B(x) \) denotes the \( \epsilon \)-neighborhood of \( x \)), it can be demonstrated that \( \lim_{t \to 0} T_t F(x) = f(x) \) for any \( f(x) \in C_b(\mathbb{R}^n) \) Consequently, \( T_t \) functions as a contraction semigroup on \( C_b(\mathbb{R}^n) \).
A semigroup \( T_t \) is considered strong-Feller if it transforms bounded measurable functions into continuous functions, indicating a "smoothing effect." To demonstrate the strong-Feller property, one can establish that the transition probabilities \( P_t(x, A) \) possess a density.
P t (x, dy) = p t (x, y)dy , (29) wherep t (x, y)is a sufficiently regular (e.g continuous or differentiable) function of x,y and maybe also oft We will discuss some tools to prove such properties in
IfT t is weak-feller we define the generatorLofT t by
The domain of definition ofLis set of allf for which the limit (30) exists for allx.
Stationary Markov processes and Ergodic Theory
We say that a stochastic process is stationary if the finite dimensional distributions
P {x t 1 +h ∈F 1 ,ã ã ã , x t k +h ∈F k } (31) are independent ofh, for allt 1 0.
E = {ω;x t (ω)∈Afor allt∈ R } (50) is a nontrivial set in the invariantσ-field What we have proved is just the converse the statement.
The extremal points of all stationary distributions, denoted as S t π = π, represent ergodic stationary processes, which we refer to as ergodic stationary distributions According to the ergodic theorem, if π is ergodic, then the limit as t approaches infinity is significant.
F(θ s (x ã (ω))ds = E π [F(x ã (ω))] (51) forP π almost allω IfF(x ã ) = f(x 0 )depends only on the state at time0and is bounded and measurable then we have t→∞ lim
0 f(x s (ω))ds f(x)dπ(x) (52) forπalmost allxand almost allω Integrating overωgives that t→∞ lim
The property of mixing is implied by the convergence of the probability measure
In the context of convergence, the approach can vary based on the specific problem, utilizing different topologies This article focuses on the total variation norm, particularly for a signed measure on R^n, which is defined as the supremum of the measure's total variation.
Clearly convergence in total variation norm implies weak convergence.
Assuming the existence of a stationary distribution π for the Markov process with transition probabilities P_t(x, dy), the condition that the limit of P_t(x, a) approaches π as t approaches infinity indicates the process is mixing To demonstrate this mixing, it suffices to focus on events E within F_s^−∞ and F within F_t^∞ Given that Θ^−t(F_s^−∞) equals F_s^−∞−t, we need to establish that as k = t−s approaches infinity, the intersection of events à(E ∩ F) converges to the product of their individual probabilities à(E) à(F).
P x (Θ −t 1 F) (P k (x s 2 (ω), dx)−π(dx)) d P π , (57) from which we conclude mixing.
An essential illustration of a Markov process is Brownian motion, which begins with an initial distribution represented by a delta mass at point x The transition probability function for this process is defined by the density p_t(x, y) = 1.
Then for0≤t 1 < t 2