Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 580 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
580
Dung lượng
12,63 MB
Nội dung
[...]... STATISTICAL PHYSICSAND PROBABILITY THEORY of probability theory (Secs 2.1 to 2.3) In Section 2.4 we focus on large systems, and stress that the statistical physics approach becomes particularly meaningful in this regime Theoretical statistical physics often deal with highly idealized mathematical models of real materials The most interesting (and challenging) task is in fact to understand the qualitative... log2 r2 (x) (1.13) x∈X1 The proof is obvious by just substituting the laws r1 and r2 by their expanded definitions This property is interpreted as the fact that the average information associated to the choice of an event x is additive, being the sum of the relative information H(q) associated to a choice of subset, and the information H(r) associated to the choice of the event inside the subsets (weighted... variables and mutual entropy Given two random variables X and Y , taking values in X and Y, we denote their joint probability distribution as pX,Y (x, y), which is abbreviated as p(x, y), and the conditional probability distribution for the variable y given x as pY |X (y|x), abbreviated as p(y|x) The reader should be familiar with Bayes’ classical theorem: p(y|x) = p(x, y)/p(x) (1.22) When the random... INTRODUCTION TO INFORMATION THEORY From a computational point of view, the encoding procedure described above is unpractical One can build the code once for all, and store it somewhere, but this requires O(|X |N ) memory On the other hand, one could reconstruct the code each time a string requires to be encoded, but this takes O(|X |N ) time One can use the same code and be a bit smarter in the encoding... Notes There are many textbooks introducing to probability and to information theory A standard probability textbook is the one of Feller (Feller, 1968) The original Shannon paper (Shannon, 1948) is universally recognized as the foundation of information theory A very nice modern introduction to the subject is the book by Cover and Thomas (Cover and Thomas, 1991) The reader may find there a description... ambitious purpose of statistical physics (and, more generally, of a large branch of condensed matter physics) is to understand this variety It aims at explaining how complex behaviors can emerge when large numbers of identical elementary components are allowed to interact We have, for instance, experience of water in three different states (solid, liquid and gaseous) Water molecules and their interactions do... TO INFORMATION THEORY {ch:intro_info} This chapter introduces some of the basic concepts of information theory, as well as the definitions and notations of probabilities that will be used throughout the book The notion of entropy, which is fundamental to the whole topic of this book, is introduced here We also present the main questions of information theory, data compression and error correction, and. .. H(p) = − p(x) log p(x) (1.19) x∈X ‘‘Info Phys Comp’’ Draft: November 9, 2007 ‘‘Info Phys 8 INTRODUCTION TO INFORMATION THEORY Example 1.10 Let {Xt }t∈N be a Markov chain with initial state {p1 (x)}x∈X and transition probabilities {w(x → y)}x,y∈X Call {pt (x)}x∈X the marginal distribution of Xt and assume the following limit to exist independently of the initial condition: p∗ (x) = lim pt (x) t→∞ (1.20)... E[p(x)/q(x)] = 0 The KL divergence D(q||p) thus looks like a distance between the probability distributions q and p, although it is not symmetric The importance of the entropy, and its use as a measure of information, derives from the following properties: 1 HX ≥ 0 2 HX = 0 if and only if the random variable X is certain, which means that X takes one value with probability one 3 Among all probability... uncertainty of x due to the knowledge of y, and is symmetric in x, y Proposition 1.11 IX,Y ≥ 0 Moreover IX,Y = 0 if and only if X and Y are independent variables Proof: Write −IX,Y = Ex,y log2 p(x)p(y) Consider the random variable u = p(x,y) (x, y) with probability distribution p(x, y) As the logarithm is a concave function (i.e -log is a convex function), one and applies Jensen’s inequality (1.6) This . of information theory, data compression and error correction, and state Shannon’s theorems. 1.1 Random variables The main object of this book will be the behavior of large sets of discrete random. and monotonicity can be used to define axiomatically the entropy. 1.3 Sequences of random variables and entropy rate {sec:RandomVarSequences} In many situations of interest one deals with a random. distributions q and p, although it is not symmetric. The importance of the entropy, and its use as a measure of information, derives from the following properties: 1. H X ≥ 0. 2. H X = 0 if and only