PART III
Trang 2
CHAPTER II
The nature of statistical inference
11.1 Introduction
In the discussion of descriptive statistics in Part 1 it was argued that in order to be able to go beyond the mere summarisation and description of the observed data under consideration it was important to develop a mathematical model purporting to provide a generalised description of the data generating process (DGP) Motivated by the various results on frequency curves, a probability model in the form of the parametric family of density functions ®= { f(x; 0), @Â â} and its various ramifications was formulated in Part II, providing such a mathematical model Along with the formulation of the probability model ® various concepts and results were discussed in order to enable us to extend and analyse the model, preparing the way for statistical inference to be considered in the sequel Before we go on to consider that, however, it is important to understand the difference between the descriptive study of data and statistical inference As suggested above, the concept of a density function in terms of which the probability model is defined was motivated by the concept of a frequency curve It is obvious that any density function f(x; 6) can be used as a frequency curve by reinterpreting it as a non-stochastic function of the observed data This precludes any suggestions that the main difference between the descriptive study of data and statistical inference proper lies with the use of density functions in describing the observed data ‘What is the main difference then”
In descriptive statistics the aim is to summarise and describe the data under consideration and frequency curves provide us with a convenient way to do that The choice of a frequency curve is entirely based on the data in hand On the other hand, in statistical inference a probability model © is
Trang 3214 The nature of statistical inference
postulated a priori as a generalised description of the underlying DGP giving rise to the observed data (not the observed data themselves) Indeed, there 1s nothing stochastic about a set of members making up the data The stochastic element is introduced into the framework in the form of uncertainty relating to the underlying DGP and the observed data are
viewed as one of the many possible realisations In descriptive statistics we
start with the observed data and seek a frequency curve which describes these data as closely as possible In statistical inference we postulate a
probability model ® a priori, which purports to describe either the DGP
giving rise to the data or the population which the observed data came from These constitute fundamental departures from descriptive statistics allowing us to make generalisations beyond the numbers in hand This being the case the analysis of observed data in statistical inference proper will take a very different form as compared with descriptive statistics briefly
considered in Part I In order to see this let us return to the income data
discussed in Chapter 2 There we considered the summarisation and description of personal income data on 23 000 households using descriptors like the mean, median, mode, variance, the histogram and the frequency curve These enabled us to get some idea about the distribution of incomes among these households The discussion ended with us speculating about the possibility of finding an appropriate frequency curve which depends on
few parameters enabling us to describe the data and analyse them in a much more convenient way In Section 4.3 we suggested that the parametric family of density functions of the Pareto distribution
6+1
o=| fs nL () 1x0, der.) (Lt)
0
could provide a reasonable probability model for incomes over £4500 As can be seen, there is only one unknown parameter @ which once specified (x; 8) is completely determined In the context of statistical inference we postulate ® a priori as a stochastic model not of the data in hand but of the distribution of income of the population from which the observed data constitute one realisation, i.e the UK households Clearly, there is nothing wrong with using f(x; @) as a frequency curve in the context of descriptive statistics by returning to the histogram of these data and after plotting F(x; 6) for various values of 8, say = 1, 1.5, 2, choose the one which comes closer to the frequency polygon For the sake of the argument let us assume that the curve chosen is @= 1.5, Le
1.5 (4500
Trang 411.2 The sampling model 215 This provides us with a very convenient descriptor of these data as can be easily seen when compared with the cumbersome histogram function $i +Ö(X;+¡ —X;) Me FRO) = KL: X41) (11.3) t
(see Chapter 2) But it is no more than a convenient descriptor of the data in hand For example, we cannot make any statements about the distribution of personal income in the UK on the basis of the frequency curve f*(x) In order to do that we need to consider the problem in the context of statistical inference proper By postulating ® above as a probability model for the distribution of income in the UK and interpreting the observed data asa sample from the population under study we could go on to consider questions about the unknown parameter 6 as well as further observations
from the probability model, see Section 11.4 below
In Section 11.2 the important concept of a sampling model is introduced
as a way to link the probability model postulated, say ®= { f(x; 6), 8 c@), to the observed data x=(x;, ,x„Y available The sampling model provides the second important ingredient needed to define a statistical model; the starting point of any ‘parametric’ statistical inference
In Section 11.3, armed with the concept ofa statistical model, we go on to discuss a particular approach to statistical inference, known as the frequency approach The frequency approach is briefly contrasted with another important approach to statistical inference, the Bayesian
A brief overview of statistical inference is considered in Section 11.4 asa prelude to the discussion of the next three chapters The most important
concept in statistical inference is the concept of a statistic which is discussed in Section 11.5 This concept and its distribution provide the cornerstone
for estimation, testing and prediction
11.2 The sampling model
As argued above, the probability model ®= { f(x; 6), @¢ O} constitutes a
very important component of statistical inference Another important
element in the same context is what we call a sampling model, which provides the link between the probability model and the observed data It is
designed to model the relationship between them and refers to the way the observed data can be viewed in relation to ® In order to be able to formulate sampling models we need to define formally the concept of a
Trang 5216 The nature of statistical inference Definition 1
A sample is defined to be a set of random variables (r.v.s) (Xị, X 5 ., X,) whose density functions coincide with the ‘true’ density function f(x; 9) as postulated by the probability model
Note that the term sample has a very precise meaning in this context and it is not the meaning attributed in everyday language In particular the term does not refer to any observed data as the everyday use of the term might
suggest
The significance of the concept becomes apparent when we learn that the observed data in this context are considered to be one of the many possible realisations of the sample In this interpretation lies the inductive argument of statistical inference which enables us to extend the results based on the observed data in hand to the underlying mechanism giving rise to them Hence the observed data in this context are no longer just a set of numbers we want to make some sense of, they represent a particular outcome of an experiment; the experiment as defined by the sampling model postulated to complement the probability model ®= { f(x; 6)
0c©)
Given that a sample is a set of r.v.s related to ® it must have a distribution which we call the distribution of the sample
Definition 2
The distribution of the sample X =(X, X,,)' is defined to be the joint distribution of the rv/s X, , X,, denoted by
ƒ(Xị X„; Ø) Sƒf(X: Ø)
The distribution of the sample incorporates both forms of relevant information, the probability as well as sample information It must come as no surprise to learn that f(x; 0) plays a very important role in statistical inference The form of f(x; 6) depends crucially on the nature of the sampling model as well as ® The simplest but most widely used form of a sampling model ts the one based on the idea of a random experiment & (see Chapter 3) and is called a random sample
Definition 3
Trang 611.2 The sampling model 217 sample takes the form
f(Xi.X; X„; Ổ)= [] ƒ(x;:Ø)=[ ƒf(x: 0)]J" i=]
the first equality due to independence and the second to the fact that the rvs ure identically distributed
One of the important ingredients of a random experiment 4 is that the experiment can be repeated under identical conditions This enables us to construct a random sample by repeating the experiment n times Such a procedure of constructing a random sample might suggest that this is feasible only when experimentation ts possible Although there is some truth in this presupposition, the concept of a random sample is also used in cases where the experiment can be repeated under identical conditions, if only conceptually In order to see this let us consider the personal income example where ® represents a Pareto family of density functions ‘What isa random sample in this case” If we can ensure that every household in the UK has the same chance of being selected in one performance of a conceptual experiment then we can interpret the n households selected as representing a random sample (X,, X>, , X,,) and their incomes (the observed data) as being a realisation of the sample In general we denote the sample by X=(X, ,X,,)' and its realisation by x =(x, ,x,)', where x is assumed to take values in the observation space 2% i.e x 6.4; usually
+ =R
A less restrictive form of a sampling model is what we call an independent sample, where the identically distributed condition in the random sample is relaxed
Definition 4
A set of ruos(X,, ,X,,) is suid to be an independent sample from f(x 0), f= 1, 2, ., n, respectively if the rves X, X, independent In this case the distribution of the sample takes the form
are
AXXO = PY] £058) (11.4)
i=l
Usually the density functions ƒ(x,:Ø,), ¡=1,2 n belong to the same family but their numerical characteristics (moments, etc.) may differ
If we relax the independence assumption as well we have what we can call
Trang 7218 The nature of statistical inference Definition Š
A set of rcs(X,, ,X,) is said to be a non-random sample from ƒ(Xi, x„: Ö) jƒ the rv’s X,, ,X, are non-HD In this case the only decomposition of the distribution of the sample possible is
ƒ(X¡ Xz , Xu; Ổ)= II ƒ(Xj/Xị v Xi~i:Ô), (11.5) i=
given Xo, where f(x;/X,, ,X;~1/9,), i=1,2, ,n, represent the
conditional distribution of X, given X,, Xy, , X;-1
A non-random sample is clearly the most general of the sampling models
considered above and includes the independent and random samples as
special cases given that
ƒ(X//Xị.- „X;~1;Ø,)=ƒ(X,;8,), i= 1,2, cà (11.6)
when X,, ,X,, are independent r.v.’s Its generality, however, renders the concept non-operational unless certain restrictions are imposed on the
heterogeneity and dependence among the X;s Such restrictions have been
extensively discussed in Sections 8.2~—3 In Part IV the restrictions often used are stationarity and asymptotic independence
In the context of statistical inference we need to postulate both a probability as well as a sampling model and thus we define a statistical model as comprising both
Definition 6
A statistical model is defined as comprising
(i) a probability model ®= { f(x; 0), 8€ ©}; and
(ii) a sampling model X=(X,, X3, , X,)'-
The concept of a statistical model provides the starting point of all forms of statistical inference to be considered in the sequel To be more precise, the concept of a statistical model forms the basis of what is known as
parametric inference There is also a branch of statistical inference known
as non-parametric inference where no ® is assumed a priori (see Gibbons (1971)) Non-parametric statistical inference is beyond the scope of this book
It must be emphasised at the outset that the two important components
Trang 811.3 The frequency approach 219 { f(x; 0), 0€@} if the sample X is non-random This is because if the r.v.’s X,, , X, are not independent the probability model must be defined in terms of their joint distribution, i.e = { f(x,, ,x,3 0), 8€O} Moreover, in the case of an independent but not identically distributed sample we need to specify the individual density functions for each r.v in the sample, i.e =
{ f(x, 9), 06 O, K=1, 2, , n} The most important implication of this
relationship is that when the sampling model postulated is found to be inappropriate it means that the probability model has to be respecified as well Several examples of this are encountered in Chapters 21 to 23
11.3 The frequency approach
In developing the concept of a probability model in Part II it was argued that no interpretation of probability was needed The whole structure was built upon the axiomatic approach which defined probability as a set function P(-): #— [0,1] satisfying various axioms and devoid of any interpretations (see Section 3.2) In statistical inference, however, the interpretation of the notion of probability is indispensable The discerning reader would have noted that in the above introductory discussion we have
already adopted a particular attitude towards the meaning of probability In interpreting the observed data as one of many possible realisations of the
DGP as represented by the probability model we have committed ourselves towards the frequency interpretation of probability This is because we implicitly assumed that if we were to repeat the experiment under identical conditions indefinitely (ic with the number of observations going to infinity) we would be able to reconstruct the probability model ® In the case of the income example discussed above, this amounts to assuming that if we were to observe everybody’s income and plot the relative frequency curve for incomes over £4500 we would get a Pareto density function This suggests that the frequency approach to statistical inference can be viewed as a natural extension of the descriptive study of data with the introduction of the concept of a probability model In practice we never have an infinity
of observations in order to recover the probability model completely and hence caution should be exercised in interpreting the results of the
frequency-approach-based statistical methods which we consider in the
sequel These results depend crucially on the probability model which we interpret as referring to a situation where we keep on repeating the
experiment to infinity This suggests that the results should be interpreted
Trang 9220 The nature of statistical inference Probability model ®= {f(x;6),6 € Of Distribution of the sample F(x4,Xa, -,X,/6) Sampling model X= (X,,X2, , X,) Observed data X= (x1, X2, ++ Xp)
Fig 11.1 The frequentist approach to statistical inference
criteria related to this ‘long-run’ interpretation Hence, it is important to keep this in mind when reading the following chapters on criteria for optimal estimators, tests and perdictors
The various approaches to statistical inference based on alternative interpretations of the notion of probability differ mainly in relation to what constitutes relevant information for statistical inference and how it should be processed In the case of the frequency approach (sometimes called the classical approach) the relevant information comes in the form of a probability model ®= { f(x; 0), Øc@} and a sampling model X=(X;, X;, , X,,)', providing the link between ® and the observed data x =(x,, X>, ,X,)' The observed data are in effect interpreted as a realisation of the sampling model, i.e X =x This relevant information is then processed via the distribution of the sample f(x,,.x2, , X43 8) (see Fig 11.1)
The ‘subjective’ interpretation of probability, on the other hand, leads to a different approach to statistical inference This is commonly known as the Bayesian approach because the discussion is based on revising prior beliefs about the unknown parameters @ in the light of the observed data using Bayes’ formula The prior information about @ comes in the form of a probability distribution (6); that is, @is assumed to be a random variable The revision to the prior f(6) comes in the form of the posterior distribution ƒ({Ø/x) via Bayes’ formula:
ƒ(x/0) /() f
/46X= ng + /tx/Ø)/(0) (11.7)
Trang 1011.4 An overview of statistical inference 221 X=x For more details and an excellent discussion of the frequency and Bayesian approaches to statistical inference see Barnett (1973) In what follows we concentrate exclusively on the frequency approach
11.4 An overview of statistical inference
As defined above the simplest form of a statistical model comprises:
(i) a probability model ®= { f(x; 6), @€ @}; and
(ii) a sampling model X=(X;, X; , X„} — a random sample Using this simple statistical model, let us attempt a brief overview of statistical inference before we consider the various topics individually in order to keep the discussion which follows in perspective The statistical model in conjunction with the observed data enable us to consider the following questions:
(1) Are the observed data consistent with the postulated statistical model? (misspecification)
(2) Assuming that the statistical model postulated is consistent with the observed data, what can we infer about the unknown parameters 0€ ©?
(a) Can we decrease the uncertainty about @ by reducing the parameter space from © to ©, where @, is a subset of ©? (confidence estimation)
(b) Can we decrease the uncertainty about @ by choosing a particular value in ©, say 6, as providing the most representative value of 0? (point estimation)
(c) Can we consider the question that @ belongs to some subset O, of ©? (hypothesis testing)
(3) Assuming that a particular representative value @ of @ has been chosen what can we infer about further observations from the DGP as described by the postulated statistical model? (prediction) The above questions describe the main areas of statistical inference Comparing these questions with the ones we could ask in the context of descriptive statistics we can easily appreciate the role of probability theory in statistical inference
The second question posed above (the first question is considered in the appendix below) assumes that the statistical model postulated is ‘valid’ and considers various forms of inference relating to the unknown parameters Ø Point estimation (or just estimation): refers to our attempt to give a
numerical value to 6 This entails constructing a mapping h(): #—© (see
Trang 11222 The nature of statistical inference #
h{*)
Fig 11.2 Point estimation
Fig 11.3 Interval estimation
estimate of 8 Chapters 12 and 13 on point estimation deal with the issues of defining and constructing ‘optimal’ estimators, respectively
Confidence estimation: refers to the construction of a numerical region for 6,
in the form of a subset ©, of © (see Fig 11.3) Again, confidence estimation
comes in the form of a multivalued function (one-to-many) g(-): 2 > © Hypothesis testing, on the other hand, relates to some a priori statement
about @ of the form Hy: Ø0e©ạ, ©, <O, against some opposite statement H,0¢0, or, equivalently, 0€O,, O,7O,=@ and O,v@,=O In a situation like this we need to devise a rule which tells us when to accept H, as ‘valid’ or reject H, as ‘invalid’ in view of the observed data Using the
postulated partition of @ into ©, and @, we need, in some sense, to construct a mapping g(-):%—@ whose inverse image induces the partition
q” *‘(Qo)=Co — acceptance region, q_'(@,)=C, — rejection region,
Trang 1211.5 Statistics and their distributions 223
# ©
J
Fig 11.4 Hypothesis testing
The decision to accept Ho as a valid hypothesis about @ or reject H, as an invalid hypothesis about @, in view of the observed data, will be
based on whether the observed data X belongs to the acceptance or
rejection regions respectively, i.e XEC,) or XEC, (see Chapter 14) Hypothesis testing can also be used to consider the question of the appropriateness of the probability model postulated Apart from the direct test based on the empirical cumulative distribution function (see Appendix 11.1) we can use indirect tests based on characterisation theorems For example, if a particular parametric family is characterised by the form of its first three moments, then we can construct a test based on these For several
characterisation results related to the normal distribution see Mathai and
Pederzoli (1977) Similarly, hypothesis testing can be used to assess the appropriateness of the sampling model as well (see Chapter 22)
As far as question 3 is concerned we need to construct a mapping Í(-):
© — ¥ which will provide us with further values of X not belonging to the
sample X, for a given value of Ø
11.5 Statistics and their distributions
As can be seen from the bird’s-eye view of statistical inference considered in the previous section, the problem is essentially one of constructing some mapping of the form:
qe) £0 (11.8)
or its inverse, which satisfies certain criteria (restrictions) depending on the nature of the problem Because of their importance in what follows such
Trang 13224 The nature of statistical inference Definition 7
A Statistic is said to be any Borel function (see Chapter 6) qs): £2 R,
Note that q(-) does not depend on any unknown parameters
Estimators, confidence intervals, rejection regions and predictors are all statistics which are directly related to the distribution of the sample ‘Statistics’ are themselves random variables (r.v.’s) being Borel functions of rv.s or random vectors and they have their own distributions The discussion of criteria for optimum ‘statistics’ is largely in terms of their distributions
Two important examples of statistics which we will encounter on numerous occasions in what follows are: _ 1 H X,=- } X¿ called the sample mean, (11.9) Niet and 5 1 H S°= ¬ (X,—X,)?; called the sample variance (11.10) H— ii On the other hand, the functions 1 H xX =~ = 11.11 h(X) n 2 o ( ) and 1 n l\X)=- —; Ÿ(X;—n)? na-1it (11.12)
are not statistics unless o* and ¿: are known, respectively
The concept of a statistic can be generalised to a vector-valued function of the form
qt): 77 O=R™ mel (11.13)
As with any random variable, any discussion relating to the nature of q(X) must be in terms of its distribution Hence, it must come as no surprise
to learn that statistical inference to a considerable extent depends critically
on our ability to determine the distribution of a statistic g(X) from that of X=(X,,X , ,X,)', and determining such distributions is one of the most difficult problems in probability theory as Chapter 6 clearly exemplified In that chapter we discussed various ways to derive the distribution function of ¥ =q(X)
Trang 1411.5 Statistics and their distributions 225 when the distribution of X is known and several results have been derived The reader can now appreciate the reason he/she had to put up with some rather involved examples All the results derived in that chapter will form
the backbone of the discussion that follows The discerning reader must
have noted that most of these results are related to simple functions q(X) of normally distributed r.v.’s, X,, X2, , X, It turns out that most of the results in this area are related to this simple case Because of the importance of the normal distribution, however, these results can take us a long way down ‘statistical inference avenue’ Let us restate some of these results in terms of the statistics X, and s? for reference purposes
Example 1
Consider the following statistical model:
1 —1fx—-p\*
(i) o=| f=; On xp} 5 ("=") | o=ucrerx ah
Trang 15226 The nature of statistical inference and Cov(X,, 7) =0 (v) “ tín — l), (1120) Sn (s3/07) ~ F(n—1,m—1), 11.21 (vi) (2/2) (n—1,m—1) ( )
where 72 is the corresponding sample variance of a random sample (Z,, Z >,
5Zm) from N(u;,+?) and s¿, t2 are independent
All these results follow from Lemmas 6.1-6.4 of Section 6.3 where the normal, chi-square, Student’s t and F distributions are related
Using the distribution of g(X), when known, as in the above cases, we can consider questions relating to the nature of this statistic such as whether it provides a ‘good’ (to be defined in the sequel) or a ‘bad’ estimator or test
statistic Once this is decided we can go on to make probabilistic statements
about 0, the ‘true’ parameter of ®, which is what statistical inference is
largely about The question which naturally arises at this point is: “What
happens if we cannot determine the distribution of the statistic q(X)? Obviously, without a distribution for g(X) no statistical inference is possible
and thus it is imperative to ‘solve’ the problem of the distribution some-
how In such cases asymptotic theory developed in Chapters 9 and 10
comes to our rescue by offering us ‘second best’ solutions in the form of
approximations to the distribution of g(X) In Chapter 6 we discussed
various results related to the asymptotic behaviour of the statistic X,, such as: as (i) X17 P (i) X,on; and đi) — vm càng ¬Š Z~N(0,1); (1122)
irrespective of the original distribution of the X;s Given only that E(X)=y, Var(X,;)=07< x; note that E(X,)=y In Chapter 10 these results were extended to more general functions h(X) In particular to continuous functions of the sample raw moments
1 n
Trang 1611.5 Statistics and their distributions 227 In relation to m, it was shown that in the case of a random sample: as (i) Mm, Lys P (ii) m,— i, and ¬ (iii) vat) © 7 Nea, 1); (11.24) oO, for „=| x'f(x)dx with E(m)=,, ø?=uz—(/)3⁄, r>1, (11.25) assuming that w„< %
It turns out that in practice the statistics g(X) of interest are often
functions of these sample moments Examples of such continuous functions
of the sample raw moments are the sample central moments being defined by
H=- V(X X,Y, r>1 (11.26)
These provide us with a direct extension of the sample variance and they
represent the sample equivalents to the central moments
„=| (x—7(x) dx (11.27)
With the help of asymptotic theory we could generalise the above asymptotic results related to m,, r>1, to those of Y, =q(X) where q(-) is a
Trang 17228 The nature of statistical inference
Asymptotic results related to Y, =q(X) can be used when the distribution of Y, is not available (or very difficult to use) Although there are many ways to obtain asymptotic results in particular cases it is often natural to proceed by following the pattern suggested by the limit theorems in Chapter 9:
Step |
Under certain conditions Y, = q(X) can be shown to converge in probability to some function of h(6) of @, ice P as Y,+h(0), or Y,—h(6) (11.30) Step 2 Construct two sequences {h,(6), c,(0), n= 1} such that ¥: — 8 D Y*= ¥y— fol ) —= Z~N(0, l) (11.31) c„(0) Let F,,(y*) denote the asymptotic distribution of Y*, then for large n FQ) = F.(9*), (11.32)
and F,.(y*) can be used as the basis of any inference relating to Y,=q(X) A question which naturally comes to mind is how large n should be to justify the use of these results Commonly no answer is available because the answer would involve the derivation of F,,() whose unavailability was the very reason we had to resort to asymptotic theory In certain cases higher- order approximations based on asymptotic expansions can throw some light on this question (see Chapter 10) In general, caution should be exercised when asymptotic results are used for relatively small values of n, say n< 100?
Appendix 11.1 — The empirical distribution function
Trang 18Appendix 11.1 229
Z.= 1 1 ifx,e(—x,x], xeR, 0 otherwise,
then F*(x)=(1/n) »= Z; If the original distribution postulated in ® is F(x), a reasonable thing to do is to compare it with F*(x) For example, consider the distance
D,,=max |F*(x) — F(x)|,
xeR
D,, as defined is a mapping of the form D,(-): #— [0,1] where 2 is the observation space Given that Z; has a Bernoulli distribution F*(x) being the sum of Z,, Z,, , Z,, is binomially distributed, 1.e
pr( Favs)" )=(; JOR! —F(x" *], k=0,l, n,
where E(F*(x)}=F(x) and Var(F*(x))=(1/n)F(x)[1 —F(x)] Using the central limit theorem (see Section 9.3) we can show that
F¥(x)— FO D
vn ne) FO) + Z~N(, 1)
tF(x)L1 — F(x)])
; D
Using this asymptotic result it can be shown that \/n D, > y where
ru) —2 ¥ (- 1" ' expt 242523 | yeR,
k-1
This asymptotic distribution of J nD,, can be used to test the validity of ®; see Section 21.2
Important concepts
Sample, the distribution of the sample, sampling model, random sample, independent sample, non-random sample, observation space, statistical model, empirical distribution function, point estimation, confidence estimation, hypothesis testing, a statistic, sample mean, sample variance, sample raw moments, sample central moments, the distribution of a statistic, the asymptotic distribution of a statistic
Questions
Trang 19230
10
1.*
The nature of statistical inference
Contrast f(x; 6) as a descriptor of observed data with f(x; 6) as a member of a parametric family of density functions
Explain the concept of a sampling model and discuss its relationship to the probability model and the observed data
Compare the sampling models: (i) random sample; (it) independent sample;
(m) non-random sample;
and explain the form of the distribution of the sample in each case
Explain the concept of the empirical distribution function
‘Estimation and hypothesis testing is largely a matter of constructing mappings of the form g(-): 2 — @ Discuss
Explain why a Statistic is a random variable
Ensure that you understand the results (11.15}(11.21) (see Appendix
6.1)
‘Being able to derive the distribution of statistics of interest is largely
what statistical inference is all about.’ Discuss Discuss the concept of a statistical model
Exercises
Using the results (22)-(29) show that for a random sample X from a distribution whose first four moments exist,
eee (0 aoe)
Additional references