The main contents of the lecture consist of 9 chapter: Instrumental variable method, non-spherical errors, vector autoregression (VAR), monetary policy in VAR systems, microfoundations of monetary policy models, solving linear expectational difference equations, a menu of different policy rules, estimation of new keynesian models.
Contents Instrumental Variable Method 1.1 Consistency of Least Squares or Not? 1.2 Reason for IV: Measurement Errors 1.3 Reason for IV: Lagged Dependent Variable + Autocorrelated Shocks 1.4 Reason for IV: Simultaneous Equations Bias (and Inconsistency) 1.5 Definition of the IV Estimator—Consistency of IV 1.6 Hausman’s Specification Test 1.7 Tests of Overidentifying Restrictions in 2SLS∗ 3 5 15 16 Non-Spherical Errors 2.1 Summary of Least Squares 2.2 Heteroskedasticity 2.3 Autocorrelation 2.4 Variance of a Sample Average (more details) 2.5 The Newey-West Estimator 2.6 Summary 18 18 19 19 22 25 27 Lecture Notes in Empirical Macroeconomics (MiQEF, MSc course at UNISG) Paul Săoderlind1 January 2005 (with some corrections done later) University of St Gallen Address: s/bf-HSG, Rosenbergstrasse 52, CH-9000 St Gallen, Switzerland E-mail: Paul.Soderlind@unisg.ch Document name: EmpMacroAll.TeX Vector Autoregression (VAR) 29 3.1 Estimation 29 3.2 Canonical Form 29 3.3 Moving Average Form and Stability 30 3.4 Granger Causality 32 3.5 Forecasts Forecast Error Variance 34 3.6 Forecast Error Variance Decompositions∗ 35 3.7 Structural VARs 36 3.8 Cointegration, Common Trends, and Identification via Long-Run Restrictions∗ 46 Monetary Policy in VAR Systems 4.1 VAR System, Structural Form, and Impulse Response Function 4.2 Fully Recursive Structural Form 4.3 Some Controversies 4.4 Summary of Some Important Results from VAR Studies of Monetary Policy∗ 53 53 54 57 78 78 83 84 Microfoundations of Monetary Policy Models 5.1 Dynamic Models of Sticky Prices 5.2 Aggregate Demand 5.3 Recent Models for Studying Monetary Policy 60 Instrumental Variable Method Reference: Greene (2003) 5.4–6 and 15.1–2 Additional references: Hayashi (2000) 3.1–4; Verbeek (2004) 5.1-4; Hamilton (1994) 8.2; and Pindyck and Rubinfeld (1998) 1.1 Consistency of Least Squares or Not? Consider the linear model A Summary of Solution Methods for Linear RE Models Solving Linear Expectational Difference Equations 7.1 The Model 7.2 Matrix Decompositions 7.3 Solving 7.4 Time Series Representation∗ 91 93 93 95 97 102 A Menu of Different Policy Rules 103 8.1 A “Simple” Policy Rule 103 8.2 Optimal Policy under Commitment 104 8.3 Discretionary Solution 106 Estimation of New Keynesian Models 9.1 “New Keynesian Economics and the Phillips Curve” by Roberts 9.2 “Solution and Estimation of RE Macromodels with Optimal Monetary Policy by Săoderlind 9.3 “Estimating The Euler Equation for Output” by Fuhrer and Rudebusch 9.4 “New-Keynesian Models and Monetary Policy: A Reexamination of the Stylized Facts by Săoderstrăom et al yt = xt β0 + u t , (1.1) where yt and u t are scalars, xt a k×1 vector, and β0 is a k×1 vector of the true coefficients The least squares estimator is βˆL S = T −1 T T xt xt t=1 = β0 + T T T (1.2) xt yt t=1 −1 xt xt t=1 T T xt u t , (1.3) t=1 where we have used (1.1) to substitute for yt The probability limit is plim βˆL S − β0 = plim 109 109 T −1 T xt xt t=1 plim T T xt u t (1.4) t=1 111 112 In many cases the law of large numbers applies to both terms on the right hand side The first term is typically a matrix with finite elements and the second term is the covariance of the regressors and the true residuals This covariance must be zero for LS to be consistent 113 1.2 Reason for IV: Measurement Errors Suppose the true model is yt∗ = xt∗ β0 + u ∗t (1.5) Data on yt∗ and xt∗ is not directly observable, so we instead run the regression yt = xt β + u t , (1.6) where yt and xt are proxies for the correct variables (the ones that the model is true for) We can think of the difference as measurement errors y yt = yt∗ + vt and xt = xt∗ (1.7) + vtx , (1.8) = xt − vtx β0 + u ∗t or y yt = xt β0 + εt where εt = −vtx β0 + vt + u ∗t 1.3 Reason for IV: Lagged Dependent Variable + Autocorrelated Shocks where the errors are uncorrelated with the true values and the “true” residual u ∗t Use (1.7) and (1.8) in (1.5) y yt − vt value This makes a lot of sense, since when the measurement error is very large then the regressor xt is dominated by noise that has nothing to with the dependent variable Suppose instead that only yt∗ is measured with error This not a big problem since this measurement error is uncorrelated with the regressor, so the consistency of least squares is not affected In fact, a measurement error in the dependent variable is like increasing the variance in the residual (1.9) Suppose that xt∗ is a measured with error From (1.8) we see that vtx and xt are correlated, so LS on (1.9) is inconsistent in this case To make things even worse, measurement errors in only one of the variables typically affect all the coefficient estimates To illustrate the effect of the error, consider the case when xt is a scalar Then, the probability limit of the LS estimator of β in (1.9) is Anything that causes correlation between the reisduals and the regressor will make LS inconsistent For instance, a model with a lagged dependent variables as regressor and autocorrelated shocks To illustrate this, consider the simple ARMA(1,1) yt = ρyt−1 + u t , where u t = εt + θ εt−1 , where |ρ| < and εt are iid white noise It is clear that Cov(yt−1 , u t ) = if θ = To be precise, we have Cov(yt−1 , u t ) = Cov(ρyt−2 + εt−1 + θ εt−2 , εt + θ εt−1 ) plim βˆL S = Cov (yt , xt ) / Var (xt ) = θ Var(εt−1 ) = Cov xt∗ β0 + u ∗t , xt / Var (xt ) Cov (xt β0 , xt ) + Cov −vtx β0 , xt + Cov u ∗t , xt Var (xt ) Cov −vtx β0 , xt∗ − vtx Var (xt ) = β0 + Var (xt ) Var (xt ) = 1.4 Suppose economic theory tells you that the structural form of the m endogenous variables, yt , and the k predetermined (exogenous) variables, z t , is (1.10) F yt + Gz t = u t , where u t is iid with Eu t = and Cov (u t ) = since and are uncorrelated with and with each other This shows that βˆL S goes to zero as the measurement error becomes relatively more volatile compared with the true xt∗ vtx Var xt∗ + Var vtx Reason for IV: Simultaneous Equations Bias (and Inconsistency) = β0 − β0 Var vtx / Var (xt ) Var vtx (1.12) Results from a Monte Carlo experiment are shown in Figure 1.1 = Cov xt β0 − vtx β0 + u ∗t , xt / Var (xt ) = β0 − (1.11) u ∗t , (1.13) where F is m × m, and G is m × k The disturbances are assumed to be uncorrelated with the predetermined variables, E(z t u t ) = Distribution of LS estimator, T=200 0.4 0.88 0.03 Mean and std: 0.89 0.01 0.4 0.2 where y˜t are the endogenous variables except y jt (and F˜ j the corresponding coefficients) We collect z t and y˜t in the xt vector to highlight that (1.17) looks like any other linear regression equation The problem with (1.17), however, is that the residual is likely to be correlated with the regressors, so the LS estimator is inconsistent The reason is that a shock to u jt influences y jt , which in turn will affect some other endogenous variables in the system (1.13) If any of these endogenous variable are in xt in (1.17), then there is a correlation between the residual and (some of) the regressors Note that the concept of endogeneity discussed here only refers to contemporaneous endogeneity as captured by off-diagonal elements in F in (1.13) The vector of predetermined variables, z t , could very well include lags of yt without affecting the econometric endogeneity problem Distribution of LS estimator, T=900 0.2 0.6 0.8 0.6 0.8 True model: yt = ρyt−1 + ut, where ut = εt + θεt−1, where ρ=0.8, θ=0.5 and εt is iid N(0,2) Estimated model: yt = ρyt−1 + ut Example (Supply and Demand Reference: Hamilton 9.1.) Consider the simplest simultaneous equations model for supply and demand on a market Supply is Figure 1.1: Distribution of LS estimator of the autoregressive parameter qt = γ pt + u st , γ > 0, Suppose F is invertible Solve for yt to get the reduced form yt = −F −1 Gz t + F −1 u t = (1.14) z t + εt , with Cov (εt ) = F j yt + G j z t = u jt , (1.16) where F j and G j are the jth rows of F and G, respectively Suppose the model is normalized so that the coefficient on y jt is one (otherwise, divide (1.16) with this coefficient) Then, rewrite (1.16) as y jt = −G j z t − F˜ j y˜t + u jt zt y˜t qt = βpt + α At + u dt , β < 0, (1.15) The reduced form coefficients, , can be consistently estimated by LS on each equation since the exogenous variables z t are uncorrelated with the reduced form residuals (which are linear combinations of the structural residuals) The fitted residuals can then be used to get an estimate of the reduced form covariance matrix The jth line of the structural form (1.13) can be written = xt β + u jt , where xt = and demand is , where At is an observable demand shock (perhaps income) The structural form is therefore −γ qt u st + At = −β pt −α u dt The reduced form is qt pt = π11 π21 At + ε1t ε2t If we knew the structural form, then we can solve for qt and pt to get the reduced form in terms of the structural parameters qt pt = γ α − β−γ − β−γ α At + β β−γ β−γ γ − β−γ − β−γ u st u dt Example (Supply equation with LS.) Suppose we try to estimate the supply equation in Example by LS, that is, we run the regression (1.17) q t = θ p t + εt If data is generated by the model in Example 1, then the reduced form shows that pt is correlated with u st , so we cannot hope that LS will be consistent In fact, when both qt and pt have zero means, then the probability limit of the LS estimator is plim θˆ = Cov (qt , pt ) Var ( pt ) Cov = γα γ −β At + γ d γ −β u t Var α γ −β − β α s γ −β u t , γ −β At + d γ −β u t At + − d γ −β u t s γ −β u t − d γ −β u t , 1.5 where the second line follows from the reduced form Suppose the supply and demand shocks are uncorrelated In that case we get plim θˆ = = γ α2 Var (At ) + γ Var u dt + β (γ −β)2 (γ −β) (γ −β) α2 1 d Var Var u + + (A ) t t (γ −β)2 (γ −β)2 (γ −β)2 γ α Var (At ) + γ Var u dt + β Var u st α Var (At ) + Var u dt + Var u st A supply shock, u st , affects the quantity, but this has no affect on the price (the regressor in the supply equation), so there is no correlation between the residual and regressor in the supply equation A demand shock, u tD , affects the price and the quantity, but since quantity is not a regressor in the inverse demand function (only the exogenous At is) there is no correlation between the residual and the regressor in the inverse demand equation either Var u st Var u st First, suppose the supply shocks are zero, Var u st = 0, then plim θˆ = γ , so we indeed estimate the supply elasticity, as we wanted Think of a fixed supply curve, and a demand curve which moves around These point of pt and qt should trace out the supply curve It is clearly u st that causes a simultaneous equations problem in estimating the supply curve: u st affects both qt and pt and the latter is the regressor in the supply equation With no movements in u st there is no correlation between the shock and the regressor Second, now suppose instead that the both demand shocks are zero (both At = and Var u dt = 0) Then plim θˆ = β, so the estimated value is not the supply, but the demand elasticity Not good This time, think of a fixed demand curve, and a supply curve which moves around Example (A flat demand curve.) Suppose we change the demand curve in Example to be infinitely elastic, but to still have demand shocks For instance, the inverse demand curve could be pt = ψ At + u tD In this case, the supply and demand is no longer a simultaneous system of equations and both equations could be estimated consistently with LS In fact, the system is recursive, which is easily seen by writing the system on vector form pt −ψ u tD + At = −γ qt u st Definition of the IV Estimator—Consistency of IV Consider the linear model yt = xt β0 + u t , (1.18) where yt is a scalar, xt a k × vector, and β0 is a vector of the true coefficients If we suspect that xt and u t in (1.18) are correlated, then we may use the instrumental variables (IV) method To that, let z t be a k × vector of instruments (as many instruments as regressors; we will later deal with the case when we have more instruments than regressors.) If xt and u t are not correlated, then setting xt = z t gives the least squares (LS) method Recall that LS minimizes the variance of the fitted residuals, uˆ t = yt − xt βˆL S The first order conditions for that optimization problem are 0kx1 = T T xt yt − xt βˆL S (1.19) t=1 If xt and u t are correlated, then plim βˆL S = β0 The reason is that the probability limit of the right hand side of (1.19) is Cov(xt , yt − xt βˆL S ), which at βˆL S = β0 is non-zero, so the first order conditions (in the limit) cannot be satisfied at the true parameter values Note that since the LS estimator by construction forces the fitted residuals to be uncorrelated with the regressors, the properties of the LS residuals are of little help in deciding if to use LS or IV The idea of the IV method is to replace the first xt in (1.19) with a vector (of similar size) of some instruments, z t The identifying assumption of the IV method is that the instruments are uncorrelated with the residuals (and, as we will see, correlated with the regressors) LS, T=200 0kx1 = E z t u t (1.20) = E z t yt − xt β0 IV, T=200 0.88 0.03 0.4 0.2 0.78 0.05 0.4 0.6 0.2 0.8 0.6 0.2 0.8 0.6 0.8 T T z t yt − xt βˆ I V , or LS, T=900 (1.22) t=1 T βˆ I V = 0.78 0.05 0.4 (1.21) The intuition is that the linear model (1.18) is assumed to be correctly specified: the residuals, u t , represent factors which we cannot explain, so z t should not contain any information about u t The sample analogue to (1.21) defines the IV estimator of β as1 0kx1 = ML, T=200 −1 T z t xt t=1 T 0.89 0.01 0.4 T z t yt IV, T=900 ML, T=900 0.80 0.02 0.4 0.80 0.02 0.4 (1.23) t=1 0.2 0.2 0.2 z t xt /T to have full rank to calculate the IV estimator It is clearly necessay for Remark (Probability limit of product) For any random variables yT and x T where plim yT = a and plim x T = b (a and b are constants), we have plim yT x T = ab 0.6 0.8 0.6 0.8 0.6 0.8 Figure 1.2: Distribution of different estimators of the autoregressive parameter To see if the IV estimator is consistent, use (1.18) to substitute for yt in (1.22) and take the probability limit plim T T z t xt β0 + plim t=1 T T z t u t = plim t=1 T T z t xt βˆ I V (1.24) t=1 Two things are required for consistency of the IV estimator, plim βˆ I V = β0 First, that plim z t u t /T = Provided a law of large numbers apply, this is condition (1.20) Second, that plim z t xt /T has full rank To see this, suppose plim z t u t /T = is satisfied Then, (1.24) can be written plim T T z t xt β0 − plim βˆ I V = (1.25) t=1 be satisfied In practical terms, the first order conditions (1.22) then not define a unique value of the vector of estimates If a law of large numbers applies, then plim z t xt /T = Ez t xt If both z t and xt contain constants (or at least one of them has zero means), then a reduced rank of Ez t xt would be a consequence of a reduced rank of the covariance matrix of the stochastic elements in z t and xt , for instance, that some of the instruments are uncorrelated with all the regressors This shows that the instruments must indeed be correlated with the regressors for IV to be consistent (and to make sense) For an example, see Figure 1.2 (details are given in Figure 1.1) Remark (Second moment matrix) Note that E zx = E z E x + Cov(z, x) If E z = and/or E x = 0, then the second moment matrix is a covariance matrix Alternatively, suppose both z and x contain constants normalized to unity: z = [1, z˜ ] and x = [1, x˜ ] If plim z t xt /T has reduced rank, then plim βˆ I V does not need to equal β0 for (1.25) to In matrix notation where z t is the t th row of Z we have βˆ I V = Z X/T −1 Z Y /T 10 11 Use (1.18) to substitute for yt in (1.22) where z˜ and x˜ are random vectors We can then write E zx = = E z˜ 0 Cov(˜z , x) ˜ + E x˜ E x˜ E z˜ E z˜ E x˜ + Cov(˜z , x) ˜ Premultiply by Example (Supply equation with IV.) Suppose we try to estimate the supply equation in Example by IV The only available instrument is At , so (1.23) becomes γˆI V = −1 T T A t pt t=1 T √ T (βˆ I V − β0 ) = A t qt , √ t=1 Cov (At , qt ) , zt u t (1.26) t=1 T −1 T √ T T z t xt t=1 T zt u t (1.27) t=1 d T (βˆ I V − β0 ) → N 0, =E T −1 zx S0 −1 xz √ T T T z t xt and S0 = Cov t=1 , where (1.28) T zt u t t=1 −1 ) = ( −1 = −1 The last matrix in the covariance matrix follows from ( zx zx ) x z This general expression is valid for both autocorrelated and heteroskedastic residuals—all such features are loaded into the S0 matrix Note that S0 is the variance-covariance matrix of √ T times a sample average (of the vector of random variables xt u t ) since all variables have zero means From the reduced form in Example we see that Cov (At , pt ) = − t=1 T If the first term on the right hand side converges in probability to a finite matrix (as assumed in proving consistency), and the vector of random variables z t u t satisfies a central limit theorem, then zx plim γˆI V = Cov (At , pt ) T z t xt T and rearrange as so the probability limit is −1 −1 T √ For simplicity, suppose z˜ and x˜ are scalars Then E zx has reduced rank if Cov(˜z , x) ˜ = 0, since Cov(˜z , x) ˜ is then the determinant of E zx This is true also when z˜ and x˜ are vectors T T βˆ I V = β0 + γ α Var (At ) and Cov (At , qt ) = − α Var (At ) , β −γ β −γ so plim γˆI V = − α Var (At ) β −γ −1 Example (Choice of instrument in IV, simplest case) Consider the simple regression γ − α Var (At ) β −γ yt = β1 xt + u t = γ The asymptotic variance of the IV estimator is This shows that γˆI V is consistent √ AVar √ T (βˆ I V − β0 ) = Var 1.5.1 Asymptotic Normality of IV Little is known about the finite sample distribution of the IV estimator, so we focus on the asymptotic distribution—assuming the IV estimator is consistent 12 T z t u t / Cov (z t , xt )2 t=1 If z t and u t is serially uncorrelated and independent of each other, then Var( Var(z t ) Var(u t ) We can then write d Remark If x T → x (a random variable) and plim Q T = Q (a constant matrix), then d Q T x T → Qx T T √ AVar T (βˆ I V − β0 ) = Var(u t ) Var(z t ) Cov (z t , xt )2 = T t=1 z t u t / Var(u t ) Var(xt )Corr (z t , xt )2 √ T) = 13 An instrument with a weak correlation with the regressor gives an imprecise estimator With a perfect correlation, then we get the precision of the LS estimator (which has a low variance, but is perhaps not consistent) Use pˆ t = δˆ L S At and Slutsky’s theorem plim γˆ2S L S = 1.5.2 2SLS = Suppose now that we have more instruments, z t , than regressors, xt The IV method does not work since, there are then more equations than unknowns in (1.22) Instead, we can use the 2SLS estimator It has two steps First, regress all elements in xt on all elements in z t with LS Second, use the fitted values of xt , denoted xˆt , as instruments in the IV method (use xˆt in place of z t in the equations above) In can be shown that this is the most efficient use of the information in z t The IV is clearly a special case of 2SLS (when z t has the same number of elements as xt ) It is immediate from (1.24) that 2SLS is consistent under the same condiditons as T IV since xˆt is a linear function of the instruments, so plim t=1 xˆt u t /T = 0, if all the instruments are uncorrelated with u t The name, 2SLS, comes from the fact that we get exactly the same result if we replace the second step with the following: regress yt on xˆt with LS Example (Supply equation with 2SLS.) With only one instrument, At , this is the same as Example 6, but presented in another way First, regress pt on At pt = δ At + u t ⇒ plim δˆ L S = Construct the predicted values as pˆ t = δˆ L S At Second, regress qt on pˆ t Cov qt , pˆ t Var pˆ t plim Var δˆ L S At Cov (qt , At ) plim δˆ L S Var (At ) plim δˆ2 LS = γ − β−γ α Var (At ) − β−γ α Var (At ) − β−γ α = γ Note that the trick here is to suppress some the movements in pt Only those movements that depend on At (the observable shifts of the demand curve) are used Movements in pt which are due to the unobservable demand and supply shocks are disregarded in pˆ t We know from Example that it is the supply shocks that make the LS estimate of the supply curve inconsistent The IV method suppresses both them and the unobservable demand shock 1.6 Hausman’s Specification Test This test is constructed to test if an efficient estimator (like LS) gives (approximately) the same estimate as a consistent estimator (like IV) If not, the efficient estimator is most likely inconsistent It is therefore a way to test for the presence of endogeneity and/or measurement errors Let βˆe be an estimator that is consistent and asymptotically efficient when the null hypothesis, H0 , is true, but inconsistent when H0 is false Let βˆc be an estimator that is consistent under both H0 and the alternative hypothesis When H0 is true, the asymptotic distribution is such that Cov βˆe , βˆc = Var βˆe (1.29) Cov ( pt , At ) =− α Var (At ) β −γ qt = γ pˆ t + et , with plim γˆ2S L S = plim plim Cov qt , δˆ L S At Proof Consider the estimator λβˆc + (1 − λ) βˆe , which is clearly consistent under H0 since both βˆc and βˆe are The asymptotic variance of this estimator is λ2 Var βˆc + (1 − λ)2 Var βˆe + 2λ (1 − λ) Cov βˆc , βˆe , 14 15 which is minimized at λ = (since βˆe is asymptotically efficient) The first order condition with respect to λ Bibliography Davidson, J., 2000, Econometric Theory, Blackwell Publishers, Oxford 2λ Var βˆc − (1 − λ) Var βˆe + (1 − 2λ) Cov βˆc , βˆe = Greene, W H., 2003, Econometric Analysis, Prentice-Hall, Upper Saddle River, New Jersey, 5th edn should therefore be zero at λ = so Hamilton, J D., 1994, Time Series Analysis, Princeton University Press, Princeton Var βˆe = Cov βˆc , βˆe Hayashi, F., 2000, Econometrics, Princeton University Press (See Davidson (2000) 8.1) This means that we can write Pindyck, R S., and D L Rubinfeld, 1998, Econometric Models and Economic Forecasts, Irwin McGraw-Hill, Boston, Massachusetts, 4ed edn Var βˆe − βˆc = Var βˆe + Var βˆc − Cov βˆe , βˆc = Var βˆc − Var βˆe (1.30) Verbeek, M., 2004, A Guide to Modern Econometrics, Wiley, Chichester, 2nd edn We can use this to test, for instance, if the estimates from least squares (βˆe , since LS is efficient if errors are iid normally distributed) and instrumental variable method (βˆc , since consistent even if the true residuals are correlated with the regressors) are the same In this case, H0 is that the true residuals are uncorrelated with the regressors All we need for this test are the point estimates and consistent estimates of the variance matrices Testing one of the coefficient can be done by a t test, and testing all the parameters by a χ test βˆe − βˆc Var βˆe − βˆc −1 βˆe − βˆc ∼ χ ( j) , (1.31) where j equals the number of regressors that are potentially endogenous or measured with error Note that the covariance matrix in (1.30) and (1.31) is likely to have a reduced rank, so the inverse needs to be calculated as a generalized inverse 1.7 Tests of Overidentifying Restrictions in 2SLS∗ When we use 2SLS, then we can test if instruments affect the dependent variable only via their correlation with the regressors If not, something is wrong with the model since some relevant variables are excluded from the regression 16 17 2.2 Definition: εt is not iid, since the V (εt ) is different for different observations (t) Effect: LS is still consistent, but the typical expression for V (b) is wrong LS is no longer the best estimator (GLS is) Non-Spherical Errors Reference: Greene (2003) 10.3 Additional references: Hayashi (2000) 6.5; Hamilton (1994) 14; Verbeek (2004) 4.10; Harris and Matyas (1999); and Pindyck and Rubinfeld (1998) Appendix 10.1; Cochrane (2001) 11.7 2.1 Summary of Least Squares Consider the regression equation yt = xt β0 + εt (2.1) Recall that the LS estimator can be written −1 T T βˆL S = xt xt T = β0 + (2.2) xt yt t=1 −1 T t=1 x t εt xt xt t=1 (2.3) t=1 xx d T (βˆL S − β0 ) → N 0, =E T −1 x x S0 √ T xt xt and S0 = Cov t=1 −1 xx T T , where (2.4) T x t εt 2.2.1 White’s Test of Heteroskedasticity H0 : homoskedasticity H A : the kind of heteroskedasticity which can be explained by the levels, squares, and cross products of the regressors Let wt be the unique elements in xt ⊗ xt Run a regression of squared fitted residuals on wt et2 = wt γ + vt Test if all elements (except the constant) in γ are zero (N R ∼ χ P2 , P = dim(wt ) − 1) The reason for this specification is that if u 2t is uncorrelated with xt ⊗ xt , then the usual LS covariance matrix applies 2.2.2 ˆ for LS Correct Var(β) The matrix S0 in (2.4)–(2.5) can not be simplified to σ White’s estimator T Sˆ0 = εˆ t2 xt xt /T, t=1 We know that the general expression for the asymptotic distribution of βˆL S is √ Heteroskedasticity (2.5) t=1 In practice, x x and S0 are replaced by their sample analogues When xt is independent of all u t−s and u t is iid, then S0 = Var(u t ) x x , so the covariance matrix in (2.4) simplifies to Var(u t ) x−1 x , which is the classical LS case 18 xx Instead, estimate it with (2.6) where εˆ t are the fitted residuals Discussion: let z t = xt εt and think of Var(z + z + )/T when z t is uncorrelated T Var(z )/T , but that is with z t−1 Ideally we would like to estimate this variance as i=1 i not possible since there is not enough data to estimate each Var(z i ) However, White has shown that is a consistent way of estimating the variance of the sum 2.3 Autocorrelation Definition: εt is not iid, since εt is correlated with some εt−s ˆ is wrong LS is Effect: LS is still consistent, but the standard expression for Var(β) no longer the best estimator (GLS is) 19 This is a sub-optimal commitment policy It is a commitment rule since the policy setter will stick to this rule, even if it would be optimal to deviate from it in certain states The optimal commitment rule, however, would not restrict the decision rule to be a function of yt and πt only Note that there is no money demand function in this model The reason is that monetary policy is specified in terms of the interest rate, so the money stock becomes demand determined (the money supply curve is flat at the chosen nominal interest rate) Of course, in order for the central bank to control anything of importance, there must be a demand for money The money demand function could be added to the model, but its only role is to determine the money stock Suppose the shocks in (5.19) and (5.20) follow επ t+1 = τπ επt + ζπt+1 ε yt+1 = τ y ε yt + ζ yt+1 (5.22) This system is in state space form and could be summarized as A˜ x1t+1 Et x2t+1 = A˜ i t = −F x1t x2t x1t x2t ˜ t + ξ˜t+1 , and + Bi (5.25) (5.26) where x1t is a vector of predetermined variables (here επt and ε yt , which happens to be exogenous, but also endogenous variables can be predetermined) and x2t a vector of forward looking variables (here πt and yt ) Premultiply (5.25) with A˜ −1 to get x1t+1 Et x2t+1 =A x1t x2t + Bi t + ξt+1 , where (5.27) ˜ ˜ −1 ˜ ˜ −1 ˜ ˜ −1 A = A˜ −1 A, B = A0 B, and Cov (ξt ) = A0 Cov ξt A0 (5.28) By using the policy rule (5.26) in (5.27) we get We can write (5.19)–(5.22) as 0 επt+1 0 ε yt+1 0 β E π t t+1 0 γ1 Et yt+1 = τπ −δ τy 0 −δφ ζπt+1 ζ yt+1 it + γ with it = 0 χ υ επ t ε yt πt yt , επ t ε yt πt yt x1t+1 Et x2t+1 + (5.23) (5.24) = (A − B F) x1t x2t + ξt+1 (5.29) This system of expectational difference equations (with stable and unstable roots) can be solved in several different ways For instance, a decomposition of A − B F in terms of eigenvalues and eigenvectors will work if the latter are linearly independent Otherwise, other techniques must be used (see, Appendix A and Săoderlind (1999)) A necessary condition for a unique saddle path equilibrium is that A − B F has as many stable roots (inside the unit circle) as there are predetermined variables (that is, elements in x1t ) To solve the model numerically, parameter values are needed The following values have been used in most of Figures 5.2-5.4 (exceptions are indicated) β δ φ γ τπ τ y υ χ λ y λi 0.99 2.25 2/7 0.5 0.5 0.5 1.5 0.5 The choice of δ implies relatively little price stickiness The choice of φ means that a 1% increase in aggregate demand leads to a desired increase of the relative price of 2/7% The choice of the relative risk aversion γ implies an elasticity of intertemporal substitution of 1/2 The υ and χ are those advocated by Taylor The loss function parameters (see next 86 87 5.3.2 b Large inflation coefficient a Baseline model 4 π y i Optimal Monetary Policy∗ Suppose the central bank’s loss function is ∞ 0 β s L t+s , where Et (5.30) s=0 −2 −2 −2 period c Large output coefficient −2 L t+s = πt+s − π ∗ period −2 −2 period + λ y yt+s − y ∗ + λi i t+s − i ∗ (5.31) Persistent price shock: simple policy rule Figure 5.2: Impulse responses to price shock; simple policy rule section) means that inflation is twice as important as output, and that the policy maker does not care about fluctuations in the nominal interest rate The first subfigure in Figure 5.2 illustrates how the model with the policy rule (5.21) works An inflation shock in period t = increases inflation The policy maker reacts by raising the nominal interest even more in order to increase the real interest rate This, in turn, has a negative effect on output and therefore on inflation via the “Phillips curve.” The central bank creates a recession to bring down inflation The other subfigures illustrates what happens if the coefficients in the reaction function (5.21) are changed 88 A particularly straightforward way to proceed is to optimize (5.30), by restricting the policy rule to be of the simple form discussed above, (5.21) Optimization then proceeds as follows: guess the coefficients υ and χ, solve the model, use the time series representation of the model to calculate the loss function value Then try other coefficients υ and χ , and see if they give a lower loss function value Continue until the best coefficients have been found The unrestricted optimal commitment policy and the optimal discretionary policy rule are a bit harder to find Methods for doing that are discussed in, among other places, Săoderlind (1999) Figure 5.3 compares the equilibria under the simple policy rule, unrestricted optimal commitment rule, and optimal discretionary rule, when it is assumed that π ∗ = y ∗ = It is clear that the optimal commitment rule achieves a much more stable inflation and output, in spite of a less vigorous increase in the nominal interest rate This is achieved by credibly promising to keep interest rates high in the future (and even raise further), which gives expectations of lower future output and therefore future inflation This, in turn, gives lower inflation and output today The discretionary equilibrium is fairly similar to the simple rule in this model Note that there is no constant “inflation bias” when target levels are at their natural levels (zero) as they are in these figures The discretionary rule is still different from the commitment rule (they are, after all, outcomes of different games) The intuition is that there is a time-varying “bias” since the conditional expectations of output and inflation in the next periods (their “conditional natural rates”) typically differ from the target rates (here zero) Figure 5.4 makes the same type of comparison, but for a positive demand shock, −ε yt In this case, both optimal rules “kill” the demand shock, which is seen almost directly from (5.20): any shock ε yt could be met by increasing i t by γ ε yt In this way output is unaffected by the shock, and there will then be no effect on inflation either, since the only 89 b Commitment policy a Simple policy rule b Commitment policy a Simple policy rule 4 2 0 0 −2 −2 −2 −2 π y i −2 period c Discretionary policy −2 period −2 period 0 −2 −2 period Persistent demand shocks −2 π y i c Discretionary policy Persistent price shocks 4 −2 period −2 period Figure 5.3: Impulse responses to price shock: simple rule, optimal commitment policy, and discretionary policy Figure 5.4: Impulse responses to positive demand shock: simple rule, optimal commitment policy, and discretionary policy way the demand shock can affect inflation is via output (see (5.19)) This is very similar to the static model discussed above: the demand shock drives both prices and output in the same direction and should, if possible, neutralized Of course, the result hinges on the assumption that the policy maker is not averse to movements in the nominal interest rate, that is, λi = in (5.31) (It can be shown that this case can be approximated in the simple policy rule (5.21) by setting the coefficients very high.) Many studies indicate that central banks are unwilling to let the nominal interest rate vary much This is sometimes interpreted as a concern for the banking sector, and sometimes as due to uncertainty about the state of the economy and/or the effect of policy changes on output/inflation In any case, λi > is often necessary in order to make this type of model fit the observed variability in nominal interest rates A Summary of Solution Methods for Linear RE Models The model is x1t+1 Et x2t+1 =A x1t x2t + εt+1 , (A.1) where x1t is an n × vector of predetermined variables, x2t is an n vector of “forward looking” variables, and εt is a white noise process All dynamics of the exogenous processes have been placed in x1t A necessary condition for a saddle path equilibrium is that A has as many unstable roots (inside unit circle) as there are elements in x1t Decompose A as A = Z T Z −1 , (A.2) where T is (at least) upper block diagonal Note that we require Z to be invertible In 90 91 some cases we could let T be a diagonal matrix with eigenvalues along the principal diagonal and with the corresponding eigenvectors in the columns of Z (if the eigenvectors are linearly independent) This decomposition should be reordered so that the blocks corresponding to the stable eigenvalues (in or on the unit circle) comes first Partition conformably with the stable and unstable roots T = Tθ θ Tθδ Tδδ and Z = Z kθ Z λθ Z kδ Z λδ (A.3) −1 x2t = Z λθ Z kθ x1t Solving Linear Expectational Difference Equations References: Blanchard and Kahn (1980), King and Watson (1998), and Klein (2000) 7.1 The Model The model in state-space form is The solution can then be shown to be −1 x1t+1 = Z kθ Tθθ Z kθ x1t + εt+1 (A.4) (A.5) x1t+1 Et x2t+1 =A x1t x2t + εt+1 0n ×1 , (7.1) Bibliography where x1t is an n × vector of predetermined variables with the initial value x10 given, x2t is an n vector of “forward looking” variables, and εt is a white noise process with covariance matrix All dynamics of the exogenous processes have been placed in x1t B´enassy, J.-P., 1995, “Money and Wage Contracts in an Optimizing Model of the Business Cycle,” Journal of Monetary Economics, 35, 303–315 Example 52 (Cagan model.) In the Cagan model (see, for instance, Blanchard and Kahn (1980) 4), money demand is Blanchard, O J., and S Fischer, 1989, Lectures on Macroeconomics, MIT Press ln Mt − ln Pt = −ωi t , with ω > 0, Obstfeldt, M., and K Rogoff, 1996, Foundations of International Macroeconomics, MIT Press Rotemberg, J J., 1982a, “Monopolistic Price Adjustment and Aggregate Output,” Review of Economic Studies, 49, 517–531 Rotemberg, J J., 1982b, “Sticky Prices in the United States,” Journal of Political Economy, 60, 11871211 Săoderlind, P., 1999, Solution and Estimation of RE Macromodels with Optimal Policy,” European Economic Review, 43, 813–823 where Mt , Pt , and i t are the nominal money balances, the price level and the nominal interest rates respectively A money demand equation would typically include a term capturing real activity (output), but that is suppressed in the Cagan model This is reasonable if we focus on cases where the price level fluctuates much more than output Similarly, we assume that the real interest rate is constant, so the Fisher equation is i t = Et (ln Pt+1 − ln Pt ) + constant Combining gives an equation where the price level, Pt , behaves like an asset price ln Pt = (1 − α) ln Mt + α Et ln Pt+1 , with < α < since α = ω/ (1 + ω) Walsh, C E., 2003, Monetary Theory and Policy, MIT Press, Cambridge, Massachusetts, 2nd edn 92 In will be important that α < 1, so the future is “discounted.” 93 Example 53 Consider the Cagan model in Example 52 and suppose the log money supply, ln Mt , is an exogenous AR(1) Et ln Pt+1 ln Mt+1 α−1 = ln Mt + ln Pt , and α α = ρ ln Mt + t+1 = ρ α−1 α α ln Mt ln Pt + t+1 A= ln Pt = (1 − α) ln Mt + α Et ln Pt+1 = (1 − α) Et ln Mt + α ln Mt+1 + α ln Mt+2 + , provided that lim K →∞ α K ln Pt+K = When ln Mt is an AR(1), then Et ln Mt+s = ρ s ln Mt , so we have ln Pt = (1 − α) + αρ + α ρ + ln Mt = 1−α ln Mt − αρ 1−i 1−i 1+i −1−i Remark 56 (Schur decomposition.) The decomposition of the n × n matrix A gives the matrices T and Z such that A = ZT ZH (7.3) which is upper triangular The ordering of the eigenvalues in T can be reshuffled, although this requires that Z is reshuffled conformably to keep (7.3) to hold—this involves a bit of tricky “book keeping.” A= Take expectations of (7.1), based in information in t, of both sides gives A H = A−1 = Example 58 (Cagan again.) If α = 0.5 and ρ = 0.9 in the Cagan model in Example 53 so the Schur decomposition is − 0.5 ln Pt = ln Mt ≈ 0.909 ln Mt − 0.5 × 0.9 x1t x2t 1+i −1+i Remark 57 (Upper triangular matrices.) If T is upper triangular, then T T is as well For instance, with α = 0.5 and ρ = 0.9 we get =A 1+i 1−i where Z is a unitary n × n matrix and T is an n × n upper triangular Schur form with the eigenvalues along the diagonal Note that premultiplying (7.3) with Z −1 = Z H and postmultiplying with Z gives T = Z H AZ , (7.4) = (1 − α) ln Mt + α Et (1 − α) ln Mt+1 + α Et+1 ln Pt+2 x1t+1 x2t+1 Remark 55 (Complex matrices.) Let A H denote the transpose of the complex conjugate of A, so that if A = + 3i then A H = − 3i Example 54 In the simple case of Example 52, we can solve the model analytically by recursion forward (and using the law of iterated expectations) Et Matrix Decompositions A square matrix A is unitary (similar to orthogonal) if A H = A−1 , for instance, This can be rewritten on the state-space form (7.1) ln Mt+1 Et ln Pt+1 7.2 (7.2) We will first try to find the solution to (7.2), then reintroduce the shocks εt Z≈ ρ α−1 α α = −0.74 0.673 −0.673 −0.74 0.9 −1 ,T = , and 0.9 , and Z H ≈ −0.74 −0.673 0.673 −0.74 Note that T is upper triangular, with the eigenvalues (0.9 and 2) along the diagonal, and Z is unitary (Z Z H = I ) Note also that A = Z T Z H holds In this example, T and Z are real since all eigenvalues are real (unique) 94 95 7.2.1 Why not a Spectral Decomposition? decomposition Remark 59 (Spectral decomposition.) The n eigenvalues (λi ) and associated eigenvectors (z i ) of the n × n matrix A satisfies (A − λi In ) z i = 0n×1 (7.5) ρ α−1 α α = αρ−1 α−1 αρ−1 α−1 ρ 0 α1 1 −1 , which would work fine since the eigenvectors are linearly independent If the eigenvectors are linearly independent, then we can decompose A as 7.3 A = Z Z −1 , where = diag(λ1 , , λ1 ) and Z = z1 z2 · · · zn Solving , 7.3.1 so is a matrix with the eigenvalues along the diagonal and zeros elsewhere To see why the spectral decomposition works, note that by (7.5) AZ = Z , which can be premultiplied by Z −1 (Note that this decomposition can be quite convenient since the fact that is diagonal implies A2 = A A = Z Z −1 Z Z −1 = Z Z −1 = Z Z −1 ) Why should we not decompose A with the help of eigenvalues and eigenvectors instead? We could if the eigenvectors were linearly independent (distinct eigenvalues is a sufficient, not necessary, condition for this) In this case, the approach in Section 7.3 still applies, but where we let T = Often the eigenvectors are linearly dependent This would create a fundamental problem when we try to “decouple” the system of difference equations (see below) We then have to use some other decomposition The Jordan decomposition used by Blanchard and Kahn (1980) is perhaps the neatest, but also very difficult to calculate accurately (see Golub and van Loan (1989)) The calculation of the Schur decomposition is fairly robust, and is therefore widely implemented in software libraries Example 60 Consider the process xt − xt−1 = xt−1 − xt−2 + εt It can be written as a VAR(1) as xt −1 xt−1 εt = + xt−1 xt−2 “Decoupling” Calculate the Schur decomposition (7.3) of A in (7.2) and reorder (both T and Z , a bit tricky) so the n θ eigenvalues with modulus smaller than one comes first (Note that T and Z may include complex elements.) Partition T accordingly θt δt (7.6) x1t x2t = ZH (7.7) Use A = Z T Z H in (7.2) Then, premultiply with the non-singular matrix Z H (“no information is lost,” that is, we get an equivalent system), use (7.7) and (7.3) Z H Et xt+1 = Z H Z T Z H Et θt+1 δt+1 = ZH ZT = 96 If there are n θ stable and n δ unstable eigenvalues, then Tθθ is n θ × n θ , Tθδ is n θ × n δ , and Tδδ is n δ × n δ Introduce the auxiliary variables The VAR matrix has a repeated eigenvalue (1) and eigenvectors ([1, 1]) Example 61 (Cagan again.) The A matrix in Example 53 has the following spectral Tθθ Tθ δ Tδδ T = θt δt Tθθ Tθ δ Tδδ x1t x2t (use A = Z T Z H in 7.2), (from (7.7)) θt δt (since Z H Z = I ) (7.8) 97 7.3.2 Solving the System (7.8) of Et θt+1 and Et δt+1 Since Tδδ contains the roots outside the unit circle, δt will diverge as t increases unless δ0 = Any stable solution will therefore require that δt = for all t The system (7.8) can therefore be written as δt = 0, and (7.9) Et θt+1 = Tθθ θt variable (the exogenous ln Mt ) and the roots are ρ and 1/α (see Example 58 or 61), so if both |ρ| and |α| are less than unity, then there is a unique stable solution, where −1 θ0 = Z kθ ln M0 Example 63 (Cagan model with too many stable roots.) Consider the Cagan model in Example 53 again, but change the price equation to ln Pt = ln Mt + a Et ln Pt+1 , with a > 7.3.3 Initial Values of θ0 ln Mt+1 Et ln Pt+1 Invert (7.7) and partition as x1t x2t = Z kθ Z λθ Z kδ Z λδ = Z kθ Z λθ θt , θt δt = ρ −1 a a ln Mt ln Pt + (7.13) t+1 , (7.14) with the eigenvalues ρ (still assuming |ρ| < 1) and 1/a To illustrate, suppose |aρ| < Then, iterating on the price equation gives the stable fundamental solution ∞ (7.10) a s Et ln Mt+s ln Pt∗ = s=0 since δt = The initial conditions are that x10 is given From (7.10) we have x10 = Z kθ θ0 , = (7.11) which can be solved for θ0 if Z kθ is invertible It has n rows (the number of predetermined/backward looking variables) and n θ columns (as many as stable roots), so a necessary condition is that the number of stable roots equal the number of backward looking variables (Blanchard and Kahn (1980), proposition 1) If that is the case and Z kθ is invertible, then −1 θ0 = Z kθ x10 , (7.12) so we can calculate a unique stable solution by using (7.12) in (7.8) (and then transforming back to x1t ) This is developed below If the number of stable roots is less than the number of predetermined variables, n , then there is no stable solution In contrast, if the number of stable roots is larger than the number of predetermined variables, , n , then there is an infinite number of stable solutions See Blanchard and Kahn (1980) for details Example 62 (Cagan model with a unique stable solution.) There is one predetermined 98 ln Mt − aρ However, the full set of solutions is ln Pt = ln Pt∗ + bt , where bt is a “bubble.” Try this in (7.13) to get 1 ln Mt + bt = ln Mt + a Et ln Mt+1 + bt+1 − aρ − aρ aρ = ln Mt + ln Mt + a Et bt+1 , − aρ which requires bt = a Et bt+1 That means that Et bt+1 = bt /a When |a| < this means that the bubble is unstable, and we choose bt = to get an economically meaningful (stable) solution of the price level ln Pt = ln Pt∗ + bt However, with |a| > 1, there is an infinity of stable bubbles (which all give a stable price level) and we have no good reason to choose one over another Example 64 (Cagan model with too many stable roots, continued.) The matrix in (7.14) can be decomposed in terms of the eigenvectors and eigenvalues ρ −1 a a = − aρ 1 ρ 0 a1 −1 − aρ 1 99 When |a| > 1, then both eigenvalues are stable so (7.10) can be written ln Mt ln Pt θ1t θ2t − aρ 1 = The spectral decomposition is β 12 , where θ1t and θ2t are the elements of the vector θt We can identify θ1t = ln Mt /(1 − aρ) from the first equation The second equation then says that ln Pt = ln Mt /(1 − aρ) + θ2t , where by (7.9) Et θ2t+1 = θ2t /a, so θ2t is indeed a stable variable However, beyond that we cannot say much about what it represents—this is the “bubble” discussed above Example 65 (Square, but singular, Z kθ ) It is possible that we have the right number of stable roots, but that Z kθ is singular This is a fairly odd case, but we cannot rule it out For instance, consider a slight variation on the example by King and Watson (1998) x1t+1 x2t+1 = α 12 x1t x2t , with x10 given The model therefore has one stable root and one initial condition (it will turn out to be in the wrong place, however) The spectral decomposition is α 12 = 3α 0 1 3α (7.9) and (7.10) become δt = and θt+1 = θt , = θt β 21 x1t x2t − 23 β 1 −1 x1t x2t δt = and θt+1 = θt , − 23 β θt = If β = 0, then Z kθ is non-singular and we have a unique stable solution It is (since x2t = θt ) x1t+1 = x1t with x10 given, and x2t = − x1t 2β This model has a unique stable solution when β = since x20 adjusts so that x1t will not explode When β = 0, then x1t does not depend on x20 , so there is no possibility to put the system on a stable path (recall that x1t has an inherent tendency of being unstable) Putting the Innovations Back Z kθ (θt+1 − Et θt+1 ) = εt+1 , (7.15) which under the same conditions as above can be inverted −1 θt+1 = Et θt+1 + Z kθ εt+1 (7.16) −1 θt+1 = Tθ θ θt + Z kθ εt+1 , (7.17) Combined with (7.9) we have Example 66 (Fixing Example 65.) Change the model to = 0 From the model in state-space form (7.1) we know that x1t+1 − Et x1t+1 = εt+1 Using (7.10) to rewrite this gives The stable auxiliary variable, θt , is not related to the variable with an initial condition, x1t , so Z kθ is indeed singular It is clear this model cannot have a stable solution since the solution for the first variable must be x1t = 2t x10 x1t+1 x2t+1 In this case (7.9) and (7.10) become 7.3.4 x1t x2t − 23 β 1 The evolution of the deterministic system is (7.9) with (7.12) as starting values (7.10) shows to transform to expected values of x1t and x2t −1 = , with x10 given which with (7.12) and (7.10) is a complete solution of the stochastic model 100 101 where εt+1 has the covariance matrix 7.3.5 Dynamics in Terms of x1t and x2t −1 Using θt = Z kθ x1t from (7.10) in (7.17) gives yt = εt + Mεt−1 + M εt−2 + , −1 x1t+1 = Z kθ Tθθ Z kθ x1t + εt+1 Similarly, combining x2t = Z λθ θt with θt = −1 Z kθ x1t which immediately gives unconditional covariance matrix of yt as (since E εt εt−s = if s = 0) Cov (yt ) = + M M + M M + (7.24) (7.19) This is easily calculated by iterating until convergence on Cov (yt ) in We can summarize these as Cov (yt ) = x1t+1 = M x1t + εt+1 , and (7.20) x2t = C x1t , (7.21) where the definition of the M and C matrices are clear from comparing with (7.18) and (7.19) Example 67 (Solving the Cagan model.) From the Cagan model in Example 58, and (7.19) we have + M Cov (yt ) M (7.25) The iteration could start by setting Cov (yt ) on the right hand side to a matrix of zeros (An exact formula exist, but it usually gives longer computation time and less accuracy.) Example 68 (Why (7.25) works) Consider iterating on As+1 = B + C As C by starting from A0 = The first iteration gives A1 = B, the second A2 = B+C A1 C = B+C BC , the third A3 = B + C A2 C = B + C B + C BC C = B + C BC + CC BC C , and so forth Continue until As+1 ≈ As Now, consider a linear combination of yt −1 ln Pt = Z λθ Z kθ ln Mt z t = C yt ≈ −0.673 (−0.74)−1 ln Mt ≈ 0.909 ln Mt , (exact answer is 10/11 ln Mt ), −1 ln Mt+1 = Z kθ Tθθ Z kθ ln Mt + Cov (z t ) = C Cov (yt ) C t+1 = −0.74 ∗ 0.9 ∗ (−0.74)−1 ln Mt + (7.26) It is straightforward to calculate the impulse responses of z t by combining (7.23) and (7.26) It is also straightforward to calculate the unconditional covariance matrix of z t as and (7.18) recovers the AR(1) of money supply 7.4 (7.23) (7.18) (both from (7.10)) gives −1 x2t = Z λθ Z kθ x1t = 0.9 ln Mt + We know that the impulse response function is (7.27) t+1 t+1 8.1 Time Series Representation∗ This section summarizes some useful tools for calculating unconditional variances from a VAR(1) model Consider the VAR(1) yt+1 = M yt + εt+1 , (7.22) 102 A Menu of Different Policy Rules A “Simple” Policy Rule Reference: Currie and Levine (1993) and Săoderlind (1999) We now change (7.1) by adding an effect of a vector of policy instruments, u t , x1t+1 Et x2t+1 =A x1t x2t + Bu t + εt+1 0n ×1 , (8.1) 103 with x10 given The policy instrument is assumed to be a linear function of x1t and x2t (this might force you to change the definition of x1t - you can always add variables) x1t x2t u t = −F ∞ β t xt Qxt + 2xt U u t + u t Ru t + 2ρt+1 (Axt + Bu t + ξt+1 − xt+1 ) L = E0 ∞ {u t }t=0 (8.2) t=0 (8.8) The k first order conditions for u t are Substituting for u t in (8.1) gives x1t+1 Et x2t+1 x2t+1 , which have to be functions of εt+1 , but that is something we will return to later Form the Lagrangian = (A − B F) x1t x2t + −B Et ρt+1 = U xt + Ru t εt+1 0n ×1 , The n first order conditions for xt are which is on the same form as (7.1), but where A − B F replaces A We can therefore apply the solution algorithm above 8.2 Optimal Policy under Commitment where x1t is an n × vector of “backward looking” variables and x2t an n × vector of “forward looking” variables Let n = n + n , so xt is an n × vector The problem is to minimize the loss function ∞ βt xt u t t=0 Q U U R xt ut , (8.5) by choosing an optimal sequence of the k × vector of policy instrument, u t The constraints are x1t+1 Et x2t+1 =A x1t x2t β A Et ρt+1 = ρt − β Qxt − βU u t (8.10) We can write (8.7), (8.10), and (8.9) as In 0n×k 0n×n xt+1 A B 0n×n xt ξt+1 0n×n 0n×k β A u t+1 = −β Q −βU In ut + 0k×n 0k×k −B Et ρt+1 U R 0k×n ρt (8.11) Reference: Currie and Levine (1993), Backus and Driffil (1986), Svensson (1994), and Săoderlind (1999) Let x1t xt = , (8.4) x2t J0 = E0 (8.9) (8.3) + Bu t + εt+1 0n ×1 , or xt+1 = Axt + Bu t + ξt+1 , and x10 given, (8.6) (8.7) 8.2.1 Initial Conditions (See Currie and Levine (1993) p 171.) We have n initial conditions from the predetermined x10 , and n from p20 = 0n ×1 The control variables in u t (a k ×1 vector) should belong to the forward looking variables, since we have no initial value for them Note that p2t will typically be non-zero, except in the initial period This can be interpreted as if the policy maker in t = exploits the fact that private sector expectations formed in t < (which still influence today’s economy, for instance, through capital stocks and prices determined in previous periods) In fact, there is always a temptation to exploit this, that is, to set policy in such a way that ρ2t = 0, but the commitment rules out this—expect for the initial period The “timeless perspective,” advocated by Woodford (1999), is essentially to use the policy rule that comes out from solving (8.11), but only from some period t > where ρ2t is set to some non-zero value such that the policy is stationary where ξt+1 = (εt+1 , x2t+1 − Et x2t+1 ) The second part of ξt+1 are the innovations in 104 105 8.2.2 Solution The solution is then on the form It can be shown that the solution is on the form x1t+1 ρ2t+1 x1t+1 = M x1t + εt+1 , x1t ρ2t + x2t ut = C ρ1t x1t ρ2t =M εt+1 0n ×1 , (8.12) (8.14) u t = −F x1t , (8.15) x2t = C x1t (8.16) and (8.13) Equations (8.12) and (8.13), together with the initial values of x10 and ρ20 = 0n ×1 give a complete description of the evolution of the system 8.3 Bibliography Backus, D., and J Driffil, 1986, “The Consistency of Optimal Policy in Stochastic Rational Expectations Models,” CEPR Discussion Paper 124 Blanchard, O J., and C M Kahn, 1980, “The Solution to Linear Difference Models under Rational Expectations,” Econometrica, 5, 1305–1311 Discretionary Solution References: Currie and Levine (1993), Backus and Driffil (1986), Oudiz and Sachs (1985), Svensson (1994), and Săoderlind (1999) The model is the same as in the commitment case, but the policy maker cannot commit to a policy rule Instead, the policy maker reoptimizes every period Currie, D., and P Levine, 1993, Rules, Reputation and Macroeconomic Policy Coordination, Cambridge University Press Golub, G H., and C F van Loan, 1989, Matrix Computations, The John Hopkins University Press, Baltimore, MD, 2nd edn King, R G., and M W Watson, 1998, “The Solution of Singular Linear Difference Systems under Rational Expectations,” International Economic Review, 39, 1015–1026 We can find a stationary policy rule if we let the time horizon go to infinity The state of the economy is given by the predetermined variables, x1t As a consequence, the decision rule and the non-predetermined variables, x2t , must be linear functions of x1t (u t = −F x1t , and x2t = C x1t , respectively in the stationary equilibrium) The policy maker takes the expectations of private agents as given (“Nash equilibrium” - not like in commitment case where the policy maker is a “Stackelberg leader”) From above it is Et x2t+1 = C Et x1t+1 No closed form solution exists—not even a proof (except in the scalar case) of convergence of the solution algorithm The solution algorithm is backwards recursive, starting from a distant period in time, T First, we find the optimal policy for T , then for T − (incorporating the knowledge of how policy in T will be set), etc The recursion continues until the policy rule has converged 106 Klein, P., 2000, “Using the Generalized Schur Form to Solve a Multivariate Linear Rational Expectations Model,” Journal of Economic Dynamics and Control, 24, 1405–1423 Oudiz, G., and J Sachs, 1985, “International Policy Coordination in Dynamic Macroeconomic Models,” in Willem H Buiter, and Richard C Marston (ed.), International Economic Policy Coordination, Cambridge University Press, Cambridge Săoderlind, P., 1999, “Solution and Estimation of RE Macromodels with Optimal Policy,” European Economic Review, 43, 813–823 Svensson, L E O., 1994, “Why Exchange Rate Bands? Monetary Independence in Spite of Fixed Exchange Rates,” Journal of Monetary Economics, 33, 157–199 107 Woodford, M., 1999, “Commentary: How Should Monetary Policy Be Conducted in an Era of Price Stability?,” in New Challenges for Monetary Policy, Federal Reserve Bank of Kansas City Estimation of New Keynesian Models 9.1 “New Keynesian Economics and the Phillips Curve” by Roberts (JMCB 1995) Rotemberg or Calvo or pt − Et pt+1 = c0 + γ yt + (9) t From theory: γ > 0, t could be serially correlated and correlated with yt , yt is the output gap Part of t could be real oil prices pt − Et pt+1 = c0 + γ yt + c1 r poilt + c2 r poilt−1 + t Tried unemployment rate (−γ RUt ) instead of output gap (+γ yt ) 108 109 9.1.1 How to Measure Et pt+1 ? 9.1.4 Survey data (Michigan, Livingston) RE/Ex post data (McCullum): use pt+1 = Et pt+1 +u t+1 to substitute for Et pt+1 − u t+1 pt − pt+1 = c0 + γ yt + t − u t+1 pt+1 = u t+1 is (under RE) uncorrelated with regressor (dated t)—but extra noise Notice that having − Et pt+1 on LHS in (9) is crucial for RE approach (at least if we were to use LS) Consider the alternative p t = β Et pt+1 + c0 + γ yt + = β pt+1 + c0 + γ yt + t Results γ is reasonable (though not significant when using the RE approach): 0.2–0.4 (cf ≈ 0.6 in PS lecture notes) Residuals autocorrelated (adjusted std, Newey-West?) Subsample (/1973/) stability when using survey data, perhaps not with RE approach [See Table of the paper] 9.2 “Solution and Estimation of RE Macromodels with Optimal Monetary Policy by Săoderlind t u t+1 (EER1999) Simplified version of model by Fuhrer-Moore Correlation of residual and regressor yt = α1 yt−1 + α2 yt−2 + αr rt−1 + ε yt , 9.1.2 Data rt = 41 ∞ 40 41 s Et (i t+s − πt+1+s ) , Sample: annual data 1949–1990 pt : CPI, Dec to Dec r poil: oil price/GNP deflator yt : GNP deviations from deterministic trend (linear-quadratic) The log price level is the average of the wage contracts still in effect 9.1.3 Nominal wage contracts are set to for periods: increasing in price level and yt (coeff γ ) t s=0 pt = θ0 wt + θ1 wt−1 + (1 − θ0 − θ1 ) wt−2 Estimation Method ∞ could be correlated with yt ⇒IV (or rather 2SLS) Instruments: r poilt , r poilt−1 , G t , G t−1 , dummy (democrat) β t q y yt2 + (1 − q y )πt2 + qi i t2 , L t = Et t=0 Săoderlind (Journal of Policy Modeling, 2001.) estimates the model quarterly US data from the mid 1960s to the mid 1990s yt log real GNP per capita detrended with a linear trend, πt CPI i t T-bill rate (i 1,t ) Monte Carlo experiment [See Table of the paper] The means of the estimates are generally close to the true values, even if the short sample exaggerates the effectiveness of monetary policy by making (i) 110 111 output too responsive to real interest rates (αr too low), and (ii) wage setting too responsive to output (γ too high) underestimates the willingness to use monetary policy (qi too high) These parameters, and also the weight on output in the loss function, q y , have considerably higher standard deviations than predicted by asymptotic theory The correlations of the parameters “stabilize” the properties of the model: (a) α1 and α2 are strongly negatively correlated (-0.85), which keeps the autocorrelation of output relatively stable across simulations (b) ar and qi are negatively correlated (-0.65), which keeps the effect of monetary policy on output and inflation relatively constant (c) average contract length (θ0 + 2θ1 + (1 − θ0 − θ1 )) and γ are strongly positively correlated (0.83), which keeps the effect of a price shock on inflation and output relatively unchanged across simulations 9.3 “Estimating The Euler Equation for Output” by Fuhrer and Rude- i: averages of FFR π : GDP deflator inflation y: different measures of the output gap 9.3.2 Estimation Methods ML: (a) Estimate VAR(4) of π and i Keep fixed (b) Guess parameters in (2), solve for equilibrium, find residuals and use in likelihood function (SSE?); repeat with new parameters GMM (or rather IV/2SLS): Instruments: four lags of yt , πt , i t or four lags of defense spending, oil prices and dummy (democrat) 9.3.3 Results ML: lags of output (α1 , α2 ) large and significant; µ varies across specifications (perhaps ≈ 0), β small whenever µ is large GMM: like ML, but µ tend to be larger busch 9.3.4 (JME 2004) Euler equation for the output gap yt = Et yt+1 − σ (i t − Et πt+1 ) + ηt (1) Correct only when C = Y Empirical problems⇒hybrid model yt = α0 + α1 yt−1 + α2 yt−2 + µ Et−τ yt+1 − β Et−τ κ Monte Carlo Simulations Purpose: study the small sample properties of ML and GMM estimators of µ and β Simulations made for different true DGPs ML: no small sample bias in either µ or β GMM: µ biased towards 0.5, β downward biased when true µ is small GMM is much less precise than ML Weak instruments κ−1 i t+ j+m − πt+ j+m+1 (2) 9.4 “New-Keynesian Models and Monetary Policy: A Reexamination of the Stylized Facts by Săoderstrăom et al j=0 Lags of output, timing of expectations (τ ), long (κ) real interest rate (lagged?, m) (Scandinavian Journal of Economics, forthcoming) 9.3.1 Data 1966Q1–2000Q4 112 113 Model from Rudebusch [See Tables 1,3 and and Figures and of the paper] πt = µπ Et−1 π¯ t+3 + (1 − µπ ) απ j πt− j + α y yt−1 + εt , (9.1) β y j yt− j − βr i t−1 − Et−1 π¯ t+3 + ηt , (9.2) j=1 yt = µ y Et−1 yt+1 + − µ y j=1 Var [π¯ t ] + λ Var [yt ] + ν Var [ i t ] {i t } (9.3) {λ, ν, µπ , µ y , σπ , σ y } is found by minimizing the function ξ − ξˆ Vˆ −1 ξ − ξˆ , (9.4) where ξ are the moments in Table (std and autocorr), and V their covariance matrix Other parameters are taken from Rudebusch (Table 3) Key findings: a small concern for output stability (λ low) but a large preference for interest rate smoothing (ν high) small degree of forward-looking behavior in price-setting (µπ low) but a large degree of forward-looking in the determination of output (µ y high)—different from Fuhrer and Rudebusch Analysis of why we get these results: larger values of λ (output stability) than in our estimated configuration imply too high volatility and persistence of inflation and too low volatility and persistence of output compared with U.S data while small decreases in ν (interest rate stability) have little effect, a positive value for ν is crucial to avoid excessive volatility in the interest rate relative to the data larger values of µπ than in our estimation imply too low volatility of output and the interest rate and too low persistence of inflation and output smaller values of µ y lead to excessive volatility in the interest rate and inflation and excessive persistence in inflation 114 115 ... all elements in xt on all elements in z t with LS Second, use the fitted values of xt , denoted xˆt , as instruments in the IV method (use xˆt in place of z t in the equations above) In can be shown... continued.) Part of the reason why Var (m) ¯ increased with ρ in the previous examples is that Var (m t ) increases with ρ We √ ¯ is than in the iid can eliminate this effect by considering how... Representation of Swedish Data 1871-1990, in Villy Bergstrăom, and Anders Vredin (ed.), Measuring and Interpreting Business Cycles pp 125–233, Claredon Press = In with (3.45) gives = Fk Fr Fk Fr Greene,