In cooperation with www.beam-eBooks.de Thomas Andren Econometrics Download free books at BookBoon.com In cooperation with www.beam-eBooks.de Econometrics © 2007 Thomas Andren & Ventus Publishing ApS ISBN 978-87-7681-235-5 Download free books at BookBoon.com In cooperation with www.beam-eBooks.de Contents Econometrics Contents 1.1 1.1.1 1.1.2 1.1.3 1.1.4 1.1.5 1.2 1.3 1.3.1 1.3.2 1.3.3 1.3.4 Basics of probability and statistics Random variables and probability distributions Properties of probabilities The probability function – the discrete case The cumulative probability function – the discrete case The probability function – the continuous case The cumulative probability function – the continuous case The multivariate probability distribution function Characteristics of probability distributions Measures of central tendency Measures of dispersion Measures of linear relationship Skewness and kurtosis 8 10 12 13 14 14 15 17 17 18 18 20 2.1 2.2 2.3 2.4 Basic probability distributions in econometrics The normal distribution The t-distribution The Chi-square distribution The F-distribution 22 22 28 29 31 3.1 3.1.1 3.1.2 3.1.3 3.2 The simple regression model The population regression model The economic model The econometric model The assumptions of the simple regression model Estimation of population parameters 33 33 33 34 36 37 © 2008 KPMG Deutsche Treuhand-Gesellschaft Aktien gesellschaft Wirtschaftsprüfungsgesellschaft, eine Konzern gesellschaft der KPMG Europe LLP und Mitglied des KPMG-Netzwerks unabhängiger Mitglieds firmen, die KPMG International, einer Genossenschaft schweizerischen Rechts, angeschlossen sind Alle Rechte vorbehalten Please click the advert Globales Denken Gemeinsame Werte Weltweite Vernetzung Willkommen bei KPMG Sie haben ehrgeizige Ziele? An der Hochschule haben Sie überdurchschnittliche Leistungen erbracht und suchen eine berufliche Herausforderung in einem dynamischen Umfeld? Und Sie haben durch Ihre bisherigen Einblicke in die Praxis klare Vorstellungen für Ihren eigenen Weg und davon, wie Sie Ihr Potenzial in eine berufliche Karriere überführen möchten? Dann finden Sie bei KPMG ideale Voraus setzungen für Ihre persönliche und Ihre berufliche Entwicklung Wir freuen uns auf Ihre Online-Bewerbung für einen unserer Geschäftsbereiche Audit, Tax oder Advisory www.kpmg.de/careers Download free books at BookBoon.com In cooperation with www.beam-eBooks.de Econometrics Contents 3.2.1 3.2.2 The method of ordinary least squares Properties of the least squares estimator 37 42 4.1 4.2 4.2.1 4.3 4.4 Statistical inference Hypothesis testing Confidence interval P-value in hypothesis testing Type I and type II errors The best linear predictor 44 44 46 48 49 52 5.1 5.2 5.3 Model measures The coefficient of determination (R2) The adjusted coefficient of determination (Adjusted R2) The analysis of variance table (ANOVA) 55 55 59 60 6.1 6.2 6.3 6.3.1 6.3.2 The multiple regression model Partial marginal effects Estimation of partial regression coefficients The joint hypothesis test Testing a subset of coefficients Testing the regression equation 63 63 64 66 66 69 7.1 7.1.1 7.1.2 7.1.3 7.1.4 7.2 Specification Choosing the functional form The linear specification The log-linear specification The linear-log specification The log-log specification Omission of a relevant variable 70 70 70 72 73 73 74 Please click the advert Lernen Sie ein paar nette Leute kennen Online im sued-café affenarxxx krixikraxi burnout bauloewe olv erdonaut catwoman ratatatata franz_joseph cuulja leicestermowell irma* borisbergmann traumfaenger angus_jang sixpence schuetzenlisl bgraff nicht_ich audiosmog auto_pilot vorsicht neutralisator_x dhaneberg Bis gleich auf sueddeutsche.de www.sueddeutsche.de/suedcafe Download free books at BookBoon.com In cooperation with www.beam-eBooks.de Econometrics Contents 7.3 7.4 Inclusion of an irrelevant variable Measurement errors 76 77 8.1 8.2 8.3 8.4 8.5 Dummy variables Intercept dummy variables Slope dummy variables Qualitative variables with several categories Piecewise linear regression Test for structural differences 80 80 83 85 87 89 9.1 9.2 9.2.1 9.2.2 9.3 9.3.1 Heteroskedasticity and diagnostics Consequences of using OLS Detecting heteroskedasticity Graphical methods Statistical tests Remedial measures Heteroskedasticity-robust standard errors 91 91 92 92 95 100 104 10 10.1 10.2 10.3 10.3.1 10.3.2 10.3.3 10.4 10.4.1 10.4.2 Autocorrelation and diagnostics Definition and the nature of autocorrelation Consequences Detection of autocorrelation The Durbin Watson test The Durbins h test statistic The LM-test Remedial measures GLS with AR(1) GLS with AR(2) 106 107 108 110 110 113 114 115 116 116 Please click the advert WHAT‘S MISSING IN THIS EQUATION? You could be one of our future talents MAERSK INTERNATIONAL TECHNOLOGY & SCIENCE PROGRAMME Are you about to graduate as an engineer or geoscientist? Or have you already graduated? If so, there may be an exciting future for you with A.P Moller - Maersk www.maersk.com/mitas Download free books at BookBoon.com In cooperation with www.beam-eBooks.de Econometrics Contents Multicollinearity and diagnostics Consequences Measuring the degree of multicollinearity Remedial measures 118 118 121 124 12 12.1 12.2 12.3 12.3.1 12.3.2 12.4 12.4.1 12.4.2 Simultaneous equation models Introduction The structural and reduced form equation Identification The order condition of identification The rank condition of identification Estimation methods Indirect Least Squares (ILS) Two Stage Least Squares (2SLS) 125 125 127 129 130 132 133 134 135 A A1 A2 A3 A4 Statistical tables Area below the standard normal distribution Right tail critical values for the t-distribution Right tail critical value of the Chi-Square distribution Right tail critical for the F-distribution: percent level 138 138 139 140 141 Please click the advert 11 11.1 11.2 11.3 Download free books at BookBoon.com In cooperation with www.beam-eBooks.de Econometrics Basics of probability and statistics Basics of probability and statistics The purpose of this and the following chapter is to briefly go through the most basic concepts in probability theory and statistics that are important for you to understand If these concepts are new to you, you should make sure that you have an intuitive feeling of their meaning before you move on to the following chapters in this book 1.1 Random variables and probability distributions The first important concept of statistics is that of a random experiment It is referred to as any process of measurement that has more than one outcome and for which there is uncertainty about the result of the experiment That is, the outcome of the experiment can not be predicted with certainty Picking a card from a deck of cards, tossing a coin, or throwing a die, are all examples of basic experiments The set of all possible outcomes of on experiment is called the sample space of the experiment In case of tossing a coin, the sample space would consist of a head and a tail If the experiment was to pick a card from a deck of cards, the sample space would be all the different cards in a particular deck Each outcome of the sample space is called a sample point An event is a collection of outcomes that resulted from a repeated experiment under the same condition Two events would be mutually exclusive if the occurrence of one event precludes the occurrence of the other event at the same time Alternatively, two events that have no outcomes in common are mutually exclusive For example, if you were to roll a pair of dice, the event of rolling a and of rolling a double have the outcome (3,3) in common These two events are therefore not mutually exclusive Events are said to be collectively exhaustive if they exhaust all possible outcomes of an experiment For example, when rolling a die, the outcomes 1, 2, 3, 4, 5, and are collectively exhaustive, because they encompass the entire range of possible outcomes Hence, the set of all possible die rolls is both mutually exclusive and collectively exhaustive The outcomes and are mutually exclusive but not collectively exhaustive, and the outcomes even and not-6 are collectively exhaustive but not mutually exclusive Even though the outcomes of any experiment can be described verbally, such as described above, it would be much easier if the results of all experiments could be described numerically For that purpose we introduce the concept of a random variable A random variable is a function, which assigns unique numerical values to all possible outcomes of a random experiment By convention, random variables are denoted by capital letters, such as X, Y, Z, etc., and the values taken by the random variables are denoted by the corresponding small letters x, y, z, etc A random variable from an experiment can either be discrete or continuous A random variable is discrete if it can assume only a finite number of numerical values That is, the result in a test with 10 questions can be 0, 1, 2, …, 10 In this case the discrete random variable would represent the test result Other examples could be the number of household members, or the number of sold copy machines a given day Whenever we talk about random variables expressed in units we have a discrete random variable However, when the number of unites can be very large, the distinction between a discrete and a continuous variable become vague, and it can be unclear whether it is discrete or continuous Download free books at BookBoon.com In cooperation with www.beam-eBooks.de Econometrics Basics of probability and statistics A random variable is said to be continuous when it can assume any value in an interval In theory that would imply an infinite number of values But in practice that does not work out Time is a variable that can be measured in very small units and go on for a very long time and is therefore a continuous variable Variables related to time, such as age is therefore also considered to be a continuous variable Economic variables such as GDP, money supply or government spending are measured in units of the local currency, so in some sense one could see them as discrete random variables However, the values are usually very large so counting each Euro or dollar would serve no purpose It is therefore more convenient to assume that these measures can take any real number, which therefore makes them continuous Since the value of a random variable is unknown until the experiment has taken place, a probability of its occurrence can be attached to it In order to measure a probability for a given events, the following formula may be used: P( A) The number of ways event A can occur The total number of possible outcomes (1.1) This formula is valid if an experiment can result in n mutually exclusive and equally likely outcomes, and if m of these outcomes are favorable to event A Hence, the corresponding probability is calculated as the ratio of the two measures: n/m as stated in the formula This formula follows the classical definition of a probability Example 1.1 You would like to know the probability of receiving a when you toss a die The sample space for a die is {1, 2, 3, 4, 5, 6}, so the total number of possible outcome are You are interested in one of them, namely Hence the corresponding probability equals 1/6 Example 1.2 You would like to know the probability of receiving when rolling two dice First we have to find the total number of unique outcomes using two dice By forming all possible combinations of pairs we have (1,1), (1,2),…, (5,6),(6,6), which sum to 36 unique outcomes How many of them sum to 7? We have (1,6), (2,5), (3,4), (4,3), (5,2), (6,1): which sums to combinations Hence, the corresponding probability would therefore be 6/36 = 1/6 The classical definition requires that the sample space is finite and that the each outcome in the sample space is equally likely to appear Those requirements are sometimes difficult to stand up to We therefore need a more flexible definition that handles those cases Such a definition is the so called relative frequency definition of probability or the empirical definition Formally, if in n trials, m of them are favorable to the event A, then P(A) is the ratio m/n as n goes to infinity or in practice we say that it has to be sufficiently large Example 1.3 Let us say that we would like to know the probability to receive when rolling two dice, but we not know if our two dice are fair That is, we not know if the outcome for each die is equally likely We could then perform an experiment where we toss two dice repeatedly, and calculate the relative frequency In Table 1.1 we report the results for the sum from to for different number of trials Download free books at BookBoon.com In cooperation with www.beam-eBooks.de Econometrics Basics of probability and statistics Table 1.1 Relative frequencies for different number of trials Sum 10 0.1 0.1 0.2 0.1 0.2 100 0.02 0.02 0.07 0.12 0.17 0.17 1000 0.021 0.046 0.09 0.114 0.15 0.15 Number of trials 10000 100000 0.0274 0.0283 0.0475 0.0565 0.0779 0.0831 0.1154 0.1105 0.1389 0.1359 0.1411 0.1658 1000000 0.0278 0.0555 0.0838 0.1114 0.1381 0.1669 0.02778 0.05556 0.08333 0.11111 0.13889 0.16667 From Table 1.1 we receive a picture of how many trials we need to be able to say that that the number of trials is sufficiently large For this particular experiment million trials would be sufficient to receive a correct measure to the third decimal point It seem like our two dices are fair since the corresponding probabilities converges to those represented by a fair die 1.1.1 Properties of probabilities When working with probabilities it is important to understand some of its most basic properties Below we will shortly discuss the most basic properties d P( A) d A probability can never be larger than or smaller than by definition If the events A, B, … are mutually exclusive we have that P ( A B ) P( A) P( B) Please click the advert Studieren in Dänemark heißt: nicht auswendig lernen, sondern verstehen in Projekten und Teams arbeiten sich international ausbilden mit dem Professor auf Du sein auf Englisch diskutieren Fahrrad fahren Mehr Info: www.studyindenmark.dk Download free books at BookBoon.com In cooperation with www.beam-eBooks.de 10 Yi Yˆi Explained (5.1) Unexplained We have to remember that we try to explain the deviation from the mean value of Y, using the regression model Hence, the difference between the expected value Yˆ and the mean value Y will therefore be i denoted as the explained part of the mean difference The remaining part will therefore be denoted the unexplained part With this simple trick we decomposed the simple mean difference for a single observation We must now transform (5.1) into an expression that is valid for the whole sample, that is for all observations We that by squaring and summing over all n observations: ¦ Yi Y ¦ n i n Yˆi Y Yi Yˆi i ¦ ª«¬Yˆi Y Yi Yˆi 2Yˆi Y Yi Yˆi º»¼ i n It is possible to show that the sum of the last expression on the right hand side equals zero With that knowledge we may write: Download free books at BookBoon.com In cooperation with www.beam-eBooks.de 55 Econometrics Model measures 2 ¦ Yi Y ¦ Yˆi Y ¦ Yi Yˆi ... BookBoon.com In cooperation with www.beam-eBooks.de 21 Econometrics Basics of probability distribution in econometrics Basic probability distributions in econometrics In the previous chapter we...Thomas Andren Econometrics Download free books at BookBoon.com In cooperation with www.beam-eBooks.de Econometrics â 2007 Thomas Andren & Ventus Publishing... Download free books at BookBoon.com In cooperation with www.beam-eBooks.de 22 Econometrics Basics of probability distribution in econometrics Unfortunately this integral has no closed form solution