1. Trang chủ
  2. » Công Nghệ Thông Tin

Linear regression basic

59 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Nội dung

1 Introduction 2 Approaches to Line Fitting 3 The Least Squares Approach 4 Linear Regression as a Statistical Model 5 Multiple Linear Regression and Matrix Formulation CHAPTER 1 Basic Concepts of Regr.

1 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation CHAPTER 1: Basic Concepts of Regression Analysis Prof Alan Wan / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Table of contents Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Regression analysis is a statistical technique used to describe relationships among variables The simplest case to examine is one in which a variable Y , referred to as the dependent or target variable, may be related to one variable X , called an independent or explanatory variable, or simply a regressor / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Regression analysis is a statistical technique used to describe relationships among variables The simplest case to examine is one in which a variable Y , referred to as the dependent or target variable, may be related to one variable X , called an independent or explanatory variable, or simply a regressor If the relationship between Y and X is believed to be linear, then the equation for a line may be appropriate: Y = β1 + β2 X , where β1 is an intercept term and β2 is a slope coefficient / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Regression analysis is a statistical technique used to describe relationships among variables The simplest case to examine is one in which a variable Y , referred to as the dependent or target variable, may be related to one variable X , called an independent or explanatory variable, or simply a regressor If the relationship between Y and X is believed to be linear, then the equation for a line may be appropriate: Y = β1 + β2 X , where β1 is an intercept term and β2 is a slope coefficient In simplest terms, the purpose of regression is to try to find the best fit line or equation that expresses the relationship between Y and X / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Consider the following data points X Y 11 13 A graph of the (x, y ) pairs would appear as Fig 1.1 14 12 10 Y 0 X / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Regression analysis is not needed to obtain the equation that describes Y and X because it is readily seen that Y = + 2X This is an exact or deterministic relationship / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Regression analysis is not needed to obtain the equation that describes Y and X because it is readily seen that Y = + 2X This is an exact or deterministic relationship Deterministic relationships are sometimes (although very rarely) encountered in business environments For example, in accounting: assets = liabilities + owner equity total costs = fixed costs + variable costs In business and other social science disciplines, deterministic relationships are the exception rather than the norm / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Data encountered in a business environment are more likely to appear like the data points in this graph, where Y and X largely obey an approximately linear relationship, but it is not an exact relationship: Fig 1.2 14 12 10 Y 0 X / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Still, it may be useful to describe the relationship in equation form, expressing Y as X alone - the equation can be used for forecasting and policy analysis, allowing for the existence of errors (since the relationship is not exact) So how to fit a line to describe the ”broadly linear” relationship between Y and X when the (x, y ) pairs not all lie on a straight line? / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Multiple Linear Regression In practice, more often than not, Y is determined by more than one factor For example, In Example 1.2, size is rarely the only factor of importance in determining housing prices Obviously, other factors need to be considered A regression that contains more than one explanatory variable is called a multiple regression model Example 1.3 Observations are available for twenty five households on their annual total expenditure on non-durable goods and services (Y ), annual disposable income (X2 ), and stocks of liquid assets they hold (X3 ) The figures are in thousands of dollars The regression model is therefore: yi = β1 + β2 xi2 + β3 xi3 + i ; i = 1, · · · , 25 28 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Multiple Linear Regression Let b1 , b2 and b3 be the estimators of β1 , β2 and β3 respectively Differentiating ni=1 ei2 with respect to b1 , b2 and b3 yields the following normal equations: n i=1 ei ∂ ∂b1 ∂ n i=1 ei ∂b2 ∂ n i=1 ei ∂b3 n = −2 (yi − b1 − b2 xi2 − b3 xi3 ) = (3) xi2 (yi − b1 − b2 xi2 − b3 xi3 ) = (4) xi3 (yi − b1 − b2 xi2 − b3 xi3 ) = (5) i=1 n = −2 i=1 n = −2 i=1 29 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Multiple Linear Regression Equations (3), (4) and (5) can be solved for b1 , b2 and b3 , but the solutions in terms of ordinary algebra are messy, and their algebraic complexity increases as k increases To work with the general linear model, it simplifies matters if we make use of the linear algebra notations 30 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form When there are k coefficients and k − explanatory variables, the complete set of n observations can be written in full as y1 = β1 + β2 x12 + β3 x13 + · · · + βk x1k + y2 = β1 + β2 x22 + β3 x23 + · · · + βk x2k + yn = β1 + β2 xn2 + β3 xn3 + · · · + βk xnk + n where xij denotes the i th observation of the j th explanatory variable 31 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form In matrix algebra notations, these equations can be written as:        β1 x12 x13 x1k y1 y2  1 x22 x23 x2k  β2             =     +          yn xn2 xn3 xnk βk n where       y1 x12 x1k β1 y2  1 x22 x2k  β2        Y =   , X =   , β =   and     . yn xn2 xnk βk    2   =.   n 32 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Thus, Y is a n × vector containing all of the observations on the dependent variable, X is n × k matrix containing all the observations on the explanatory variables (including the constant term), β is a k × vector of unknown coefficients we wish to estimate, and is n × a vector of disturbances In compact notations, our model therefore becomes Y = Xβ + 33 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Let  b1 b2    b =   be the O.L.S estimator of β    bk Define   e1 e2    e =   = y − Xb   en Note that n i=1 ei = e e 34 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Thus, e e = (Y − Xb) (Y − Xb) = Y Y − b X Y − Y Xb + b X Xb = Y Y − 2b X Y + b X Xb Note that if a and b are k × 1, and A is k × k and symmetric, then ∂b a ∂a b ∂b = ∂b = ∂b Ab ∂b = 2Ab a 35 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Applying these results, ∂e e = −2X Y + 2X Xb = 0, ∂b leading to X Xb = X Y b = (X X ) or −1 X Y, (6) provided that X X is non-singular 36 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form The O.L.S residual vector is e = Y − Xb Note that X e = X (Y − Xb) = X Y − (X X )(X X )−1 X Y = 0, implying that the residual vector is uncorrelated with each explanatory variable 37 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form For the special case of a simple linear regression model, XX = n i=1 xi n i=1 xi n n i=1 xi giving n n i=1 xi n i=1 xi n i=1 xi and X Y = i=1 yi i=1 xi yi n i=1 yi n i=1 xi yi b1 = b2 or n nb1 + b2 n xi = i=1 n b1 n xi2 = xi + b2 i=1 yi i=1 n i=1 xi yi i=1 as in equations (1) and (2) 38 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Refer to Example 1.3,  (X X )−1 X y =  n n i=1 xi2 n i=1 xi3 n i=1 xi2 n i=1 xi2 n i=1 xi2 xi3 n i=1 xi3 −1  n  i=1 xi2 xi3 n i=1 xi3  n i=1 yi  n  i=1 xi2 yi n i=1 xi3 yi giving    −1   b1 25 4080.1 14379.1 4082.34 b2  =  4080.1 832146.8 2981925  801322.7 b3 14379.1 2981925 11737267 2994883 39 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Or      0.202454971 −0.001159287 0.000046500 4082.34 b1 b2  = −0.001159287 0.000020048 −0.000003673 801322.7 b3 0.000046500 −0.000003673 0.000000961 2994883   36.789  0.332  = 0.125 These estimates concur with the results produced by EXCEL shown overleaf 40 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Output 1.3: SUMMARY OUTPUT Regression Statistics Multiple R 0.891730934 R Square 0.795184059 Adjusted R Square 0.776564428 Standard Error 38.43646324 Observations 25 ANOVA df Regression Residual Total Intercept X Variable X Variable 2 22 24 SS MS F 126186.6552 63093.33 42.70676 32501.95754 1477.362 158688.6128 Significance F 2.66073E-08 Coefficients 36.79008153 0.33183046 0.125785793 Standard Error t Stat P-value 17.29448828 2.127272 0.044851 0.172101064 1.928114 0.066847 0.037687964 3.337559 0.002984 Lower 95% 0.923508064 -0.025085301 0.04762574 Upper 95% 72.65665499 0.688746221 0.203945847 41 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Hence the estimated regression equation is: yˆi = 36.79 + 0.3318xi2 + 0.1258xi3 A family with annual disposable income of $50,000 and liquid assets worth $100,000 is predicted to spend yˆi = 36.79 + 0.3318(50) + 0.1258(100) = 65.96 thousand dollars on non-durable goods and services in a year 42 / 42 ... Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Table of contents Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression. .. Model Multiple Linear Regression and Matrix Formulation / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression. .. to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Multiple Linear Regression Equations (3), (4) and (5) can

Ngày đăng: 09/09/2022, 19:48

w