1. Trang chủ
  2. » Công Nghệ Thông Tin

Linear regression basic

59 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 59
Dung lượng 865,76 KB

Nội dung

1 Introduction 2 Approaches to Line Fitting 3 The Least Squares Approach 4 Linear Regression as a Statistical Model 5 Multiple Linear Regression and Matrix Formulation CHAPTER 1 Basic Concepts of Regr.

1 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation CHAPTER 1: Basic Concepts of Regression Analysis Prof Alan Wan / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Table of contents Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Regression analysis is a statistical technique used to describe relationships among variables The simplest case to examine is one in which a variable Y , referred to as the dependent or target variable, may be related to one variable X , called an independent or explanatory variable, or simply a regressor / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Regression analysis is a statistical technique used to describe relationships among variables The simplest case to examine is one in which a variable Y , referred to as the dependent or target variable, may be related to one variable X , called an independent or explanatory variable, or simply a regressor If the relationship between Y and X is believed to be linear, then the equation for a line may be appropriate: Y = β1 + β2 X , where β1 is an intercept term and β2 is a slope coefficient / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Regression analysis is a statistical technique used to describe relationships among variables The simplest case to examine is one in which a variable Y , referred to as the dependent or target variable, may be related to one variable X , called an independent or explanatory variable, or simply a regressor If the relationship between Y and X is believed to be linear, then the equation for a line may be appropriate: Y = β1 + β2 X , where β1 is an intercept term and β2 is a slope coefficient In simplest terms, the purpose of regression is to try to find the best fit line or equation that expresses the relationship between Y and X / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Consider the following data points X Y 11 13 A graph of the (x, y ) pairs would appear as Fig 1.1 14 12 10 Y 0 X / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Regression analysis is not needed to obtain the equation that describes Y and X because it is readily seen that Y = + 2X This is an exact or deterministic relationship / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Regression analysis is not needed to obtain the equation that describes Y and X because it is readily seen that Y = + 2X This is an exact or deterministic relationship Deterministic relationships are sometimes (although very rarely) encountered in business environments For example, in accounting: assets = liabilities + owner equity total costs = fixed costs + variable costs In business and other social science disciplines, deterministic relationships are the exception rather than the norm / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Data encountered in a business environment are more likely to appear like the data points in this graph, where Y and X largely obey an approximately linear relationship, but it is not an exact relationship: Fig 1.2 14 12 10 Y 0 X / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Introduction Still, it may be useful to describe the relationship in equation form, expressing Y as X alone - the equation can be used for forecasting and policy analysis, allowing for the existence of errors (since the relationship is not exact) So how to fit a line to describe the ”broadly linear” relationship between Y and X when the (x, y ) pairs not all lie on a straight line? / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Multiple Linear Regression In practice, more often than not, Y is determined by more than one factor For example, In Example 1.2, size is rarely the only factor of importance in determining housing prices Obviously, other factors need to be considered A regression that contains more than one explanatory variable is called a multiple regression model Example 1.3 Observations are available for twenty five households on their annual total expenditure on non-durable goods and services (Y ), annual disposable income (X2 ), and stocks of liquid assets they hold (X3 ) The figures are in thousands of dollars The regression model is therefore: yi = β1 + β2 xi2 + β3 xi3 + i ; i = 1, · · · , 25 28 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Multiple Linear Regression Let b1 , b2 and b3 be the estimators of β1 , β2 and β3 respectively Differentiating ni=1 ei2 with respect to b1 , b2 and b3 yields the following normal equations: n i=1 ei ∂ ∂b1 ∂ n i=1 ei ∂b2 ∂ n i=1 ei ∂b3 n = −2 (yi − b1 − b2 xi2 − b3 xi3 ) = (3) xi2 (yi − b1 − b2 xi2 − b3 xi3 ) = (4) xi3 (yi − b1 − b2 xi2 − b3 xi3 ) = (5) i=1 n = −2 i=1 n = −2 i=1 29 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Multiple Linear Regression Equations (3), (4) and (5) can be solved for b1 , b2 and b3 , but the solutions in terms of ordinary algebra are messy, and their algebraic complexity increases as k increases To work with the general linear model, it simplifies matters if we make use of the linear algebra notations 30 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form When there are k coefficients and k − explanatory variables, the complete set of n observations can be written in full as y1 = β1 + β2 x12 + β3 x13 + · · · + βk x1k + y2 = β1 + β2 x22 + β3 x23 + · · · + βk x2k + yn = β1 + β2 xn2 + β3 xn3 + · · · + βk xnk + n where xij denotes the i th observation of the j th explanatory variable 31 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form In matrix algebra notations, these equations can be written as:        β1 x12 x13 x1k y1 y2  1 x22 x23 x2k  β2             =     +          yn xn2 xn3 xnk βk n where       y1 x12 x1k β1 y2  1 x22 x2k  β2        Y =   , X =   , β =   and     . yn xn2 xnk βk    2   =.   n 32 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Thus, Y is a n × vector containing all of the observations on the dependent variable, X is n × k matrix containing all the observations on the explanatory variables (including the constant term), β is a k × vector of unknown coefficients we wish to estimate, and is n × a vector of disturbances In compact notations, our model therefore becomes Y = Xβ + 33 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Let  b1 b2    b =   be the O.L.S estimator of β    bk Define   e1 e2    e =   = y − Xb   en Note that n i=1 ei = e e 34 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Thus, e e = (Y − Xb) (Y − Xb) = Y Y − b X Y − Y Xb + b X Xb = Y Y − 2b X Y + b X Xb Note that if a and b are k × 1, and A is k × k and symmetric, then ∂b a ∂a b ∂b = ∂b = ∂b Ab ∂b = 2Ab a 35 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Applying these results, ∂e e = −2X Y + 2X Xb = 0, ∂b leading to X Xb = X Y b = (X X ) or −1 X Y, (6) provided that X X is non-singular 36 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form The O.L.S residual vector is e = Y − Xb Note that X e = X (Y − Xb) = X Y − (X X )(X X )−1 X Y = 0, implying that the residual vector is uncorrelated with each explanatory variable 37 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form For the special case of a simple linear regression model, XX = n i=1 xi n i=1 xi n n i=1 xi giving n n i=1 xi n i=1 xi n i=1 xi and X Y = i=1 yi i=1 xi yi n i=1 yi n i=1 xi yi b1 = b2 or n nb1 + b2 n xi = i=1 n b1 n xi2 = xi + b2 i=1 yi i=1 n i=1 xi yi i=1 as in equations (1) and (2) 38 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Refer to Example 1.3,  (X X )−1 X y =  n n i=1 xi2 n i=1 xi3 n i=1 xi2 n i=1 xi2 n i=1 xi2 xi3 n i=1 xi3 −1  n  i=1 xi2 xi3 n i=1 xi3  n i=1 yi  n  i=1 xi2 yi n i=1 xi3 yi giving    −1   b1 25 4080.1 14379.1 4082.34 b2  =  4080.1 832146.8 2981925  801322.7 b3 14379.1 2981925 11737267 2994883 39 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Or      0.202454971 −0.001159287 0.000046500 4082.34 b1 b2  = −0.001159287 0.000020048 −0.000003673 801322.7 b3 0.000046500 −0.000003673 0.000000961 2994883   36.789  0.332  = 0.125 These estimates concur with the results produced by EXCEL shown overleaf 40 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Output 1.3: SUMMARY OUTPUT Regression Statistics Multiple R 0.891730934 R Square 0.795184059 Adjusted R Square 0.776564428 Standard Error 38.43646324 Observations 25 ANOVA df Regression Residual Total Intercept X Variable X Variable 2 22 24 SS MS F 126186.6552 63093.33 42.70676 32501.95754 1477.362 158688.6128 Significance F 2.66073E-08 Coefficients 36.79008153 0.33183046 0.125785793 Standard Error t Stat P-value 17.29448828 2.127272 0.044851 0.172101064 1.928114 0.066847 0.037687964 3.337559 0.002984 Lower 95% 0.923508064 -0.025085301 0.04762574 Upper 95% 72.65665499 0.688746221 0.203945847 41 / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation The linear model in matrix form Hence the estimated regression equation is: yˆi = 36.79 + 0.3318xi2 + 0.1258xi3 A family with annual disposable income of $50,000 and liquid assets worth $100,000 is predicted to spend yˆi = 36.79 + 0.3318(50) + 0.1258(100) = 65.96 thousand dollars on non-durable goods and services in a year 42 / 42 ... Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Table of contents Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression. .. Model Multiple Linear Regression and Matrix Formulation / 42 Introduction Approaches to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression. .. to Line Fitting The Least Squares Approach Linear Regression as a Statistical Model Multiple Linear Regression and Matrix Formulation Multiple Linear Regression Equations (3), (4) and (5) can

Ngày đăng: 09/09/2022, 19:48

w