Bài 1: Các dạng hàm hồi quy

 Akaike’s Information Criterion (AIC): Adds harsher penalty for adding. more variables to the model, defined as:[r]

(1)

FUNCTIONAL FORMS

Truong Dang Thuy truong@dangthuy.net

(2)

Linear model

 Consider a linear regression function

 : change in Y when X increases by unit  Sometimes the relationship is not linear  Common functional form:

 Log-linear  Log-lin  Lin-log

 Reciprocal  Polynomial

0 1

Y     X  

(3)

Functional forms

Linear model Log-linear

Lin-log

Log-lin

0

Y     X 

0

lnY     ln X 

0 1ln

Y     X 

0

(4)

Functional forms

Reciprocal (negative beta) Reciprocal (positive beta)

0 1

1

0

Y

X

   

   

1



0 1

1

0

Y

X

   

(5)

Example dataset

Viet Nam Provincial data on (file ‘gdpprov.xlsx’)

 gdp: provincial GDP (mil VND)

 labfo: number of laborers of provinces (1000

persons)

(6)

Record of commands

Record of results

Variables (data)

Commands

Taskbar

(7)

Import data

 Copy from Excel

(8)

Data description

(9)

Linear function

(10)

LOG-LINEAR MODEL

 The Cobb-Douglas Production Function:

can be transformed into a linear model by taking natural logs of both sides:

 The slope coefficients can be interpreted as elasticities

 If (B2 + B3) = 1, we have constant returns to scale  If (B2 + B3) > 1, we have increasing returns to scale  If (B2 + B3) < 1, we have decreasing returns to scale

3 2

1

B B

i i i

Q  B L K

1 2 3

(11)

Log-linear model

_cons 3.06333 .4515804 6.78 0.000 2.174233 3.952426 linvest 644785 .0405325 15.91 0.000 5649824 .7245876 llabor 508612 .0643267 7.91 0.000 381962 635262 lgdp Coef Std Err t P>|t| [95% Conf Interval] Total 224.910559 270 833002069 Root MSE = 42886 Adj R-squared = 0.7792 Residual 49.2915017 268 183923514 R-squared = 0.7808 Model 175.619057 87.8095284 Prob > F = 0.0000 F( 2, 268) = 477.42 Source SS df MS Number of obs = 271 reg lgdp llabor linvest

(17 missing values generated) gen linvest = ln(rinvest) gen llabor = ln(labfo)

(10 missing values generated) gen lgdp = ln(rgdp)

(12)

LOG-LIN OR GROWTH MODELS

 The rate of growth of real GDP:

can be transformed into a linear model by taking natural logs of both sides:

 Letting B1 = ln RGDP0 and B2 = ln (l+r), this can be

rewritten as:

ln RGDPt = B1 +B2 t

 B2 is considered a semi-elasticity or an instantaneous growth rate  The compound growth rate (r) is equal to (eB2 – 1)

0(1 )

t t

RGDP  RGDP  r

0

(13)

LOG-LIN MODEL

t 290 1.416658 5 Variable Obs Mean Std Dev Min Max sum t

(14)

LOG-LIN MODEL

(15)

LIN-LOG MODELS

 Lin-log models follow this general form:

 Note that B2 is the absolute change in Y responding to a

percentage (or relative) change in X

 If X increases by 100%, predicted Y increases by B2 units

1 2 ln

i i i

(16)

Exercise – lin-log model

 Data: from VHLSS 2010

 income: individual annual income (1000 VND)  healthcost: individual annual cost for health care

(1000 VND)

 Use the data in ‘healthcost.dta’ to run the

regression

where hcshare is the share of health cost in income

 

0 1 ln

(17)

Health cost with Lin-log model

_cons 421608 .0322026 13.09 0.000 .35847 484746 lincome -.0341629 .0029364 -11.63 0.000 -.0399202 -.0284056 hcshare Coef Std Err t P>|t| [95% Conf Interval] Total 75.7996618 3474 021819131 Root MSE = 14494 Adj R-squared = 0.0372 Residual 72.9563097 3473 021006712 R-squared = 0.0375 Model 2.84335206 2.84335206 Prob > F = 0.0000 F( 1, 3473) = 135.35 Source SS df MS Number of obs = 3475 reg hcshare lincome

gen lincome = ln(income)

(18)

RECIPROCAL MODELS

 Lin-log models follow this general form:

 Note that:

 As X increases indefinitely, the term approaches zero and Y approaches the limiting or asymptotic value B1

 The slope is:

 Therefore, if B2 is positive, the slope is negative throughout, and if B2 is negative, the slope is positive throughout

1 2

1 ( )

i i

i

Y B B u

X    1 ( ) i B X 2 2 1 ( ) dY B

(19)

Exercise – Reciprocal model

 Use the data in ‘healthcost.dta’ to run the

regression

0 1

1

hcshare

income

  

(20)

Exercise – Reciprocal model

_cons 023971 .0032251 7.43 0.000 0176478 .0302943 invincome 942.4843 81.65964 11.54 0.000 782.3786 1102.59 hcshare Coef Std Err t P>|t| [95% Conf Interval] Total 75.7996618 3474 021819131 Root MSE = 14498 Adj R-squared = 0.0367 Residual 72.9997153 3473 .02101921 R-squared = 0.0369 Model 2.79994649 2.79994649 Prob > F = 0.0000 F( 1, 3473) = 133.21 Source SS df MS Number of obs = 3475 reg hcshare invincome

(21)

POLYNOMIAL REGRESSION MODELS

 The following regression predicting GDP is an example of a

quadratic function, or more generally, a second-degree polynomial in the variable time:

 The slope is nonlinear and equal to:

 Exercise: run the above model with ‘gdpprov.dta’

2

1 2 3

t t

RGDP  A  A time  A time  u

2 2 3

dRGDP

A A time

(22)

SUMMARY OF FUNCTIONAL FORMS

MODEL FORM SLOPE ELASTICITY

( dY

dX ) .

dY X dX Y

Linear Y =B1 + B2 X B 2 2( )

Y X B

Log-linear lnY =B1 + ln X 2( )

Y B

X B 2

Log-lin lnY =B1 + B2 X B Y 2( ) B2(X)

Lin-log Y  B1 B2 ln X

1 ( ) B X ) 1 ( Y B

Reciprocal

1 ( )

Y B B X

  B2( 12)

X

 2( 1 )

XY B



2 ln

(23)

COMPARING ON BASIS OF R2

 We cannot directly compare two models that have

different dependent variables

 We can transform the models as follows and compare RSS:

 Step 1: Compute the geometric mean (GM) of the dependent

variable, call it Y*

 Step 2: Divide Yi by Y* to obtain:

 Step 3: Estimate the equation with lnYi as the dependent variable

using in lieu of Yi as the dependent variable (i.e., use ln as the dependent variable)

 Step 4: Estimate the equation with Yi as the dependent variable

using as the dependent variable instead of Yi

i i

Y Y

Y ~

* 

i

Y~ Y~i

i

(24)

MEASURES OF GOODNESS OF FIT

 R2: Measures the proportion of the variation in the regressand

explained by the regressors

 Adjusted R2: Denoted as , it takes degrees of freedom into account:

 Akaike’s Information Criterion (AIC): Adds harsher penalty for adding

more variables to the model, defined as:

 The model with the lowest AIC is usually chosen

 Schwarz’s Information Criterion (SIC): Alternative to the AIC criterion,

expressed as:

 The penalty factor here is harsher than that of AIC 2

R



_

2 1

1 (1 ) n

R R n k      2

ln AIC k ln(RSS)

n n

 

ln SIC k ln n ln(RSS)

n n

Định dạng
Số trang	24
Dung lượng	0,98 MB