1. Trang chủ
  2. » Công Nghệ Thông Tin

A textbook of Computer Based Numerical and Statiscal Techniques part 43 pot

10 176 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 10
Dung lượng 112,34 KB

Nội dung

But, in a statistical relationship between the two variables, when the value of one variable is known, we can simply estimate the corresponding value of another variable.. Regression ana

Trang 1

7 Fit a parabola y = a + bx + cx2 to the folliwng data:

x y

[Ans y=0.34 0.78− x+0.99x2]

8 Determine the constants a and b by the method of least squares such that y=ae bx fits the following data:

x y

[Ans.y=1.49989e0.50001x]

9 Fit a least square geometric curve y=ax b to the following data:

x y

[Ans.y=0.5012x1.9977]

10 A person runs the same race track for five consecutive days and is timed as follows:

( )

( )

Day x Time y

Make a least square fit to the above data using a function a b c2

x x

6.7512 4.4738 13.0065

y

11 Use the method of least squares to fit the curve 0

1

c

x

= + to the following table of values:

x y

[Ans y 1.97327 3.28182 x

x

12 Using the method of least square to fit a parabola y= + +a bx cx2 in the following data:

( )x y, :(−1, 2 , 0, 0 , 0,1 , 1, 2) ( ) ( ) ( ) [Ans 1 3 2

2 2

y= + x ]

Trang 2

13 The pressure of the gas corresponding to various volumes V is measured, given by the following data:

( ) ( )

3 2

V cm

p kgcm

Fit the data to the equation pVγ =c

[Ans.pV0.28997 =167.78765]

We know that in a functional relation between two variables, if we know the value of one variable, then the corresponding value of the other variable can be determined exactly

But, in a statistical relationship between the two variables, when the value of one variable

is known, we can simply estimate the corresponding value of another variable

Regression analysis is the method used for estimating the unknown values of one variable corresponding

to the known values of another variable.

9.3.1 Dependent and Independent Variables

Suppose there is a relation between two variables The variable, whose values are known, is known

as independent variable, while another one is called the dependent variable

9.3.2 Line of Regression

Let { , } : 1x y i i ≤ ≤i n and 1≤ ≤j n} be a bivariate distribution If we plot the corresponding values

of x and y, taking the values of x along x-axis and the values of y along y-axis, we obtain a

collection of dots, called the scatter-diagram

If the scatter diagram indicates some relationship between x and y, then the dots of the

scatter diagram will be concentrated round a line, called the line of regression or the line of best fit

9.3.3 Regression Line of y on x

If we have to predict the values of y from given values of x, then the line of regression has an

equation of the form y= +a bx. This is called the regression line of y on x.

9.3.4 Regression Line of x on y

If we have to predict the values of x from given values of y, then the line of regression has an equation of the form x = a + by This is called the regression line of x on y.

9.3.5 To obtain the Equation of Line of Regression of y on x

Suppose that the line approximating the set of point (x y1, 1) (, x y2, 2) (, x y3, 3), , (x y n, n) has the equation:

Trang 3

Then, y i = +a bx i and 2

i i i i

x y =ax +bx for each i=1, 2, ,n therefore

Equations ( )2 and ( )3 are normal equations for this line

Solving ( )2 and ( )3 for a and b and putting these values in ( )1 , we obtain the required equation of the line of regression of y on x.

9.3.6 To obtain the Equation of Line of Regression of x on y

Suppose that the line approximating the set of points (x y1, 1) (, x y2, 2) (, x y3, 3), ,(x y n, n) has the equation:

Then, x i = +a by i and x y i i =ay i+by i2 for each i=1, 2, ,n therefore

Equations ( )2 and ( )3 are normal equations for this line

Solving ( )2 and ( )3 for a and b and putting these values in (1), we obtain the required equation of the line of regression of x on y

Example 1 Find the line of regression of y on x for the following data:

Sol Here n=7 Now form the table given below:

i

i

∑ 47 ∑y i =60 ∑x2i =352 ∑x y i i =416

Then, y i= +a bx i andx y i i=ax i+bx i2for each i.

Trang 4

Therefore the normal equations are:

2

i i i i

Putting the values from the table in ( )2 and ( )3 , we get

60 7= a+47b

416=47a+355b

Solving these equations, we get a=8.582 and b=1.094.

Putting these values in (1) the required equation is y = 8.582 + 1.094x Ans

Example 2 Find the line of regression of x on y for the following data:

Sol Here n=5 Now, form the table given below :

i

i

i

∑ 340 ∑x y i i =214

Then x i= +a by i and 2

i i i i

x y =ay +by for each . Therefore the normal equations are:

2

i i i i

Putting the values from the table in ( )2 and ( )3 , we get

30=5a+40ba+8b=6

214=40a+340b⇒ 20a+170b=107

On solving these equations we get a=16.4 and b= −1.3.

Therefore the requried equation is, x=16.4 1.3 − y Ans

Trang 5

Example 3 Prove that arithmetic mean of the coefficient of regression is greater than the coefficient

of correlation.

Sol Coefficients of regression are r y

x

σ

σ , r xy

σ σ

We have to prove that A M .>r.

y x

x y

 σ σ  + >

 σ σ 

  or

1

1 2

y x

x y

σ σ  + >

σ σ 

y x

x y

+ − >

σ σ or 2 2

1 [ x y 2 x y] 0

x y

σ + σ − σ σ >

σ σ

[ x y]

x y

σ − σ

σ σ which is true Proved.

Example 4 Find the regression line of y on x for the following data:

Estimate the value of y, when x=10.

Sol

Let y = a + bx be the line of regression of y on x Therefore normal equations are : ∑y i =na b+ ∑x i ⇒ 40=8a+56b (1) ∑x y i i =ax i+bx i2 ⇒ 364=56a+524b (2)

On solving (1) and (2) we get

6

11

11

b= The equation of the required line is

6 7

11 11

y= + x or 7x−11y+ =6 0

If x=0, 6 7( )10 76 610

Trang 6

Example 5 In a study between the amount of rainfall and the quantity of air pollution removed the following data were collected.

Daily Rainfall in 0.01cm 4.3 4.5 5.9 5.6 6.1 5.2 3.8 2.1 Pollution Removed (mg/m 3 ) 12.6 12.1 11.6 11.8 11.4 11.8 13.2 14.1 Find the regression line of y on x.

Sol

Let y= +a bx be the equation of the line of regression of y on x

∴ Normal equations are:

y i =na b+ ∑x i ⇒ 98.6=8a+37.5b

2

i i i i

x y =a x +b x

∑ ∑ ∑ 453.82=37.5a+188.01b

After solving these normal equations we get a=15.49 and b= − 0.675

The equation of the line of regression is y = 15.49 – 0.675x. Ans

9.3.7 Another Form of Equations of Lines of Regression

Theorem 1: Show that the equation of the line of regression of y on x is given by

( )

y

x

− = −

σ , where x and y are the means of x-series and y-series respectively; r is the coefficient of correlation between x and y; σx and σy are the standard deviations of x-series and the y-series respectively.

Proof: Suppose that the line approximating the set of points (x y1, 1) (, x y2, 2), ,(x y n, n)

has the equation

Then y i= +a bx i and 2

i i i i

x y =ax +bx for each i=1, 2, , n

Trang 7

x y i i =ax i+bx2i (3) From (2), we have y i a b x i

Thus, it follows that ( )x y, lies on the line

Shifting the origin to ( )x y, ( )2 becomes

∑ (y iy)=na b+ ∑ (x ix) or a=0

3∑ b gx ix = ∑ b gy iy = 0

Shifting the origin to ( )x y, and taking a = 0,

i i i

xx yy =b xx

From ( )6 , we have

b = x x y y

x x

i i i

d id i

d i

dx

i i i

b g

= dx dy

n

i i x

b g

σ 2 = r y

x

σ

σ 3 ( )

( , )

i i

x y

dx dy r

n

=

σ σ

Putting this values of b in ( )5 , the required equation of the line if regression of y on x i

( ) y( )

x

− = −

σ

Coefficient of Regression of y on x: The real number y

x

=

σ is called the coefficient

of regression of y on x and is denoted by b yx Thus yx y

x

=

σ .

Theorem 2: The equation of the line of regression of x on y is given by

d ix x– =r x y y

y

.σ –

σ d i

Proof: Proceed as in theorem 1

Coefficient of Regression of x on y: The real number b = r x

y

σ is called the coefficient

of regression of x on y and is denoted by b xy Thus b xy = r x

y

σ .

Trang 8

Theorem 3: Prove that:

(i)

( )( ) ( )2 2

i i

i i yx

i i

x y

n b

x x

n

=

∑ ∑

(ii)

( )( ) ( )2 2

i i

i i xy

i i

x y

n b

y y

n

=

∑ ∑

Proof: (i) By definition, we have

( )2

y y x yx

x x

b rσ r σ σ

( ) ( )2

cov ,

x

x y

=

( )( ) ( )2 2

i i

i i

i i

x y

n

x x

n

=

∑ ∑

Similarly, ( )ii can be proved

Example 6 Find the regression coefficient b yx between x and y for the following data: x = 24 ,

2 2

y = 44, xy = 306, x = 164, y = 574

Sol The given data may be written as ∑x i =24, ∑y i=44, ∑x y i i =306, ∑x i2=164,

2 574

i

b yx=

n

n

i i

i

i

∑ ∑ ∑

R S|

T|

U V|

W|

d id i

d i

2

2 =

306 24 44

4

164 24

4

2

– –

×

a f

= 306 264

164 144

– –

20 = 2.1 Ans

Example 7 Find the regression coefficient b xy between x and y for the following data:

x

= 30, y = 42, xy = 199, x2 = 184, y2 = 318 and n = 6.

Trang 9

Sol The given data may be given as under: ∑x i = 30, ∑y i = 42, ∑x y i i = 199,

x i

∑ 2

= 184, ∑y i2

= 318 and n = 6.

( )( ) ( )2 2

i i

i i xy

i i

x y

n b

y y

n

=

∑ ∑

30 42 199

199 210 11

318

6

×

 − 

Ans

Example 8 For the following observations (x, y), find the regression coefficient b yx and b xy and hence find the correlation coefficient between x and y: (1, 2), (2, 4), (3, 8), (4, 7), (5, 10), (6, 5), (7, 14), (8, 16), (9, 2), (10, 20).

Sol Here n = 10 We may prepare the table, given below:

( )( ) ( )2 ( )2 2

55 88

82.5 55

385 10

i i

i i yx

i i

x y

n b

x x

n

×

∑ ∑

And

( )( ) ( )2 ( )2 2

(55 88)

339.6 88

1114

10

i i

i i xy

i i

x y

n b

y y

n

×

∑ ∑

Trang 10

Now, yx· xy · y · x 2

b b =r σ r σ =r

 σ  σ 

   , where r is the coefficient of correlation.

r= b yx·b xy = 1.24 0.30× =0.609

Thus, b yx =1.24, b xy =0.30 and r=0.609 Ans

9.3.8 Some Properties of Regression Coefficients

Let, the regression coefficient of y on x is b yx; the regression coefficient of x on y is b xy; and,

the correlation coefficient between x and y is r Then, we have the following results.

Theorem 1: Prove that r= b yxb xy

Proof: We have: y

yx x

=

x xy y

σ Therefore, b yx b xy = r2 or r = b yx.b xy

Remark: Clearly we can say that, correlation coefficient is the geometric mean between the two regression coefficients

Theorem 2: Prove that r, byx and bxy are of the same sign

Proof: We know that y

yx x

=

σ and

x xy y

σ Since σx and σy are both positive, it

follows from the two equations, given above that b yx and b xy have the same sign as r.

Hence r b, yx and b xy are always of the same sign.

Theorem 3: Prove that the arithmetic mean of regression coefficient is greater than the correlation coefficient

Proof: Clearly, the required result is true,

2 b yx+b xy >r i.e., if 12 . y x

x y

 σ σ  + >

σ σ

i.e., if σ + σ > σ σ2y 2x 2 x y

i.e., if (σ − σ2y 2 2x) − σ σ >2 x y 0

i.e., if (σ − σy x)2>0, which is true

Hence the required result is true Proved

Theorem 4: Let θ be the angle between the regression line of y on x and the regression

line of x on y Then, prove that ( )

( )

2

2 2

x y

r r

σ + σ

Proof: The equation of the line of regression of x on y is

y

Ngày đăng: 04/07/2014, 15:20

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w