(TIỂU LUẬN) course name and code basic econometrics – ECON 1313 lecturer name dr greeni maheshwari class group no 1

31 3 0
(TIỂU LUẬN) course name and code basic econometrics – ECON 1313 lecturer name dr  greeni maheshwari class group no  1

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Course Name and Code: Basic Econometrics – ECON 1313 Lecturer Name: Dr Greeni Maheshwari Class Group No Chocolate brand assigned: RIO Tran Hoang An S3768191 Contents Part 1: Descriptive statistics Part 2: Model Analysis a) Estimate regression model b) Seasonal index c) Impact of Disease on the sales of Rio chocolate d) Impact of competitor’s pricing Part 3: Conclusion .22 Part 4: References .25 Part 5: Appendix .26 Part 1: Descriptive statistics a Line chart Figure 1: Rio’s sales volume (1998-2018) b Visual analysis Based on Figure 1, the sales volume of Rio generally follows an upward trend There seems to be no cycle appearing in the trend However, it is clear that there is seasonality effect on the sales volume Within a year, Rio sales volume is the highest towards the end of the year (around December), while the sales remain low in the middle of the year (around August and September) This pattern repeats every year Besides, there was an irregular fluctuation happening in 2008 when Whooping Cow disease took effect The sales volume of Rio in this year was much higher as compared to other periods Therefore, there was a certain spike in the sales volume in 2008 before it fell back and continued with the increasing trend from 2009 onwards c Descriptive statistics Table 1: Descriptive statistics of Rio’s sales volume According to Table 1, the mean sales volume is 11.0495 tonnes, which is quite closed to the median, while the maximum and minimum sales volumes are 13.23 tonnes and 9.47 tonnes respectively This indicates the small spread in sales volume In Figure 1, it is obvious that the sales volume of 13 tonnes happened in 2008 when there was a Cow disease This is closed to the maximum sales over the past 20 years Such a high sales volume in 2008 was due to Cow disease that created low demand for dairy products Since Rio is not made from milk, there was an increase in demand for Rio chocolate, making it sales higher Standard deviation is small (0.736 tonnes), which indicates that there is not much variance in the sales volume Part 2: Model Analysis a) Estimate regression model  Linear trend Table 2: SPSS output (linear trend) Table 3: SPSS model summary (linear trend) As shown in Table 2, the estimated regression of linear trend is Volume (Yt^) = 10.145 + 0.007*t  Quadratic trend To create quadratic trend, another variable, which is t^2, must be created Using SPSS, the following result is obtained Table 4: SPSS output (quadratic trend) Table 5: SPSS model summary (quadratic trend) Based on Table 4, the estimated regression model is Volume (Yt^) = 9.981 + 0.011*t – 0.00001531*t^2  Exponential trend To determine exponential trend, the dependent variable should be log (Yt) instead of Yt Therefore, new variable, log (Yt) is created Using SPSS, we obtain the following regression result Table 6: SPSS output (exponential trend) Table 7: SPSS model summary (exponential trend) Hence, the estimated regression is log (Yt)^ = 2.318 + 0.001*t Out of the models, linear trend is the most suitable regression to depict the change in total sales volume over time This is because based on the line graph (Figure 1) and the visual analysis in Part 1, there seems to be an upward linear trend in sales volume The dependent variable (sales volume) seems to increase at a constant rate Moreover, by conducting hypothesis testing upon the coefficient of ‘t’ in the linear regression model Yt^=B0 + B1*t, H0: B1=0 (there is no linear trend) H1: B1 ≠ (there is linear trend) As shown in Table 2, the p-value is 0.000 which is less than 0.05 (significant level) Hence, we reject H0 and conclude that, at 95% confidence level, there is a linear trend in the sales volume of Rio chocolate Besides, the quadratic and exponential trends are not likely to be fitted in this case because under quadratic model, there is an upward trend Yet the rate of increasing of the trend gets smaller over time After a maximum point, a downward trend follows Looking at Figure 1, the rate of change seems to be constant rather than diminishing Therefore, quadratic model can be omitted The same reasoning can be applied to exponential model as well Under exponential trend, the rate of change will increase over time Such a characteristic is not observed in Figure Therefore, exponential model is not fitted As a result, the linear trend model is the most suitable depiction of the case This is our final regression model Yt^ = 10.145 + 0.007*t b) Seasonal index Using excel, we obtain the SI as shown below Table 8: Excel calculation of SI values Figure 2: SI value in 12 months Now we will incorporate the SI element into our regression model The below table is the SPSS output after we include SI Table 9: SPSS output for model with SI Based on Table 9, estimated regression model is Yt^= -0.647 + 0.007*t + 0.108*SI To determine whether SI has an impact upon sales volume, a hypothesis test is done upon B2 (the coefficient of SI) H0: B2=0 (there is no seasonality effect on sales volume) H1: B2≠0 (there is seasonality effect on sales volume) Based on Table 9, p-value of B2 is 0.000 which is less than 0.05 (significant level) Hence, we reject H0 and conclude, at 95% confidence level, that there is an effect of seasonality upon sales volume c) Impact of Disease on the sales of Rio chocolate Table 10: SPSS output of model with Disease dummy variable Based on Table 10, estimated model is Yt^= -0.696 + 0.007*t + 0.108*SI + 1.037*D Hypothesis testing upon B3 (coefficient of disease) H0: B3=0 (disease has no impact on Rio sales volume) H1: B3≠0 (disease has an impact on Rio sales volume) The p-value of B3 is 0.000 which is less than 0.05 (significant level) Hence, we reject H0 and conclude, at 95% confidence level, that there is a relationship between disease and Rio sales volume Therefore, Disease dummy variable is a significant variable Based on the value of B3 (1.037), there is a positive relationship between Disease and Rio sales volume d) Impact of competitor’s pricing Table 11: SPSS output of model including competitors’ prices In Table 11, Pt represents the price of Rio chocolate while Pt_c1,2,3,…,8 represent price of CS, Gummi, Smartey, Heaven, Milkey, Treat, Lovely and Roca respectively Based on Table 11, our regression model is Yt^= -0.864 + 0.007*t + 0.102*SI + 1.043*D – 0.242*Pt + 0.150*Pt_c1 + 0.262*Pt_c2 – 0.023*Pt_c3 + 0.039*Pt_c4 + 0.054*Pt_c5 + 0.035*Pt_c6 + 0.026*Pt_c7 – 0.029*Pt_c8 The definition of variables is summarised in the table below, Types of variable Symbols Coefficient Definition Dependent variable Yt Sales volume of Rio chocolate (in tonnes) Independent variables t B1 Time in months (t=1 represent Jan 98) SI D B2 B3 Seasonal index Presence of disease (1: disease presence, 0: Pt B4 otherwise) Price per 100g ($) of Rio chocolate Pt_c1 B5 Price per 100g ($) of Caramel Squared Pt_c2 B6 chocolate Price per 100g ($) of Gummi chocolate Pt_c3 B7 Price per 100g ($) of Smartey chocolate Pt_c4 B8 Price per 100g ($) of Heaven chocolate Pt_c5 B9 Price per 100g ($) of Milkey chocolate Pt_c6 B10 Price per 100g ($) of Treat chocolate Pt_c7 B11 Price per 100g ($) of Lovely chocolate Pt_c8 B12 Price per 100g ($) of Roca chocolate Table 12: Definition of variables Looking at the coefficients in Table 11, there seems to be a negative relationship between Rio price and its sales volume Similarly, there is a negative relationship between price of Smartey/Roca and Rio sales volume For the remaining competitors, there is a positive relationship between their price and Rio sales volume Now, we will conduct hypothesis testing to check significance of the variables Hypothesis testing on B1, B2, B3, B4, B5 and B6 H0: B1=0 (no linear trend) H1: B1≠0 (there is linear trend) The p-value is 0.00 (Table 11) which is less than 0.05 (significant level) Hence, we reject H0 and conclude at 95% confidence level that there is a relationship between t and Rio sales volume (there is linear trend) Therefore, t is a significant variable H0: B2=0 (no seasonality effect on Rio sales volume) Table 17: Output with ‘Months’ dummy variables Now, let’s interpret the coefficients of dummy variables -1.019 means that the sales volume of Rio in January is expected to be 1.019 tonnes lower than sales volume in December, holding other variables constant -0.929 means that the sales volume of Rio in February is expected to be 0.929 tonnes lower than sales volume in December, holding other variables constant -0.926 means that the sales volume of Rio in March is expected to be 0.926 tonnes lower than sales volume in December, holding other variables constant -0.809 means that the sales volume of Rio in April is expected to be 0.809 tonnes lower than sales volume in December, holding other variables constant -1.000 means that the sales volume of Rio in May is expected to be 1.000 tonnes lower than sales volume in December, holding other variables constant 16 -1.176 means that the sales volume of Rio in June is expected to be 1.176 tonnes lower than sales volume in December, holding other variables constant -1.127 means that the sales volume of Rio in July is expected to be 1.127 tonnes lower than sales volume in December, holding other variables constant -1.284 means that the sales volume of Rio in August is expected to be 1.284 tonnes lower than sales volume in December, holding other variables constant -1.217 means that the sales volume of Rio in September is expected to be 1.217 tonnes lower than sales volume in December, holding other variables constant -1.151 means that the sales volume of Rio in October is expected to be 1.151 tonnes lower than sales volume in December, holding other variables constant -0.943 means that the sales volume of Rio in November is expected to be 0.943 tonnes lower than sales volume in December, holding other variables constant  Strength of the model Hypothesis testing upon B1,2,3,4,5,6 H0: Bj=0, j=1,2,3,4,5,6 (there is no relationship between t/SI/Disease/Pt/Pt_c1/Pt_c2 and Rio sales volume) H1: Bj≠0, j=1,2,3,4,5,6 (there is a relationship between t/SI/Disease/Pt/Pt_c1/Pt_c2 and Rio sales volume) Based on Table 15, the p-values of all coefficients are less than 0.05 (significant level) Hence, we reject H0 and conclude at 95% confidence level that there is a relationship between each of these variables and the sales volume of Rio So these variables are significant Table 18: Model summary 17 Moreover, the adjusted R-squared is 0.819 which is a high value (Table 18) This indicates that 81.9% of the changes in the sales volume of Rio chocolate can be explained by the variations in the independent variables such as t, SI, Disease, Pt, Pt_c1 and Pt_c2 Such a high value of R-squared makes this model strong  Residual analysis Figure 3: Residual plot Figure 4: Histogram of residuals 18 Figure shows that residuals are distributed evenly instead of following any specific pattern Therefore, visual analysis indicates no autocorrelation This means that the errors are not correlated over time To double check for autocorrelation, we regress Ut on U(t-1) using SPSS and obtain the following result Table 19: SPSS output of residual model Based on Table 19, the regression model of residual is Ut=-0.014*U(t-1) + et Hypothesis testing upon p (coefficient of U(t-1)) H0: p=0 (there is no autocorrelation) H1: p≠0 (there is autocorrelation) Based on Table 19, p-value of p is 0.823, which is higher than 0.05 (significant level) Hence, we not reject H0 Therefore, we can conclude that, at 95% confidence level, there is no autocorrelation In the case whereby autocorrelation exists, we need to create AR(1) model This means that we include another regressor which is Y(t-1) Our initial model is Yt^= -0.388 + 0.007*t + 0.103*SI + 1.052*D - 0.245*Pt + 0.143*Pt_c1 + 0.251*Pt_c2 Therefore, we will regress Yt on variables t, SI, Disease, Pt, Pt_c1, Pt_c2 and Y(t-1) After that, we will perform the same autocorrelation test as above If under hypothesis testing, there is autocorrelation again, then we need to choose more variable and create the lagged variable For example, we can choose Pt and create more variable which is Pt(t-1) Then we regress Yt on variables t, SI, Disease, Pt, Pt_c1, Pt_c2, Y(t-1) and Pt(t-1) After that, we perform the same autocorrelation test as before We will keep doing the test and add in variables if necessary until there is no autocorrelation This is how we can overcome autocorrelation Now, to check normality assumption, we look at Figure In Figure 4, it appears that residuals are not following Normal distribution Hence, the assumption of normality is violated The implication of this is that when errors are not normally distributed, the dependent variable and OLS estimators will not 19 follow normal distribution as well This means that the usual F-test and t-test are not valid anymore In other words, the significance testing may not be correct in identifying the significant variables Therefore, some variables may appear significant under t-test may not be actual significant variables  Checking assumptions - 1st assumption: linear in parameters In other words, the relationship between regressors and regressand must be linear This means the power of parameters must be In our model, we assume linear relationship between independent variables and dependent variable This is evident that in our regression function, all the OLS estimators have the power of In our model, we are using Bi instead of Bi-squared (i=1,2,3,4,5,6) Hence, this assumption is not violated - 2nd assumption: no perfect collinearity This means the relationship between regressors must not be perfect linearity In other words, explanatory variable cannot be expressed as a linear equation of other variables To test multicollinearity, Variance Inflation Factor (VIF) values are used If the VIF values of the regressors are smaller than 10, there is no multicollinearity among them Using SPSS, we obtained the result as shown in Table 20 Table 20: VIF values of the estimators It is clear that all VIF values are in the range 1-2 Therefore, the assumption of no perfect collinearity in our model is not violated 20 - 3rd assumption: zero conditional mean This means the mean value of the errors (unobserved factors) is unrelated to the values of the independent variables in all periods In our case, this assumption may not hold because unobserved factors may be influenced by values of independent variables Some unobserved factors are ‘strength of brand name’ and ‘people’s taste and preference’ can affect Rio sales volume These factors can be related to regressors For example, when there was Cow disease in the past, people developed preference for Rio (non-milk products) Hence, even until now, people still have preference for Rio due to what happened in the past Consumers may be afraid that the same disease may happen again, so they are willing to buy Rio Moreover, the Disease strengthened Rio brand name since it is a safe product even in the event of Cow disease Hence, the strong brand name today can be attributed to the presence of Disease in the past Therefore, it is evident that unobserved factors can be related to the past values of regressors Hence, rd assumption may not hold This means that the OLS estimators may be biased and may not reflect the true population parameter - 4th assumption: homoskedasticity This means the variance of errors must be constant regardless of any values of regressors Based on Figure 3, the values of residuals are distributed evenly, this indicates no heteroskedasticity To double check the assumption, we use Breusch Pagan test and compute residual squared Then, we regress this upon independent variables The result is shown below Table 21: Breusch Pagan test result Ui^2 = S0 +S1*t +S2*D +S3*SI + S4*Pt +S5*Pt_c1 + S6*Pt_c2 + error Hypothesis testing - H0: S1=S2=S3=S4=S5=S6=0 (there is no heteroskedasticity) - H1: at least out of is different from (there is heteroskedasticity) 21 Based on Table 21, the p-value is 0.493 which is more than 0.05 (significant level) Hence, we not reject H0 Therefore, we conclude that at 95% confidence level, there is no heteroskedasticity This indicates that the homoskedasticity assumption is not violated - 5th assumption: no autocorrelation - 6th assumption: normality of errors These assumptions are already tested in the Residual analysis part  Recommendation of other variables to be included Income of the consumers can affect the sales volume of Rio chocolate Hsu et al (2002) found that with higher income, consumers have greater purchasing power which spur them to increase spending on food and snacks products Similarly, Majumdar (2004) argued that rising income can raise the sales volumes of food retail industry Therefore, there may be a positive relationship between ‘household income’ and ‘sales volume of Rio’ We can include this variable in our model to test the relationship Another variable that can be included is the ‘number of Rio’s advertisement’ Bruce et al (2012) found the positive relationship between advertising and growth in sales Chakrabortty et al (2013) attributed such a relationship to how advertising affects consumer mindset and behaviours In fact, effective advertising helps increase people’s awareness of the brand, hence it can attract potential customers In another study, Buil et al (2013) argued that advertisement also helps the firms retain its competitiveness by constantly reminding customers of their product quality and features This builds brand name and eventually increase sales volume Therefore, we can incorporate variable ‘the number of Rio’s advertisements’ in our model There is probably a positive relationship between Rio advertisement and its sales volume Part 3: Conclusion Overall, there is an increasing trend in the sales volume of Rio chocolate over the past 20 years Based on Table 8, out of 12 months, April, November and December are the months with higherthan-average sales of chocolate (SI>100) while the other months experience lower-than-average sales (SI Johnson, I.L 2002, ‘Luxury chocolate sales boom at Christmas’, 24 December, viewed September 2019, < https://www.swissinfo.ch/eng/luxury-chocolate-sales-boom-at-christmas/6333382> Majumdar, R 2004, ‘The Effect of Changes in Housing Wealth on Retail Sales’, University of Pennsylvania, viewed September 2019, Smillie, S 2011, ‘Is there a chocolate season?’, The Guardian, 15 April, viewed September 2019, 25 Tannenbaum, K 2012, ‘8 Facts About Chocolate’, Delish, 14 February, viewed September 2019, Part 5: Appendix Appendix 1: Sales volume of Rio 26 Appendix 2: Sales volume of Caramel Squared Appendix 3: Sales volume of Gummi 27 Appendix 4: Sales volume of Smartey 28 Appendix 5: Sales volume of Heaven Appendix 6: Sales volume of Milkey Appendix 7: Sales volume of Treat 29 Appendix 8: Sales volume of Lovely Appendix 9: Sales volume of Roca 30 ... testing on B7,8,9 ,10 ,11 ,12 Let i denotes the chocolate brands number (Smartey:7, Heaven:8, Milkey:9, Treat :10 , Lovely :11 , Roca :12 ) and Bi denotes coefficient (i=7,8,9 ,10 ,11 ,12 ) H0: Bi=0 (no relationship... Hypothesis testing using F-test - H0: B7=B8=B9=B10=B 11= B12=0 (Pt_c3,4,5,6,7,8 are not jointly significant) - H1: at least one Bi ≠ (i=7,8,9 ,10 ,11 ,12 ) (at least one out of Pt_c3,4,5,6,7,8 is significant... SSR values found in Table 13 and 14 , (24.040−23. 610 )/ = 0.725 23. 610 /( 252? ?12 ? ?1) - F-stat= - F-critical value (q, n-k -1) = +/-2 .13 7 Since F-stat < F-critical, we not reject H0 Hence, we conclude

Ngày đăng: 02/12/2022, 08:52

Tài liệu cùng người dùng

Tài liệu liên quan