Bài tập thống kê ra quyết định quản lý e

15 110 0
Bài tập thống kê ra quyết định quản lý e

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Bài tập Thống Ra định quản One sale manager who works for a bicycle company wants to review the affect of factors relating to the revenue The data from 30 stores are collected In which: UnitsSold: The number of sold bikes FloorSpace: The area of display (square meters) CompetingAds: Advertising costs of competitors (thousand USD) Price: Price of product (USD) Obs 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 UnitsSold 1015 903 1293 1479 1413 1207 999 1172 1110 1270 1448 1327 910 455 1052 1125 915 1079 1493 885 1069 1220 1124 1043 1369 1244 1361 1421 782 1210 FloorSpace CompetingAds 72.9 102.2 56.6 99.6 80.4 107.1 93.4 110.1 84.5 93.5 69.8 95.6 77.4 102.5 75.3 94.8 75.5 101.0 63.6 95.5 84.5 100.0 74.6 110.4 60.4 98.2 8.2 86.1 48.6 86.3 49.6 94.7 57.4 96.5 48.5 90.6 102.1 89.6 55.2 101.1 56.0 96.8 76.8 99.8 59.2 111.1 49.0 94.0 78.5 99.7 59.4 98.2 100.0 104.7 76.3 85.5 41.8 92.5 65.2 93.1 Price 1146 1261 1408 729 1227 966 1400 1277 1137 954 856 892 1024 1028 1170 984 1200 1725 1588 1298 1359 1469 480 1435 823 1274 1165 1054 860 895 Use above data to answer following questions: Page Use suitable statistic model to comment about variables: Variables which need to be analyzed are quantitative ones so specific descriptive statistics for these variables include mean, median, quartiles, interquartile range, minimum value, maximum value, variance, standard deviation, outlier and confidence intervals for each variable We use megastat tool on excel - megastat descriptive statistics and put all data into data section, we have statistical description and graphical boxplot for each variable as follows: Descriptive statistics UnitsSold Count Mean sample variance sample standard deviation minimum maximum Range (unit) 30 1,146.4333 54,335.9092 FloorSpace (m2) CompetingAds(1000USD) 30 30 66.6900 97.6933 362.3864 46.6669 Price (USD) 30 1,136.1333 73,186.1885 233.1006 455 1493 1038 19.0365 8.2 102.1 93.9 6.8313 85.5 111.1 25.6 270.5295 480 1725 1245 lower confidence interval 95.% 1,059.3921 59.5817 95.1425 1,035.1160 upper half-width 1,233.4745 87.0412 73.7983 7.1083 100.2442 2.5509 1,237.1507 101.0174 1st quartile Median 3rd quartile interquartile range Mode 1,022.0000 1,148.5000 1,318.5000 296.5000 #N/A 56.1500 67.5000 77.2500 21.1000 84.5000 93.6250 97.5000 101.0750 7.4500 98.2000 957.0000 1,155.5000 1,292.7500 335.7500 #N/A 0 0 0 0 0 0 confidence interval 95.% low extremes low outliers high outliers high extremes Page 8/9/2012 8:48.19 (4) Comment: Basing on the results of statistic description, it can be seen that the distribution is relatively symmetric, the average and median are approximately equal Revenue variable and shop area variable have an outlier (there is a shop which has area and revenue much lesser than that of another one) Variables of price and advertising cost have no outlier Page Variables of floorspace and advertising cost have relatively concentrated distributions: range is small; standard deviation is tiny in which variable of advertising cost has the most concentrated distribution (smallest range) Variables of floorspace and advertising cost have mode, the other variables not have Use scatter graph to evaluate the linear relationship between the revenue and the remaining factors Are results from the graph the same with your expectation on this relationship based on economic theory? Use the correlation coefficient to check the results from the graph: Draw the scatter graph by using Megastat on excel, using Correlation/Regression- scaterplot, we have a graph which reflects the relationship between bicycle sales (shown on the vertical axis) with separate variables respectively: floorspace, advertising cost of competitors and prices of the shop (on the horizontal axis with calculation units respectively: square meters, thousands of dollars, and dollars) as follows: From at the graph, it can be seen that there is an increasing trend which shows the positively proportional correlation between sales and floorspace: The larger the floorspace is, the greater the number of products sold is This is quite consistent with economic theory because the product has a trademark, so customers often go to largescale stores which provide many choices to find the most suitable product Thus, the increase of sales is associated with scale of store display Moreover, the larger the Page floorspace is, the larger the number of products sold is and this is also consistent with the reality According to the results from the above table, we have the regression function Y = 10.382x + 454.035; the correlation coefficient is positive, reflecting the positively proportional relation If the floorspace increases by 1m 2, the number of sales will increase by 10, 382 units (about 10 units) R coefficient is 0.719 This means 71.9% of the sale increase is due to the floorspace factor, the remaining 28.1% is determined by other factors The increasing trend in the graph indicates the positively proportional correlation between sales and advertising of competitors, this seems to be inconsistent with the economic theory (revenue usually decreases when advertising costs of competitors increase) However, this trend is not clear due to the dispersion of points is quite clearly scattered in comparison with the line which shows the general trend This can be explained that the advertising activities of competitors generally reduce the sales, but for bike, the competitors' ads not affect much to the sales of stores From to the table, the correlation function between sales and advertising costs of competitors is Y = 8.625x + 303 833; the correlation coefficient is positve, which reflects the positively proportional relation If the cost of competitors' ads increases by 1,000 USD, the store's sales will increase by units However, the coefficient R is 0064, showing that only 6.4% of the increase in sales is influenced by the advertising Page costs of competitors The number is not much, so can not confirm the relationship between sales and advertising costs The relationship between Units sold and price: it can be seen from the graph that there is a downward trend; prices rise when sales decline, but the decline of prices is not much (the trend line is nearly horizontal), which shows that the concern about the price does not have much influence on decision to purchase this product This is entirely consistent with theory about price elasticity of demand (if the price increases, the sales will decrease) The regression function: y =-0.064x + 1.218.619; the function is consistent with the graph, the correlation coefficient is negative, which represents the inversely proportional relationship between the two variables However, the coefficient R2 is 0.005, meaning that only 0.5% of the decrease of sales is due to the increase of price 3.Using average confidence interval for above variables with confidence level of 95%, explain the meaning of results obtained Estimate the proportion of stores having sales greater than 1200 units: Use Confidence interval – mean in MegaStat Confidence interval – mean UnitsSold (unit) 95% confidence level 1146.433333 mean Page 233.1006418 30 2.045 87.0412 std dev n t (df = 29) half-width upper confidence 1,233.4745 limit lower confidence 1,059.3921 limit Comment: The average turnover of stores is in the range (1146 ± 87) (from 1059 to 1233), with confidence level of 95% Confidence interval - mean 95% 66.69 19.03645052 30 2.045 7.108 FloorSpace (m2) confidence level mean std dev n t (df = 29) half-width upper confidence 73.798 limit lower confidence 59.582 limit Comment: the average floorspace of stores is in the range (66 ± 7) (from 59 to 73) m 2, with the confidence level of 95% Confidence interval - mean CompetingAds(1000USD) 95% confidence level 97.69333333 mean 6.831313971 std dev 30 n 2.045 t (df = 29) 2.5509 half-width 100.2442 upper confidence limit 95.1425 lower confidence limit Comment: The advertising cost of compertitors is in the range of (97.69± 2.55) (from 95.14 to 100.24) (unit: 1000USD), confidence level is 95% Page Confidence interval - mean Price (USD) 95% 1136.133333 270.5294596 30 2.045 101.0174 confidence level mean std dev n t (df = 29) half-width upper 1,237.1507 confidence limit lower 1,035.1160 confidence limit Comment: The average price of stores is in the range (1136± 101) (or from 1035 to 1237) USD, the confidence level of 95% Estimate the proportion of stores having units sold more than 1200 units: Use Frequency Distribution – Quantitative to estimate: Frequency Distribution - Quantitative UnitsSold (unit) low uppe midpoi widt er 400 600 r 600 800 1,00 nt 500 700 h 200 200 cy 1 < < frequen percen cumulative freque percen t 3.3 3.3 ncy t 3.3 6.7 800 1,00 < 1,20 900 200 16.7 23.3 1,20 < 1,40 1,100 200 30.0 16 53.3 1,40 < 1,60 1,300 200 30.0 25 83.3 < 1,500 200 16.7 30 100.0 30 100.0 From the distribution table, it can be seen that the number of stores having sales greater than 1200 units is 14 in 30 stores sampled, accounting for 46.7% Use Confidence interval – proportion to estimate the proportion of stores having sales more than 1200 units: Confidence interval - proportion 95% confidence level Page 0.46666666 30 1.960 0.179 proportion n z half-width upper confidence 0.645 limit lower confidence 0.288 limit Comment: The proportion of stores which have units sold more than 1200 is in the range of (46.7 ± 17.9 %), the confidence level is 95% Test the idea that the average advertising costs of competitors is less than 100 thousand dollars and the average sales of stores is less than 1200 units: Use Hypothesis test to test the hypothesis: “the average advertising cost of competitors is less than 100 thousand USD” We have hypothesis couple: H0 ≤ 100 and H1 > 100 Hypothesis Test: Mean vs Hypothesized Value 100.00000 hypothesized value 97.69333 mean CompetingAds(1000USD) 6.83131 std dev 1.24722 std error 30 n 29 df -1.85 9627 95.14248 100.24419 2.55085 t p-value (one-tailed, upper) confidence interval 95.% lower confidence interval 95.% upper margin of error Comments: Base on the value of P-value in the table above, p-value = 0.96> α = 0.05, so we can not reject the hypothesis Ho, which means the hypothesis: "the average advertising cost of competitors is less than 100 thousand dollars" is rejected Use Hypothesis test function to test the hypothesis: “the average sales of stores is less than 1200 ” Hypothesis couple is: H0 ≥ 1200 and H1 < 1200 Hypothesis Test: Mean vs Hypothesized Value Page 1,200.00000 1,146.43333 233.10064 42.55816 30 29 -1.26 1091 1,059.39212 1,233.47454 87.04121 hypothesized value mean UnitsSold (unit) std dev std error n df t p-value (one-tailed, lower) confidence interval 95.% lower confidence interval 95.% upper margin of error Comments: Basing on the value of P-value in the table above, p-value = 0.10> α = 0.05, so we can not reject the hypothesis Ho, which means can not reject the hypothesis "the average sales of stores is greater than or equal to 1200 units ", which means that the hypothesis H1:" the average sales of stores is less than 1200 units of product " is rejected Estimate a linear regression model in which the dependent variable is sales, the independent variables are remaining variables: Use Megastat tool on excel: basing on the data given by use Correlation/Regression- regression analysis, we have table of regression analysis which reflects the relationship between sales (denoted by Y) with independent variables: floorspace denoted by X1 (m2), advertising costs of competitors, denoted by X2 (thousand USD) and the price denoted by X3 (USD) as follows: Regression Analysis R² Adjusted R² R Std Error 0.759 0.731 0.871 120.78 n k 30 UnitsSold Dep Var (unit) MS 398,803.21 F 27.33 ANOVA table Source Regression SS 1,196,409.6 df p-value 3.36EPage 542 379,331.712 81 14,589.681 Residual 1,575,741.3 26 Total 667 29 08 Regression output std p- confidence interval 95% 95% variables coefficients error 397.26 t (df=26) value lower 408.84 upper 2,042.04 Intercept 1,225.4435 85 3.085 0048 3.82 64 06 FloorSpace (m2) 11.5222 1.3296 8.666 E-09 8.7892 - 14.2553 CompetingAds(1000U SD) Price (USD) 14.961 -6.9351 -0.1496 3.9048 0.0893 -1.776 -1.675 0874 1059 -0.3331 1.0913 0.0339 a Explain the significance of the regression coefficients and the R2 coefficient Which Independent variables have most impact on sales: Regression model reflects the relationship between sales and other factors: floorspace (m2), advertising costs of competitors (thousand USD) and price of stores (USD) as follows: Y = b0 + b1*X1 + b2*X2 + b3*X3 Y=1225.4435+ 11.5222X1 – 6.9351X2 - 0.1496X3 + the regression coefficient of the independent variable Floorspace is 11.5222 , meaning that the relationship between sales and floorspace is proportional; If the floorspace increases by 1m2, the number of bikes sold will increase by 11, assume that other factors like advertising costs of competitors and the price remain the same + the regression coefficient of the independent variable Ads cost of competitors is -6.9351, meaning that the relationship between sales and ads cost of competitors is inversely related; as the ads cost of competitors increase by 1,000 USD, the number of bikes sold goes down about units, other factors remain the same + The regression coefficient of the independent variable Price is -0.1496, which means that the relationship between the sales and the price is inversely proportional; when the price rises by USD, the number of bikes sold fell by 0.15 (equivalent to the Page price increase by USD, the number of units sold decreases by 1), provided that other factors advertising cost of competitors and floorspace remain the same + Coefficient R2 = 0759, which means that 75.9% of the increase in sales is explained by the influence of these factors: floorspace, advertising costs of competitors and the price + In the above factors, floorspace factor has most impact on sales (regression coefficient is the largest) b.Use appropriate test which independent variables have impacts on sales and which ones not Then, estimate the suitability of the selection of appropriate independent variables? Is there any possible missing variables which also can affect the sales, give an example: b1 Test the relation between floorspace and sales: Hypotheses: H0: β1 = H1: β1 ≠ t = 8.666 and p-value = 3.82.10-9 < α = 0,05 ; reject H0 Conclusion: Sales and floorspace are related and floorspace has influence on sales b2 Test the relation between sales and advertising cost of competitors: Hypotheses: H0: β2 = H1: β2 ≠ t = -1.176 and p-value = 0.0874 > α = 0,05 ; accept H0 Thus, there is no foundation to confirm the relation between advertising cost of competitors and sales of stores b3 Test the relation betwee sales and price: Hypothesis: H0: β3 = H1: β3 ≠ With t = -1,675 and p-value = 0.1059 > α = 0,05 ; accept H0 So, can not confirm the relation between sales and price of stores Page Conclusion: + It is certain that the relationship between units sold and floorspace is a positively proportional relationship + Can not conclude whether or not the cost of competitors and the price have influnce on sales Therefore, the selection of the two independent variables (advertising costs of competitors and the price) is not appropriate in this sample + In addition, sales of a product not only depends on the above factors, but also depends on other factors such as the advertising cost of company itself, the location of the store and economic characteristics, differences of social and geographic regions c Using F-test to check the meaning of the model? Using F-test: we have the couple of hypotheses: H0: β1= β2 =β3=0 H1: There is at least one βi≠0 (i=1, 2, 3) If: β1= β2 =β3=0 Sales not depend on independent variables βi≠0 There is at least one variable which can affect sales, -> the model is meaningful From ANOVA table, it can be seen that: F (the ratio of variability in and out of the model) = 27.33; F becomes larger, there is a change in the model From the table, p-value = 3.36 * 10_8 ≈

Ngày đăng: 09/02/2018, 14:10

Tài liệu cùng người dùng

Tài liệu liên quan