1. Trang chủ
  2. » Luận Văn - Báo Cáo

business and economic statistics case study business performance

15 0 0
Tài liệu được quét OCR, nội dung có thể không chính xác
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Nội dung

Trang 1

Nguyễn Thị Thu Trang 2004050055

Tiphaine Verbist 22FBA0001

Hanoi, November 4" , 2022

Trang 2

Table of content

LG na ẻ.ẻ.Ề 1

Question 1: Produce descriptive statistics to summarize the data You are expected to generate as many relevant descriptive statistics as possible using ALL the relevant tools introduced in the labs of this course Remember to provide appropriate mterpretations for the descriptive statistics Try not to

Question 2: Use analysis of variance to test for any significant differences due to province Use a 05 level of significance, and for now, ignore the effect of types of ownership Check all the assumptions of the imference technique you use Are the assumpfions satisfiied? Explaim cece eee 5 Question 3: Use analysis of variance to test for any significant differences due to types of ownership Use a 05 level of significance, and for now, ignore the effect of province Check all the assumptions of the inference technique you use Are the assumptions satisfied? Explain 0.00000 cece eee eee 6 Question 4; At the 05 level of significance test for any significant differences due to province, types of ownership, and interaction Check all the assumptions of the inference technique you use Are the

Question 5: Draw an interaction plot and interpret the plot Is the plot consistent with the conclusions Question 6: Discuss the credibility of the mterpretations and conclusions of these tests Is there

Trang 3

A Scenerio

The database of the annual Vietnamese Enterprise Surveys (VESs) is an important source of data

for any scholars doing research on Vietnam economy and its micro dynamics In 2004, the survey

was carried out with a sample size of more than 2 million businesses in all provinces across the country The household questionnaire contained many sections, each of which covered a separate aspect of business activities, and profitability was one important indicator In the survey, businesses were asked to specify their site of operation(province), types of ownership(own) and profitability (roa) The objective of our study is to test for any significant interaction between provinces and types of ownership and to test for any significant differences in the profitability of businesses due to these two variables A portion of the VES data is to be given to each group by your tutor In the given dataset, 1 represents firms from Hanoi, 2 represents firms from Danang and 3 represents firms from Ho Chi Minh City

B Answering the questions

Question 1: Produce descriptive statistics to summarize the data You are expected to generate as many relevant descriptive statistics as possible using ALL the relevant tools introduced in the labs of this course Remember to provide appropriate interpretations for the descriptive statistics Try not to include unnecessary or irrelevant descriptive statistics

In order to produce descriptive statistics, R studio was applied in this report In the first place, it is necessary to set the working directory and import our data file “Datasetl.csv” Therefore, the following code was used:

setwd("~/Desktop/Case study BES")

data1 <- read.table("Dataseti.csv", header = TRUE, sep = ";", quote="\"", stringsAsFactors = FALSE)

Then, we change the variable “own” and “province” into factors:

datal$province <- factor(datal$province, levels =c("1", "2", "3") ,labels = cC"Ha Noi","Da Nang","Ho Chi Minh

City"))

The following table shows the first 6 subjects of the data set 1:

Trang 4

We use “SO” for “State-owned” and “PO” for “Private-owned”

We can access the internal structure of the data frame with the following code:

> str(datal)

$ province: Factor w/ 3 levels "Ha Noi","Da Nang", : 1111111111

We have 180 observations and 3 variables: roa, own and province We converted own and

province into factors because they are characters

In order to form a tabular summary of the data for 2 variables, we form a cross tabulation table

As shown by this table, we can analyse the relationships between the two variables: the site of operation “province” and the type of ownership “own” With the following code we obtained this table:

> table(datal$province, datal$own) SO PO Ha Noi 30 30 Da Nang 30 30 Ho Chi Minh City 30 30

Furthermore, we computed the means, standard deviations and a summary of information of our data set as numerical values with the code by() We can observe that the mean or average of Ha Noi State-owned is the largest one and the mean or average of Ho Chi Minh City state-owned is

the smallest one In terms of standard deviation, we can observe that the value of the standard

deviation of Ho Chi Minh City state-owned is the largest one and the value of the standard deviation of Da Nang state-owned is the smallest one In the third by() code that we ran we can

see the minimum and maximum values, the quartiles and the means of our data.

Trang 5

: Ha Noi : SO

[1] -0.5864526

: Ha Noi : PO

[1] -@.01194629

: Da Nang : PO [1] -@.01175367

[1] 9.1722996

: Ho Chỉ Minh City : PO [11 9.5308589

> by(datal$roa, 1ist(datal$province, datal$own), summany) : Ha Noi

: S0

-0.098277 0.004444 0.019747 0.032610 0.041715 0.242908 : Da Nang

-@.302521 -@.005953 0.002195 -0.011946 0.007814 0.030714

: Da Nang : PO

Trang 6

For the purpose of providing a graphical description of our data, we drew a boxplot with the following code:

> boxplot(roa ~ province + own, data = datal, xLab = "Type of Provinces and Ownership", ylab = "roa", ylim=c(- 0.1,0.2), col = c("red", "blue", "yellow" ,"pink","green","orange"), main="Box Plots")

This boxplot shows us the distribution of the profitability for the six given groups (state-owned

from Ha Noi, state-owned from Da Nang, state-owned from Ho Chi Minh City, private-owned

from Ha Noi, private-owned from Da Nang and private-owned from Ho Chi Minh City) We can observe the minimum and maximum profitability and the lower, median/middle and upper quartile of the profitability for each group

While analysing the plots, we notice that they are almost all symmetric, with some outliers Besides, we can see that Ho Chi Minh City private-owned hit the maximum value and Ha Noi state-owned hit the minimum value

Since the plot of Ha Noi private-owned is the shortest, this means that there is the least variation

in this group The group with the most variation is Ho Chi Minh City private-owned

° ° 2

3 ° °

Type of Provinces and Ownership

Trang 7

Question 2: Use analysis of variance to test for any significant differences due to province Use a 05 level of significance, and for now, ignore the effect of types of ownership Check all the assumptions of the inference technique you use Are the assumptions satisfied? Explain

Ho: All population means of profitability followed by provinces are the same Ha: At least 2 population means of profitability followed by provinces are different

¢ Checking assumptions:

- Samples are independent, simple random samples

Private-owned and state-owned are the two categories of ownership, as it has been mentioned There are 90 samples total for each sort of ownership, making the sample sizes all equal We may therefore say that the samples are independent

- All populations in question are normally distributed

The fact that a straight line links practically all of the places is obvious Since no glaring outliers are significant, we may draw the conclusion that all populations exhibit are

- All population standard deviations are equal

We divide the largest standard deviation by the smallest standard deviation (= 36.1188 1)

This value is bigger than 2, so we conduct a Levene test.The p-value of the Levene test is equal to 0.2128 This value is larger than alpha (0.05) so all standard deviations are equivalent.

Trang 8

datal$province: Ha Noi

[1] @.06776013

datal$province: Da Nang [1] @.1240284

datal$province: Ho Chi Minh City [1] 2.447415

> 2.447415/0.06776013 [1] 36.11881

> LeveneTest(datal$roa, data1$province, center = median) Levene's Test for Homogeneity of Variance (center = median)

Df F value Pr(>F) group 2 1.561 0.2128

177

We use the one-way ANOVA-test:

> datal<-aovCroa ~ province, data = datal1) > summaryCdata1)

Df Sum Sq Mean Sq F value Pr(>F}) province 2 5.2 2.586 1.291 0.278 Residuals 177 354.6 2.003

¢ Significant level alpha = 0.05

Reject Ho if p-value < alpha

P-value = 0.278>0.05 — Do not reject Ho ¢ Conclusion:

There is enough evidence to conclude that there are no appreciable differences related to different provinces

Question 3: Use analysis of variance to test for any significant differences due to types of ownership Use a 05 level of significance, and for now, ignore the effect of province Check all the assumptions of the inference technique you use Are the assumptions satisfied? Explain

We utilize one-way ANOVA to check for any differences that are significantly different based on

ownership in this question We need to check assumptions: - Samples are independent, simple random samples - Populations are normally distributed

- All population standard deviations are equal

Assumption 1: Samples are independent, simple random samples.

Trang 9

Private-owned and state-owned are the two categories of ownership, as it has been mentioned There are 90 samples total for each sort of ownership, making the sample sizes all equal We may therefore say that the samples are independent

Assumption 2: Populations are normally distributed

To determine if populations are regularly distributed or not, we use the R function qq plot

> Business]<-read.table ("Business.csv",header=TRUE,sep=",",quote="\""

stringsAsFactors = FALSE) install packages("car" library(car)

qqPlot(Im(roa~own, data=business1), simulate=T, main="Q-Q Plot", labels=F)

Assumption 3: All population standard deviations are equal

We use the function “by” to check if this assumption is true or not.

Trang 10

business$own: State

[1] 0 3241811 business$own: private

178

>

We have p-value = 0.448, alpha = 0.05, then p-value is less than alpha, so all population standard

deviations are equivalent, as may be inferred Test procedure:

Ho: All population means of profitability followed by ownership are the same Ha: At least 2 population means of profitability followed by ownership are different ° Check assumption:

- Samples are independent, simple random samples - Populations are normally distributed

- All population standard deviations are equal

We use one-way ANOVA test for this:

> ovown <- aov(roa ~ own, data=business) > summary Covown)

DF Sum Sq Mean Sq F value Pr(>F) 1 0.7633 0.378 0.539 own 8

0O 2.0168 Residuals 178 35

>

9 9

° Significant level: alpha = 0.05

Reject Ho if p-value < alpha, we have P-value = 0.539>0.05 — Do not reject Ho

Trang 11

There is enough evidence to conclude that there are no appreciable differences related to different types of ownership

Question 4: At the 05 level of significance test for any significant differences due to province, types of ownership, and interaction Check all the assumptions of the inference technique you use Are the assumptions satisfied? Explain

There are three sets of hypotheses for two-way ANOVA in this case study:

1 Hol: The means of “Province” are equal

Hal: The mean of at least one factor of “Province” is different 2 Ho2: The means of “Ownership” are equal

Ha2: The means of at least one factor of “Ownership” is different 3 Ho3: There is no interaction between “Province” and “Ownership”

Ha3: There is interaction between “Province” and “Ownership”

a, Check assumptions:

- Samples are independent, simple random samples from each of 6 populations (“Ownership has 2 levels; “Province” has 3 levels)

- All populations are normally distributed (Q-Q plot - figure 2)

- All populations have the same standard deviation (because

largest standard deviation _ 3.43 _ 108.46>2, we use other tests - figure 3) smallest standard deviation 0.03

b, Test statistic:

The test was conducted using Rstudio:

> datal.result<-aovCroa ~ province*own, data = datal) > summary Cdatal.result)

Df Sum Sq Mean Sq F value PrC>F) province 2 5.2 2.5857 1.281 0.280

province:own 2 2.6 1.2955 0.642 0.528 Residuals 174 351.2 2.0185

Figure 1: Two-way ANOVA output 9

Trang 12

- Test statistic and p-values

= 0.642 p-value (province:own) = 0.528

F (province) = 1.281 p-value (province) = 0.280

F (own) = 0.378 p-value (own) = 0.539

F (province:own) = 0.642 p-value (province:own) = 0.528

® Decision rule: We reject Ho if p-value < a = 0.05

- p-value (province) > o (0.280 > 0.05) then we do not reject Hol - p-value (own) > o (0.539 > 0.05) then we do not reject Ho2

Trang 13

Figure 2: Q-O plot

> byCdatal$roa, listCdatal$province,datal$own) ,sd) >: Ha Noi

: private-owned [11 O.O5845379

Da Nang : private-owned [1] 0.1722996

Ho Chi Minh City : private-owned

[1] 0.5308589

Ha Noi 2 state-owned [1] 0.06998216

Figure 3: Standard deviation output

Question 5: Draw an interaction plot and interpret the plot Is the plot consistent with the

conclusions made in Question 4?

In order to visualize the interaction between province location and types of ownership by the

outcome variable of roa, and recheck the conclusion in question 4, the following code is conducted:

interaction plot(datal$province, datal$own, datal$roa, type="b", col=c("red", "blue"), pch=c(16,

18), main="Interaction between Province and Ownerships") Following that, the image of interaction is explored

Ngày đăng: 29/08/2024, 16:08

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN