1. Trang chủ
  2. » Luận Văn - Báo Cáo

end of termgroup presentation subject business analytics

16 0 0
Tài liệu được quét OCR, nội dung có thể không chính xác
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Analyze the data and write a report according to the following requirements for the Business Analytics course.
Tác giả Group 4
Người hướng dẫn Ph.D. Nguyễn Văn Dũng
Trường học UEH UNIVERSITY
Chuyên ngành Business Analytics
Thể loại Group Presentation
Định dạng
Số trang 16
Dung lượng 1,19 MB

Nội dung

- Female Group: o Mean Income: 8,208 o Standard Deviation: 10,7308 Hypothesis Stating: Null Hypothesis HO: There is no statistically significant difference in income between male and fem

Trang 1

UEH UNIVERSITY

UEH COLLEGE OF BUSINESS

SCHOOL OF INTERNATIONAL BUSINESS- MARKETING

IIHHMI

UEH

UNIVERSITY

END OF TERM-

GROUP PRESENTATION

Subject: Business Analytics

Lecturer : Ph.D Nguyễn Văn Dũng

Group :4

Class Code : 23C1BUS50320002

Course -Class : IBC05

Malor : International Business

Trang 2

Contents

1 Collect and create an SPSS file with the following requiremen: - - 3

2 Make a frequency table about Educational levelL 2 c1 222211222 xe se 3

3 Draw apie chart showing the percentage of observations by Gendker 5

4 Compare the mean of Income of the 2 øroups oÊ genđ€T crete tees 5

a Descriptive Statistics: ccc ccccccccccccecseececteeeeeseceeesseseiesesseeseeseseseseesesieeeseeeees 5

b Hypothesis Sfating: L Q2 112212112112 2 cty 5111 k1 1551k khe 6

lân “dd 6

5 Compare the mean of Income among different educational leveÌs 7

A Descriptive Statistt css ccc Ởý33 7

B Hypothesis Stating ccc ccc 1121221112111 121211 2111111 11115111 1 11H ngu 8

C Homogeneity of Variances T€SÍ: - 12 1121112111211 1 1211121110111 811k ky 8

BI - 9

6 Check whether there is multicollinearity among the variables: Age, Gender,

Education, Marital status, Doing exercises? 00.ccccccccccccccccecceceeeesecesseesseseseseseeteeeenees 9

7 Use multiple linear regression to analyze the impact of the vanables Age, Gender,

Education, Marital status, Doing exercises on the variable Ineome? - - 10

A Lmear Regression Model c2: 2221211112111 112111 181112111 0111181115281 vky 10

SN 200) 2 e A II

C Linear reøTcssIoni r€SUÏ( - c2 121112211211 121 1151115111811 H1 c ng x rệy II

D Interprefatlon GÊT€SuÏ[ c2 221221112111 1521 151121111511 1511 1811151118111 811 111g xky 13

Trang 3

GROUP PRESENTATION —END OF TERM (20%)

Analyze the data and write a report according to the following requirements for the

Business Analytics course

Requirements:

I Format:

1

2

3

4 Reference list and citations in the text follow APA style

5

Font: Times New Roman, Size: 12, Line spacing: 1.5 lines, Spacing:

Before: 0 pt After: 0 pt

Length: 10-15 pages (core content)

Cover, content, main text, reference list

Submit | Word file, 1 PPT file and 1 SPSS file into LMS

II content:

1 Collect and create an SPSS file with the following requirements:

1 Variables:

©

=>

a Name: string

b Income: million VNDs/month

Age

Gender: Male (1), Female (0)

Education: thpt (1), dai hoc (2), thac si (3), tién si (4)

Marital status: single (1), married (2)

Doing exercises: doing regular exercise (more than 20

minutes a day) (1), not doing regular exercise (less than 20

minutes a day) (0)

ul Observations: 100

2 Make a frequency table about Educational level

Trang 4

Frequencies

Statistics

Trinh d6 hoc van

N Valid 105

Missing 0

Trình độ học vấn

Frequene Valid Cumulative

y Percent Percent Percent

Valid Highschool 5 4,8 4,8 48

Student

College Student 85 81,0 81,0 85,7

Master’s Degree 13 12,4 12,4 98,1

Ph.D Degree 2 1,9 1,9 100,0

Total 105 100,0 100,0

The table describes the number of observations for each level of education and their

proportion in the total extent It can be drawn from the table that “College Students”

account for the highest number with 85 observations, accounting for 81% of the sample

On the other hand, “Ph.D Degree” is the smallest group with only 2 observations,

which only capture 1.9% of the sample

Trang 5

3 Draw a pie chart showing the percentage of observations by Gender

Pie Chart Percent of Gioi tinh

Giới

tính

#Femals

@ Male

The Pie chart illustrates that within the surveyed sample, the gender distribution

shows a fairly balanced split between males and females, with percentages of

50.48% for males and 49.52% for females This suggests a relatively even

distribution of genders within the dataset

4 Compare the mean of Income of the 2 groups of gender

Group Statistics

Gidi tinh N Mean Std Deviation Std Error Mean

Thu nhập hiện tạ _Female 53 8,208 10,7308 1,4740

Male 52 6,531 96743 1,3416

a Descriptive Statistics:

- Male Group:

o Mean Income: 6,531

o Standard Deviation: 9,6743

Trang 6

b

1

- Female Group:

o Mean Income: 8,208

o Standard Deviation: 10,7308

Hypothesis Stating:

Null Hypothesis (HO): There is no statistically significant difference in income

between male and female groups

If P-value > a, the null hypothesis (HO) is not rejected, suggesting no statistically

significant difference in income

Alternative Hypothesis (H1): There is a statistically significant difference in

income between the male and female groups

If P-value < a, the null hypothesis is rejected, indicating a statistically significant

difference in income between the two groups

Give the significant level (a) of 0,05

c Result

Independent Samples Test

Levene's

Test for

Equality of

Variances t-test for Equality of Means

Std 95% Confidence

Mean Error Interval of the

Sig (2- Differen Differe Difference

F Sig t df tailed) ce nce Lower Upper

Thu Equal 061 ,805 ,840 103 403 1/6768 1,9951 -22800 56336

nhập variances

hiện assumed

tại Equal 841 102, 402 1,6768 1,9931 -2,2764 5,6300

variances 276

not

assumed

To check the Equal Variances Assumptions, use the result in Levene’s Test of

Equality of Variances The test’s F-value is 0,061 which yields a P-value of 0,805 >

0,05 Therefore, there is no difference in the variance of the 2 populations and we will

use the results in the line Equal Variance of assumed

In the independent sample Test above, the P-value or Sig = 0,403, so it is

concluded that the null hypothesis is not rejected, therefore, it suggests that there is no

statistically significant difference between the two groups

Trang 7

In other words, there is no evidence of a difference in means of Income between

males and females

5 Compare the mean of Income among different educational levels

Descriptives

Thu nhap hién tai

95% Confidence

Interval for Mean

Mea Std Std Lower Upper Minimu Maximu

N n Deviation Error Bound Bound m m

Highschool 5 4,00 89443 4.0000 -7,106 15,106 „0 20,0

Student 0

College 85 4,41 3,0401 3298 3,763 5,075 „0 15,0

Student 9

Master’s 13 20,6 13,0600 3,6222 12,800 28.584 5,0 60,0

Degree 92

Ph.D 2 55,0 7,0711 5,0000 -8 531 118,531 50,0 60,0

Degree 00

Total 105 7,37 102069 9961 5,402 9352 „0 60,0

7

A Descriptive Statistics:

1 Highschool Student:

- Mean income: 4,000

- Standard Deviation: 8,9443

- Sample Size: 5

2 College Student:

- Mean income: 4,419

- Standard Deviation: 3,0401

- Sample Size: 85

3 Master’s Degree:

- Mean income: 20,692

- Standard Deviation: 13,0600

- Sample Size: 13

Trang 8

4 Ph D Degree

- Mean income: 55,000

- Standard Deviation: 7,0711

- Sample Size: 2

B Hypothesis Stating:

1 Null Hypothesis (HO): There is no statistically significant difference in income

among the four educational levels

If P > a, the null hypothesis (HO) is not rejected, suggesting no statistically

significant difference in income

2 Alternative Hypothesis (H1): There is a statistically significant difference in

income among the four educational levels

If P <a, the null hypothesis (HO) is rejected, indicating a statistically significant

difference in income among educational levels

Given the significant level (a) of 0,05

Test of Homogeneity of Variances

Levene

Statistic dfl df2 Sig

Thu nhập hiện tại Based on Mean 7,736 3 101 ,000

Based on Median 4,927 3 101 ,003

Based on Median and 4,927 3 21,760 ,009

with adjusted df

Based on trimmed mean 6,641 3 101 2 000

C Homogeneity of Variances Test:

- The Levene test indicates that the variances of the four educational levels are

not equal, Sig < 0.05

“> This violates the assumption of homogeneity of variances and should be

taken into account when conducting further statistical analyses, as it may

impact the validity of the results Therefore, an alternative statistical

method will be used, which is Welch’s test in the table Robust Tests of

Equality of Means

Robust Tests of Equality of Means

Trang 9

Thu nhap hién tai

Statistic* dfl df2 Sig

Welch 29,934 3 3,787 ,004

a Asymptotically F distributed

D Result:

Sig of the Welch test is 0.004 < 0.05 This leads us to the conclusion of rejecting

H0, indicating a statistically significant difference in income among educational

levels, or at least two of the four groups differ significantly with regard to the mean

of Income

6 Check whether there is multicollinearity among the variables: Age,

Gender, Education, Marital status, Doing exercises?

Correlations

Độ tuổi hiện Trình độhọc Tìnhtrạng Mức độ tập

tại Giới tính van hôn nhân luyện thê dục

Độ tuổi hiện tại Pearson 1 - 088 664" 507" -,149

Correlation

Sig (2-tailed) „370 „000 „000 ,129

N 105 105 105 105 105

Gidi tinh Pearson -,088 1 - 194” ,173 -,276”

Correlation

Sig (2-tailed) „370 047 ,078 ,004

N 105 105 105 105 105

Trinh d6 hoc van Pearson 664" -,194° 1 349” -,199°

Correlation

Sig (2-tailed) ,000 047 ,000 ,042

N 105 105 105 105 105

Tỉnh trạng hôn nhân Pearson 507" ,173 349" 1 -,173

Correlation

Sig (2-tailed) ,000 ,078 ,000 ,078

N 105 105 105 105 105

Mức độ tập luyện thể Pearson -149_ -276” - 199” - 173 1

duc Correlation

Trang 10

Sig (2-tailed) 129 004 042 078

N 105 105 105 105

** Correlation is significant at the 0.01 level (2-tailed)

* Correlation is significant at the 0.05 level (2-tailed)

The correlations of “Age - Education” ,“Age - Martial status”, “Gender-

Education”, “Gender- Doing exercise”, “Education- Martial status’, “Education-

Doing excercise” are statistically significant and have the Pearson correlation none

exceed the recommended threshold of +0.7

The correlations of “Age — Gender”, “Age — Doing Exercises” “Gender- Martial

status”, “Martial- Doing exercise” do not statistically meaning

= It means that there is no problem of multicollinearity

7 Use multiple linear regression to analyze the impact of the variables

Age, Gender, Education, Marital status, Doing exercises on the variable

Income?

A Linear Regression Model

By utilizing a multiple linear regression model, we can analyze the impact of

the five mdependent variables Age, Gender, Education, Marital status, and

Doing exercises on the dependent variable Income The model equation is as

follows:

Y=B0+B1X1+P2X2+PB3X3+P4X4+B5SX5+E

Where:

¢ Yis the dependent variable Income

e X1, X2, X3, X4, XS are the independent variables Age, Gender, Education,

Marital status, Doing exercises respectively

¢ 0 represents the intercept

e 61, B2, B3, B4, B5 are respective coefficients for the independent variables

¢ € represents the error term

10

105

Trang 11

A R square & Interpretation:

Model Summary

Adjusted R Std Error of the

Model R R Square Square Estimate

1 „841 „707 „692 5,6613

a Predictors: (Constant), Mire d6 tap luyén thể dục, Độ tuổi hiện tai,

Giới tính, Tình trạng hôn nhân, Trình độ học vẫn

The value of R”indicates that 70% of the variation in Income 1s explained by five

independent variables

B Hypothesis stating and testing:

H0: BI =2 = B3 = B4 =5

If P> a, the null hypothesis (HO) is not rejected, suggesting no statistically

significant relationship

H1: At least one Bj 4 0

If P< a, the null hypothesis (HO) is rejected, suggesting a statistically

significant relationship exists

ANOVA?

Sum of

Model Squares df Mean Square F Sig

1 Regression 7661,781 5 1532,356 47,811 000°

Residual 3172,964 99 32,050

Total 10834,745 104

a Dependent Variable: Thu nhap hién tai

b Predictors: (Constant), Murc d6 tap luyén thể dục, Độ tuôi hiện tại, Giới tính,

Tình trạng hôn nhân, Trình độ học vấn

Test Statistic: F-stat = 47,811, which yields a P-value of 0,000 < a

Therefore, the null hypothesis is rejected and there is a significant

relationship between the dependent variable and at least one

independent variable

C Linear regression result

Coefficients’

11

Trang 12

Unstandardized Standardized

Coefficients Coefficients

Model B Std Error Beta t Sig

1 (Constant) -30,413 4.010 -7,585 ,000

Độ tuôi hiện tai ,268 „113 ,188 2,375 „019

Giới tính -1,101 1,230 -,054 -,895 373

Trinh d6 hoc van 9,758 1,587 465 6,147 ,000

Tỉnh trạng hôn nhân 12,768 2,374 „352 5,379 ,000

Mức độ tập luyện thê -1,314 1,194 -,065 -1,101 „274

dục

a Dependent Vanable: Thu nhap hiện tại

Looking at the p-values for the independent variables in the last section, we see those two

of the five independent variables (Gender, Doing exercises) have P-values that exceed

the Significance level and therefore, are not statistically significant

By removing the variables with the highest P-value (Gender) and re-analyze the

model, we could create an improved regression model:

Coefficients’

Standardize

Unstandardized d

Coefficients Coefficients

Model B Std Error Beta t Sig

1 (Constant) -31,738 3,723 -8,524 ,000

Độ tuổi hiện tại 273 112 192 2424 017

Trinh d6 hoc van 10,102 1,539 482 6,565 000

Tinh trạng hôn nhân 12259 2,303 „338 5,324 ,000

Mức độ tập luyện thê -,983 1,134 -,048 -,867 388

duc

a Dependent Vanable: Thu nhap hiện tại

Doing exercises P-value remains higher than the Significant level, indicating that the

variable has no statistical significance in the model and should be removed

The regression model follows the coefficient tables as bellow:

Coefficients’

12

Ngày đăng: 25/09/2024, 16:32

w