1. Trang chủ
  2. » Luận Văn - Báo Cáo

business statistics group assignment

27 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Employee Data Analysis
Tác giả Nguyễn Thị Lan Anh, Phan Thị Phương Thảo, Tạ Khánh Vi, Lê Thị Mai Chi, Nguyễn Thị Việt An
Trường học National Economics University
Chuyên ngành Business Statistics
Thể loại Group Assignment
Năm xuất bản 2023
Thành phố Ha Noi
Định dạng
Số trang 27
Dung lượng 2,32 MB

Nội dung

The Chi-Square value of 260.293 with 6 degrees of freedom and a p-value of .000 indicating a p-value less than .001 suggests that there is a very strong statistical association between t

Trang 1

NATIONAL ECONOMICS UNIVERSITY

Business School

BUSINESS STATISTICS CLASS: EBDB4 Group Assignment

Group Members:

Nguyễn Thị Lan Anh - 11220457

Phan Thị Phương Thảo - 11225967

Tạ Khánh Vi - 11226889

Lê Thị Mai Chi - 11220982 Nguyễn Thị Việt An - 11220045

Ha Noi, November 8 , 2023.th

Trang 2

TABLE OF CONTENTS

CASE STUDY 1: EMPLOYEE 3

Question 1 3

Question 2 12

CASE STUDY 2: Specialty Toys 13

Question 1 13

Question 2 14

Question 3 15

Question 4 17

Question 5 17

Trang 3

CASE STUDY 1: EMPLOYEE Question 1

a Do necessary data cleaning

b Produce the appropriate graphs and tables to explore the distribution of variables?Summarize the quantitative variables using location and variability measures

 Produce the appropriate graphs and tables to explore the distribution of variables?

- There are 1000 observations,

with the majority of

employees (752 employees

accounting for 75.2%)

having bachelor’s degree

Followed by 211 employees

who have master’s degree

The number of people

having PHD was only 3.7%

(37 employees)

Trang 4

- 2017 is the year that most of employees joined the company with 243 people,accounting for 24.3% In 2018 and 2012, there were only 81 and 99employees joining the company respectively During the remaining years, thenumber of employees joining the company increased from 11.9% to 16.6%

- Bangalore is the city which 47% of employees are based or work Followed

by 279 employees who are based or work in Pune and 251 employees in NewDelhi

Trang 5

- The most employees answer that their payment tier is 3.0, accounting for74.4% The number of employees who have payment tier 2.0 and 1.0 are 204and 52 employees respectively.

- The company's employees are mainly at the age of 26 years old(203 people).The number of people at the age from 22-23 years old and over 29 years oldare less than 10 people, 34 years old (11 employees)

Trang 6

- The number of male employees in the company (61.6%) are twice as large asthe number of female employees (38.4%).

Trang 7

- 898 people out of 1,000 said they had never been temporarily withoutassigned work and 102 people remaining have ever

- From this table, we can see that there are a lot of people having from 2 to 4years of experience in their current field There are only 21 employees having

no experience before and 69 employees having 1 years of experience

Trang 8

- Out of 1000 employees, 35.7% of employees responded that they would leavethe company The remaining 643 employees answered that they did not leavethe company and this number accounted for 64.3%.

 Summarize the quantitative variables using location and variability measures

We have 2 quantitative variables which are Age variable and Experience currentdomain variable

Trang 9

- In the Age variable, the youngest age is 22 years old, the oldest is 40 yearsold, the mode value is 26 The mean value is 26.49, Standard deviation is2.575 and variance is 6.629

- In the Experience Current Domain variable, the minimum number of years ofexperience is 0 year, the maximum is 4 years of experience, the mode value is

3 The mean value is 3.14, Standard deviation is 1.291 and variance is 1.666

Trang 10

c Analyze the relationship between two variables (for example how the years of working inthe companies change by cities? Any relationship between Payment Tier andExperience?)

 Experience in Current Domain and Age: How is the Experience in Current Domain ofemployees affect by their age?

Scatter Graphs:

 There is a nonlinear relationship

Correlation:

Trang 11

 A Pearson correlation analysis was performed to test whether there is arelationship between age and experience in current domain The result

of the Pearson correlation analysis showed that there was a significantrelationship between age and experience in current domain, IrI= 0.25,

p-value = 0.437 > 0.05 => The regression model is not suitable

c.1 Joining year and Leaving or not

Trang 12

Null hypothesis: there is no relationship between Joining year and Leaving or not.

Alternative hypothesis: There is a relation between Joining year and Leaving or not

The Chi-Square value of 260.293 with 6 degrees of freedom and a p-value of 000

(indicating a p-value less than 001) suggests that there is a very strong statistical

association between the joining year and the decision leaving or not

Since the p-value is less than 0.001, we can reject the null hypothesis that there is no associationbetween the variables

c.2 City and Leaving or not

Trang 13

The Chi-Square value of 33.532 with 2 degrees of freedom and a p-value of 0.000

(indicating a p-value less than 0.001) suggests that there is a very strong statistical

association between the city which employee live and the decision leaving or not

Since the p-value is less than 0.001, we can reject the null hypothesis that there is no associationbetween the variables

c.3 Payment Tier and Leaving or not

PaymentTier * LeaveOrNot Crosstabulation

Trang 14

Total 643 357 1000

Chi-Square Tests

Asymp Sig sided)

The Chi-Square value of 77.243 with 2 degrees of freedom and a p-value of 0.000

(indicating a p-value less than 0.001) suggests that there is a very strong statistical

association between payment tier and the decision leaving or not

Since the p-value is less than 0.001, we can reject the null hypothesis that there is no associationbetween the variables

c.4 Age and Leaving or not

Age * LeaveOrNot Crosstabulation

Trang 15

The Chi-Square value of 30.324 with 17 degrees of freedom and a p-value of 0.024

(indicating a p-value more than 0.001) suggests that there is a weak statistical

association between age and the decision of leaving or not

Since the p-value is more than 0.001, we can accept the null hypothesis that there is noassociation between the variables

c.5 Gender and Leaving or not

Gender * LeaveOrNot Crosstabulation

Trang 16

Value df

Asymp Sig sided)

Exact Sig sided)

(2-Exact Sig sided)

a 0 cells (0.0%) have expected count less than 5 The minimum expected count is 137.09.

b Computed only for a 2x2 table

The Chi-Square value of 35.512 with 1 degrees of freedom and a p-value of 0.000

(indicating a p-value less than 0.001) suggests that there is a strong statistical

association between gender and the decision of leaving or not

Since the p-value is less than 0.001, we can reject the null hypothesis that there is no associationbetween the variables

c.6 Everbenched and Leaving or not

EverBenched * LeaveOrNot Crosstabulation

Exact Sig sided)

(2-Exact Sig sided)

Trang 17

b Computed only for a 2x2 table

The Chi-Square value of 8.779 with 1 degrees of freedom and a p-value of 0.003

(indicating a p-value more than 0.001) suggests that there is not a strong statistical

association between everbenched and the decision of leaving or not

Since the p-value is more than 0.001, we can accept the null hypothesis that there is noassociation between the variables

c.7 Experience in current domain and Leaving or not

ExperienceInCurrentDomain * LeaveOrNot Crosstabulation

The Chi-Square value of 5.324 with 5 degrees of freedom and a p-value of 0.378

(indicating a p-value more than 0.001) suggests that there is a weak statistical

Trang 18

association between Experience in current domain and the decision of leaving or not.

Since the p-value is more than 0.001, we can accept the null hypothesis that there is noassociation between the variables

c.8 Education and Leaving or not

Education * LeaveOrNot Crosstabulation

The Chi-Square value of 41.411 with 2 degrees of freedom and a p-value of 0.000

(indicating a p-value less than 0.001) suggests that there is a strong statistical

association between education and the decision of leaving or not

Since the p-value is less than 0.001, we can reject the null hypothesis that there is no associationbetween the variables

d Do produce the confidence interval and comparative tests if possible

Gender and Age: 1: Male

2: Female

We will consider the differences between male and female in age

We have hypothesis:

Trang 19

H0: There is no difference between the two samples.

H1: There is a difference between the two samples

- An independent samples t-test was conducted to compare Age in Male and Female.There was a significant difference in the scores for Male (mean = 26.57, Stdev =2.7533) and Female (mean = 26.365, Stdev = 2.2523); t = 1.226, p = 0.22 (two-tailed) The magnitude of the differences in the means

(mean difference = 0.2052, 95% CI: (-0.1232, 0.5336) was small

- In this t-test, the p-value (2-tailed) is 0.22 or 22% Since the significance level was set

at 5 %, it is thus lower than 22 % For this reason, no significant difference isassumed between the two samples, and they therefore come from the samepopulation

2.1 Education and Decision to leave

Educational level has an impact on the decision to leave a company, HR should considerimplementing strategies to address this factor and enhance employee retention such as: trainingand development opportunities, educational assistance programs, recognition and rewards,…

Trang 20

A notable disparity exists in the average joining year between individuals who opted to leave andthose who chose to stay, with departing employees having joined more recently HR effortscould be directed towards enhancing the onboarding process and providing better support in theearly stages of a career to improve retention among newer employees.

2.3 City and Decision to leave

The chi-square test

findings indicate a noteworthy association between the city and the decision to

depart It is advisable for HR to further investigate factors specific to each

location that could impact employee retention and proactively address any regional

concerns

2.4 Payment Tier and Decision to leave

Because the employee could decide to leave or not depend on payment tier, so Hr can have somemethods to minimize this case: Salary Benchmarking, Transparent Compensation Policies,Rewards and Professional Development Opportunities,…

2.5 Age and Decision to leave

Age does not have any effect on decision leaving of the companies So HR can do recruitmentwith people with all range of age

2.6 Gender and Decision to leave

Gender play an important role in decision to leave of the employee, so the HR should have EqualOpportunities and Inclusion Policies, Family-Friendly Policies,…

2.7 Ever Benched and Decision to leave

The chi-square test indicates a lack of a significant association between being benched and thedecision to leave However, it's important to note that the p-value is close to the threshold ofsignificance HR should remain vigilant and monitor the situation closely, as benching maypotentially become a more influential factor in the future

2.8 Experience in current domain and Decision to leave

The independent samples t-test comparing the decision to leave with the level of experienceindicates that there is no significant difference in the mean experience between those who chose

to leave and those who stayed This suggests that decisions to leave may not be influenced by theamount of experience within the current domain HR may want to explore alternative factorsaffecting retention, such as work-life balance or job satisfaction

Trang 21

CASE STUDY 2: Specialty Toys Question 1

According to the case study:

 Expected demand of 20000 units with a 0.9 probability that demand would be between

10000 units and 30000 units That is, the 90% confidence interval is (10000; 30000)

 For 90% confidence, α = 1 - 0.9 = 0.1

The 90% confidence interval is given by: μ ± z ×σ = μ ± z ×σ∝ /2 0.05

where z is the critical value at 0.05 obtained from normal distribution tables We know0.05

that z = 1.645 So, the 90% confidence interval is given by: μ ± 1.645σ.0.05

 From the sales forecaster's prediction, we know that:

μ - 1.645σ = 10000 (1); μ + 1.645σ = 30000 (2)

Solving (1) and (2), we get: μ = 20000; σ = 6079.03

 The probability density function that defines the curve of the normal distribution is given

f (x) = 16079.03√2π

−1

2(x−20000 6079.03)2

Now we sketch the curve The highest point on the normal curve is at the mean 20000.The curve is symmetric The standard deviation determines how flat and wide the curve

is Larger values result in wider, flatter curves, showing more variability in the data.Thus, we have:

Trang 22

Question 2

A stock-out will occur if the demand is greater than the order quantity To compute theprobability of a stock-out, we need to compute the probability that the demand will be greaterthan the order quantity for each of the order quantities recommended by the different managers.This can be done either using the Excel NORMDIST function or manually using the standardnormal distribution tables

 Do with NORMDIST function in Excel as: NORMDIST (x; μ; σ; TRUE)

Where:

x = the recommended order quantity

μ = the mean of the distribution or 20000 in this case

σ = the standard deviation of the distribution or 6079.03 in this case

TRUE is used to indicate that you are using the cumulative probability

P (X > k) = P (Z > k−20000

6079.03) = 1- P (Z < k−20000

6079.03) = 1- NORMDIST (x; μ; σ; TRUE)

We have table of probability is shown below:

 To calculate manually using the charts, use the standard normal distributionconversion formula to calculate z Look up the calculated z value in the tables to getthe probability that the demand will be greater than order quantity

The suggested order quantities are: 15000; 18000; 24000; 28000 units

o Let X be the demand of the product: X ~ N (20000; 6079.032) Order quantity is

15000, probability of stock out is:

P (X > 15000)

Trang 23

=> If order quantity is 15000, probability of stock is: 79.39%

o Order quantity is 18000, probability of stock out is:

=> If order quantity is 18000, probability of stock is: 62.93%

o Order quantity is 24000, probability of stock out is:

=> If order quantity is 24000, probability of stock is: 25.46%

o Order quantity is 28000, probability of stock out is:

Case 1: Order quantities = 15000 units

Cost of $16 per unit => Total cost = 15000 * 16= 240000 ($)

 Worth case in which sales = 10000 units => inventory has 5000 units

=> Total worth case sold = 10000 * 24 + 5000 * 5 = 265000 ($)

Trang 24

=> Profit = 265000 - 240000 = 25000 ($)

 Most likely case in which sales = 20000 units, but sales were limited by order size

=> Total most likely case sold = 15000*24 = 360000 ($)

=> Profit = 360000 - 240000 = 120000 ($)

 Best case in which sales = 30000 units, but sales were limited by order size

=> Total most likely case sold = 15000*24 = 360000 ($)

=> Profit = 360000 - 240000 = 120000 ($)

Case 2: Order quantities = 18000 units

Cost of $16 per unit => Total cost = 18000*16= 288000 ($)

 Worth case in which sales = 10000 units => inventory has 8000 units

=> Total worth case sold = 10000*24 + 8000*5 = 280000 ($)

=> Profit = 280000 - 288000 = -8000 ($)

 Most likely case in which sales = 20000 units, but sales were limited by order size

=> Total most likely case sold = 18000*24 = 432000 ($)

=> Profit = 432000 - 288000 = 144000 ($)

 Best case in which sales = 30000 units, but sales were limited by order size

=> Total most likely case sold = 18000*24 = 432000 ($)

=> Profit = 432000 - 288000 = 144000 ($)

Case 3: Order quantities = 24000 units

Cost of $16 per unit => Total cost = 24000*16= 384000 ($)

 Worth case in which sales = 10000 units => inventory has 14000 units

=> Total worth case sold = 10000*24 + 14000*5 = 310000 ($)

=> Profit = 310000 - 384000 = -74000 ($)

 Most likely case in which sales = 20000 units, => inventory has 4000 units

=> Total most likely case sold = 20000*24 + 4000*5 = 500000 ($)

=> Profit = 500000 - 384000 = 116000 ($)

 Best case in which sales = 30000 units, but sales were limited by order size

=> Total most likely case sold = 24000 * 24 = 576000 ($)

Ngày đăng: 12/08/2024, 14:34