Case study analysis academic performance of university students by TWO WAY ANOVA test

18 2 0
Case study analysis academic performance of university students by TWO WAY ANOVA test

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

HANOI UNIVERSITY FACULTY OF MANAGEMENT AND TOURISM STATISTICS FOR ECONOMICS (FALL 2019) Case study analysis: Academic Performance of University Students by TWO-WAY ANOVA Test Instructor: Ms Lai Hoai Phuong Tutorial - Group Group members: Nguyễn Thị Lan Anh 1604010005 2.Nguyễn Tuấn Phong 1704040093 Nguyễn Gia Phương Anh 1704040005 Lê Thị Bảo Ngọc 1704040084 Nguyễn Việt Hoa 1704040043 Nguyễn Thị Nhung 1704040093 Đặng Ngọc Quỳnh 1704040100 TABLE OF CONTENTS I Introduction II Answering questions  Question  Question  Question  Question  Question  Question .10 I INTRODUCTION Analysis of variance (ANOVA) is a statistical technique that assesses potential differences in a scale-level dependent variable by a nominal-level variable having two or more categories By using this method, the aggregate variability in a dataset is divided into two parts: random factors and systematic factors In fact, we often use two types of ANOVA methods to determine whether differences exist among population means, they are: one-way and two-way In particular, a oneway ANOVA has just one independent variable, which estimates the effect of a factor on a response variable The other, a two-way ANOVA, refers to an ANOVA using two independent variables In this case study: we study the relationship, if any, between classroom seating positions and academic performance (GPA) for both female and male students in a large university in the United States by the way of using two-way ANOVA method The aim of our project is to describe how the outstanding features of two-way ANOVA model applied into the real case study II ANSWERING THE QUESTIONS What inference technique should be considered for this study? Explain The objective of the survey in this case is to test for any significant interaction between Classroom seating positions and Gender and to test for any significant difference in academic performance (GPA) due to seat preference and gender We can easily notice that the suitable inference technique should be used for this study is Two-way ANOVA model Two-way ANOVA compares the mean differences among groups that have been split into independent factors, each with several levels In particular, it is clear that respondents were asked to specify one of three levels of seat preference: “front” , “middle” and “back” Therefore, seating positions become the first factor which including levels The second factor is gender with levels of male and female From utilizing two factors, two-way ANOVA will expose the interaction between these two factors Each combination of the factors is named a cell Therefore, total combinations of seats and genders results in cells Produce descriptive statistics for the dataset You are expected to generate as many relevant descriptive statistics as possible using ALL the relevant tools introduced in the labs of this course Remember to provide appropriate interpretations for the descriptive statistics Try not to include unnecessary or irrelevant descriptive statistics 2.1 Sample size The sample of the conservations is normally distributed It is conducted by 300 respondents which are large enough and it is independent because the attendants are randomly selected There are three variables consisting of the GPA, the gender (male,female), and the Seat (front, middle and back ) 2.2 Mean and Standard deviation We can get the mean of the GPA and find the standard deviation of two other variables but we have to convert variable Gender and Seat into factors Using “Factor” function, then use “By” function to get the mean for two groups at the same time  Convert variable Gender and Seat into factors and Crosstabulation table between Gender and Seat variables: ❖ StudentSurvey$Gender library(car) leveneTest(StudentSurvey$GPA,interaction(StudentSurvey$Seat,StudentSurvey$Gender),center =mean) The outcome Levene's Test for Homogeneity of Variance (center = mean) Df F value Pr(>F) group 1.1739 0.322 294 The P-value of the test is 0.322 while our α is 0.05, therefore we not reject the hypothesis, as well as cannot conclude that the standard deviations are different However, since the ratio is smaller than 2, conducting the Levene test is not truly necessary in this case If the ratio of this case is larger than 3, we should choose other tests instead of the Two-way ANOVA 3.3 All populations are normal distributions We can check the normality by using Q-Q plot of residuals (The Q-Q plot was made in Rsudio) with this code and output: R code:  install.packages("car")  library(car)  leveneTest(StudentSurvey$GPA,interaction(StudentSurvey$Seat,StudentSurvey$Gender) , center=mean)  qqPlot(lm(GPA ~ Gender + Seat + Gender*Seat, data=StudentSurvey), simulate=T, main="Q-Q Plot", labels=F) The outcome: It is clearly seen from the Q-Q plot that all outliers lie within the confidence envelop, which obviously demonstrates that all populations are normally distributed Perform the inference technique you suggest in Question Remember to provide all the necessary steps What are your interpretations and conclusions? Explain ANOVA test 2-way factors: Step 1: Identify null and alternative hypothesis: Ho: There is not a significant interaction between seat preference and gender in GPA Ha: There is significant interaction between seat preference and gender in GPA Step 2: Test statistic and p-value: We used Rstudio to calculate and had the output as following: > StudentSurvey.result summary(StudentSurvey.result) Df Sum Sq Mean Sq F value Pr(>F) Gender Seat 1.40 1.4008 7.108 0.0081 ** 0.93 0.4673 2.371 0.0951 Gender:Seat 1.35 0.6745 3.423 0.0339 * Residuals 294 57.94 0.1971 Signif codes: ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ Step 3: Level of significance: α=0.05 Step 4: Decision rule: Reject Ho if p-value < ∝ From R output, we can see that the interaction between seat preference and gender has P-value: 0.0339 interaction.plot(StudentSurvey$Gender,StudentSurvey$Seat, StudentSurvey$GPA,type=“b”,col=c(“red”, “blue”),pch=c(16,18),main=“Interaction between Gender and Seat”) Figure 7: Interaction Plot between Gender and Seat As we can see from the interaction plot, the male and female student groups record a significant difference among the ones who sit in the front, middle and back Looking at the details, the female group who sit in the front scores the highest GPA with over 3.3 while the male group who also sit at the same spot has 3.1 The female sitting in the middle has approximately 3.1 and the male group has a bit higher GPA The female group who sits in the back shows a similarity with the ones who sit in the middle but the male has the lowest GPA (less than 3.0) From this interaction, we can conclude that the ones who sit from the middle to the front has the tendency of having higher GPA Yet, the female group who sits in the back also has remarkable result An intersection among seat lines can be observed in the above interaction plot This indicates that there is a connection between genders and the seat position The female students sitting in the front and the back of the class have better performance than the male students and the contrary can be seen in the middle seat group Discuss the credibility of the interpretations and conclusions of question Is there anything we should be concerned about? Explain a Credibility of the interpretations With the purpose of comparing population means when population is categorized by two categorical factors, an appropriate and useful tool is used in this case study – two-way ANOVA test Secondly, a significant level of 0.05 is utilized, which guarantees the accuracy of the test At the same time, the result of p-value is quite small meaning that there is a higher chance to reject the null hypothesis Besides, all the assumptions for the test are satisfied with clear evidences as well as explanation for each proof in the third part of the report The thing should be highlighted is that although we use “by” function to test equal variances and receive the result: Largest standard deviation/Smallest standard deviation equal 1.3 (< 2), we still apply LeveneTest to ensure the result of this assumption checking Eventually, the plot and interpretation of interaction between two factors is considered as an important part of the case study b Limitations of the case First of all, one of the assumptions is that the sample of the case has to be a Simple Random Sample However, there is nothing here to ensure that the sample is chosen randomly from its population Moreover, ANOVA test assumes that the data are normally distributed and the violation of this assumption affects greatly on the results Since the violation in this case is moderate, therefore if there are some outliers in the QQ-plot, this assumption still can be satisfied Another limitation is the condition of equal variances because the greater the difference in variances between groups, the greater chance that the conclusion of the test is inaccurate Eventually, when running ANOVA to test the difference of GPA due to Gender and Seat position, the result only tells whether there is a difference or not but it does not indicate how the difference is III Conclusion Two-way ANOVA which is used to address this case is satisfied It brings us to the conclusion that it is significant about the change in academic performance due to the relationship classroom seating positions and academic performance (GPA) for both female and male students APPENDIX Read R code with file “StudentSurvey.csv” Mean and standard deviation Check assumption Interaction plot STATISTICS FOR ECONOMICS - PEER EVALUATION FORM Please fill out this form to perform evaluation of your group members Discuss with all members and agree on the final evaluations Please evaluate each member out of a scale of 100% Allocation should be based upon group opinions regarding how satisfactorily the member fulfilled his/her assigned tasks within the group’s case study For example, a 100% rating should be given to members who fulfilled satisfactorily the tasks assigned by the group Group members should ask themselves the following questions before assigning the percentages to others Did he/she his/her fair share of the work on schedule and to the group’s satisfaction? Did he/she cooperate with other group members? Did he/she participate in, contribute to and share ideas in all relevant discussions? Did he/she attend group meetings when required? Did he/she relate and communicate to other group members? Team members Nguyễn Thị Lan Anh Contribution (100%) 100% Nguyễn Tuấn Phong 100% Nguyễn Gia Phương Anh 100% Lê Thị Bảo Ngọc 100% Nguyễn Việt Hoa 100% Nguyễn Thị Nhung 100% Đặng Ngọc Quỳnh 100% Signature (all members) Guidelines for peer evaluation:  Disregard your general impression and concentrate on group members’ performance in the case study within this course only  Make a fair, objective and impartial evaluation of group members  Sign the evaluation form to indicate group consensus  Attach the evaluation form at the end of the report Note: Your final mark for the case study will be equal to Your group result * Your peer rating ... the United States by the way of using two- way ANOVA method The aim of our project is to describe how the outstanding features of two- way ANOVA model applied into the real case study II ANSWERING... tool is used in this case study – two- way ANOVA test Secondly, a significant level of 0.05 is utilized, which guarantees the accuracy of the test At the same time, the result of p-value is quite... than 2, conducting the Levene test is not truly necessary in this case If the ratio of this case is larger than 3, we should choose other tests instead of the Two- way ANOVA 3.3 All populations are

Ngày đăng: 07/06/2022, 18:46