DSpace at VNU: Measuring University Students'''' Approaches to Learning Statistics: An Invariance Study tài liệu, giáo án,...
596125 research-article2015 JPAXXX10.1177/0734282915596125Journal of Psychoeducational AssessmentChiesi et al Article Measuring University Students’ Approaches to Learning Statistics: An Invariance Study Journal of Psychoeducational Assessment 1–13 © The Author(s) 2015 Reprints and permissions: sagepub.com/journalsPermissions.nav DOI: 10.1177/0734282915596125 jpa.sagepub.com Francesca Chiesi1, Caterina Primi1, Ayse Aysin Bilgin2, Maria Virginia Lopez3, Maria del Carmen Fabrizio3, Sitki Gozlu4, and Nguyen Minh Tuan5 Abstract The aim of the current study was to provide evidence that an abbreviated version of the Approaches and Study Skills Inventory for Students (ASSIST) was invariant across different languages and educational contexts in measuring university students’ learning approaches to statistics Data were collected on samples of university students attending undergraduate introductory statistics courses in five countries (Argentina, Italy, Australia, Turkey, and Vietnam) Using factor analysis, we confirmed the three-factor (Deep, Surface, and Strategic approach) model holds across the five samples, and we provided evidence of configural and measurement invariance The current version of the ASSIST for statistics learners is a suitable scale for researchers and teachers working in the field of statistics education and represents promising tool for multinational studies on approaches to learning statistics Keywords approaches to learning, statistics education, invariance, CFA Being able to provide good evidence-based arguments and to critically evaluate data-based claims are important skills that all citizens should have Thus, statistical reasoning, that is, being able to understand, evaluate, and make decisions about quantitative information, should be a necessary component of adults’ numeracy and literacy (Gal, 2003; Garfield & Ben-Zvi, 2008) In line with this claim, statistics has been included into a wide range of university programs, and in many countries, students progressing toward a degree other than statistics have to pass at least a compulsory statistics exam For this reason, statistics education is a research area of increasing interest across the world, and several researchers have investigated factors affecting statistics 1University of Florence, Italy University, Australia 3University of Buenos Aires, Argentina 4Bahcesehir University, Turkey 5International University, Ho Chi Minh City, Vietnam 2Macquarie Corresponding Author: Francesca Chiesi, Department of Neuroscience, Psychology, Drug Research, and Child’s Health (NEUROFARBA)— Section of Psychology, University of Florence, via di San Salvi 12 - Padiglione 26, 50135 Firenze, Italy Email: francesca.chiesi@unifi.it Downloaded from jpa.sagepub.com at CMU Libraries - library.cmich.edu on September 19, 2015 Journal of Psychoeducational Assessment learning of university students, and focused their effort on how to improve the learning and teaching of statistics (see Zieffler et al., 2008, for a review) In reviewing this literature, we found that learning approaches (Biggs, 2003; Entwistle, 1991; Marton & Saljo, 1976a, 1976b; Rhem, 1995) have not been investigated referring specifically to statistics Thus, the present article aimed at addressing this issue starting from the following assumptions Approaches to learning are not intrinsic characteristics of students (Lucas & Mladenovic, 2004; Ramsden, 2003), but they are sensitive to the environment in which the learning occurs and are affected by students’ perceptions of the learning situation (Rhem, 1995) Thus, it turns to be very interesting to investigate the learning approaches that students adopt inside the statistics learning environment Indeed, all over the world and in many university programs, students that have to pass compulsory statistics exams are for the most part students progressing toward degrees other than statistics (and sometimes quite different from it, such as psychology, health sciences, educational science degrees) As such, they fail to understand the significance and benefit of statistics for their own academic and professional life More broadly speaking, these students are more likely to have a negative attitudes toward statistics, that is, they not like statistics, are not interested in statistics, believe that statistics is difficult to learn, or are not willing to put in the effort needed to learn statistics (Schau, Miller, & Petocz, 2012) For all these reasons, to gain a better understanding of university students’ learning approaches to statistics could provide new insights in investigating factors affecting statistics learning in higher education and in improving the learning and teaching of statistics The present article aimed at addressing this issue focusing on the measurement matter Because the measurement instruments used in research studies are essential to the findings that are produced, evidence of their meaningfulness and appropriateness to the groups or participants being studied is an essential element Thus, it is important that teachers and researchers working in the field of statistics education might share a reliable and valid tool to describe and compare the approaches adopted by university students when learning statistics Nonetheless, conclusions drawn from comparative analyses may be biased or invalid if the measures not have the same meaning across groups (Vandenberg & Lance, 2000) Indeed, lack of measurement equivalence renders group comparisons ambiguous because we cannot ascertain if the differences are a function of the measured trait, or if they are artifacts of the measurement process To make meaningful interpretations of group differences, one must first establish measurement invariance From the perspective of a key psychometric assumption, in most assessment instruments, the summed score of items serves as an approximation of an individual’s trait score Ideally, differences in summed scores should reflect true differences in the latent variable that the scale intends to measure In interpreting group differences with respect to summed scores, the instrument should measure the same underlying trait across groups and a necessary condition for this is that the instrument is measurement invariant (e.g., Slof-Op ’t Landt et al., 2009) From a factor analytic perspective, invariance assesses whether there is conceptual equivalence of the underlying latent variable(s) across groups (Vandenberg & Lance, 2000) that is reflected in the use of identical indicators to measure the same trait(s) Starting from these premises, our goal was to provide researchers and teachers working in the field of statistics education a reliable and valid scale measuring students’ learning approaches to statistics In line with this claim, we aimed at providing evidence that the scale can be used in different language versions and educational contexts Specifically, the aim of the present article was to provide evidence that an abbreviated version of the Approaches and Study Skills Inventory for Students (ASSIST, Tait, Entwistle, & McCune, 1998) was invariant across different languages and educational contexts in measuring university students’ learning approaches to statistics The ASSIST assesses three approaches to learning that can be described as follows The first one is called the “Deep” approach to learning and is characterized by a personal commitment to learning Students adopting this approach aim to comprehend what they are learning and approach Downloaded from jpa.sagepub.com at CMU Libraries - library.cmich.edu on September 19, 2015 Chiesi et al critically the arguments, they evaluate whether concepts and contents are justified by evidence, and try to associate them to their prior knowledge As such, a deep approach to learning is more likely to result in better retention and application of knowledge (Biggs, 2003; Ramsden, 2003) The second one is the “Surface” approach that is characterized by a lack of personal engagement in the learning process As such, concepts and contents are learned in an unreflective and unrelated manner and surface learning is more likely to result in memorizing without understanding and in misunderstanding of important concepts (Ramsden, 2003) Finally, the third one is the “Strategic” approach when their learning is characterized by a strong achievement motivation and is tailored on the assessment demands This approach describes well-organized and conscientious study methods (including time management) linked to the purpose to well (Struyven, Dochy, Janssens, & Gielen, 2006) Among the various self-report inventories that have been developed to evaluate different aspects of students’ learning and study (for a review, Cassidy, 2004), the ASSIST has been largely applied in different educational contexts in different countries (e.g., Moneta, Spada, & Rost, 2007; Samarakoon, Fernando, Rodrigo, & Rajapakse, 2013; Speth, Namuth, & Lee, 2007; Valadas, Gonỗalves, & Faísca, 2010; Walker et al., 2010; Zhu, Valcke, & Schellens, 2008) and used in studies focused on evaluating the impact of some intervention programs on the approaches to learning (e.g., Ballantine, Duff, & McCourt Larres, 2008; Maguire, Evans, & Dyas, 2001; Reid, Duvall, & Evans, 2007) Whereas the validity and reliability of the ASSIST have been confirmed in several studies within different disciplines (Buckley, Pitt, Norton, & Owens, 2010; Byrne, Flood, & Willis, 2004; Diseth, 2001; Entwistle, Tait, & McCune, 2000; Kreber, 2003; Maguire et al., 2001; Reid et al., 2007), the scale has not been used yet to investigate learning approaches referring specifically to statistics Thus, we aimed to ascertain if the ASSIST was suitable in measuring the deep, surface, and strategic learning approaches of students enrolled in statistics courses In doing so, we proposed an abbreviated version of the scale not including some subscales (called related subscales by the authors) assessing related constructs such as interest, anxiety, effort, and confidence Specifically, they refer to the interest in learning for learning’s sake (e.g., “I sometimes get ‘hooked’ on academic topics and feel I would like to keep on studying them”), to the pessimism, loss of confidence and anxiety about academic outcomes (e.g., “Often I lie awake worrying about work I think I won’t be able to do”), and to the confidence and intention to engage in the task (e.g., “I feel that I am getting on well, and this helps me put more effort into the work”) These aspects are very similar or overlap constructs like statistics anxiety and statistics attitudes elsewhere investigated in the statistics education field where several reliable and valid instruments exist to measure both statistics anxiety (for a review, Onwuegbuzie & Wilson, 2003) and attitudes toward statistics (for a review, Emmioğlu & CapaAydin, 2012) Thus, we deemed these subscales of the ASSIST not really helpful in this domain, and we removed these subscales to develop a tool to assess specifically the deep, surface, and strategic core aspects of the learning approaches In sum, collecting data on samples of university students attending introductory statistics courses in five different countries (Argentina, Italy, Australia, Turkey, and Vietnam), we investigated if this abbreviated ASSIST was suitable in measuring students’ learning approaches, and we tested if they used the same conceptual framework to answer to the items of the scale testing the invariance of the ASSIST through factor analyses Method Participants Data were collected in Argentina, Australia, Italy, Turkey, and Vietnam The five samples were composed of university students progressing in degrees such as agricultural engineering, Downloaded from jpa.sagepub.com at CMU Libraries - library.cmich.edu on September 19, 2015 Journal of Psychoeducational Assessment business, environmental sciences, psychology, medical sciences, and social science All students attended introductory statistics courses (see the appendix for a detailed description) The Argentinean sample consisted of 430 students of the University of Buenos Aires (52% female, M age of 21.4 years, SD = 3.1 years) The Australian sample consisted of 292 university students (50% female, M age of 21.8 years, SD = 4.6 years) of the Macquarie University in Sydney The Italian sample was composed of 403 students (79% female, M age of 20.5 years, SD = 3.3 years) of the University of Florence The Turkish sample consisted of 350 university students (61% female, M age of 22.1 years, SD = 0.9 years) from different Turkish universities (Afyon Kocatep, Hacettepe, Karadeniz, Istanbul, Selcuk, and Yildiz) Finally, the 260 university students of the Vietnam Open University in Ho Chi Minh City (76% female, M age of 19.9 years, SD = 1.3 years) participated in the study Measure and Procedure The ASSIST (Tait et al., 1998) consists of three sections.1 Section B is the core part of the scale that measures the learning approaches and all the psychometric investigations reported in literature have been focused on it (Buckley et al., 2010; Byrne et al., 2004; Diseth, 2001; Entwistle et al., 2000; Kreber, 2003; Maguire et al., 2001; Reid et al., 2007) Once having excluded the three related subscales referring to interest, anxiety, effort, and confidence, we proposed a scale consisting of 40 items, and respondents indicated the degree of their agreement with each statement using a 5-point Likert-type scale where = disagree and = agree The items were combined into 10 subscales of four items each and then further grouped into the three main scales: Deep (Seeking Meaning [SM], Relating Ideas [RI], Use of Evidence [UE]), Surface (Lack of Purpose [LP], Unrelated Memorizing [UM], Syllabus-Boundness [SB]), and Strategic (Organized Studying [OS], Time Management [TM], Alertness to Assessment Demands [AAD], Monitoring Effectiveness [ME]) In detail, the “Deep” subscales consist of items about the intention to understand in learning (e.g., “I usually set out to understand for myself the meaning of what we have to learn”), linking ideas and relating them to other contents (e.g., “I try to relate ideas I come across to those in other topics or other courses whenever possible”), and reasoning and relating evidence to conclusions (e.g., “I look at the evidence carefully and try to reach my own conclusion about what I’m studying”) The “Surface” subscales consist of items about the lack of direction and conviction in studying (e.g., “Often I find myself wondering whether the work I am doing here is really worthwhile”), poor understanding and rote memorizing of the material (e.g., “I find I have to concentrate on just memorizing a good deal of what I have to learn”), and putting the bare minimum effort to pass the examinations (e.g., “I gear my studying closely to just what seems to be required for assignments and exams”) The “Strategic” subscales consist of items about the ability to planning and working regularly and effectively (e.g., “I manage to find conditions for studying which allow me to get on with my work easily”), organizing time and effort (e.g., “I organize my study time carefully to make the best use of it”), trying to impress teachers (e.g., “When working on an assignment, I’m keeping in mind how best to impress the marker”), and checking progress to ensure achievement of aims (e.g., “I think about what I want to get out of this course to keep my studying well focused”) The ASSIST was translated into Italian, Spanish, Turkish, and Vietnamese using a forwardtranslation method following the current guidelines for adapting tests (Hambleton, 2005) For each version, non-professional translators worked independently, and then they compared their translations with the purpose of assessing the equivalence Then, they worked together to verify the similarity and resolve any discrepancies Discrepancies were corrected by agreement among translators For each language version, a small group (n from to 5) of native speakers read the translated versions to check clarity, understandability, and readability When necessary, further Downloaded from jpa.sagepub.com at CMU Libraries - library.cmich.edu on September 19, 2015 Chiesi et al revisions were done to get through an iterative procedure at the final version of the Spanish, Italian, Turkish, and Vietnamese forms All students participated on a voluntary basis after they were given information about the general aim of the investigation (i.e., they were told we were collecting information to improve students’ achievements in statistics) The scale was administered individually during the classes, and students were asked to answer referring solely to the statistics course Answers were collected in a paper-and-pencil format and data collection was completed in about 20 In all the countries, we surveyed students toward the end of the course when they had completed some kind of midterm assignments and they had received a feedback in some aspects of their learning Indeed, to investigate the learning approaches to the discipline, it was necessary that all the students were engaged in the process of learning statistics Results Prior to conducting the analyses, we looked at missing values in the data For each item, missing values remained at or below 0.3% of the total cases in the sample, and no case had more than five missing responses Then, we tested if missing data occurred completely at random (MCAR) using the R J A Little’s (1988) test Data were missing completely at random as indicated by a nonsignificant MCAR test, χ2(849) = 32.27, p = 99, and we decided to use an expectation maximization (EM) algorithm to impute the missing data (Scheffer, 2002) Both R J A Little’s test and the EM algorithm are implemented in SPSS 20.0 Confirmative Factor Analysis (CFA) A three-factor model corresponding to the Deep, Surface, and Strategic core dimensions of the ASSIST was tested separately in each group applying CFA The aim was twofold: to confirm the factor structure of the ASSIST and to test a baseline model individually for each group as a prerequisite for assessing the invariance across groups Specifically, we tested a model in which SM, RI, and UE were the observed variables of the Deep approach LP, UM, and SB were the observed variables of the Surface approach, and OS, TM, AAD, and ME were the observed variables of the Strategic approach In line with the original version of the ASSIST (Tait et al., 1998), analyses were conducted on the subscale scores for all the samples This kind of parceling procedure (Gribbons & Hocevar, 1998; T D Little, Cunningham, Shahar, & Widaman, 2002) was applied to help avoid the inherent non-normality associated with single item distributions and to reduce the number of observed variables in the model to have adequate sample sizes to test the factorial structure of the scale CFA analyses were conducted with AMOS 5.0 (Arbuckle, 2003) using maximum likelihood estimation on the variance–covariance matrix because Skewness and Kurtosis indices of all the observed variables ranged inside the values of −1 and revealing that the departures from normality were acceptable and that cannot be expected to lead to appreciable distortions (Marcoulides & Hershberger, 1997) Several fit indices were used to assess model fit as suggested by Schumaker and Lomax (1996): Goodness-of-fit statistics reported are χ2/degrees of freedom ratios, the comparative fit index (CFI), the Tucker–Lewis index (TLI), and the root mean of square error of approximation (RMSEA) For the ratio of chi-square to its degrees of freedom (χ2/df), values less than were considered to reflect a fair fit (Kline, 2005) We considered CFI and TLI values of 90 and above to reflect a fair fit (Bentler, 1990) For the RMSEA, values less than 08 were considered to reflect an adequate fit (Browne & Cudeck, 1993) Results from all countries, with the exception of Vietnam, revealed that the indices of fit parameters fail to reach the cutoffs for good fit (χ2/df ranged from 4.7 to 5.5; CFIs ranged from 84 to 89; TLIs from 78 to 85; RMSEA from 096 to 116) An exploration of the modification Downloaded from jpa.sagepub.com at CMU Libraries - library.cmich.edu on September 19, 2015 Journal of Psychoeducational Assessment Figure 1. The three-factor model of the abbreviated ASSIST for statistics learners indices revealed that the bad fit of this initial three-factor model could be explained by two paths that not appear in the model, that is, the path between the AAD observed variable and the Surface factor, and the path between the ME observed variable and the Deep factor Thus, to obtain a clearer factor structure representing the three distinct approaches to learning, the two subscales were removed from the analysis Therefore, we tested a model in which SM, RI, and UE were the observed variables of the Deep approach, LP, UM, and SB were the observed variables of the Surface approach, and OS and TM were the observed variables of the Strategic approach (Figure 1) On data from the Argentinean sample, the model showed a good fit, χ2(17) = 49.37, p < 001, χ /df = 2.9, CFI = 95, TLI = 92, RMSEA = 067 For the measurement model, each of the subscales loaded strongly and significantly on the hypothesized factor (factor loadings ranged from 46 to 75) For the structural model, a positive correlation was found between Deep and Strategic (.27), Surface correlated negatively with Deep (−.42) and with Strategic (−.16) In the Australian sample, the model showed an excellent fit, χ2(17) = 30.95, p < 05, χ2/df = 1.8, CFI = 98, TLI = 97, RMSEA = 053 For the measurement model, each of the subscales loaded strongly and significantly on the hypothesized factor (factor loadings ranged from 64 to 89) For the structural model, a positive correlation was found between Deep and Strategic (.54), Surface correlated negatively with Deep (−.13) and Strategic (−.18) The three-factor model showed a good fit, χ2(17) = 42.82, p < 01, χ2/df = 2.5, CFI = 97, TLI = 95, RMSEA = 061, on data from the Italian sample For the measurement model, each of the subscales loaded strongly and significantly on the hypothesized factor (factor loadings ranged from 57 to 89) For the structural model, a positive correlation was found between Deep and Strategic (.40), Surface correlated negatively with Deep (−.31) and Strategic (−.57) The three-factor model showed a good fit, χ2(17) = 36.41, p < 01, χ2/df = 2.1, CFI = 98, TLI = 96, RMSEA = 057, also on data from the Turkish sample Downloaded from jpa.sagepub.com at CMU Libraries - library.cmich.edu on September 19, 2015 Chiesi et al For the measurement model, each of the subscales loaded strongly and significantly on their hypothesized factor (factor loadings ranged from 55 to 91) For the structural model, a positive correlation was found between Deep and Strategic (.54), whereas Surface was very weakly correlated with Deep (.10) and Strategic (.09) Finally, the model showed a good fit, χ2(17) = 35.06, p < 01, χ2/df = 2.10, CFI = 97, TLI = 94, RMSEA = 064, in the Vietnamese sample For the measurement model, each of the subscales loaded strongly and significantly on the hypothesized factor (factor loadings ranged from 46 to 83) For the structural model, a positive correlation was found between Deep and Strategic (.77), and Surface and Deep (.20), whereas Surface and Strategic were very weakly correlated (.09) In summary, the three-factor model including the Deep, Surface, and Strategic approaches was confirmed in each country However, different pattern of correlations were found among factors Indeed, whereas Deep and Strategic approaches were positively correlated in all samples with difference in the effect size, similar negative correlations were found between Deep and Surface in Argentina, Australia, and Italy but not in Turkey (no correlation) and Vietnam (positive correlation) In the same way, negative correlations were found between Strategic and Surface in Argentina, Australia, and Italy, but not in Turkey and Vietnam Factorial Invariance Across Countries To assess the invariance of the scale, we tested the invariance of the factor model’s parameters across the five samples performing a multi-group CFA For testing the factorial equivalence among the Spanish, English, Italian, Turkish, and Vietnamese versions, we started from the configural invariance to test increasingly stringent hypotheses of equivalence by imposing equality constraints on different sets of parameters to provide evidence of configural invariance determines whether participants from different groups use the same conceptual framework to answer the items of the scale If data fit a model where the number of factors and the pattern of free and fixed loadings are the same across groups, the configural invariance holds (Cheung & Rensvold, 2002) In the present study, the testing of the invariance involved a series of hierarchically ordered steps that begin with a baseline model with unconstrained parameters (configural invariance) that was compared with two different models: Model A with constrains on factor loading parameters (measurement invariance) and Model B with additional constrains on variance and covariance parameters (structural invariance) Looking at data obtained separately for each group, we hypothesized to find invariance among the baseline model and Model A, whereas invariance was not expected for Model B The tenability of hypotheses of equivalence was determined by comparing the difference in fit between nested models Criteria for assessing the difference between competing models were based on two different approaches First, we used traditional hypothesis testing by statistical methods with the scaled difference chi-square test (Satorra & Bentler, 2010) Second, we also used a more practical approach based on the difference in CFIs of nested models Following Cheung and Rensvold (2002), a difference of CFI values smaller than 01 were considered as support for the more constrained of the competing models As suggested by Little, Card, Slegers, and Ledford (2007), the first approach will be mainly used for testing the invariance of structural parameters and the second approach for testing the invariance of measurement parameters The overall and comparative fit statistics of invariance models are presented in the Table Goodness of fit indices provided evidence of configural invariance In the first model comparison, i.e., baseline model vs Model A, the difference of CFI values was smaller than 01 indicating the invariance of the factor loadings across sample As expected, in the second model comparison, i.e., Model A versus Model B, comparison fit statistics indicated that there was no invariance of structural covariances among factors Downloaded from jpa.sagepub.com at CMU Libraries - library.cmich.edu on September 19, 2015 Journal of Psychoeducational Assessment Table 1. Goodness-of-Fit Statistics for Test of Invariance of the Abbreviated ASSIST for Learning Approaches to Statistics Across Countries Assuming the Unconstrained Model (Baseline) to be Correct χ2 df χ2/df Δχ2 Δdf p(Δχ2) CFI ΔCFI RMSEA 194.62 235.43 452.39 85 105 129 2.3 2.4 3.5 − 40.81 252.77 − 20 44 −