1. Trang chủ
  2. » Ngoại Ngữ

MeasuringAnnualImprovementInStudentAchievement

180 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 180
Dung lượng 4,02 MB

Nội dung

Measuring Annual Improvement in Student Achievement: Pilot Study Results Table of Contents Executive Summary Background 13 Purpose of Document 15 Section I Reaching the Standard Model 15 Section II SAS® EVAAS® Multivariate, Longitudinal Models .21 Section III Pilot Study 31 Pilot Study Methods 31 Summary of Pilot Data 31 Sufficient Data for Growth Measure 38 Practical and Psychometric Comparison 41 Pilot Growth Results 53 Growth and Value-Added Reporting Options .65 Extending Growth Analyses to EOC Assessments 67 Tables and Figures Table I Students with Sufficient Data for Reporting Growth in the Pilot Study Table II Estimated Numbers and Percents of Students with Sufficient Data for 2009 Growth Reporting Table III Percents of Students Meeting Reading and Mathematics RTS and EVAAS® Growth Expectations in Pilot Study Table IV Percents of Students Meeting Science and Social Studies RTS and EVAAS® Growth Expectations in Pilot Study 10 Table Summary of RTS Growth Expectations .16 Figure RTS Growth Targets Compared with Actual Scores for Two Hypothetical Students 17 Table Pilot Study Cohorts 32 Table Summary of Cohort Growth Information for Each Method in Reading and Mathematics 33 Table Summary of Cohort Growth Information for Each Method in Science and Social Studies 36 Table Student Level Summary 39 Table Student Level Summary 40 Table Student Level Summary 40 Table Student Level Summary 41 Table Growth Expectations for the Reaching the Standard Model .41 Table 10 Standard Deviation Units for the Reaching the Standard Model 42 Table 11 Correlations Between Projected and Actual Scores 49 Table 12 Regression Equation Results for Reading/ELA 51 Table 14 Campuses Added to Passing 2007 AYP Proficiency Targets by EVAAS® and a Growth-to-Proficiency Model 51 Table 15 Campuses Estimated to be Added to Passing 2009 AYP Proficiency Targets by EVAAS® and a Growth-to-Proficiency Model 52 Table 16 Summary for Reading Data 53 Table 17 Summary for Mathematics Data 54 Table 18 Summary for Science Data 55 Table 19 Summary for Social Studies Data 55 Table 20 Campuses with At Least 90% of Students Meeting the 2007 TAKS Standard —Reading Results 56 Table 22 Campuses with 70% of Students or Fewer Meeting the 2007 TAKS Standard —Reading Results 57 Table 23 Campuses with 70% of Students or Fewer Meeting the 2007 TAKS Standard —Mathematics Results 57 Table 24 Campuses with 70% of Students or More Reporting Limited English Proficiency Status—Reading Results 58 Table 25 Campuses with 70% of Students or More Reporting Limited English Proficiency Status—Mathematics Results .58 Table 26 Campuses with 25% of Students or Fewer Reporting Limited English Proficiency Status—Reading Results 59 Table 27 Campuses with 25% of Students or Fewer Reporting Limited English Proficiency Status—Mathematics Results .59 Table 28 Campuses with 70% of Students or More Reporting Free/Reduced Lunch Status—Reading Results 60 Table 29 Campuses with 70% of Students or More Reporting Free/Reduced Lunch Status—Mathematics Results 60 Table 30 Campuses with 40% of Students or Fewer Reporting Free/Reduced Lunch Status—Reading Results 61 Table 31 Campuses with 40% of Students or Fewer Reporting Free/Reduced Lunch Status—Mathematics Results 61 Table 32 Campuses with Enrollment Counts in the Top 10% of the State—Reading Results 62 Table 33 Campuses with Enrollment Counts in the Top 10% of the State— Mathematics Results 62 Table 34 Campuses with Enrollment Counts in the Bottom 10% of the State— Reading Results 63 Table 35 Campuses with Enrollment Counts in the Bottom 10% of the State— Mathematics Results 63 Table 36 EVAAS® Selected Campuses – Reading Results 64 Table 37 EVAAS® Selected Campuses – Mathematics Results 64 Table A1 District/Campus Level Summary 70 Table A2 District/Campus Level Summary 71 Table A3 District/Campus Level Summary 72 Table A4 District/Campus Level Summary 73 Table B1 Summary for Reading Data 74 Table B2 Summary for Mathematics Data 82 Table B3 Summary for Science Data 90 Table B4 Summary for Social Studies Data 94 Table RTSA2_1 Pilot Study Cohorts 106 Table RTSA2_2 Classification Table of Students’ Estimated True Scores vs Observed Scores 108 Table RTSA2_3 6th-grade Cohort: 4th-grade Reading Growth 108 Table RTSA2_4 6th-grade Cohort: 4th-grade Math Growth 108 Table RTSA2_5 6th-grade Cohort: 5th-grade Reading Growth 109 Table RTSA2_6 6th-grade Cohort: 5th-grade Math Growth 109 Table RTSA2_7 6th-grade Cohort: 6th-grade Reading Growth 109 Table RTSA2_8 6th-grade Cohort: 6th-grade Math Growth 110 Table RTSA2_9 8th-grade Cohort: 6th-grade Reading Growth 110 Table RTSA2_10 8th-grade Cohort: 6th-grade Math Growth 110 Table RTSA2_11 8th-grade Cohort: 7th-grade Reading Growth 111 Table RTSA2_12 8th-grade Cohort: 7th-grade Math Growth 111 Table RTSA2_13 8th-grade Cohort: 8th-grade Reading Growth 111 Table RTSA2_14 8th-grade Cohort: 8th-grade Math Growth 112 Table RTSA2_15 9th-grade Cohort: 9th-grade Reading Growth 112 Table RTSA2_16 9th-grade Cohort: 9th-grade Math Growth 112 Table RTSA2_17 Summary of Classification Errors by Performance Levels for Reading 113 Table RTSA2_18 Summary of Classification Errors by Performance Levels for Math 113 Table RTSA2_19 Summary of Classification Accuracy .114 Table SASA1_1: Correlation between Projected Scores and Observed Scores 121 Table SASA1_2: Mean Prediction Error for Math by Predicted Achievement Categories 122 Table SASA1_3: Mean Prediction Error for Reading/Lang Arts by Predicted Achievement Categories 123 Measuring Annual Improvement in Student Achievement: Pilot Study Results Executive Summary Background: House Bill (79th Texas Legislature, 3rd Called Session) and Senate Bill 1031 (80th Texas Legislature, Regular Session) require that an approach to measuring student growth on the Texas Assessment of Knowledge and Skills (TAKS) be implemented The growth measure will evaluate students’ annual improvement and will be used to report student growth each year In addition, the implementation of a growth measure will provide an opportunity to include student growth in the state accountability system and in calculations of adequate yearly progress (AYP) for federal reporting purposes As such, Texas conducted a pilot study in fall 2007 and spring 2008 to evaluate two approaches for measuring student growth The two approaches studied included growth-to-proficiency models and regression-based models These general types of models were selected for the study because these approaches were ones approved for use in adequate yearly progress (AYP) calculations in other states by the United States Department of Education (USDE) Though the goal of the pilot study was to evaluate the two general approaches, the study focused on two specific models representing these approaches The first model was a growth-to-proficiency type model called the Reaching the Standard (RTS) model The second model was a regression-based type model, the SAS® EVAAS® mixed-model using longitudinal methods A description of the two specific models evaluated in the study, an overview of the results of the pilot, and conclusions are presented in the executive summary below Detailed information about the two models and complete results of the pilot study are presented in the body of the report Additional data and information about the two specific models in the pilot study are contained in the appendices Growth Models Evaluated for the Pilot: The Reaching the Standard (RTS) model is a growth-to-proficiency type model that looks at test performance on a baseline test and calculates how much a failing student must improve in that subject each year in order to pass the test by grade or grade 11 For students who are in the Met Standard performance level, the model calculates what a student must score each year so that the student maintains or increases performance until grade or grade 11 For students who are in the Commended Performance level, the model sets the growth target each year as at least Commended Performance This model sets yearly growth targets for students in each subject based on how far they are from the Met Standard passing standard and their grade (number of years to grade or grade 11) Students’ actual performance is compared to their target performance to determine if they progressed adequately over the school year To meet the growth standard, failing students must improve enough each year to pass the test at grade or grade 11, passing students must maintain their performance level above Met Standard, and commended performance students must continue to perform in the commended performance level The SAS® EVAAS® mixed-model, longitudinal methods are examples of regressionbased models and look at test performance across subjects over all years a student tests in Texas These methods estimate the likelihood that each student will pass specific subject tests in grade or grade 11 These models provide two types of estimates, an estimate of individual students’ likelihood of reaching the proficiency requirements in subsequent grades and an estimate of the influence of campuses and districts on the academic progress of groups of students To meet the growth standard, both passing and failing students must have a 50% likelihood of passing the test in grade or grade 11 based on current and past performance These two models were chosen for the pilot study because they are well-matched to the data conditions in Texas, are representative of two general approaches of growth models currently approved by USDE for use in federal accountability in other states, offer the flexibility to potentially satisfy all requirements for growth measures, and can be adapted to accommodate upcoming changes to the Texas assessment program (e.g., move to a vertical scale in TAKS, introduction of end-of-course (EOC) assessments) Overview of Results: The growth-to-proficiency and regression-based models were compared on practical, psychometric, and empirical features A summary of how the models compared on these features is provided below A more detailed description of the comparisons is presented in the full report  Growth Definitions—the growth-to-proficiency model, the RTS model in this study, defines growth using only scores in one subject Growth targets are based on the proficiency level (i.e., did not meet standard, met standard, commended performance) a student is in at baseline, or the first time a student tests in grades 3-8 and again in grades 8-11 For students with scores below Met Standard at baseline in grades 3-8 and grades 8-11, students are expected to grow enough to reach Met Standard by grade and by grade 11, respectively Students with scores in the Met Standard level in grades 3-8 and 8-11 are expected to maintain scores at the same level above the Met Standard cut point until grade and grade 11, respectively Students with scores in the Commended Performance level in grades 3-8 and 8-11 are expected to continue to score in this level each year until grade and grade 11, respectively (note that students in this level are only required to score in this level and no requirement is made about how high in the level a student needs to score) If a student scores in a higher performance level, that student’s growth targets are reset to match the target requirements in that level The regression-based model in this study, the EVAAS® projection model, uses score data from all subjects and years for a student to project that student’s performance in the future (i.e., grade for grades 3-8, grade 11 for grades 811, and grade for Spanish testers) A student is defined to have met growth expectations if that student’s probability of scoring at least Met Standard in the projected year (e.g., grade or grade 11) is 50% or greater EVAAS® valueadded models, applied to the same multivariate, longitudinal data structures as the EVAAS® projection models, provide measures of the influence of educational entities on the academic progress of groups of students Results from the value-added models offer a metric to assess the effectiveness of districts, campuses and potentially teachers  Reproducibility and Interpretation—both models produce easy-to-understand information about whether students are on track for meeting the standard in grades and 11 The RTS model uses straightforward calculations that can be reproduced by school districts The EVAAS® projection and value-added models are calculated using publically-available formulas However, the statistical programs used for the calculations are computationally intense, so the calculations cannot be reproduced by school districts, but may be made available by the state agency Though the specific EVAAS® regression-based models evaluated in the study are not reproducible, other regression-based models that could be used for making student projections are reproducible and could be replicated by school districts  Data Responsive to Instruction—both growth models provide information about student performance across years and can be used to evaluate when students are or are not on track to pass by grade or grade 11 Because the RTS model uses data in one content area only to measure student growth in that content area, instructional changes in that area should coincide with changes in the student growth measure in that content area The EVAAS® projection model uses scores from all content areas to project a student’s likelihood of meeting the standard in one content area Though all scores are used in the projection, scores in the content area for which the projection is made will contribute most to the projection By using multiple measures to make projections, the accuracy of the projections is increased and error in the projections is decreased When content-area instruction results in changes in student performance in that content area, the EVAAS® projections will reflect those changes The EVAAS® value-added model provides information about the influence of campuses and districts on student progress The campus and district estimates are made from data from multiple years As the amount of data used to make the estimates increases, the stability and precision of these estimates increases When instruction by these educational entities results in changes in students’ scores, campus and district estimates from the EVAAS® value-added models will reflect the influence of the instruction  Testing of Models—the specific growth-to-proficiency model, the RTS model, was developed for use in Texas and has not been implemented operationally Experience using the model is limited to the pilot study However, aspects of the model reflect characteristics of similar growth-to-proficiency models approved for use in AYP calculations by the USDE and used in states such as North Carolina, Florida, and Arkansas Variations of the regression-based model used in the pilot study, the EVAAS® individual student projection model, have been approved for use in federal accountability reports for both Tennessee and Ohio These two states have implemented EVAAS® value-added modeling for accountability purposes at district and campus levels Additionally, the EVAAS® teacher reports are provided in Tennessee, Ohio, North Carolina, and various districts across the nation (e.g., Houston (TX) Independent School District and Richardson (TX) Independent School District)  Incorporation of Measures on Different Measurement Scales—both types of models (growth-to-proficiency and regression-based) can be used to measure student growth with assessments that are on different measurement scales Specifically, the RTS model uses standard deviation units, which can be used with measures on different scales The EVAAS® methods use normal curve equivalent units, which can also be used when measures have different scales  Adding Assessments—both types of models are flexible enough to be applied to assessments added to the Texas assessment system in the future, such as the TAKS-Modified and EOC assessments  End-of-Course Growth—applying a growth-to-proficiency model to the EOC assessments will be different from applying a growth-to-proficiency model to the TAKS assessments, since the shared content across EOC assessments will not be as extensive as shared content in TAKS assessments, and students will not necessarily take the EOC assessments in a standard sequence Given the unique features of the EOC assessments, the growth-to-proficiency model would be most useful in providing information about a student’s progress towards attaining his or her cumulative score on the EOC assessments required for graduation under SB 1031 Regression-based models, on the other hand, can be used for making projections and reporting the probability a student has for passing an EOC assessment These types of models account for the amount of shared content across EOC assessments and not require that EOC assessments be taken in a specified sequence, so the regression-based models are better suited to application to EOC assessments  Growth to Other Targets—both models can be used to evaluate student progress to other targets such as college readiness Additionally, a regressionbased projection approach can provide estimates linked to various college admission tests, such as SAT and ACT, which can be used as benchmarks showing readiness for students to be competitive in various college majors  Data Collection Changes Needed—neither type of growth model would require Texas to make changes to the way assessment data are currently collected Growth using the current assessment data could be reported at the campus, district, or state level using either model However, if the state were to report growth at the teacher level, a change in the way in which data are collected would be required Specifically, data linking student scores to specific teacher information would need to be collected  Accuracy of Growth Decisions and Projections—the two types of growth models are different, therefore the concept of accuracy differs across methods An important aspect of accuracy in a growth-to-proficiency type model relates to the decision about whether a student met or did not meet the growth target in a year A study was conducted to evaluate the accuracy of decisions about whether students met growth targets with the specific growth-to-proficiency type model studied in the pilot, the RTS This study indicated that the probability of being misclassified due to measurement error for several cohorts was between 4% and 19%, meaning the classification accuracy of the RTS model was between 81% and 96% For students below Met Standard, classification errors in the study were not greater than 2% and classification accuracy was at least 98% Accuracy in the EVAAS® projection model relates to the decision about whether students will meet the standard in the future To obtain a measure of the accuracy of the EVAAS® model projections, two studies were conducted The projected TAKS reading and mathematics results for students who were in grade in 2006 were compared with their observed 2007 TAKS reading and mathematics results The projected TAKS reading and mathematics results for students who were in grade and grade 10 in 2007 were compared with their observed 2008 TAKS reading and mathematics results Study results indicated that projection accuracy for projections to the next year for reading and English language arts was 93%-95% and for mathematics was 85%-89%  Reporting Options—both types of growth models could be used to support paper or online reports In addition, data from either type of model could be used to generate an online reporting system that would allow aggregation and disaggregation of growth information Two reporting features distinguish the two models evaluated in the study First, the timing of the reporting of results will differ for the two models The basic RTS growth results for a current year would be reported at the same time as the student assessment Confidential Student Reports and district/campus summary reports The more detailed longitudinal information would be reported using an online system that would be available each summer, after the regular assessment reports EVAAS® projections to growth standards for students and value-added reporting for districts and campuses would likely be made available after the regular assessment reports Second, the way in which the results would be reported would differ The basic RTS results for a current year would appear on the Confidential Student Reports and summary reports Additional longitudinal reporting would be provided with an online reporting system The EVAAS® results, on the other hand, would be reported using the SAS EVAAS® Web application which is currently used to report growth information for other states and districts To obtain empirical results in the pilot study, data on approximately 2.4 million students in TAKS reading, mathematics, science, and social studies in English and Spanish 2004-2007 were evaluated using the RTS and EVAAS® models The evaluation indicated the percent of students for whom sufficient data were available in the pilot study for calculating growth for each method and the percent of students who met growth expectations under each method in 2007 Note that the empirical results in the pilot study are dependent on the specific models implemented and that the results would be different had different models been applied or had different decisions been made to the models that were applied The empirical comparison of the two models included three sets of analyses First, for each subject, student level summary tables were created to display the total number of students who had a reportable scale score in 2007 in the TAKS subject and the percent of those with sufficient data to report growth under each specific method as part of the pilot For the RTS model, students in 2007 had to have a valid scale score in the subject and a baseline score to have sufficient data to report growth For the EVAAS® projection model, a student needed to have at least three historical scores to receive a projection For the EVAAS® value-added models, all student data could be 10 Table SASA1_3: Mean Prediction Error for Reading/Lang Arts by Predicted Achievement Categories Projected Performance Below Proficie Proficie Advance nt nt d All Grad e Mea Score n Project ed Residu al N Mea Score n Project ed Residu al N Mea Score n Project ed Residu al N Mea Score n Project ed Residu al N Mea Score n Project ed Residu al N 54.24 25.98 45.41 70.12 54.24 22.56 46.58 69.19 0.00 3.42 -1.17 0.94 62490 4168 32716 25606 56.96 24.28 48.53 73.71 56.96 23.51 48.86 73.34 -0.00 0.77 -0.33 0.36 64630 3640 35833 25157 58.56 25.55 51.00 78.33 58.56 23.67 51.40 77.99 0.00 1.88 -0.41 0.34 64983 4322 38657 22004 56.09 28.05 49.62 75.41 56.09 26.01 50.57 74.78 0.00 2.04 -0.95 0.63 64847 8348 33236 23263 57.30 27.82 49.70 75.52 57.30 25.72 50.34 75.12 0.00 2.10 -0.64 0.40 65600 6070 35085 24445 166 167 SAS Appendix 2: SAS Calculation of NCE Scores Calculate the percentile rank corresponding to each scaled score in the reference distribution The reference distribution is the distribution of scale scores in the baseline year (typically the first year the test was given statewide) for each grade and subject combination The percentile rank corresponding to a particular score, X, is calculated as PR(X) = 100 (Nbelow-X + 0.5 Nequal-X) / Ntotal = CumulativePercentX − 0.5 PercentX, where PR(X) is the percentile rank of score X, Ntotal is the number of students in the reference population, Nbelow-X is the number of students in the reference population with scores below X, Nequal-X is the number of students in the reference population with scores equal to X, CumulativePercentX is the cumulative percent of the population with scores below or equal to X , PercentX is the percent of the population with score X (e.g., from PROC FREQ) Calculate the NCE corresponding to the percentile rank, PR(X) Determine the corresponding (“equivalent”) value, Z, from a standard normal distribution For example, if PR(X)=.025, then Z = -1.96 The corresponding NCE score is then calculated as NCE = 50 + 21.0630579 * Z In this example, the NCE corresponding to PR(X) =.025 is 50+21.0630579*-1.96 = 8.7 168 SAS Appendix 3: Sample Reports The following pages contain samples of district and campus level value-added reports and projection reports for campuses Please note that the names on all reports are fictious, although the reports themselves are the results of Texas TAKS analyses: Student Projection Report (example: “met standard” student) Student Projection Report (example: “commended performance” student) Value-Added Summary Report at the district level (example: Alpha ISD) Value-Added Report at the campus level (example: Ophelia MS in Rivers ISD) Diagnostic Summary Report at the district level (example: Bertha ISD) Diagnostic Report at the campus level (example: Pluto HS in Delta ISD) Performance Diagnostic Summary Report at the campus level (example: Hemlock ISD) Performance Diagnostic Report at the campus level (example: Alaska Elem in Liberty Bell ISD) Additionally, three sample teacher reports are provided as well 169 8th TAKS Mathematics Projection Report for Student 170 171 172 173 174 175 176 177 178 179 180

Ngày đăng: 18/10/2022, 16:54

TÀI LIỆU CÙNG NGƯỜI DÙNG

  • Đang cập nhật ...

TÀI LIỆU LIÊN QUAN

w