TEACHING ISSUES The TESOL Quarterly publishes brief commentaries on aspects of English language teaching Edited by DANA FERRIS University of California, Davis Computer-Generated Feedback on Student Writing PAIGE WARE Southern Methodist University Dallas, Texas, United States doi: 10.5054/tq.2011.272525 & A distinction must be made between computer-generated scoring and computer-generated feedback Computer-generated scoring refers to the provision of automated scores derived from mathematical models built on organizational, syntactic, and mechanical aspects of writing (for details, see Brock, 1990, 1993; Burston, 2001; Chung & Baker, 2003; Leacock, 2004; Xi, 2010) Automated scoring is outside the scope of this article, because it presents a complex, controversial topic in its own right with few strong advocates in the writing community A formal position statement made by the College Conference on Composition and Communication (2004) voiced clear opposition to any use of machine scoring for assessment purposes In contrast, computer-generated feedback, the focus of this article, refers to a focus on computer tools for writing assistance rather than for writing assessment and has piqued the curiosity of many in the writing community (Ericsson & Haswell, 2006) WHAT IS COMPUTER-GENERATED FEEDBACK, AND WHY IS IT PIQUING CURIOSITY? Over the last decade, software developers have actively improved the systems for providing formative feedback rather than just summative scoring (Shermis & Burstein, 2003) Developers have repeatedly TESOL QUARTERLY Vol 45, No 4, December 2011 769 acknowledged that it is infeasible for computers to measure every aesthetic property of writing or to discern the amount of content knowledge that humans depict in their writing (Landauer, Laham, & Foltz, 2003) The resounding consensus about computer-generated feedback, among developers and writing specialists alike, is that the time is ripe for critically examining its potential use as a supplement to writing instruction, not as a replacement, and the assistance features of computer-generated feedback are central to this inquiry (cf Chen & Cheng, 2008; Shermis & Burstein, 2003; Ware, 2005; Warschauer & Ware, 2006) Most computer-generated feedback programs, also commonly known as automated writing evaluation (AWE) tools, are web based and offer a core set of support features, including a writing manual, model essays, and translators Student privacy is secured through a password-protected account, and the features students access are regulated by their teachers Students can submit multiple iterations of an essay and receive several different types of feedback, including holistic and analytic scores, graphic displays of feedback such as bar charts tabulating problematic areas, generic feedback on revising strategies, and individually tailored suggestions for improving particular aspects of their writing They can also interact with their teachers through private messages that function much like chat boxes that are automatically stored Teachers have flexibility in how they implement the program They can regulate the type of feedback provided as either analytic or holistic and either numeric or text based, and they can add their own feedback Teachers can view spreadsheets that organize student scores, frequency of revisions, minutes spent on task, questions asked, and error analysis reports They can create longitudinal portraits of student work and develop individual histories of students to track progress across time It seems that computer-generated feedback programs have been more purposefully developed over the last decade with the intention of allowing teachers to use their own pedagogical lens to shape many facets of how the programs are implemented DOES COMPUTER-GENERATED FEEDBACK HELP STUDENTS IMPROVE THEIR WRITING? Answering this question depends largely on how writing is defined and how computer-generated feedback is implemented Much of the research borne out of studies conducted by the software developers defines writing as discrete skills and performance on a constrained set of essay types, and when measured this way, writing scores have been shown to improve after intensive use of computer-generated feedback and 770 TESOL QUARTERLY assessment programs (Attali & Burstein, 2006; Elliot, Darlington, & Mikulas, 2004; Lee, Gentile, & Kantor, 2010; Shermis & Burstein, 2003) Many classroom teachers justifiably worry, however, that the type of writing being measured in these studies is mechanistic and formulaic, divorced from real-world contexts A few recent studies have examined more closely the implementation of computer-generated feedback in context First, a study by Warschauer and Grimes (2008) found no impact on secondary students’ scores on standardized writing tests when participating classroom teachers were free to implement the feedback programs in the way they saw best in their classroom Although teachers held mostly positive views and saw the benefits of using the software for classroom management and for student motivation, they did not use it as a regular part of the drafting and revising process, likely due to pressures to prepare their students for other state exams In contrast, in a randomized control study I recently carried out with secondary learners, significant gains were made on writing scores when the programs were implemented for 90 minutes a week across a 6-week period In addition to the gains in scores, the teacher and her students remained enthusiastic about the software across the duration of the project, with retrospective interviews indicating more favorable views across time An important caveat to these gains, however, is that only one narrowly defined type of writing, open-ended response, was taught due to the state’s curricular mandate In short, answering whether students’ writing can improve through computer-generated feedback may well be the wrong question to pose; rather, teachers would be advised to critically analyze the cost-benefit relationship of its use depending on the features of writing that are considered important by the particular institutional and instructional constraints in which they make pedagogical choices Over the long term, effects on the more observable mechanistic and formulaic aspects of writing may be counterproductive if they lead teachers and students further away from writing purposefully for real audiences HOW SHOULD COMPUTER-GENERATED FEEDBACK BEST BE INTEGRATED INTO WRITING INSTRUCTION? Pedagogical recommendations to be drawn from the small body of research on computer-generated feedback emphasize the need for a longterm commitment to the software, an emphasis on using the writing assistance tools over the scoring tools, and a balanced provision of teacher, peer, and computer-generated feedback (Chen & Cheng, 2008; Warschauer & Grimes, 2008; Warschauer & Ware, 2006) Other suggestions for implementation, although accompanied by a limited empirical TEACHING ISSUES 771 base, are nonetheless gaining momentum for practical purposes The main three suggestions include analyzing the context of the larger instructional framework, allocating sufficient time for professional development, and addressing individual differences among students First, choices about integration depend on the larger instructional framework into which computer-generated feedback is being implemented A recent study by Chen and Cheng (2008) examined three different types of integration used in three different postsecondary EFL classrooms In the class whose postsurvey responses from students were the most positive, the teacher made a long-term, 16-week commitment to use the program, and she used the computer-generated feedback only in the early stages of drafting Students first submitted their drafts to the computer and relied on its feedback until they reached a minimum cut-off score, at which point the teacher held individual conferences with them and organized peer feedback sessions On the other extreme, a second teacher used the computer-generated feedback for summative assessment, and she assigned students’ final grades based on the computer-generated scores The students protested, and she conceded to grade the final essays with the more traditional human feedback A third teacher reported having not invested much time in learning the software herself, so after a short 6-week period, she became frustrated with technical difficulties and stopped using it altogether Not surprisingly, her frustration was reflected in her students’ postsurvey comments as well These three portraits underscore the importance of critically analyzing the larger instructional context into which computer-generated feedback is to be integrated A second practical suggestion is to allocate sufficient time and support for teachers in the early phases of implementation, as the learning curve seems to be short yet rather steep The difference between teacher buy-in vs burnout early on can likely be resolved by dispelling some of the assumptions often made about the software For example, some teachers are much more enthusiastic once they realize they can disable all scoring features and rely solely on formative feedback; others prefer to use feedback programs primarily as prewriting tools and rely heavily on graphic organizers, model essays, and asynchronous communication with their students Teachers with minimal training in the software might not be aware that such versatility exists and be dismissive before critically engaging with both the pitfalls as well as the promises of the software A final consideration when integrating computer-generated feedback is the age and literacy level of the learners Although research on learner types is scant in this emerging field (Lai, 2009), there is some empirical evidence that for English language learners, the information presented as formative feedback can be overwhelming (Dikli, 2006) Feedback is primarily text-based and densely worded, using a high proportion of metalinguistic terminology Stronger writers in my study were concerned 772 TESOL QUARTERLY that they were missing important feedback about content that they perceived their teacher to be better at providing, whereas less confident writers appreciated the computerized instant feedback to help them find and correct surface-level errors It is unclear from the research base for which learner profiles computer-generated feedback may be most useful FINAL THOUGHTS Despite the promise of computer-generated feedback programs, there is still much apprehension that writing in these programs is defined too narrowly as a collection of surface features and formulaic writing Some of the concern possibly reflects a legacy inherited from the programs’ earlier, primary focus on assessment More research is needed that examines the context in which the system is used, the content of what is written, and the impact on key stakeholders as part of its integration And yet, to further our understanding, interested teachers are also needed who will provide a critical, grounded exploration of how computer-generated feedback might enhance their classrooms and, ultimately, how the programs themselves might be enhanced in that process THE AUTHOR Paige Ware is an associate professor at Southern Methodist University, Dallas, Texas, USA Her research examines technology-based literacy and language instruction in secondary and postsecondary contexts REFERENCES Attali, Y., & Burstein, J (2006) Automated essay scoring with e-rater v.2 Journal of Technology, Learning, and Assessment, 4(3), 1–30 Brock, M (1990) Customizing a computerized text analyzer for ESL writers: Cost versus gain CALICO Journal, 8(2), 51–60 Brock, M (1993) Three disk-based text analyzers and the ESL writer Journal of Second Language Writing, 2(1), 19–40 doi: 10.1016/1060-3743(93)90004-M Burston, J (2001) Computer-mediated feedback in composition correction CALICO Journal, 19(1), 37–50 Chen, C-F., & Cheng, W-Y (2008) Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning effectiveness in EFL writing classes Language Learning & Technology, 12(2), 94–112 Chung, K W K., & Baker, E L (2003) Issues in the reliability and validity of automated scoring of constructed responses In M D Shermis & J C Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp 23–39) Hillsdale, NJ: Lawrence Erlbaum Associates TEACHING ISSUES 773 College Conference on Composition and Communication (2004) CCCC Position statement on teaching, learning, and assessing writing in digital environments Retrieved from http://www.ncte.org/cccc/resources/positions/digitalenvironments Dikli, S (2006) An overview of automated scoring of essays Journal of Technology, Learning, and Assessment, 5(1), 1–35 Elliot, S., Darlington, K., & Mikulas, C (2004, April) But does it really work? A national study of MY Access! Effectiveness Paper presented at the National Council on Measurement in Education, San Diego, CA Ericsson, P F., & Haswell, R (Eds.) (2006) Machine scoring of student essays: Truth and consequences Logan, UT: Utah State University Press Lai, Y-H (2009) Which students prefer to evaluate their essays: Peers or computer program British Journal of Educational Technology, 41(3), 432–454 doi: 10.1111/j.1467-8535.2009.00959.x Landauer, T K., Laham, D., Foltz, P W (2003) Automated scoring and annotation of essays with the Intelligent Essay Assessor In M D Shermis and J C Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp 87–113) Hillsdale, NJ: Lawrence Erlbaum Associates Leacock, C (2004) Scoring free-responses automatically: A case study of a largescale assessment Examens, 1(3) Lee, Y-W., Gentile, C., & Kantor, R (2010) Toward automated multi-trait scoring of essays: Investigating links among holistic, analytic, and text feature scores Applied Linguistics, 31(3), 391–417 doi: 10.1093/applin/amp040 Shermis, M D., & Burstein, J (2003) ‘‘Introduction.’’ Automated essay scoring: A crossdisciplinary perspective (pp xiii–xvi) Hillsdale, NJ: Lawrence Erlbaum Associates Ware, P (2005) Automated writing evaluation as a pedagogical tool for writing assessment In A Pandian, G Chakravarthy, P Kell, & S Kaur (Eds.), Strategies and practices for improving learning and literacy (pp 174–184) Selangor, Malaysia: Universiti Putra Malaysia Press Warschauer, M., & Grimes, D (2008) Automated writing assessment in the classroom Pedagogies, 3(1), 22–36 Warschauer, M., & Ware, P (2006) Automated writing evaluation: Defining the classroom research agenda Language Teaching Research, 10(2), 1–24 doi: 10.1191/1362168806lr190oa Xi, X (2010) Automated scoring and feedback systems: Where are we and where are we heading? Language Testing, 27(3), 291–300 doi: 10.1177/0265532210364643 The Promise of Directed Self-Placement for Second Language Writers DEBORAH CRUSAN Wright State University Dayton, Ohio, United States doi: 10.5054/tq.2010.272524 & Evaluation is far from being a neutral process In recent years, tests have commanded increasing influence, which in turn, has broad 774 TESOL QUARTERLY