Regarding materials’ practices, the aspect of materials development which has received the most attention in literature is evaluation. Much of what has been written on evaluation focuses on procedures for evaluating materials and on the development of principled criteria. Many review articles published in professional journals are impressionistic, but some do subject the materials to rigorous criterion-referenced evaluations (Tomlinson, 2011). Littlejohn (1998, p. 192) also refined ‘A general framework for analyzing materials’ in which he uses a framework which recognizes the three levels of analyzing. Masuhara (1998) reviewed the literature on materials
36
evaluation and finds a gap that many frameworks seem to neglect teacher needs and wants. She then raised these needs and wants which can be goals of any evaluation of ELT coursebooks. McGrath (2002) focused on the evaluation of language learning materials – though Cunningsworth (1995) did so from a practical perspective much earlier. In subsequent chapters McGrath (2002) continued proposing principled ways for adapting and enhancing the coursebook once selected. Tomlinson (2003) distinguished between materials analysis as a description of what the materials contain and do, and materials evaluation as a criteria of the effect of the materials on their users. In this book, Tomlinson considered the principles of materials evaluation and to use them in proposing a criterion-referenced framework for the evaluation of materials, which distinguishes between universal principles (those applicable to all language learning contexts) and local criteria (those specific to a particular learning context). Other books which prefered to evaluate and to adapt materials are from Tomlinson and Masuhara (2018) and Tomlinson (2008), which contained evaluations of materials in current use for General English, and for young learners (2008a). These considerations may suggest the ideas on a coursebook’s suitability based on principles of materials’ evalution.
Also in his viewpoint, McGrath (2002), while suggesting the necessity of impressionistic evaluation (roughly comparable to Littlejohn’s suggestion) indicated that it is not adequate as the sole basis of the evaluation and selection of the coursebooks. He argued for a checklist consisting of important sections (namely, components/support for teacher, cost, target learners and target teaching context), but his later kind of evaluation has been more appealing among evaluators and evaluation checklist reviewers. His main purpose of providing this kind of evaluation is for comparing many books using criteria while still precisely looking the features of the coursebook with timesaving as a benefit. This McGrath’s checklist included four main criteria (i.e., practical considerations, support for teaching and learning, context- relevance, and attractiveness to learners) with their accompanying items that require the evaluator to give ‘yes/no’ answers in the checklist. In a checklist similar to
37
Garinger’s (2001, cited in McGrath, 2002), McGrath highlighted the matter of practical consideration by placing it in the first section. It might be because it is the first consideration for the administrators in some special situations like provinces that is hard to find the coursebook.
Tomlinson (2011) considered a framework in a checklist that differentiate feature compared to the other checklists of the decade 2000’s. Five main criteria (rational/learner needs, independence and autonomy, self-development, creativity, and cooperation) in this checklist are identified for psychological validity. These criteria raise the importance the materials should give to the student-centred view of learning, their needs, wants and their long-term goals. The important factors of cognitive and affective engagement of the learners, promoting their creative and critical thinking, and offering opportunities for their cooperative learning, which are among the emerging issues in recent years, have been enunciated among the questions of this category. Considering that coursebooks can act as a powerful driving for
‘consciousness-raising’ especially for new teachers and can empower teachers to explore possibilities for thriving, one may see the essential role of the checklists as wanting to enquire about the comparability between the coursebook and the teachers’
skills, abilities, their own philosophy and perspectives towards learning and teaching, and the extent to which the coursebook instruct them and encourages their creativity and flexibility in adopting, adapting, and developing the tasks and exploiting the content. These kinds of issues are reflected in this checklist where it enquires about the pedagogical validity through three criteria of ‘guidance’, ‘choice’ and
‘reflection/exploration/innovation’. In the section ‘process and content validity’, 14 basic criteria (‘methodology’, ‘content’, ‘appropriacy’, ‘authenticity’, ‘cultural sensitivity’, ‘layout/graphics’, ‘accessibility’, ‘linkage’, ‘selection/grading’,
‘sufficiency’, ‘balance/integration/challenge’, ‘stimulus/practice/revision’,
‘flexibility’, and ‘educational validity’) have been mentioned. Close reviewing of various sections of the other checklists reveals their comparability, more with the criterion of ‘process and content validity’ in this checklist. However, some issues on
38
language acquisition’ principle such as permitting a silent period at the beginning stages or in the learning of a new feature, and clarity in present language among the items of ‘methodology’ are not compared, explicitly, as surveying in the other checklists.
The instruments applying the view of the fit between a coursebook and curriculum/students/teachers are presented in some checklists (e.g., Byrd et al., 2001, cited in Tomlinson, 2011), and then this continued with the incorporation of psychological, pedagogical and process and content validity issues. Among the checklists that refer to these issues, some criteria are emphasized, while the same criteria. Tomlinson (2003) suggests a helpful procedure for developing criteria for materials evaluation. In this procedure, some issues are emphasized such as the distinction between evaluation and analysis questions, avoiding multiple questions in each question, avoiding unclear questions that are not answerable, avoiding dogmatic questions and avoiding the questions that may be perceived differently by different evaluators. These ideas are regarded as important factors for increasing the reliability of the evaluation. As it is seen, not all the developers have considered these criteria in their checklists.
In this regard, among the checklists with rating scales, some have used analytical questions (e.g., Garinger, 2002, in Tomlinson, 2011; McGrath, 2002) and some evaluation questions (e.g., Litz, 2005; in Tomlinson, 2011). The other checklists have employed both evaluative and analytic criteria without ordinal scales that may not lead to reliable evaluation. Using of multiple questions can be seen in some of the checklists in this line of evaluation. Questions such as “Do the materials help individual learners discover their learning styles and preferences, study habits and learning strategies?” or “Does it allow the students to make use of their linguistic abilities and to put into practice their communicative competence?”, and “Does it include up-to-date and relevant grammatical structures and lexicon?” (Tomlinson, 2011) are better separated if an evaluation checklist is going to be maintained higher reliability. The question “Does the coursebook encompass cultural values in global
39
context?” may be seen differently by different evaluators. These flaws, if corrected after the refining of the developed checklists may provide more systematic, serious, and reliable evaluations.
One feature that should be noticed about a useful checklist in coursebook evaluation is the key-words in a checklist. The frequency of key-words within checklists in Mukundan (2010, cited in Tomlinson, 2011) revealed the common key- words used by checklist developers in four decades. The high frequency key-words in each decade would be defined as the ones that have been used in the majority of the reviewed checklists in that decade. Considering this issue, the key-words which include students, teachers, content, skills and practice, among the other criteria, were the most cited in all decades. Other key-words like clarity, culture, different kinds of activities and exercises, interest, layout and tests are emphasized more in the checklists of 1980s, 1990s and 2000s; vocabulary in 1970s, 1980s, 1990s; authenticity in 1980s and 2000s; communicative approach on task and structure of units in 1980s and 2000s; context in 1990s and 2000s; availability of the material, teaching methodology, objectives, and assessment in 1980s and 1990s; syllabus and themes in 1990s; and supplementary materials and intensive exercises in 1970s, 1980s, 2000s.
There was also consistency between the frequency of the key-words and the total number of running words of the checklists (in subject matters and methodology).
In general, the study of checklists indicates no specific preference or pattern for the arrangement of the criteria and their underlying items in all the four decades.
Some criteria are emphasized more in separate sections while others are used as sub- categories under the main or general criteria. Some put forward the idea of preliminary or “initial evaluation” (e.g. Grant, 1987; Littlejohn, 1998; McDonough and Shaw, 2003; McGrath, 2002; cited in Tolinson, 2010) before going through in-depth evaluation of the coursebooks. In addition, the count of running words specifies both short and long checklists in all decades. Skierso’s checklist (1991, cited in Tomlonson, 2010) is the most comprehensive one 348 Research for Materials Development but some may question the length of it as it probably would not be
40
practical. Teachers would not have time to use such a lengthy instrument for evaluation purposes. In this line of suggestion, Cunningsworth (1995) notes that “it is important to limit the number of criteria used, and the number of questions asked to manageable proportions; otherwise, we risk being swamped in a sea of details” (p. 5).
Teachers today have a lot of choices when it comes to evaluation instruments.
There is a danger in this though. Most instruments are developed because institutions believe they must possess their own – almost as if it were a matter of pride to have one. Many of these instruments developed, some hastily, are neither tested for validity nor reliability. Teachers must be made aware that “the framework used must be determined by the reasons, objectives and circumstances of the evaluation”
(Tomlinson, 2013, p. 35).
Clarity is a main factor for the criteria of a good checklist; otherwise, if it is lacking, the number of unanswered questions would increase. This would reduce the reliability of the evaluation. In similar vein, in order to increase the validity of the evaluation, the checklists should base on the target situation of use so that they can be matched and evaluated for teaching contexts. This is in line with what Littlejohn (1998) proposed in his preparatory framework for material analysis, evaluation and action. The checklists should be developed as independent evaluating tools, so that even an inexperienced evaluator can understand the criteria (without requiring the developers’ elaboration on them), otherwise, they may be left aside at first glance. In this regard, one maypropose some important factors to be considered in the process of checklist development.
The three main features that one maybelieve checklist developers should keep in mind when developing checklists are their:
• Clarity
• Conciseness
• Flexibility
The checklists should have items that make evaluators visualize materials will ensure:
41
• To invoke learner’s affective and cognitive domains in the learning process
• To enhance learner’s confidence, stimulating different learning styles, caring needs and wants, to bring on positive attitude to language
• To maximize attractiveness via layout (pictures, illustrations, colour,
“white space” (Tomlinson, 2003, p. 21)
• To present different models of target language in authentic use
• To include up-to-date topics of interests (i.e., culturally specific but being universal in all cultures) (Tomlinson, 2011)
• To encourage creativity and advanced thinking through various activities and to give freedom for students to discover the language
• To prefer the four language skills
• To serve the functional load of vocabulary and its recycling
• To promote effective methodology (providing flexibility to different teaching methods of teachers)
• To have proper level of competencies for students’ readiness
• To contain essential supplementary materials
These aforementioned points may be effective in helping teachers have a better understanding of their coursebooks and how effectively they perform. These points also provide theoretical grounds for setting a checklist of criteria as an instrument in coursebook evaluation, which will be discussed in the next part.