THIẾT KẾ BÀI THI
1 Ketnooi.com share this to you CHAPTER 1: INTRODUCTION 1.1 RATIONALE The importance of language testing is recognized by virtually all professionals in the field of language education It is of special importance in educational system that is highly competitive as testing is not only an indirect stimulus to learning, but also plays a crucial role in determining the success or failure of an individual's career with direct implications for his future earning power "Thus, testing is an important tool in educational research and for programme evaluation, and may even throw light on both the nature of language proficiency and language learning"(Lauwerys and Seanlon, 1969) Likewise, in the teaching and learning foreign language process, testing takes a very important role Language testing is one of the most important ways to evaluate how students acquire when they learn a foreign language Through tests teachers know not only the success or failure of learners but also how well the learners use what they have been taught Moreover, the learners know what they gain, what they can apply, and what they cannot Moore (1992, p.138) states: “Evaluation is an essential tool for teachers because it gives them feedback concerning what the students have learned and indicates what should be done next in the learning process Evaluation helps you to better understand students, their abilities, interests, attitudes and needs in order to better teach and motivate them.” Nga (1997, p.1) reaches the same conclusion: “Tests are assumed to be powerful determiners-of what happens in classroom and it is commonly claimed that they affect teaching and learning activities both directly and indirectly.” Therefore, testing is an important part of the teaching and learning process; but has it been given adequate attention and careful study yet? Test researchers (Hughes, 1989; Brown, 1995; Read, 1982; Hai, 1999; Tuyet, 1999) in general claim that unfortunately tests have got a bad rap in recent years and not without reason More often than not, tests are seen by learners “as dark clouds hanging over their heads, upsetting them with thunderous anxiety as they anticipate the lightning bolts of questions they not know and worst of all a flood of disappointment if they not make the grade” (Brown, 1994a: p.373) Hughes (1989, p.1) makes another comment on recent language testing: “It cannot be denied that a great deal of language testing is of very poor quality Too often language tests have a harmful effect on teaching and learning and too often they fail to measure accurately whatevaer it is they are intended to measure.” This coupled with the fact that teachers frequently lack Ketnooi.com share this to you formal training in educational measurement techniques and they tend to be alienated from the testing process They regard it as a necessary evil, an intrusion on their regular instructional activities At present, English tests at Son La Teachers’ Training College (STTC) have the following characteristics: - It has not been given appropriate attention and careful study - Its role in teaching and learning has not been fully recognized - Almost language teachers think that teachers should be responsible for making tests because testing is one part of teaching and learning activities that students have to pass - There has been a tendency using commercial (ready-made) tests rather than teacher selfmade tests since commercial tests are very convenient and not take much time to construct Thus these selected tests may not be relevant to the objectives of the course - Test content is sometimes found to be unrelated to the objectives of the course and very often many test items in some tests have not been dealt with classes - Students have complained that there is still a big gap between what is taught and what is tested An instance for this would be the case when some tests designed for preintermediate level are given to students of elementary level They are so difficult that only few students can accomplish Therefore, such tests are not valid and reliable - Using tests exclusively for grading, there is no feedback about the tests - There has been no discarding of bad tests or bad items Some items are found to be so difficult that few testees could whereas there are test items, which are so easy that all testees can obtain the correct answers Such items should be discarded or replaced - Moreover, due to the fact that the writing and reading comprehension tests at the university are totally designed with multiple choice techniques so students can easily cheat by asking and copying answers from their classmates - Apart from those carefully designed tests, some others are still of low and poor quality and these not accurately measure the students' real ability Perhaps the test writer only pays attention to the fulfillment of his/her duty, which is to give tests, rather than to the effectiveness of the tests Those tests often fail to measure accurately whatever they are intended to measure - Finally, the last testing problem at STTC is that some of the tests may lack reliability because they are not pre-tested anywhere else for the sake of confidentiality Truly, for the Ketnooi.com share this to you sake of "confidentiality" test designers are often informed to write tests at short notice, just some time before it is administered In such circumstances who can say for sure that the required standards, criteria will be met by the test writers? Therefore, a well-design test is necessary for every language level especially for college level since it is the elementary level, which aims at acquiring survival English and diagnosing students’ aptitudes in the course and what they have to study to improve both their knowledge and skills In this minor thesis, the author bases herself on the knowledge of testing and testing situation to propose a sample achievement test for the first year students who have been taught the student’s book New Headway English Course (elementary level) from unit to unit 1.2 SCOPE OF THE STUDY The scope of the study focuses on the existing situations at Son La Teacher’s Training College I design a sample test only on writing and reading skills focusing on grammar, vocabulary, reading and writing skills The study provides investigated and analyzed data of the achievement test for the first-year non-English major students Moreover, the teachers’ and students’ comments on the test and their suggestion for its improvement will be presented in this thesis 1.3 AIMS OF THE STUDY The aim of the study is to report a research examining the current testing situations and language tests for non-English majors at STTC with great emphasis on analyzing the result of the sample test, the teachers’ and students’ comments on the test and their suggestion for its improvement The specific aims of the study are: To investigate the STTC teachers’ evaluation and students’ evaluation of the sample test concerning its content, time allowance and its format To investigate the teachers’ suggestions and students’ suggestions for improving testing situations and language tests at STTC To propose an achievement test construction for the first-year students at STTC and a sample test will be designed based on the proposed test construction To offer some practical recommendations for improving of testing situation at STTC Ketnooi.com share this to you 1.4 METHODS OF THE STUDY In order to achieve the above aims, a study has been carried out with the following approach Basing on the theory and principle of language testing, major characteristics of a good test, especially achievement tests, the author analyzes the results of the sample test, and the survey questionnaire done on 10 English teachers of the English major students at STTC Many other methods, such as interviews, informal discussion with students, teachers, and classroom testing observation are also employed to get more needed information 1.5 RESEARCH QUESTIONS The research questions of the study are as follows: What should be done to improve the English testing situation for the first-year students at STTC? Which test components are considered appropriate for the English Achievement test construction at STTC? 1.6 DESIGN OF THE STUDY The minor thesis is organized into four chapters Chapter one is the introduction consisting of the rationale, the aims, the method, the research questions and the design of the study Chapter two presents the literature review on the basic concepts of testing, types of tests and characteristics of good tests, the test items, test item types of language components and language skills Chapter three, which is the main part of the study, shows the analysis of the finding of test designing and some brief comments from teachers and testees Chapter four deals with some suggestions to improve the test and the summary of the research Ketnooi.com share this to you CHAPTER 2: LITERATURE REVIEW 2.1 BASIC CONCEPTS OF TESTING According to Brown (1994: p.252), “A test, in plain or ordinary words, is a method of measuring a person’s ability or knowledge in a given area.” Moore (1992: p.138) proposes that evaluation is an essential tool for teachers because it gives them feedback concerning what the students have learned and indicates what should be done next in the learning process Evaluation helps us to understand students better, their abilities, interests, attitudes, and needs in order to better teach and motivate them However, in the book of Brown (1994, p.373) he stresses that tests are seen by learners as dark clouds hanging over their heads, upsetting them with thunderous anxiety as they anticipate the lightning bolts of questions they not know and worst of all a flood of disappointed if they not make the grade Read (1983, p.3) shares the idea saying a language test is a sample of linguistic performance or a demonstration of language proficiency In other words, a test is not simply a set of items that can be objectively marked; it can also involve a ‘subject’ educational of spoken and written performance with the assistance of a checklist, a rating scale, or a set of performance criteria.” Nga (1992, p.2) also confirms that tests commonly refer to a set of items or questions designed to be presented to one or more students under specified conditions Harrions (1986, p.1) notices that a natural extension of classroom work, providing teachers and students with useful information that can serve as a basis for improvement and a test is necessary but unpleasant imposition from outside the classroom That means test is a useful tool to measure learners’ ability in a certain situation especially in classroom 2.2 TYPES OF TESTS 2.2.1 Proficiency Tests According to Hughes (1990:9), “Proficiency tests are designed to measure people’s ability in a language regardless of any training they may have had in that language.” That is to say the content of a proficiency test is not based on the content or objectives of any language course test takers may have followed It is rather based on a specification of what they have to be able to in the language to meet the requirement of their future aims Ketnooi.com share this to you Other test specialists, such as Carroll and Hall (1985), Harrison (1986) and Henning (1987) share the same view that proficiency test helps both teachers and learners know whether the learners can be able to follow a particular course or they have to take some predeparture training to some other popular tests such as TOEFL, IELTS, which are used to test students’ proficiency for their study in some English speaking countries In Vietnam proficiency tests are of different levels namely A, B, C for workers, engineers, teachers, architects, etc 2.2.2 Achievement Tests As it has been mentioned above, not many teachers are interested in proficiency tests since it does not base on any particular course book (Hughes, 1990:10) states: “In contrast to proficiency tests, achievement tests are directly related to language courses, their purpose being to establish how successful individual students, groups of students, or the courses themselves have been in achieving objectives” Achievement tests are usually carried out after a course on a group of learners who take the course Sharing the idea about achievement tests with Hughes, Brown (1994:259) suggests: “An achievement test is related directly to classroom lessons, units or even total curriculum” Achievement tests, in his opinion, “are limited to a particular material covered in a curriculum within a particular time frame.” Another useful comment on achievement tests offered by Finocchiaro and Sako (1983:15) is that achievement types or attainment tests are widely employed in any language teaching institutions They are used to measure the amount of degree of control of discrete language and cultural items and of integrated language skills acquired by the students within a specific period of instruction in a specific course” In his book, Harrison (1983:7) shows: “an achievement test looks back over a longer period of learning than the diagnostic test, for example, a year’s work, or even a variety of different courses.” He also points out that achievement tests are intended to show the standard, which the students have reached in relation to other students at the same level There are two kinds of achievement tests: final achievement tests and progress achievement tests Final achievement tests are those administered at the end of a course of study They may be written and administered by ministries of education, official examining boards, or by Ketnooi.com share this to you members of teaching institutions Clearly, the content of these tests must be related to the courses with which they are concerned, but the nature of this relationship is still a matter of disagreement amongst language testers According to some testing experts, the content of a final achievement test should be based directly on a detailed course syllabus or on the books and other material used This has been referred to as the syllabus–content approach It has an obvious appearance, since the test only contains what it is thought that the students have actually encountered, and thus can be considered, in this respect at least, a fair test The disadvantage of this type is that if the syllabus is badly designed, or the books and other materials are badly chosen, then the results of a test can be very misleading Successful performance on the test may not truly indicate successful achievement of course objectives The alternative approach is to design the test content directly on the objectives of the course, which has a number of advantages Firstly, it forces designers to elicit course objectives Secondly, test takers show how far they have achieved those objectives This in turn puts pressure on those who are responsible for the syllabus and for the selection of books and materials to ensure that these are consistent with the course objectives Tests based on course objectives work against the perpetuation of poor teaching practice, a kind of course–content–based test, almost as if part of a conspiracy fails to It is the author’s belief that test content based on course objectives is much preferable, which provides more accurate information about individual and group achievement, and is likely to promote a more beneficial backwash effect on teaching Progress achievement tests, as the name suggests, are intended to measure the progress that learners are making Since ‘progress’ in achieving course objectives, these tests should be related to objectives These should make a clear progression towards the final achievement test based on course objectives Then if the syllabus and teaching methods are appropriate to these objectives, progress tests based on short – term objectives will fit well with what has been taught If not, there will be pressure to create a better fit If it is the syllabus that is at fault, it is the tester’s responsibility to make clear that it is there, that change is needed, not in the tests In addition, more formal achievement tests require careful preparation; teacher could feel free to set their own ways to make a rough check on students’ progress to keep learners on Ketnooi.com share this to you their toes Since such tests will not form part of formal assessment procedures, their construction and scoring need not be purely towards the intermediate objectives on which a more formal progress achievement tests are based However, they can reflect a particular ‘route’ that an individual teacher is taking towards the achievement of objectives 2.2.3 Diagnostic Tests According to Hughes (1990:13), “Diagnostic tests are used to identify students’ strengths and weaknesses They are intended primarily to ascertain what further teaching is necessary” Brown (1994:259) proposes, “A diagnostic test is designed to diagnose a particular aspect of a particular language.” Harrison (1983) remarks that this kind of tests is used at the ends of a unit in the course book or after a lesson designed to teach one particular point This kind of test is reasonably straight-forward to find out what skills are applied well or badly by the learners Otherwise, this leads to disadvantage, as it is not so easy to obtain a detailed analysis of a learner’s command of grammatical structures In order to be sure of this, we would need a number of examples of the choice the student made between the two structures in every different context on which we thought was significantly different and important enough to warrant obtaining information Tests of this kind still need a tremendous amount of work to produce Whether or not they become generally available will depend on the willingness of individuals to write them and of publishers to distribute them 2.2.4 Placement tests According to Hughes (1990:14), “Placement tests are intended to provide information which will help to place students at the stage of the teaching progamme most appropriate to their abilities Typically, they are used to assign students to classes at different levels.” In other words, we use placement tests to place pupils into classes according to their ability so that they can start a course approximately at the same level as the other students in the group 2.2.5 Progress Tests A progress test is designed to measure the extent to which the students have mastered the material taught in the classroom It is based on the language programme which the students have been following and is just as important as an assessment of the teacher's own work as the students' own learning Results obtained from the progress tests enable the teacher to Ketnooi.com share this to you become more familiar with the work of each of the students and with the progress of the class in general It also aims at stimulating learning and reinforcing what has been taught Good performances may act as a mean of encouraging the students, and even poor performances may act as an incentive-to more work According to Baker (1989, p.103), the frequent use of the progress test, as a goad to encourage application on the part of the learners, can also in theory serve as a basis for decisions on course content, learner placement and future course design He also concludes that the results of a progress test can be used as an indication to parts of the course content, which have not been mastered by numbers of students and thus need remedial action Moreover, a properly written progress test sampling correctly from the course content can be a pointer to learners which part of the course need more attention, and to course designers which parts of the course have not been effective Whereas, Khoa's research (1999, p 13) establishes: “A progress test is an ‘on-the-way’ achievement test, which is linked to the specific content of a particular set "of teaching materials" or particular course of instruction Progress tests are prepared by a teacher and given at the end of a chapter, a course, or a term They may also be regarded as similar in nature to achievement tests but narrower and much more specific in scope These tests help the teacher to judge the degree of success of his or hers in teaching and to identify the weaknesses of the learners The application of progress tests is gaining force in many universities and colleges in Vietnam nowadays They are parts of what is generally known as ''continuous assessment", a process of assessment which takes into consideration the results scored by students when they did their progress tests 2.2.6 Direct versus Indirect Tests It is pointed out by Hughes (1990:15) that direct testing requires the candidate to perform precisely the skills that we wish to measure If we want to know how well the candidate can write compositions, we ask them to write compositions If we want to know how well they pronounce words, we ask them to speak The tasks, and the texts which are used, should be as authentic as possible There is a fact that the tasks cannot be really authentic Nevertheless, the effort is to make them as realistic as possible Direct testing is easier to design when it is intended to measure the productive skills of speaking and writing since 10 Ketnooi.com share this to you 10 the very acts of speaking and writing provide us with information about the candidate’s ability With listening and reading it is necessary to get candidates not only to listen or read but also to demonstrate that they have done this successfully He also indicates several attractions of direct testing Firstly, if teachers want to assess pupils’ ability, it is relatively straightforward to create the conditions, which will elicit the behavior based on judgments Secondly, in his opinion at least in the case of the productive skills, the assessment and interpretation of students’ performance is quite straight - forward Thirdly, there is likely to be a helpful backwash effect since practice for the test involves the practice of the skills that we want to encourage By contrast, indirect testing tries to measure the abilities that “underlie” the skills in which we are interested (Hughes, 1990:15) One section of the TOEFL is considered an indirect measure of writing ability where the candidate has to identify which of the underlined elements is erroneous or inappropriate in formal Standard English Another example of indirect testing id Lado’s (1961) proposes methods of testing pronunciation ability by a paper and pencil test in which the candidate has to identify pairs of words, which rhyme with each other The main problem with indirect tests is that the relationship between language performance and skill performance in which we are usually interested tends to be rather weak in strength and uncertain in nature We not know enough about the component parts of composition writing to predict accurate composition writing ability from scores on tests that measure the abilities, which we believe underlies it We may construct tests of grammar, vocabulary, discourse markers, handwriting, and punctuation Still we will not be able to predict accurately scores on compositions even if we make sure of the representation of the composition scores by taking many samples 2.2.7 Discrete point verse integrative testing According to Hughes (1990:16), “Discrete point testing refers to the testing of one element at a time, item”, which means the test involves a series of items and each item tests a particular grammatical structure On the contrary, integrative testing requires the candidate to combine many language elements in the completion of a task involving writing a composition, taking notes while listening to a lecture, taking a dictation, or completing a cloze passage Henning (1987) shares with Hughes the idea that discrete point tests will usually be indirect, while integrative tests will tent to be direct However, some integrative 10 23 Ketnooi.com share this to you 23 Hello - Verb to be everybody! (am/is/are) - Possessive adjectives (my, your, his her) Meeting Verb to be people Questions and negatives Negative and short answers Possessive ‘s The world - Present simple of work - Questions and negatives - Countries - using bilingual dictionary Take it easy! - Present simple - Leisure activities Where you live? - There is/are - some/any - How many? - this/that/ these/ those - Rooms - Household goods - Parts of a plane - Places - Countries and languages - Family - Food and drink - Opposite adjectives - Verbs - Jobs types Reading comprehension Q type: Gap-filling type Qs type: Write a short text about yourselves Reading comprehension Q-type: True-False sentences Write questions for given answers Reading comprehension - Match sentences with photographs - Answer questions Reading comprehension Q-type: - Fill in the gaps - Short answers - Correct mistakes Qs type: Writing a letter Qs type: Writing a letter Reading Qs type: comprehension Describing Short-answer where you live items True-false sentences Can you - Can/can’t Reading Qs type: speak - Could/ couldn’t comprehension Writing a letter English? - was/ were Q-type: of application Short-answer items for a job Then and - Past simple - Verbs Reading Qs type: now comprehension Describing a Q-type: Gap-filling holiday - Mark True-False - Write questions How long Past simple Relationship Reading Qs type: ago? Negative and ago comprehension Describing an Qs type: old friend Reading for specific information 3.2 THE CURRENT TESTING SITUATION AT STTC 23 24 Ketnooi.com share this to you 24 Based on the experience in teaching English at STTC for nearly years, the author has learned that testing is not the most important thing for teachers at STTC Classroom teachers themselves design most language tests by using a cut-and-paste method, by which it means they use commercial tests available to write tests without following any rules of testing It is thought that teachers who are able to teach are able to design a good test for their learners English has not always been an important subject; it is one of general compulsory subjects such as, philosophy, political economy, psychology, computing which are not used in final college examinations Therefore, the teachers of English section regard the class progress tests as a mean to estimate the students’ results in learning as well as to reinforce their knowledge and motivate their learning After about ten lessons students will have a written test which means, after fifteen weeks studying in each term, they will have three tests At the examinations, only one test is given to all the students of the same level Objective tests such as multiple-choice question items, matching items, cloze test, etc have been used in order to achieve the high-test reliability and discrimination among the test takers In general, English tests look good and reasonable for students where the test takers write down the answers on the examination papers Since this textbook has already been used for the first year, some progress tests are designed and run by teachers of the English group Most of them are familiar with training English major students, so the tests are thought as validity, reliability and discrimination for the students not for non-English major ones Up to now, there has not been an English standard item for students at STTC Therefore, English tests are mainly taken from grammar book designed for learners at the elementary level such as practical grammar usage (exercises book ), revisions and tests, grammar in use, sentence building, sentence transformation, filling the gap (published by Ho Chi Minh City Publishing House), etc Thus, as mentioned earlier, the content of the test is sometimes found to be rather unrelated to the course objectives and may lack the most important criteria for a good test concerning its validity, format, practicality, and reliability On the other hands, the test content of both progress tests and achievement test for the first year at STTC are likely to fail to measure the language skills and language components As mentioned above, students not have opportunities to practise speaking and listening skills have no chances to test learners’ speaking ability otherwise measuring 24 25 Ketnooi.com share this to you 25 the way of pronunciation of students is done by phonetics section of the test which is not reasonable The Sample Achievement test consists of four sections with eight item types: Section I Grammar and Vocabulary (Fifteen multiple-choice questions) Section II: Grammar and Vocabulary (Ten matching items) Section III: Writing (Sentences building with given words) Section IV: Grammar and vocabulary (Ten multiple-choice questions) Section V Gap-filling 3.3 THE PROPOSED CONSTRUCTION OF THE ACHIEVEMENT TEST FOR THE FIRST YEAR STUDENTS AT STTC 3.3.1 Test objectives As mentioned above, achievement tests are directly related to the language course When taking this test, students have to study units of the course book for the first year The time allowance is 90 minutes It is essential to design a test that is suitable to what students have been taught, and satisfied the objectives of the course The problem is that students are required to master four macro-skills but we are not able to test speaking and listening skills The objectives of the achievement test are: - To grade students - To elicit the abilities of students on grammar and usage - To evaluate teachers’ methods 3.3.2 The Paper Specification Grids for the 2nd Term Achievement Test The test consists of 50 items which are divided into five sections Section I aims at checking use of grammatical structures and vocabulary that students have studied by asking them to choose the best option from four given options This part accounts for 30 % of the total mark Section II asks students to match questions with suitable answers The objective is to check students’ grammar structures and vocabulary This part accounts for 20 % of the total mark Section III checks students’ writing skill through building five correct sentences using given words This part accounts for 10% of the total mark Section IV aims at checking the use of grammatical structures and vocabulary that students have learnt by ticking one correct sentence from two given sentences This part accounts for 20% of the total mark 25 26 Ketnooi.com share this to you 26 Section V assesses students’ reading comprehension skill through an extract of text about 200 words which consists ten gap-fillings with given words This part accounts for 20% of the total mark The weighing of the different parts is based on their importance with regard to the syllabus design of the course The specification grids of the test are as follows: Table 4: The specification grids of the test Par t Main skill & language components focus Grammar and Separate sentences Vocabulary Reading with gaps Stimulus and Writing Grammar and Vocabulary Reading Input responses Incomplete sentences Separate sentences Narrative approx 200 words Total Response/item type x15, 4-option Markin g multiple 30 choice x 10, sentence matching 20 x 5, sentence building 10 x10, multiple 20 choice x 10, gap filling with given 20 2-option words 100 3.3.3 Data collection The test was administrated to 50 students of Course 10 th at STTC having finished their nd semester There were two teachers supervising the administration of the test seriously to ensure that students did the test individually and that was cheating in the examination Then the test papers were collected for the marking and later analysis To analysis the data input to evaluate the reliability and the validity of the final achievement test, the author wishes to combine some instruments which are shown below: Formula 1: Rt = N − x( N − x) x N −1 Nsd to compute the reliability coefficient Software: Item and Test Analysis Program-ITEMMAN for Windows Version 3.50 to analyze item difficulty and item discrimination and to evaluate construct validity 3.3.4 Interpretation and test score analysis: 26 27 Ketnooi.com share this to you 27 3.3.4.1 The frequency distribution Table 5: Frequency distribution in the final achievement test Converte Frequency d scores (f) (x) 1 1.5 2.5 3.5 4.5 5.5 6.5 7.5 10 8.5 11 Total (fx) 2.5 4.5 16.5 24 13 28 45 80 93.5 18 342 Figure 3.3.4.1 Histogram of score distribution It can be seen from the histogram that the scores are distributed unevenly, in which more students got marks from 7.5 to 8.5 than the others 3.3.4.2 The central tendency - The mean = the average score: X = X 342 = ≈ 6.8 N 50 ( X = the arithmetic mean, X = the sum of all scores, N = the number of test takers) - The mode = the score gained by the largest number of students Therefore, as we can see in the histogram above, the mode is 8.5 (there are 11 students got mark 8.5) - The median = the scores gained by the middle testee in the order of merit In this case it is 7.5 3.3.4.3 The dispersion - The range = the difference between the highest and the lowest scores The range = – = 27 28 Ketnooi.com share this to you 28 - Standard derivation = the average amount that each student’s score deviates from the mean Sd is calculated as follows: Sd = ∑(X − X ) X: any observed score in the sample N X : The mean of all scores N: number of scores in the sample Sd = 125.75 (1 − 5.8) + (2 − 5.8) + + (8.5 − 5.8) + (9 − 5.8) Sd = = 2.515 ≈ 1.59 50 50 = The standard derivation of the test is 1.59 and the standard derivations of scales in the test are: Table 6: The standard derivations of scales in the test Scale Sd 2.829 2.988 0.968 1.910 3.200 These numbers show us that the test and all the scales in the test have quite large standard derivation That means the score distribution is wide, the scale has spread the students out and there is a wide range of students’ ability 3.3.5 Test items evaluation 3.3.5.1 The item difficulty The item difficulty (p) or facility value (FV) is an index that shows how easy or difficult a particular item proved in the test Level of difficulty = proportion of students getting it right = average score on this item E.g.: p = 0.65 means 65% students got it right p = 1.00 means 100% students got it right p = 0.00 means no students got it right The scales for this item difficulty are: p = 0.80 → 1.00: Probably too easy p = 0.25 → 0.80: Acceptable p = 0.00 → 0.25: Probably too difficult The interpretation of the item difficulty of the test is shown in following table: Table 7: The interpretation of the item difficulty of the test 28 29 Ketnooi.com share this to you Item P Acceptable Too easy 0.86 √ 0.62 √ 0.80 √ 0.84 √ 0.26 √ 0.72 √ 0.90 0.68 0.82 10 0.20 11 0.80 √ 12 0.84 √ 13 0.78 14 0.82 15 0.48 16 0.86 √ 17 0.88 √ 18 0.78 19 0.89 √ 20 0.90 √ 21 0.82 √ 22 0.86 √ 23 0.74 24 0.88 √ 25 0.88 √ 29 Difficult √ √ √ √ √ √ √ √ √ P 26 0.02 √ 27 0.08 √ 29 Item Too easy Acceptable Difficult 30 Ketnooi.com share this to you 30 28 0.58 √ 29 0.26 √ 30 0.00 31 0.88 √ 32 0.92 √ 33 0.92 √ 34 0.86 √ 35 0.90 √ 36 0.80 √ 37 0.90 √ 38 0.76 39 0.92 40 0.58 √ 41 0.34 √ 42 0.76 √ 43 0.70 √ 44 0.44 √ 45 0.42 √ 46 0.74 √ 47 0.58 √ 48 0.70 √ 49 0.76 √ 50 0.60 √ √ √ √ From table 7, it can be seen that most of the items used in the test are of suitable level for students There are only items that are too difficult for students (10, 26, 27 and 30) with quite low level of difficulty (0.20, 0.02, 0.08 and 0.00) There are 24 items that are too easy for students Both too difficult and too easy items should be changed to make the test more reliable And attention should also be paid to the knowledge or skills that most students had problem with 30 31 Ketnooi.com share this to you 31 3.3.5.2 The item discrimination Item discrimination (D) indicates the extent to which the item discriminates between the testees, separating the more able testees from the less able ones Discrimination value shows us about the difference in ability between those who were successful and those who were not D can range from -1 to +1 D of +1 tells perfect correlation with the testees’ result on the whole test and D of -1 discriminates in entirely wrong way The scales for item discrimination are: D = 0.60 → 1.00: Good discrimination (g-d) D = 0.30 → 0.59: Medium discrimination (m-d) D = 0.00 → 0.29: Bad discrimination (b-d) D < 0.00: Bad item (b-i) The result of item discrimination of the test is shown in the table on the next page: Table 8: The result of item discrimination of the test Item D 0.23 0.47 √ 0.36 √ 31 g-d m-d b-d √ b-i 32 Ketnooi.com share this to you 32 0.47 0.15 √ 0.27 √ 0.33 0.23 0.60 10 0.11 11 0.33 √ 12 0.36 √ 13 0.67 14 0.53 15 0.73 16 0.50 √ 17 0.43 √ 18 0.79 19 0.43 √ 20 0.36 √ 21 0.57 √ 22 0.50 √ 23 0.93 24 0.43 √ 25 0.43 √ √ √ √ √ √ √ √ √ √ Ite m 26 0.07 √ 27 0.27 √ 28 1.00 √ 29 0.87 √ 30 0.00 32 D √ g-d m-d b-d √ b-i 33 Ketnooi.com share this to you 33 31 0.38 32 0.25 √ 33 0.25 √ 34 0.38 √ 35 0.31 √ 36 0.56 √ 37 0.31 √ 38 0.44 √ 39 0.25 40 0.63 √ 41 0.71 √ 42 0.62 √ 43 0.77 √ 44 0.57 45 1.00 √ 46 1.00 √ 47 0.77 √ 48 1.00 √ 49 0.85 √ 50 0.78 √ 33 √ √ √ The table shows that there is no bad item in the test 11 items have discrimination, but among them the discrimination of 39 are acceptable (0.3 up) Therefore, it can be concluded that the discrimination of the test is good 3.3.6 Estimating the reliability of the test The reliability of the test can be judged through an index called coefficient alpha (α) a test with coefficient of is the one which would give precisely the same results for a particular setoff candidates regardless of when it is administered A test which has the coefficient of would give sets of result quite unconnected with each other And the test would fail to be a reliable one Here are the coefficient alphas of the scales in the test that the computer had printed out: Table 9: The coefficient alphas of the scales in the test Scale Alpha 0.734 0.954 0.544 0.742 0.875 From the table 9, we can see that generally, the level of reliability for the test is quite high 3.4 TEACHERS’ AND STUDENTS’ COMMENTS The teachers’ and test-takers’ comments on the test is presented in the following section According to the information gathered from the survey questionnaire for the teachers of English group (10 teachers) and 50 testees, teachers (80%) agree that the content of the test relates to what the students have been taught while 45 testees (90%) think so Moreover, out of 10 teachers note that the marking scale is appropriate Concerning the time allowance for the test, the same number teachers (6 out of 10) refer that the time is enough for students to complete the test, while only 25 students (50%) have the same opinion Two teachers (20%) consider that the time is too much and the same number of testees shares the idea, 24 test-takers (48%) find that the time is rather short and they suppose 60 minutes is appropriate for the same type of test All of the teachers find that the test measure students’ true ability in Grammar, Reading, Writing skill, and so the students Concerning the structures of the test, only (20%) of 10 teachers think grammar and vocabulary item are easy All of them consider that the structures are related to what they have taught, whereas 25 test-takers (50%) share the idea, 14% of testees think the structures are not related to what they have been taught, and 36% find they are partly related According to the teachers, the reading texts used are reasonable, not too long and difficult whereas 15 testees (30%) refer that it is long and difficult, 20 test-takers (40%) think it is reasonable, and 11 (22%) think it is not very easy, and the rest thinks the opposite The thing to note here is correspondences out of 10 suggest the necessity to change the test items format, content, time allowance and the total point of each part for the test to measure learners’ speaking and listening skills Furthermore, the test-takers want to change the test format so as to measure students’ speaking and listening skills, the time allowance must be longer, the structures and the test items for skills must be more complicates CHAPTER 4: CONCLUSION 4.1 SUMMARY OF THE STUDY The aims of the study are to report a research that shows the current testing situations and language test for the first year non-English majors at STTC Chapter one has been introduced the rationale, the aims, the method, the research questions and the design of the study In chapter two, by reviewing the relevant literature on language testing, the writer shows the different types of test and characteristics of good test, theory of test items, language components, and skills The test items used to evaluate reading, writing, grammar and vocabulary are also provided in the chapter The issues influencing current testing situation at STTC have been mentioned in first two parts of chapter three In the following parts, the collected data from the sample test result has been analyzed in order to find out how good the students are However, one thing that must be stressed here is the interpretation of the test results is not only objectively reasonable (through the results of item data analysis) but also subjectively reasonable (through the results of interviews and surveys), which relatively coincides The information from the survey that shows the differences and similarities between teachers’ comments and students’ opinion of the sample test The finding of the study shows that the teachers and students share the same view of the test They expect that the test will be more relevant and related to the current learning and teaching practice Moreover, they wish to improve the test with listening and speaking items that can measure students’ overall ability effectively The designing of the test should be based on the basic principles of test design, the target language should be evaluated in both language skills and language components, the revision for the test should be carried out after their pilot application, more workshops on test design strategies and test development should be held regularly for teachers and test writers 4.2 LIMITATION Because of time limitation and the ability of the author, this study has some drawback The sample test had been carried out on the small population of the first year non-English majors at STTC In addition, the author finds out that the test should be piloted and be modified in the future in order to standardize classroom tests at a college 4.3 SUGGESTION FOR FURTHER STUDY In order to generate a more comprehensive look at achievement tests at STTC, a number of issues which could be taken into accounts in further study are: - Quantitative and qualitative methods were used but there are other ways to collect and analyze data that should be employed to ensure the objective of the research results - More samples can be selected in a more reliable procedure [...]... teachers, the reading texts used are reasonable, not too long and difficult whereas 15 testees (30%) refer that it is long and difficult, 20 test-takers (40%) think it is reasonable, and 11 (22%) think it is not very easy, and the rest thinks the opposite The thing to note here is 6 correspondences out of 10 suggest the necessity to change the test items format, content, time allowance and the total point of... ended questions) - Reordering - Context-based 18 19 Ketnooi.com share this to you 19 CHAPTER 3: THE STUDY This chapter provides information about the current teaching, learning and testing English situations at STTC as the setting of the study The analyzed data from the sample test result and survey questionnaire is provided in this chapter 3.1 THE SUBJECTS AND THE CURRENT ENGLISH TEACHING, LEARNING... best option from four given options This part accounts for 30 % of the total mark Section II asks students to match questions with suitable answers The objective is to check students’ grammar structures and vocabulary This part accounts for 20 % of the total mark Section III checks students’ writing skill through building five correct sentences using given words This part accounts for 10% of the total... vocabulary that students have learnt by ticking one correct sentence from two given sentences This part accounts for 20% of the total mark 25 26 Ketnooi.com share this to you 26 Section V assesses students’ reading comprehension skill through an extract of text about 200 words which consists ten gap-fillings with given words This part accounts for 20% of the total mark The weighing of the different parts is... the agreement between markers by establishing, and maintaining adherence to, explicit guidelines for the conduct of this marking The third aspect of reliability is that of parallelforms of a test to be devised The concept of reliability is particularly important when language tests within the communicative paradigm one considered Moreover, Davies (1968) stresses that reliability is the first essential... simple articles Writing: students are able to write a narrative, letters and to describe photos, pictures or people 21 22 Ketnooi.com share this to you 22 3.1.5 Teaching material used for first-year students at STTC As mentioned earlier in the scope of the study, this study focuses on designing of the final achievement English test for the first-year non-English major students who have been learning... Describing an Qs type: old friend Reading for specific information 3.2 THE CURRENT TESTING SITUATION AT STTC 23 24 Ketnooi.com share this to you 24 Based on the experience in teaching English at STTC for nearly 2 years, the author has learned that testing is not the most important thing for teachers at STTC Classroom teachers themselves design most language tests by using a cut-and-paste method, by which... Moreover, the third, semantics, with the way we assign meaning to a certain unit of a language in order to communicate Each of these has additional levels, phonology is supplemented by phonetics, the study of the physical characteristics of sound; syntax by morphology is the study of the structure of words and semantics by pragmatics is the study of the situational 17 18 Ketnooi.com share this to you... task dependency, with tasks in one section of the test building upon the content of earlier sections, including the test taker's answers to those sections Third, communicative tests can be characterized by their integration of test tasks and content within a given domain of discourse Finally, communicative tests attempt to measure a much broader range of language abilities including knowledge of cohesion,... 8.5 (there are 11 students got mark 8.5) - The median = the scores gained by the middle testee in the order of merit In this case it is 7.5 3.3.4.3 The dispersion - The range = the difference between the highest and the lowest scores The range = 9 – 1 = 8 27 28 Ketnooi.com share this to you 28 - Standard derivation = the average amount that each student’s score deviates from the mean Sd is calculated