Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 44 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
44
Dung lượng
59,73 KB
Nội dung
VIETNAM NATIONAL UNIVERSITY, HANOI COLLEGE OF FOREIGN LANGUAGES DEPARTMENT OF POST GRADUATE STUDIES - HOÀNG VĂN SÁU A STUDY ON THE VALIDITY OF END-TERM ACHIEVEMENT TESTS ON ENGLISH GRADE 12, HIGH SCHOOLS IN NORTHERN VIETNAM NGHIÊN CỨU TÍNH HIỆU LỰC CỦA CÁC BÀI KIỂM TRA CUỐI KỲ MÔN TIẾNG ANH LỚP 12 TẠI MỘT SỐ TRƯỜNG THPT Ở MIỀN BẮC VIỆT NAM M.A THESIS FIELD: METHODOLOGY CODE: 60 14 10 HA NOI - 2009 VIETNAM NATIONAL UNIVERSITY, HANOI COLLEGE OF FOREIGN LANGUAGES DEPARTMENT OF POST GRADUATE STUDIES - HOÀNG VĂN SÁU A STUDY ON THE VALIDITY OF END-TERM ACHIEVEMENT TESTS ON ENGLISH GRADE 12, HIGH SCHOOLS IN NORTHERN VIETNAM NGHIÊN CỨU TÍNH HIỆU LỰC CỦA CÁC BÀI KIỂM TRA CUỐI KỲ MÔN TIẾNG ANH LỚP 12 TẠI MỘT SỐ TRƯỜNG THPT Ở MIỀN BẮC VIỆT NAM M.A THESIS FIELD: METHODOLOGY CODE: 60 14 10 SUPERVISOR: DR HA CAM TAM HA NOI - 2009 - v - T CANDIDATE’S STATEMENT ACKNOWLEDGEMENTS ABSTRACT LIST OF TABLES Chapter 1: INTRODUCTION 1.1 Rationale of the study 1.2 Scope of the study 1.3 Aims of the study 1.4 Research questions 1.5 Methods of the study 1.6 Organization of the study Chapter 2: LITERATURE REVIEW 2.1 The relationships of lang 2.2 Objective testing 2.3 Achievement tests 2.3.1 Definitions 2.3.2 Final achievement 2.4 Test specification 2.5 Testing language compon 2.5.1 Tests of grammar 2.5.2 Test of vocabulary 2.5.3 Test of phonology 2.6 Validity of a test 2.6.1 Definitions and ty 2.6.2 Content validity o 2.6.3 Construct validity 2.7 Objectives and Syllabus contents of English grade 12 - vi - 2.7.1 Objectives of Eng 2.7.2 Syllabus contents 2.8 Recommended test spec grade 12 2.9 Components’ contents o 2.9.1 Components’ 2.9.2 Components’ Chapter 3: THE STUDY 3.1 Research design 3.1.1 Research questio 3.1.2 Informants 3.1.3 Data description 3.2 Analytical framework 3.3 Findings and discussion 3.3.1 Content validity of test samples’ components 3.3.1.1 Content validity of phonetic items 3.7.2.2 Content validity of grammar test items 3.7.2.3 Content validity of vocabulary items 3.3.2 Construct validi 3.3.2.1 Constru 3.3.2.2 Constru 3.3.2.3 Constru Chapter 4: CONCLUSION 4.1 Conclusion 4.2 Implications 4.3 Limitations and suggest REFERENCES APPENDIX - iv - LIST OF TABLES Table 1: Syllabus contents of English grade 12 Table 2: The recommended specification of the end-term achievement tests Table 3: Components’ contents of 1st term achievement tests Table 4: Components’ contents of nd term achievement tests Table 5: Content validity of test samples’ components Table 6: Construct validity of the test samples - - CHAPTER 1: INTRODUCTION 1.1 Rationale of the study In several decades recently, English language testing and evaluation has received a great interest from both educators, researchers worldwide In Vietnam, for its important roles in educational field, English testing and evaluation has been focused in universities and educational institutions through researches, Master of Arts theses, doctoral theses in methodology, most of which aim to evaluate reliability and validity, the essential and most important characteristics of a test The raising interest towards English testing can be only explained by its importance to English teaching and English learning For English teaching, testing evaluation helps teachers check again the effect of the teaching procedure, from which they could consider the contents and techniques used in teaching On the other hand, through testing, students can adjust the learning process themselves in order to get better study results There are a number of previous researches at College of Foreign Languages – Vietnam national University that carried out on testing in terms of validity of tests For instance, Vu, Ba Linh (2006); Nguyen, Thi Mai Phuong (2008); Tran, Thi Hieu Thuy (2008); Le Thuy Linh (2004); Nguyen, Thi Bich Hong (2008), etc All of these tests are at college and university research area However, we recognized that there is not any study about validity of tests at high schools The interested research topics are often about using language skills, techniques in English teaching and learning For example, Lam, Thi Thu Thuy (2008); Đậu, Duy Lịch (2007); Nguyễn Thị Nguyệt (2007), etc This raises a question whether or not high school tests have reliability and validity And if so, how could they be evaluated? One important thing when we mention to the testing and evaluation is the subjective factor of the test-makers Popularly, the tests are written without considering carefully among the contents and objectives of the course and the content and construct of the tests This leads to the fact that many tasks students have to in the tests not exist in the course contents or the test items are unfamiliar or far too difficult to students Clearly, those tests are said to be lack of reliability as well as validity, the most important and essential measurement qualities of a test This can be shown significantly through end-term achievement tests which examine students‟ achievements after a term or a course - - For the scope of this research, the end-term achievement tests on English grade 12 at high schools in Northern provinces of Vietnam have been collected and analyzed Due to the time limitation and research conditions, the end-of term achievement tests that have been done and scored by students cannot be collected That the reason why the reliability of those tests was not chosen to investigate in this study Only the validity in terms of content validity and construct validity were taken into consideration From the above reason, the author is encouraged to conduct this study entitled “A Study on the Validity of End-term Achievement Tests on English Grade 12, High Schools in Northern Vietnam” with the desire of finding out how valid these tests are Furthermore, the writer hopes that the findings of the study can be applied to improve the current testing in high schools It is also intended to encourage both teachers and learners in the teaching and learning process and to be the valuable source of reference for test designers 1.2 Scope of the study Due to the limitation of time and research conditions, the author doesn‟t have the ambition to cover all the aspects of a good achievement test like reliability, validity, discrimination, backwash effects etc This study will mainly focus on the construct validity and content validity of the end-term achievement tests on English grade 12 at high schools of some provinces in Northern Vietnam in school years of 2008 - 2009 The study will give the findings about construct validity and content validity of those achievement tests and give suggestion to improve those tests as well as suggestions for further studies 1.3 Aims of the study The major aims of the study is to evaluate the validity of the end-term achievement tests on English grade 12 at high schools of some provinces in Northern Vietnam in school years of 2008-2009 with a special focus on those tests‟ construct validity and content validity The specific aims of the study are: To study and evaluate the construct validity and content validity of those end-term achievement tests; and To give out the strengths and weaknesses of the tests - - 1.4 Research questions In order to achieve these goals, the study is carried out to the answer the following research questions: 1- Do the end-term achievement tests on English grade 12 at high schools in some Northern Vietnamese provinces possess content validity? 2- Do those tests possess construct validity? 1.5 Methods of the study This study is a combination of both quantitative and qualitative approaches First, a quantitative method was employed on the data collection from 10 end-term achievement tests on English grade 12 of high schools in some northern provinces of Vietnam The number of each language component of a test that possessed the content validity and construct validity has been counted and changed into percent Then from the quantitative statistics, qualitative method was employed to interpret the data into the meanings of test samples and their components in terms of content validity and construct validity 1.6 Organization of the study The thesis is organized into four major chapters: Chapter is the introduction that presents such initial information as the rationale, aims, methods, research questions and the organization of the study Chapter reviews all related literature that provides the theoretical basis for language testing and language evaluation First, the relationships of language testing with teaching and learning and objective testing are presented Then, the achievement tests; test specification; multiple choice questions and testing language components are discussed carefully Next, the most important theoretical part, validity in terms of content validity and construct validity are deeply taken into consideration Last parts are spent for objectives an syllabus design of English grade 12; Recommended test specification of end-term achievement tests on English grade 12 and components‟ contents of end-term achievement tests Chapter is the main part of the study which shows the research design containing research questions, data description, informants and analytical framework Next, data - - analysis of construct validity and content validity is discussed Finally, the findings about content validity and construct validity of the test samples are laid out Chapter offers the conclusions that make clear the research questions Some implications are suggested to improve end - term achievement tests in terms of their construct validity and content validity The limitations and directions to further research are also mentioned in this final chapter - - CHAPTER 2: LITERATURE REVIEW This chapter provides an overview of the theoretical background of the research Firstly, it discusses about the relationships of language testing with teaching and learning process Then, the achievement tests; test specification and testing language components are discussed carefully Next, the most important theoretical part, validity in terms of content validity and construct validity are deeply taken into consideration Last parts are spent for objectives and syllabus design of English grade 12; recommended test specification of end-term achievement tests on English grade 12 and components‟ contents of end-term achievement tests 2.1 The relationships of language testing with teaching and learning Teaching, learning and testing are interrelated closely to each other, that the existence and changes of this factor may have considerable effects on other factors Among these three factors, perhaps language testing itself has the strongest and clearest effects on teaching and learning process Heaton (1988:5) had the same idea that “Both testing and teaching are so closely interrelated that it is virtually impossible to work either field without being constantly concerned with the other” Heaton (1988:5) also pointed out the importance of testing to the learning process as “Tests may be constructed primarily as devices to reinforce learning and motivate the students or as a mean of assessing the students’ performance in the language” Davies (1996:5) also described the importance of language testing as “Properly made English tests can help create positive attitudes toward instruction by giving students a sense of accomplishment and a feeling that the teacher’s evaluation of them matches what he has taught them Good English tests also help students learn the language by acquiring them to study hard, emphasizing course objectives, and showing them where they need to improve” In term of the teaching field, testing help teachers evaluate how learners have achieved the target language knowledge and language skill Bachman (1990:55) shared this point of view when he stated that the fundamental use of testing in an educational program is to provide information for making decisions, that is, to evaluate However it is not a simple thing for teachers to receive exact, reliable and valid testing from different test- -24- CHAPTER 3: THE STUDY This chapter is the main part of the study The research design of the study which covers research questions, data description, informants and analytical framework will be briefly presented at first Secondly, data analysis for the study is taken into account Lastly, the author will give out the findings of study 3.1 Research design 3.1.1 Research questions The study is carried out to the answer the following research questions: Do the end-term achievement tests on English grade 12 at high schools in some Northern Vietnamese provinces possess content validity? Do those tests possess construct validity? 3.1.2 Informants The study was carried out at high schools of several provinces in northern Vietnam where have the similar social and economical conditions In the places where this study is conducted, the social and economical conditions are at the medium level, the living standard there are not very high The demands of English are not as high as in big cities like Hanoi or Hochiminh city Therefore most of students in those places are not fully aware of the important role of English to their future life This is the reason why they seem to be lack of English learning motivation and of English proficiency level Especially, their communicative skills are very weak Their grammatical capacity, however, is better than other language skills Moreover, the English teaching staffs in high schools are of dissimilar quality Some of them are originally Russian teachers and most of those teachers are deeply familiar with tradition teaching methods, such as grammatical translation, P.P.P approach (Presentation Practice - Production) There are few teachers continuing higher studies Besides, the inadequate textbook system and the rather low level of applying informatic technology in teaching has contributed considerably to the disadvantages in teaching English in the light of communicative approach - the most popular teaching approach nowadays All of these things have certain impacts on testing in high schools -25- 3.1.3 Data description The end term achievement tests on English grade 12 of high schools in several Northern provinces of Vietnam are chosen to be the data sources of the research With the helps of the author‟ colleagues from different high schools, the st term and the 2nd term test samples have been sent by emails or by post-office Five st term test samples and five 2nd term test samples are chosen to investigate To ensure their secret, objectiveness and safety, five of 1st term test samples are numbered from Test to Test and five of nd term test samples are numbered from Test to Test 10 Most of the sample tests are made by teachers of English themselves but some of sample tests are designed by local departments of education and training For instance, Test and Test The time allowance of the test is often from forty-five minutes to sixty minutes, depending upon the test item numbers and the students‟ ability Last thing to comment on, the test content of the nd term test samples is far more complex than the test content of the 1st term ones 3.2 Analytical framework In order to attain the research‟ aims and research questions, this study was done in the light of both quantitative and qualitative methods First, the author will base on the theoretical discussions on content validity of the previous chapter to analyze content validity of the test samples This part will mainly be based on Brian K Lynch‟s statements of content validity (2003:150) For him, content validity will be examined by the judgments about the degree of match between the test items and tasks and the ability to be tested In other words, the match found between test specifications and items produced from those specs is evidence of content validity Content validity of data will be examined by comparing the syllabus design of English grade 12 and the test content to see whether the test samples cover the components like phonetics, grammar and vocabulary of the syllabus or not To achieve the construct validity of the tests, we will examine the test items‟ components (phonetics, grammar and vocabulary) to see whether their employed testing techniques can check students‟ ability of understanding and using those language components, then if so, they are said to have construct validity and vice versa Factually, there are often four subtests in each test: Phonetics, Grammar and Vocabulary, Reading comprehension and Writing The reading comprehension will be analyzed in vocabulary -26- field to show out that the techniques used are valid or not; and the writing questions will be assessed in terms of their grammar structures to find out whether their testing techniques are valid or not Then quantitative method was employed to collect data and consider the percent of test items that achieve content validity and construct validity as well as the percent of test items that could not reach content validity and construct validity Finally, the qualitative method is the major instrument to the interpretation of the meanings of the above percent as data results 3.3 Findings and discussions 3.3.1 Content validity of test samples’ components st From the comparison between the content of language components of the term nd and term test samples and the tests‟ contents, the percent of the 10 test samples‟ components that achieved content validity will be shown as the following table: Test samples Test Test Test Test Test Test Test Test Test Test 10 Table 5: Content validity of test samples’ components 3.3.1.1 Content validity of phonetic items As can be seen from the above table, the numbers of valid test contents of the 1st term test samples are higher than the ones of the nd term test samples While Test and Test cover main phonetic points (pronunciation of letters and stress patterns) in the -27- contents of components of the 1st term, Test 1, and cover only one main phonetic point – pronunciation of letters The phonetic test items of the 2nd term test samples from Test to Test 10 have very low content validity Test and Test cover old knowledge of the st term (pronunciation of letters and stress patterns) while Test 6, Test and Test 10 examine pronunciation only None of them test any of the nd term contents of components It may be explained that it isn‟t very easy to design phonetic questions about full and contracted forms of auxiliaries; rhythm or elision as the 2nd term contents of components Another reason that may be more persuasive is, the test makers followed the format of the High School Graduation Examination and College/University Entrance Exams in English, which mainly test the pronunciation of letters and stress patterns To conclude, 20% of tests‟ phonetic items passed the content validity 3.3.1.2 Content validity of grammar test items Generally, the data of the percent of grammar items shows that the content validity of grammar items of the 1st term tests are much higher than the ones of the nd term tests That can be explained as most of the major grammar points of the components‟ contents of st the term tests appear in Test 1, 2, 3, 4, and Especially, Test and Test cover all major grammar points of the components‟ contents of the 1st term tests On the other hand, when comparing the components‟ contents of the nd term tests with grammar test items of 2nd term test samples 6, 7, 8, and 10, the author concludes that their content validity is low Test has the most numbers of grammar items of all with 45% get content validity This can be understood that most of test-makers didn‟t pay much attention to the test specification while designing the tests And this leaded to the fact that main grammar points in textbook didn‟t appear in the test, for instance, modals in the passive voice, transitive and intransitive verbs, or phrasal verbs, etc There is too much old grammar knowledge in those tests: approximately 55% in Test 6, 70% in Test 7, 80% in Test 8, 85% in Test 9, and 60% in Test 10 Moreover, much of typing errors appeared: question 14 in Test (His He), question 37 in Test (lack of question mark), question 18 in Test 10 (lack of „a‟ after „such‟) To sum up, of 10 (50%) of test samples‟ grammar items achieved content validity -28- 3.3.1.3 Content validity of vocabulary items The statistics of vocabulary items show that 70% of the tests‟ vocabulary items possess the content validity because the topics of vocabulary item in test samples are similar to the components‟ contents of 1st and 2nd term tests However, we noticed that the topics of the reading passages of Test and Test are not suitable with the test specifications‟ topics While the reading passage of Test is about a famous writer (Jack London), the reading passage of Test is about robots The reading passages of Test the topics of which were history of movies and reading methods, didn‟t follow the topic list of the 2nd term test components‟ contents, and we can say that its 50% vocabulary items in reading passages did not have content validity In conclusion, of 10 (70%) test samples‟ vocabulary items have passed the content validity The rest should be corrected to possess content validity 3.3.2 Construct validity of the test samples The construct validity of the test samples was investigated by analyzing the components of test samples (phonetics, grammar and vocabulary) to see how valid the techniques are used to test those components The following table shows out the shortcomings of some test samples‟ components as regards to their construct validity Test samples Test Test Test Test Test Test Test Test Test Test 10 Table Construct validity of the test samples -29- 3.3.2.1 Construct validity of phonetic test items It can be stated that in terms of construct validity, the phonetic test items of those tests appear to be valid That is because the multiple choice questions can check students‟ ability of pronunciation of vowels and recognition of stress patterns The following example is from Test 1: Choose the word whose underlined part is pronounced differently from the rest A summer A increased A smile A movie A cracker The techniques that proved that they ensured to evaluate the students‟ ability of recognizing the pronunciation of the ending “ed”, the learnt vowels Another example from Test proved the students‟ ability to distinguish the stress patterns of multi syllable words: Choose one word which has different stress pattern from the others Identify your answer by circling the corresponding letter A, B, C or D B probably C immediately D usually Câu 4: A recently B politician C violinist D librarian Câu 5: A musician It can be seen from Test to Test 10 that the phonetic test items of those test samples st are valid in terms of construct validity Like the term test samples, the nd term ones are able to test the students‟ ability of recognizing the pronunciation of vowel sounds the ability to distinguish the stress patterns of multi syllable words (except for Test which test students‟ ability of pronunciation only) It may be explained that the test makers followed the format of the High School Graduation Examination and College/University Entrance Exams in English, which test these above students‟ ability only In conclusion, 100% of phonetics items of all the test samples possess construct validity -30- 3.3.2.2 Construct validity of grammar test items The author noticed that in terms of construct validity, generally most of the grammar test items from Test to Test were valid as most of them employed more than techniques that could measure students‟ ability of understanding and using tenses and structures, articles, etc However, several grammar test items failed to evaluate the students‟ ability of remembering tenses and structures as only multiple choices and error recognition techniques are used It is realizable to add transformation or word formation technique to evaluate the construct validity of grammar items of 1st term test samples In the following examples, the grammar questions failed to test students‟ ability to use the direct and indirect speech correctly Students may not know how to change a direct speech into indirect speech but they might still take the right choices in a random way: Test 5: Choose the correct sentence with the same meaning as the one in italics Câu 37: "I've arranged to meet them after lunch tomorrow," Mathew said A Mathew said that he arranged to meet them after lunch the next day B Mathew said that he had arranged to meet them after lunch tomorrow C Mathew said that he had arranged to meet them after lunch the next day D Mathew said that he has arranged to meet them after lunch the next day Câu 38: The teacher asked her class: "Do you remember what you have to now?" A The teacher asked her class if we remembered what we had to then B The teacher asked her class if they remembered what they had to then C The teacher asked her class if they remember what they have to then D The teacher asked her class if they remember what they had to now The problem in the following grammar sentence is that we could choose option A or C to be changed If option A is changed, it will be “has passed” and if option C is chosen, then it is “be allowed” Test 3: Part IV: Writing Choose the underlined word(s) that must be changed in order to make each of the following sentences correct by circling the corresponding letter A, B, C or D 37 If he passed the GCSE examination, he would have been allowed to take the A B entrance examination to the university D C -31- In another more example, the test designer should used the symbol O to replace option D “no article” as it will take students‟ time to think of the meaning of the word “no article” Test 3: Part II: Grammar and Vocabulary (3.75 points) Choose from the four options given one best answer to complete each sentence by circling the corresponding letter A, B, C or D 11.If you don‟t send the application form on time, you will not be able to take … entrance examination A the B a C an D no article Or in the following case, the author thinks that this sentence is too easy for students of grade 12 as it was taught for grade students, so this sentence is invalid to check students‟ language proficiency: Test 4: Choose the best answer among A, B, C or D that best complete for each sentence: 13 Tom: “How are you today, Jane?” - Jane: “…….” A I am 20 B good, I like it C fine, thanks D No, thank you The author noticed that the popular technique, multiple choice questions in Test 7, Test and Test 10 failed to evaluate the students‟ understanding of using indirect speech questions For example, question 19, 46 in Test 10; question 45 in Test 7; question 13 in Test Another thing to say is, none of those test samples used sentence transformation but multiple choice questions instead to test students‟ usage of tenses and structures Moreover, the grammar items 46, 47, 48, 49, 50 of Test cannot measure students‟ ability of writing sentences from cue words Grammar item 25 of Test 10 failed to check students‟ ability of understanding and using modals in passive voices because it also check students‟ cultural knowledge Grammar items 47, 48, 49 and 50 of Test 10 cannot examine students‟ ability of sentence transformation and the understanding and using relative clauses; Grammar item 47 of Test failed to examine students‟ understanding and using the passive voices; question 6, 16, 22, 23, 26, 28 of Test cannot evaluate the students‟ ability of sentence transformation and sentence combination -32- All test samples used error recognition technique to test grammar structures, and they are quite valid as they can test students‟ ability to use and recognize the grammar structures In short, Test 2, Test and Test that are 30% of all the test samples are valid in terms of construct validity while the rest (70%) should be edited to possess construct validity 3.7.1.3 Construct validity of vocabulary test items It can be said that the vocabulary test items from Test to Test possess construct validity However, the contexts of some questions are not very clear to decide which option is the best choice For example: Test 1: II Choose the best answer A, B, C or D to complete the sentences 14 She felt……because she couldn‟t answer the last question A sadness B embarrassing C disappointed D shameful In this case, B, C, or D might be the acceptable answers for this question For us, this sentence must be changed to be correct as follows: 14 She felt……because she couldn‟t answer the last question A sadness B happy C disappointed D satisfied However, vocabulary items 26, 27, 28, 29, 30 of Test are able to check the students‟ ability of understanding and using the word formation; vocabulary items from 31 to 40 help to check students‟ ability of reading comprehension They are all said to possess construct validity Vocabulary questions from Test to Test 10 are also valid in terms of construct validity Many techniques were used to test the students‟ understanding and using new words or phrases Some of them are multiple choice questions, word-formation and gapfill However as the result of analysis the author found out that all of those tests used completely multiple choice questions and error recognition to test students‟ ability of using vocabulary It is better to test students‟ ability of understanding and using vocabulary by word-formation technique The gap-fill technique for passages which is used in Test and Test 10 seems good, but it is better if its multiple choice questions are replaced by matching To sum up, 100% of the tests‟ vocabulary items passed the construct validity -33- CHAPTER 4: CONCLUSION In this part, the author will conclude the study Some suggested implications for the study will also be given Last but not least, limitations and suggestions for further studies will be presented 4.1 Conclusion This study has made clear the basic notions of testing and evaluating in education, focusing on construct validity and content validity of the tests Through the efforts of the test samples‟ analysis, the answers for the two research questions were found out The 1st research question is answered as follows: the statistics showed that the content validity of the test samples‟ components is very low Only 20% of phonetic items (Test and Test 5) and 50% of grammar items (Test 1,2,3,4,5) have attained content validity while the number of vocabulary questions that got the content validity is 80% (except for Test and Test 3) It is one more time proved for the author‟s idea that the contents of the tests didn‟t go with the syllabus design of the textbook Therefore, the answer for research question is: only Test and Test that took 20% of test samples have possessed content validity Secondly, the answer for 2nd research question is as follows: from the analysis and comparison between the syllabus design and test specifications with the 10 collected test samples, it was found out that 100% of vocabulary test items of all the test samples have construct validity; 100% of the phonetic items also possessed construct validity Only 30% of the grammar items did not achieved construct validity as they failed to check students‟ ability of understanding and using tenses and difficult structures, while 70% of the rest are valid It can be concluded that all test samples (except for Test 7, Test and Test 10) have possessed the construct validity 4.2 Implications From these findings, the following recommendations were given to improve the tests‟ construct validity and content validity of the test samples: - To improve the construct validity of the test, at least two or three testing techniques should be employed to measure pupils‟ ability of using and producing sounds, grammar structures and vocabulary For example, to test grammar structures, multiple choices should be used with word formation technique or other alternatives as sentence -34- transformation and sentence building Both objective tests and essay tests have their own advantages and problems However, the combination between these two approaches could help to give the more accurate and effective assessment of students‟ ability - To improve the content validity of the tests, there should be common test specifications (or test formats) which could reflect the main points of syllabus design and issued by educational governors because these are final achievement tests Besides, advanced knowledge (about 10-30%) should be presented to help students prepare for later examinations - As can be seen in the objectives of the English Grade 12, at the end of this grade, students are able to use learnt English knowledge to practice the four skills: Listening, speaking, reading comprehension and writing However the present achievement tests seem to use the old test format that test mainly grammar and vocabulary components Moreover, there is no test sample that checks students‟ ability of listening and speaking skills Our recommendation here is there should be test items for listening and speaking skills to assess what students have learnt - When designing a test in general, test- makers should be a detailed specification, basing on the objectives and syllabus contents of the course, that raises clear and detail requirements For example, what contents of phonetics, grammar structures and vocabulary must be obligated in In accordance with those requirements, a marking scale should be built clearer and fairer for students, especially for tests of production skills like speaking or writing tests The author thinks that these detailed specifications and marking scales not only necessary to teachers in high schools but essential for every level of education 4.3 Limitations and suggestions for further research In spite of the author‟s efforts to carry out this study, there are unavoidable limitations due to the volume of a minor thesis and the writer‟s own knowledge and ability Firstly, only construct validity and content validity were investigated while many other important criteria of a test‟s quality were not evaluated For instance, reliability, face validity, backwash effect, practicality, etc Secondly, the major instrument to study the tests construct validity is only testing techniques, while some other instruments like levels of difficulty, intercorrelation of tests‟ components, etc were not mentioned Lastly, only test samples were collected for this study that caused the lack of reliability and face validity -35- From these above limitations, suggestions for further research are shown as follows: - More of instruments for study should be used to enhance the accuracy of the assessments of tests‟ components, for example, teachers‟ and students‟ questionnaires about the tests, test scores, etc - Other essential criteria of these tests, such as reliability, face validity, backwash, etc should be under investigation - The capacity of test-designers and the relationship between the fact of teaching and testing nowadays Last but not least, this research reflects the own ideas only of the writer basing on his teaching experiences and theoretical collections, so the results of which must be analysed and evaluated carefully by test designers and testing experts nationwide Nevertheless, we hope that this study will be the valuable source of reference for those who concerned with the achievement test design for English grade 12 It is also hoped that this minor thesis will contribute to the improvement of language testing at High Schools in the North of Vietnam -36- REFERENCES In Vietnamese: Bộ Giáo Dục Đào Tạo (2006) Chương trình giáo dục phổ thông - Trung học phổ thông- môn Tiếng Anh NXB Giáo Dục Hoàng, Văn Vân (2008) Hướng dẫn thực chương trình, sách giáo khoa lớp 12 mơn tiếng Anh NXB Giáo dục Hồng, Văn Vân (Tổng chủ biên) (2007) Tiếng Anh 12 NXB Giáo Dục Vũ, Thị Lợi (chủ biên) (2008) Kiểm tra đánh giá thường xuyên định kỳ môn tiếng Anh lớp 12 NXB Giáo Dục In English: Alderson, C.J, Clapham C and Wall D (1995) Language test construction and evaluation Cambrige University Press Anastasi, A (1982) Psycological testing London Macmillan Batchman, L.F (1990) Fundamental considerations in language testing Oxford University Press Bachman, L.F and Palmer, A.D (1996) Language testing in practice Oxford University Press Cohen, A.D (1981) Second Language Testing (in) Teaching English as a second foreign language Celce-Murcia, M (ed) Boston, Massachusetts Heinle and Heinle Publishers 10 Đậu, Duy Lịch (2007) A study on the improvement of English speaking skills for 10th form students at upper-secondary schools in Ha Tinh province MA thesismethodology 11 Davies, A et al., (1999) Dictionary of language testing University of Melbourne 12 Fulcher, G (2003) Testing Second Language Speaking Peason Education Limited 13 Harold S Madsen (1983) Techniques in testing Oxford University Press 14 Harris, D.P (1969) Testing English as a second language New York: McGra, Hill Book Company 15 Heaton, J.B (1997) Classroom Testing Longman 16 Heaton, J.B (1988) Writing English language tests London Longman 17 Henning, G (1987) A guide to language testing Cambrige: Newbury House Publishers -37- 18 Hughes, A (1995) Testing for language teachers Cambrige University Press 19 Lâm, Thị Thu Thủy (2008) The application of games in teaching grammar with reference to Tiếng Anh 10 Texbook at Hà Trung high school, Thanh Hóa province MA thesis-Methodology 20 Lê, Thùy Linh (2004) Teachers’ and test-takers’ evaluations of the validity of the current final test for the 4th semester non-english majors at Hanoi University of Education MA thesis-Methodology 21 Nguyễn, Thị Mai Phương (2008) Validity of the achievement test for non-major, second-year students at Economics Department, Hanoi Open University MA thesis22 Nguyễn, Thị Nguyệt (2007) How to make classroom reading more communicative for grade 10 of English at Bac Ninh specialized high School MA thesis-Methodology 23 Nguyễn, Thị Bích Hồng (2008) Evaluating an achievement test for credit to nonmajors at Vietnam University of Commerce and some suggestions for improvement MA thesis-Methodology 24 Lado, R (1961) Language Testing London Longman 25 Mcnamara, T (2000) Language Testing Oxford University Press 26 Nunan, D (1992) Qualitative and quantitative researches Research methods in Language Learning Cambrige University Press 27 Vũ, Ba Linh (2006) An evaluation of the current 5th semester test for students at Economics Faculty, HNU MA thesis-Methodology 28 Trần, Thị Hiếu Thủy (2008) Evaluation of an end term listening test for first year mainstream students of English Department - College of Foreign Languages - Vietnam National University, Hanoi MA thesis-Methodology 29 Weir, C.J (1990) Comunicative Language Testing Prentice Hall International Ltd, UK 30 Weir C.J (2005) Language Testing and Evaluation Palgrave Macmillan -38- APPENDICES THE TEST SAMPLES ... items passed the content validity 3.3.1.2 Content validity of grammar test items Generally, the data of the percent of grammar items shows that the content validity of grammar items of the 1st term. .. the 1st term tests are much higher than the ones of the nd term tests That can be explained as most of the major grammar points of the components‟ contents of st the term tests appear in Test 1,... than the 45 minute tests The end- term achievement tests aim at evaluating the general knowledge of different themes that students have learned in a term There are two end- term achievement tests