1. Trang chủ
  2. » Luận Văn - Báo Cáo

An investigation into the cut score validity of the VSTEP 3 5 listening test

208 411 6

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 208
Dung lượng 1,88 MB

Nội dung

VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES ****** NGUYỄN THỊ QUỲNH YẾN DOCTORAL DISSERTATION AN INVESTIGATION INTO THE CUT-SCORE VALIDITY OF THE VSTEP.3-5 LISTENING TEST MAJOR: ENGLISH LANGUAGE TEACHING METHODOLOGY CODE: 9140231.01 HANOI, 2018 VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES ****** NGUYỄN THỊ QUỲNH YẾN DOCTORAL DISSERTATION AN INVESTIGATION INTO THE CUT-SCORE VALIDITY OF THE VSTEP.3-5 LISTENING TEST (Nghiên cứu xác trị điểm cắt kết thi Nghe Đánh giá lực tiếng Anh từ bậc đến bậc theo Khung lực Ngoại ngữ bậc dành cho Việt Nam) MAJOR: ENGLISH LANGUAGE TEACHING METHODOLOGY CODE: 9140231.01 SUPERVISORS: PROF NGUYỄN HÒA PROF FRED DAVIDSON HANOI, 2018 This dissertation was completed at the University of Languages and International Studies, Vietnam National University, Hanoi This dissertation was defended on 10th May 2018 This dissertation can be found at: - National Liberary of Vietnam - Liberary and Information Center -Vietnam National University, Hanoi i DECLARATION OF AUTHORSHIP I hereby certify that the thesis I am submitting is entirely my own original work except where otherwise indicated I am aware of the University's regulations concerning plagiarism, including those regulations concerning disciplinary actions that may result from plagiarism Any use of the works of any other author, in any form, is properly acknowledged at their point of use Date of submission: _ Ph.D Candidate’s Signature: _ ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy _ Prof Nguyễn Hòa (Supervisor) I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy _ Prof Fred Davidson (Co-supervisor) iii TABLE OF CONTENTS LIST OF FIGURES……………………………………………………………………… viii LIST OF TABLES………………………………………………………………………… ix LIST OF KEY TERMS…………………………………………………………………… xiii ABSTRACT……………………………………………………………………………… xvii ACKNOWLEDGMENTS………………………………………………………………… xix CHAPTER I: INTRODUCTION……………………………………………………… 1 Statement of the problem……………………………………………………………… Objectives of the study………………………………………………………………… Significance of the study ….…………………………………………………………… 4 Scope of the study……………………………………………………………………… Statement of research questions………………………………………………………… Organization of the study……………………………………………………………… CHAPTER II: LITERATURE REVIEW……………………………………………… Validation in language testing……….………………………………………………… 1.1 The evolution of the concept of validity ……………………………………… 1.2 Aspects of validity.……………………………………………………………… 1.3 Argument-based approach to validation………………………………………… 11 Standard setting for an English proficiency test………………………………………… 15 2.1 Definition of standard setting…………………… …………………………… 15 2.2 Overview of standard setting methods………………………………………… 17 2.3 Common elements in standard setting…………………………………………… 21 2.3.1 Selecting a standard-setting method……………………………………… 21 2.3.2 Choosing a standard setting panel………………………………………… 23 2.3.3 Preparing descriptions of performance-level descriptors………………… 24 2.3.4 Training panelists………………………………………………………… 24 2.3.5 Providing feedback to panelists…………………………………………… 26 2.3.6 Compiling ratings and obtain cut scores………………………………… 27 2.3.7 Evaluating standard setting……………………………………………… 27 2.4 Evaluating standard setting…………………………………………………….… 28 2.4.1 Procedural evidence.………………………………….…………………… 30 iv 2.4.2 Internal evidence………………………………………………………… 32 2.4.3 External evidence………………………………………………………… 32 2.4.3.1 Comparisons to other standard-setting methods………………… 33 2.4.3.2 Comparisons to other sources of information…………………… 33 2.4.3.3 Reasonableness of cut scores……………………………………… 34 Testing listening….………………………………………………………………… … 34 3.1 Communicative language testing………………………………………………… 34 3.2 Listening construct……………………………………………………………… 36 Statistical analysis for a language test………………………………………………… 42 4.1 Statistical analysis of multiple choice (MC) items……………………………… 42 4.2 Investigating reliability of a language test……………………………………… 46 Review of validation studies…………………………………………………………… 49 5.1 Review of validation studies on standard setting………………………………… 49 5.2 Review of studies employing argument-based approach in validating language tests……………………………………………………………………………… 52 Summary………………………………………………………………………………… 60 CHAPTER III: METHODOLOGY…………………………………………………… 61 Context of the study…………………………………………………………………… 61 1.1 About the VTEP.3-.5 test………………………………………………………… 61 1.1.1 The development history of the VSTEP.3-5 test………………………… 61 1.1.2 The administration of the VSTEP.3-5 test in Vietnam…………………… 62 1.1.3 Test takers………………………………………………………………… 62 1.1.4 Test structure and scoring rubrics………………………………………… 62 1.1.5 The establishment of the cut scores ……………………………………… 63 1.2 About the VSTEP.3-5 listening test……………………………………………… 64 1.2.1 Test purpose……………………………………………………………… 64 1.2.2 Test format……………………………………………………………… 64 1.2.3 Performance standards ………………………………………………… 64 1.2.4 The establishment for the cut scores of the VSTEP.3-5 listening test…… 68 Building an interpretive argument for the VSTEP.3-5 listening test…………………… 68 Methodology…………………………………………………………………………… 70 v 3.1 Research questions…………………………………………………………… 70 3.2 Description of methods of the study…………………………………………… 71 3.2.1 Analysis of the test tasks and test items………………………………… 72 3.2.1.1 Analysis of test tasks…………………………………………… 72 3.2.1.2 Analysis of test items…………………………………………… 73 3.2.2 Analysis of test reliability……………………………………………… 75 3.2.3 Validation of cut-scores………………………………………………… 76 3.2.3.1 Procedural……………………………………………………… 76 3.2.3.2 Internal………………………………………………………… 76 3.2.3.3 External………………………………………………………… 77 3.3 Description of Bookmark standard setting procedures ………………………… 78 3.4 Selection of participants of the study…………………………………………… 81 3.4.1 Test takers of early 2017 administration………………………………… 81 3.4.2 Participants for Bookmark standard setting method…………………… 82 3.5 Descriptions of tools for data analysis…………………………………………… 83 3.5.1 Text analyzing tools……………………………………………………… 83 3.5.1.1 English Profile………………………………………………… 83 3.5.1.2 Readable.io……………………………………………………… 84 3.5.2 Speech rate analyzing tool……………………………………………… 84 3.5.3 Statistical analyzing tools……………………………………………… 85 3.5.3.1 WINSTEPS (3.92.1)…………………………………………… 85 3.5.3.2 Iteman 4.3 ……………………………………………………… 86 Summary……………………………………………………………………………… 87 CHAPTER IV: DATA ANALYSIS…………………………………………………… 89 Analysis of the test tasks and test items………… …………………………………… 89 1.1 Analysis of the test tasks……………………………………………….………… 89 1.1.1 Characteristics of the test rubric………………………………………… 89 1.1.2 Characteristics of the input……………………………………………… 94 1.1.3 Relationship between the input and response…………………………… 102 1.2 Analysis of the test items………………………………………………………… 102 1.2.1 Overall statistics of item difficulty and item discrimination……………… 102 vi 1.2.2 Item analysis…………………………………………………………… 107 Analysis of the test reliability…….….………………………………………………… 128 Analysis of the cut-scores…… ……………………………………………………… 130 3.1 Procedural evidence……………………………………………………………… 130 3.2 Internal evidence………………………………………………………………… 131 3.3 External evidence………………………………………………………………… 132 CHAPTER V: FINDINGS AND DISCUSSIONS……………………………………… 145 The characteristics of the test tasks and test items……………………………………… 145 The reliability of the VSTEP.3-5 listening test………………………………………… 151 The accuracy of the cut scores of the VSTEP.3-5 listening test ……………………… 151 CHAPTER VI: CONCLUSION ……………………….… …………………………… 154 Overview of the thesis………………………………………………………………… 154 Contributions of the study……………………………………………………………… 157 Limitations of the study………………………………………………………………… 158 Implications of the study….…………………………………………………………… 158 Suggestions for further research………………………………………………………… 159 LIST OF THESIS-RELATED PUBLICATIONS………………………………………… 161 REFERENCES…………………………………………………………………………… 162 APPENDIX 1: Structure of the VSTEP.3-5 test………………………………………… 172 APPENDIX 2: Summary of the directness and interactiveness between the texts and the questions of the VSTEP.3-5 listening test………………………………………………… 174 APPENDIX 3: Consent form (workshops)……………………………………………… 177 APPENDIX 4: Agenda for Bookmark standard-setting procedure……………………… 179 APPENDIX 5: Panelist recording form…………………………………………………… 180 APPENDIX 6: Evaluation form for standard-setting participants ……………………… 181 APPENDIX 7: Control file for WINSTEPS……………………………………………… 183 APPENDIX 8: Timeline of the VSTEP.3-5 test administration………………………… 185 APPENDIX 9: List of the VSTEP.3-5 developers………………………………………… 186 vii LIST OF FIGURES Figure 2.1: Model of Toulmin’s argument structure (1958, 2003)……………………… 12 Figure 2.2: Sources variance in test scores (Bachman, 1990)…………………………… 47 Figure 2.3: Overview of interpretive argument for ESL writing course placements……… 57 Figure 4.1: Item map of the VSTEP.3-5 listening test…………………… … ……… 105 Figure 4.2: Graph for item 2……………………………………………………………… 108 Figure 4.3: Graph for item 3………………………………………………………… 110 Figure 4.4: Graph for item 6……………………………………………………………… 112 Figure 4.5: Graph for item 13……………………………………………………………… 115 Figure 4.6: Graph for item 14……………………………………………………………… 117 Figure 4.7: Graph for item 15…………………………………………………………… 119 Figure 4.8: Graph for item 19…………………………………………………………… 121 Figure 4.9: Graph for item 20…………………………………………………………… 123 Figure 4.10: Graph for item 28……………………………………………………… 125 Figure 4.11: Graph for item 34……………………………………………………… 126 Figure 4.12: Total score for the scored items…………………………………………… 129 viii APPENDIX STRUCTURE OF THE VSTEP.3-5 TEST Số câu hỏi/nhiệm vụ thi Dạng câu hỏi/nhiệm vụ thi phần, 35 câu hỏi đa lựa chọn (MCQ) Thí sinh nghe đoạn trao đổi ngắn, hướng dẫn, thông báo, đoạn hội thoại nói chuyện, giảng, sau trả lời câu hỏi đa lựa chọn (MCQ) in sẵn đề thi Bài thi Thời gian Nghe hiểu Khoảng 40 phút, bao gồm thời gian chuyển câu trả lời sang phiếu trả lời Đọc hiểu 60 phút, bao gồm thời gian chuyển câu trả lời sang phiếu trả lời đọc, 40 câu hỏi đa lựa chọn Viết 60 phút viết 12 phút phần: Tương tác xã hội Thảo luận giải pháp Phát triển chủ đề Nói Thí sinh đọc văn vấn đề khác nhau, độ khó văn tương đương bậc 3-5 với tổng số từ dao động từ 1900-2050 từ Thí sinh trả lời câu hỏi đa lựa chọn sau đọc Bài 1: Viết thư/ thư điện tử có độ dài khoảng 120 từ Bài chiếm 1/3 tổng số điểm thi Viết Bài 2: Thí sinh viết luận khoảng 250 từ chủ đề cho sẵn, sử dụng kiến thức trai nghiệm để minh họa cho lập luận Bài chiếm 2/3 tổng số điểm thi Viết Phần 1: Tương tác xã hội Thí sinh trả lời 3-6 câu hỏi chủ đề khác Phần 2: Thảo luận giải pháp Thí sinh cung cấp tình giải 172 Mục đích Kiểm tra tiểu kĩ Nghe khác nhau, có độ khó từ bậc đến bậc 5: nghe thơng tin chi tiết, nghe hiểu thơng tin chính, nghe hiểu ý kiến, mục đích người nói suy từ thông tin Kiểm tra tiểu kĩ Đọc khác nhau, có độ khó từ bậc đến bậc 5: đọc hiểu thông tin chi tiết, đọc hiểu ý chính, đọc hiểu ý kiến, thái độ tác giả, suy từ thơng tin đốn nghĩa từ văn cảnh Kiểm tra kĩ Viết tương tác Viết sản sinh Kiểm tra kĩ Nói khác nhau: tương tác, thảo luận trình bày vấn đề pháp đề xuất Thí sinh phải đưa ý kiến giải pháp tốt giải pháp đưa phản biện giải pháp lại Phần 3: Phát triển chủ đề Thí sinh nói chủ đề cho sẵn, sử dụng ý cung cấp sẵn tự phát triển ý riêng Phần kết thúc với số câu hỏi thảo luận chủ đề 173 APPENDIX SUMMARY OF THE DIRECTNESS AND INTERACTIVENESS BETWEEN THE TEXTS AND THE QUESTIONS OF THE VSTEP.3-5 LISTENING TEST OF EARLY 2017 Question Reonship between input and response Subskill tested Information in the text Specific in Number of the buildings Specific in The thing that the man has to pay for Specific in Different degrees of temperature Specific in The special feature of the music magazine Specific in What Facebook offers in the next month Specific in True information about this year’s music festival Specific in Max’s opnion about a product Inferencing Opinion of the man Specific in How Alex contacts with the customers 10 Specific in The things that Alex has to with customers 11 Inferencing The part of the job Alex finds most difficult at first 12 Main idea The main idea of the conversation 13 Specific in The true information about late night shoppers 14 Specific in Feature of some discounted products 174 15 Inferencing Mrs Green’s opinion about late night shopping 16 Main idea The best title for the interview 17 Specific in The aspect of the job Dan enjoys the most 18 Specific in Dan’s opinion about his special Chilean student 19 Inferencing Inference about the the babies’ Learn-to-Swim program 20 Main idea Which information is not the topic for discussion 21 Specific in Harry’s main task during the trip 22 Specific in The effects of his trip 23 Specific in Things that Harry advises applicants to 24 Inferencing Harry’s opinion about the cost of the gap year 25 Main idea The best title for the talk 26 Specific in Number of people who died during the Great Smog 27 Specific in The information which is not mentioned about photochemical smog 28 Inferencing Sharon’s implication about the Beijing city’s plan 29 Specific in What Sharon advises the audience to 30 Main idea Main idea of the talk 31 Specific in Meaning of Jen in Confucius’s writing 32 Specific in The false information about achieving nirvana 33 Inferencing The implication about the Taoist tradition 175 34 Specific in The thing that Aristotle’s principle of moderation argues for 35 Main idea Main topic of the lecture 176 APPENDIX 3: CONSENT FORM (Workshops) PROJECT TITLE: “An Investigation into the Cut-score Validity of the Listening Section of Vietnamese Standardized Test of English Proficiency – VSTEP.3-5” Name of PhD candidate: _ I understand that this project is for research purposes, and is undertaken by a PhD candidate in the Faculty of Post-graduate Studies, at the University of Languages and International Studies, Vietnam National University I understand the nature of the project and what is expected of me, and I consent to participate on that basis I acknowledge that: a) I have read the written information about the project and have received a copy of that information; b) I have received an adequate explanation of all likely risks, effects, discomforts or inconvenience arising from participation in the project; c) My participation is voluntary I have the right to withdraw from participation at any time and I have the right to withdraw any data I have supplied (up to the point of analysis/publication); d) My ideas contributed in the workshops can be recorded in both written and sound format; e) The researcher will keep this signed copy of my consent form; f) I am satisfied that the confidentiality of the information I have provided will be safeguarded subject of any legal limitations; 177 g) I will not be identified in any publication arising from the research Any information used in this research that might identify me will be changed and a pseudonym will be used to protect my identity Participant signature: _ Date: _ 178 APPENDIX AGENDA FOR BOOKMARK STANDARD-SETTING PROCEDURE AGENDA FOR BOOKMARK STANDARD-SETTING PROCEDURE DAY Section (morning) 8:00 8:15 8:30 9:00 10:00 10: 15 11: 30 Registration Introductions & completion security forms Background and overview Test administration, test scoring and discussion Break Review of performance level descriptors Adjourn Section (afternoon) 1:00 1:30 2:30 2:45 3:00 3:15 4:45 5:00 Introduction to Bookmark procedure Practice round & evaluation of readiness Questions and answers Break Introductions for round Round Wrap-up Adjourn DAY Section (morning) 8:00 9:00 10:30 10:45 11: 30 Review of Round results Round Break Discussion of Round results Adjourn Section (afternoon) 1:00 2:30 3:00 4:00 Round Final recommendations Closure and evaluation Adjourn 179 APPENDIX PANELIST RECORDING FORM Panelist number: Directions: Enter your Bookmark page number for each performance level in the spaces below ROUND Unrated/Level (A2/B1) Level 3/Level (B1/B2) Level 4/Level (B2/C1) Level 3/Level (B1/B2) Level 4/Level (B2/C1) Level 3/Level (B1/B2) Level 4/Level (B2/C1) Page number ROUND Unrated/Level (A2/B1) Page number ROUND Unrated/Level (A2/B1) Page number Cut scores % at or above Notes: 180 APPENDIX EVALUATION FORM FOR STANDARD SETTING PARTICIPANTS Directions: Please indicate your level of agreement with each of the following statements and add any additional comments you have on the process at the bottom of this page Thank you No Statement Strongly disagree The orientation provided me with a clear understanding of the purpose of the meeting The task was clearly explained The training and practice exercises helped understand how to perform the task Taking the test helped me to understand the assessment The performance level descriptions were clear and useful The large and small group discussion aided my understanding of the process There was adequate time provided for discussions There was an equal opportunity for everyone in my group to contribute his/her ideas and opinions I was able to follow the instructions and complete the rating sheets accurately 10 The discussions after the first round of ratings were helpful to me 181 Disagree Agree Strongly agree 11 The discussions after the second round of ratings were helpful to me 12 The information showing the distribution of examinee scores was helpful to me 13 I am confident about the defensibility and appropriateness of the final recommended cut scores 14 Comments:…………… …………………………………………………………… 182 APPENDIX CONTROL FILE FOR WINSTEPS &INST Title= "Data Nghe de chay du lieu.xlsx" ; Excel file created or last modified: 7/31/2017 3:51:17 PM ; input ; Excel Cases processed = 1562 ; Excel Variables processed = 36 DATA = "D:\Yen\DHNN-DHQGHN\NCS\Luan an TS\Winstep listening 25.3.17\data file new.txt" ITEM1 = ; Starting column of item responses NI = 35 ; Number of items NAME1 = 37 ; Starting column for person label in data record NAMLEN = ; Length of person label XWIDE = ; Matches the widest data value observed CODES = 01 ; matches the data TOTALSCORE = Yes ; Include extreme responses in reported scores ; Person Label variables: columns in label: columns in line @Person = 1E6 ; $C37W6 &END ; Item labels follow: columns in label i1-B1.1 ; Item : 1-1 i2-B1.1 ; Item : 2-2 i3-B1.1 ; Item : 3-3 i4-B1.1 ; Item : 4-4 i5-B1.2 ; Item : 5-5 i6-B1.3 ; Item : 6-6 i7-B2.1 ; Item : 7-7 i8-B2.1 ; Item : 8-8 i9-B1.1 ; Item : 9-9 i10-B1.2 ; Item 10 : 10-10 i11-B1.2 ; Item 11 : 11-11 i12-B1.2 ; Item 12 : 12-12 i13-B1.3 ; Item 13 : 13-13 183 i14-B1.3 ; Item 14 : 14-14 i15-B2.2 ; Item 15 : 15-15 i16-B2.1 ; Item 16 : 16-16 i17-C1.1 ; Item 17 : 17-17 i18-C1.1 ; Item 18 : 18-18 i19-C1.2 ; Item 19 : 19-19 i20-B2.3 ; Item 20 : 20-20 i21-B1.3 ; Item 21 : 21-21 i22-B1.3 ; Item 22 : 22-22 i23-B2.1 ; Item 23 : 23-23 i24-B2.1 ; Item 24 : 24-24 i25-B2.2 ; Item 25 : 25-25 i26-B2.2 ; Item 26 : 26-26 i27-B2.3 ; Item 27 : 27-27 i28-C1.1 ; Item 28 : 28-28 i29-C1.2 ; Item 29 : 29-29 i30-B2.3 ; Item 30 : 30-30 i31.-C1.1; Item 31 : 31-31 i32-C1.2 ; Item 32 : 32-32 i33.C1.3 ; Item 33 : 33-33 i34-C1.3 ; Item 34 : 34-34 i35-B2.3 ; Item 35 : 35-35 184 APPENDIX THE TIMELINE OF THE VSTEP.3-5 TEST ADMINISTRATION Timeline Moring Session - 7:35 am: Test papers and reading answer sheets were delivered - 7:45 am: The Reading test began - 8: 45 am: The Reading test ended and the answer sheets were collected - 8:50 am: The Listening Answer sheets were delivered and volume was tested - 8:55 am: The Listening test began - 9: 40 am: The Listening test ended and the answer sheets were collected (Test takers were given minutes to transfer their answers to the answer sheets) - 9: 40 am -9: 55 am: Break - 9: 55 am: The Writing papers were delivered - 10:00 am: The Writing test began - 11:00 am: The Writing test ended Afternoon session: Speaking test 185 APPENDIX LIST OF THE VSTEP.3-5 DEVELOPERS Nguyễn Hòa (Prof) Đỗ Tuấn Minh (Ph.D) Huỳnh Anh Tuấn (Ph.D) Đỗ Thanh Hà (Ph.D) Trần Hoài Phương (Ph.D) Nguyễn Thị Ngọc Quỳnh (Ph.D) Đặng Thu Trang (M.A) Nguyễn Huyền Minh (M.A) Nguyễn Thúy Lan (M.A) Nguyễn Thị Mai Hữu (M.A) Nguyễn Thị Quỳnh Yến (M.A) Vũ Minh Huyền (M.A) Lại Thị Phương Thảo (M.A) Vũ Đoàn Phương Thảo (M.A) Phạm Thị Thanh Thủy (M.A) Nguyễn Huy Hoàng (B.A) Nguyễn Thanh Thủy (B.A) Nguyễn Thị Thu Hằng (M.A) Nguyễn Lê Hường (M.A) 186 ... 1 45 The characteristics of the test tasks and test items……………………………………… 1 45 The reliability of the VSTEP. 3- 5 listening test? ??……………………………………… 151 The accuracy of the cut scores of the VSTEP. 3- 5 listening. .. establishment of the cut scores of the test or suggest the adjustment for them Scope of the study In the current context of English testing and assessment in Vietnam, the cut scores of the VSTEP. 3- 5 listening. .. administer the VSTEP. 3- 5 test (hereinafter referred to as the VSTEP. 3- 5 listening test) Based on the argument that in order for the cut scores of the VSTEP. 3- 5 listening test to be valid, (1) the test

Ngày đăng: 02/11/2019, 06:23

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w