1. Trang chủ
  2. » Luận Văn - Báo Cáo

Đánh giá tính giá trị của các bài kiểm tra tiếng Anh dành cho học sinh lớp 10 ở một số trường THPT miền Trung và miền Bắc Việt Nam, từ Hà Tĩnh đến Hà Nam

67 1,1K 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 67
Dung lượng 797,26 KB

Nội dung

The contents tested in the fourty five minute tests 34 Table 6: The contents tested in the fifteen minute tests 37 Table 7: Test Specification for final written tests 41 Table 8: The con

Trang 1

Vietnam national university, ha noi College of foreign languages

……… ………

nguyễn thị hoàng lân

An evaluation on the validity of english tests used for English 10 at some Higher Secondary Schools in the middle and north of viet nam ,

from ha tinh to ha nam

đánh giá tính hiệu lực của các bài kiểm tra tiếng anh dành cho học sinh lớp 10 ở một số tr-ờng thpt miền trung và miền bắc việt

nam; từ hà tĩnh đến hà nam

======== ========

Ma thesis

Field: methodology Code:

Supervisor: Dr Hà cẩm tâm

Ha Noi, 2009

Trang 2

Vietnam national university, ha noi College of foreign languages

……… ………

nguyễn thị hoàng lân

an evaluation on the validity of english tests

used for English 10 at some Higher Secondary

Schools in the middle and north of viet nam ,

from ha tinh to ha nam

đánh giá tính hiệu lực của các bài kiểm tra tiếng anh dành cho

học sinh lớp 10 ở một số tr-ờng thpt miền trung và miền bắc việt

Trang 3

table of contents

Introduction

6 Organization of the study 6

Development

Chapter 1: literature review

1.1 Basic concepts of testing 8

1.2.1.1 Kinds of achievement tests 11 1.2.1.2 Final achievement tests 11 1.2.1.3 Progress achievement tests 12

1.3 Characteristics of a good language test 13

Trang 4

1.3.4.4 Face validity 19 1.3.4.5 Backwash validity 20 1.3.4.6 Souces of invalidity 21

1.4 Test items for phonetics, structures and vocabulary 22

1.4.2 Language components 23 1.4.3 The test items types used to evaluate phonetics, structures 23 and vocabulary

1.5 Syllabus Objectives on language components 24

Chapter 2: The study

2 3.Analytical framework for data analysis 29 2.3.1 Content validity 29 2.3.2 Construct validity 30

2.3 Data Analysis and Discussion 31 2.3.1 Content validity of the tests used 31 2.3.1.1 Content validity of 45 minute written tests 31 2.3.1.2 Content validity of 15 minute written tests 36 2.3.1.3 Content validity of final written tests 412.3.2 Construct validity of the tests used 46 2.3.3 Face validity of the tests used 53

Trang 5

References 61

Trang 6

List of tables

Table 1: The test item types 24

Table 2: Syllabus Objectives 25

Table 3: Format of 45 minute tests and final tests 28

Table 4: Specification for fourty five minute tests 32

Table 5 The contents tested in the fourty five minute tests 34 Table 6: The contents tested in the fifteen minute tests 37 Table 7: Test Specification for final written tests 41 Table 8: The contents of language components in the final tests 43 Table 9: Specification for language components 46 Table 10: Teachers’ opinions on the investigated tests 54 Table 11: Testing techniques 57

Trang 7

1 Rationale of the study

English has played an integral role in increasing the development of science, technology, culture and international relations This fact has resulted in the growing demand for English language learning and teaching in many parts of the world In addition, the world-wide globalization process has confined English the most widely used means of international communication The need to master English to access to information and interactions with each other is increasingly growing in many parts of the world English teaching is undoubtly the ultimate capacity-building tool

Fully recognizing the importance of this global language, Vietnamese Ministry of Education has encouraged and required pupils of Secondary Schools to learn it as a compulsory subject during at least seven years English is also a compulsory subject in the Higher Secondary graduation examination

To evaluate and assess the English learning and teaching process, testing is apparently employed as an important and powerful tool This is because ever since language began to be taught in formal settings, the development of tests to assess the learner's performance has been an integral part of language learning and teaching process Language testing, then, is central to language teaching Therefore, It is also widely accepted that testing plays a significant role in the process of learning and teaching foreign languages The main purpose of language testing is to provide opportunities for learning, both for the students who are being tested and for the professionals who are administering the tests Through the tests students can learn from the work they do with the teacher, and

by themselves in preparation for the tests, the opportunities arising during the tests for developing what they know and what they can do, especially the feedback which they receive after the tests, both from their own reflection and from professionals who have monitored their performance on the tests The teachers can pinpoint strengths and weaknesses in the learned abilities of the students and gain the information about the progress the students are making or what the students are likely to be able to do with the

Trang 8

language in a target context or what the students know and what they do not know (both explicitly and implicitly) about the target language In general, a language test can be a

“sample language behaviour and infer general ability in the language learnt” (Brown

D.H, 1994: 252) From the results of the tests and depending on different kinds of tests

with different purposes as well, the teacher can infer a certain level of language competence of his students in such different areas as grammar, vocabulary, pronunciation,

or speaking, listening , writing and reading Lanwerys & Seanlon (1969) contends in their

book “testing is an important tool in educational research and for programme evaluation,

and may even throw light on both the nature of language proficiency and language learning”

“Language testing is a form of measurement It is so closely related to teaching that we

cannot work in testing without being constantly concerned with teaching” (Heaton,

1988:5) Therefore, it is undeniable that the most effective and fastest way to check students' understanding is testing Besides, thanks to testing, teachers can evaluate the effectiveness of the used syllabus or its contents, objectives, methods and to identify, locate the difficult areas that their pupils are being confronted with in learning process through tests

For the past ten years or so, there have been a number of changes in the practice of English teaching in Viet Nam tertiary education Some regard methodology, from Grammar translation method to Communicative approach Some involve in course books Some are concerned with technology, from traditional tape recorders to modern LCD projectors Some are related to testing For example, at Higher Secondary Schools in recent years there is a shift in testing from Subjective tests to Objective tests, which has great effects on teaching and learning process Therefore, in testing pupils’ progress, teachers tend to design more objective tests and many mid-term or final tests are multiple choice questions This is considered as a good preparation for students to perform well in the entrance university tests which exists in multiple choice questions However, the problem

is that the English 10 is one of the three new course books of Ministry of Education which focus on improving the four skills reading, writing, speaking and listening and help students to consolidate their grammar in the Language Focus part Thus, multiple choice questions seem to fail to test pupil’s progress accurately The question arising is that whether the tests used at High Schools test what students are supposed to acquire

Trang 9

according to the objectives of the textbook This is also one of the major reasons why I carried the research on validity

In addition, Test researchers and developers have admitted that validity are critical for tests and referred to as integral measurement qualities Because this quality provides major justification for using test score numbers as a basic for making inferences or

decisions (Bachman and Palmer, 1996:19) From educative perspectives that both teachers

and students should have their voice heard about instructional content, mode of syllabus delivery, and assessment As analyzed above, validity is an indispensable quality of all good tests Opinions from test takers and test raters, therefore, are essential and important

to the process of test construction More importantly, it is impossible for test writers to try

in vain to increase the validity of a reliable test due to the features of test items that constructs it From the outset of test construction, test validity should be of most essential

focus of all Heaton (1988:60) argued that "face validity can provide not only a quick and

reasonable guide but also a balance to too great of concern with statistical analysis." He stated that the students' motivation is maintained if a test has good face validity and most students will try harder if the test looks sound Thus, the face validity plays a certain role in any test and it is also of great concern in this thesis Moreover, the emphasis on test validity is also confirmed in Hughes (1989) that, "the greater a test's content validity is, the more likely it is to be an accurate measure of what it is to measure." To put it in another way, if major areas in the test specification are not identified or not represented, the test is said to be inaccurate Furthermore, such an inaccurate test is likely to have harmful backwash effect because those are not presented or tested will probably be ignored in

teaching and learning Bachman ( 1990: 289) also insists that :'' The most important quality

to consider in the development, interpretation and use of language tests is validity, which has been described as a unitary concept related to the adequacy and appropriateness of the way we interpret and use test scores." In general , the reasons discussed here are regarded

as a strong impetus that initiates this thesis into investigating the validity of the achievement tests at Higher Secondary Schools from Ha Tinh to Ha Nam

Some studies and researches have been done in some particular schools to design

an English achievement test for the 10th form pupils as a Case Study such as the study by

Ta Thi Minh Hien (2005) However, there has not been any study on investigation into and evaluation on tests used for 10th form pupils High Schools in The Middle and in the

Trang 10

North of Viet Nam While it is undeniable that good evaluation of tests can help us measure skills and knowledge of pupils more accurately For example, test analysis can help us remove weak items even before we record the results of the tests

Another reason for the selection of this research topic lies in the fact that language testing at Higher Secondary Schools has not been paid enough attention to As a teacher, I have been involved in designing, administering and marking any kinds of English tests Yet I have also witnessed neither comprehensive nor systematic evaluation nor research on the effectiveness and appropriateness of these tests No formal discussions or seminar on test construction or test methods have been carried out There is a lack of a language test item bank, a professionals testing committee, who judges the quality of the tests and takes the responsibility for the given tests

For the above-mentioned reasons, as a learner, a teacher, and a beginning

researcher of English, the author has been encouraged to conduct the study entitled: “An

evaluation on the validity of English tests used for English 10 at Higher Secondary Schools

in the middle and north of Viet Nam , from Ha Tinh to Ha Nam” with a view to evaluate

the validity of the tests used for pupils at Higher Secondary School It is hoped that the study will benefit the author as well as teachers at Higher Secondary Schools and those who are concerned with language testing in general and English testing techniques at Higher Secondary School in particular

2 Scope of the study

In this study the author intends to focus mainly on the content validity, construct validity, face validity of progress achievement tests including 15 minute tests, mid-term tests, and final achievement tests consisting of final- term tests and final tests in the school years of 2007-2008 and 2008-2009at the 12 high schools in 6 provinces from Ha Tinh to

Ha Nam The results can be seen as the basis for providing some suggestions for test designers as well as raters

Trang 11

3 Aims of the study

Parallel with the above reasons leading to the research are some following aims:

- To assess the validity of tests used for English 10 at Higher Secondary School from the Middle to the North of Viet Nam focusing on content validity, construct validity, face validity,

- To suggest some implications on designing a written English test to better the teaching and learning English at Higher Secondary Schools in Viet Nam

4 Methods of the study

In order to achieve the above aims, a study has been carried out with the following approaches

Basing on the theory and principle of language testing, major characteristics of a good test, especially achievement tests, random samples of progress tests including progress tests, 15 – minute tests, 45 - minute tests, mid-term tests, final term tests and final achievement tests comprising of term tests and final tests in the school years 2007-2008 and 2008-2009 in a number of Higher Secondary Schools, from the Middle to the North of Viet Nam were analyzed Content validity is evaluated basing on the comparison between the test specification relying on the syllabus objectives of the English 10 and the content tested in the collected tests Construct validity is assessed relying on the test specification constructed basing on theoretical background of testing and the syllabus of the English 10 The survey questionnaire was administered to the teachers of the Upper Secondary Schools

to investigate their evaluative comments on the face validity of the tests they designed

Beside the use of critical reading, analysis and questionnaires for data collection, the study made use of other supporting methods such as interviews, informal discussions, opinion exchanges with teachers and students to gather necessary information about the learning, teaching and testing situations at High Schools

Trang 12

The methods used in the study are quantitative and qualitative

5 Research questions

This study is implemented to find the answers to the following research question:

- Do the achievement tests for Higher Secondary School pupils of grade meet the

following criteria: content validity, construct validity, face validity?

6 organization of the thesis

This thesis is comprised three parts:

Part one introduces the rationale of the study, the scope, the aims, the methods, research questions

Part two is the development of the thesis which is divided into three chapters

Chapter one reviews the literature related to language testing (basic concepts, roles, types of testing, criteria of a good test and test items for reading, writing, grammar and vocabulary.)

Chapter two presents the methodology including the curricula of English 10, Data, Participants and Analytical framework for data analysis (Construct validity, Content validity, Face validity), Results and discussions (Construct validity of the tests used, Content validity of the tests used, Face validity of the tests used)

Part three demonstrates the conclusion comprising of main finding, implications and suggestions for further studies

Trang 13

Development

chapter 1 Literature Review

This chapter reviews the theories and literature relevant to the topic under investigation in the present study The chapter starts with basics concepts of testing and then the definition and types of achievement tests are reviewed A brief review of major characteristics of a good language test is presented with a major focus on test validity, especially construct, content and face validity Next, test items for phonetics, structures and vocabulary is discussed Finally, Curricula of English 10 is provided with the objectives and the content of the English 10

1.1 Basic concepts of testing

Trang 14

Testing is an essential part of every teaching and learning experience and becomes one of the main aspects of methodology Many researchers have demonstrated definitions

of testing with different point of view

Allen (1974: 313) emphasizes testing as an instrument to ensure that students have

a sense of competition rather than to know how good their performance is and in which condition a test can take place He contends that “test is a measuring device which we use when we want to compare an individual with other individuals who belong to the same group."

Carrol (1968: 46) holds that a psychological or educational test is a procedure designed to elicit certain behavior from which one can make inferences about certain characteristics of an individual In other words, a test is a measurement instrument designed to elicit a particular behavior of each individual

Besides, Ibe (1981: 1) points out that "a sample of behavior under the control of

specified conditions aims towards providing a basis for performing judgment" The term a sample of behavior used here is rather broad and it means something else rather than the

traditional types of paper and pencils Read (1983) shares the same idea in the sense that a sample of behavior suggests language testing certainly includes listening and speaking skills as well as reading and writing ones

However, Heaton (1988:5) looks at testing in a different way In his opinion, tests are considered as a means of assessing the students' performance and to motivate the students He looks at tests with a positive eyes as many students are eager to take tests at the end of semester to know how much knowledge they have One important thing is that

he points out the relationship between testing and teaching

Harrison (1986:1) notices that a natural extension of classroom work, providing teachers and students with useful information that can serve each as a basis for improvement and a test is necessary but unpleasant imposition from outside the classroom That means a test is a useful tool to measure learners' ability in a certain situation especially in classroom

Trang 15

According to Bachman (1990:20), what distinguishes a test from other types of measurement is that it is designed to obtain specific sample of behavior This distinction is believed to be of great importance because it reflects the primary justification for the use

of language tests and has implications for how we design, develop and use them to their best use Thus, language tests can provide the means for more focus on the specific assure

of interest

Brown (1994:252) states that "A test, in plain or ordinary words, is a method of measuring a person's ability or knowledge in a given area" Moore (1992:138) proposes that evaluation is an essential tool for teachers because it gives them feedback concerning what the students have learned and indicates what should be done next in the learning process Evaluation helps us to better understand students, their abilities, interests, attitudes, and needs so as to teach more effectively and motivate them However, in the book of Brown (1994:373) he stresses that are seen by learners as dark clouds hanging over their heads, upsetting them with thunderous anxiety as they anticipate the lightning bolts of questions they do not know and worst of all a flood of disappointed if they do not make the grade

From the above descriptions, though different researchers holds different point of view on testing, in short, testing is an effective means of measuring and assessing students' language knowledge and skills It is of great use to both language teaching and learning

1.2 Achievement tests

Just as there are many purposes for which language tests are developed, so there are many types of language tests Some types of tests serve a variety of purposes while others are more restricted in their applicability The tests collected were designed basing on the text book English 10 and were intended to assess pupils' progress, therefore in this part

definition as well as kinds of achievement tests are presented

1.2.1 Definition

Trang 16

Achievement tests are defined differently depending on researchers' points of view Hughes (1990:10) held that.“, achievement tests are directly related to language course, their purpose being to establish how successful individual students, groups of students , or the courses themselves have been in achieving objectives.”.Achievement tests are usually carried out after a course on a group of learners who take the course Brown (1994:259) also suggests that “An achievement test is related directly to classroom lessons, units or even total curriculum.” achievement tests in his point of view “are limited to a particular material covered in a curriculum within a particular time frame” Another comment on achievement test offered by Finocchiaro and Sako (1983:`5) is that achievement tests or attainment test are widely employed in many language teaching institutions They are used to measure the degree of control of discrete language and cultural items and of integrated language skills acquired by the students within a specific period of instruction in a specific course Harrison (1983:7) demonstrates that “an achievement test looks back over a longer period of learning than the diagnostic test, for example, a year’s work, or the whole course, or even a variety of different courses.” He also states that achievement tests are intended to show the standard, which the pupils have reached in relation to other pupils at the same level In short, Achievement tests are directly related to language courses The purpose of this kind of test is to know how successful students, courses or the teaching itself have been in achieving the objectives stated beforehand (in the program of the course, for example)

In short, achievement tests play a crucial role in the school programs, especially in evaluating students' acquired language knowledge and skills during the course, and they are widely used at different school level

1.2.2 Kinds of achievement tests

Achievement tests can be subdivided into the final achievement tests and progress achievement tests classified according to the time administration and the designed objectivities

1.2.2.1 Final achievement tests

Trang 17

Final achievement tests are administered at the end of a course and its purpose is to measure the achievement of the course as a whole These tests may be written and administered by ministries of education, official examining boards, or by members of teaching institutions Obviously, the content of these tests must be related to the courses with which they are concerned, but the nature of this relationship is a matter of disagreement amongst language testers

According to some testing experts, the content of a final achievement test should be based directly on a detailed course syllabus or on the books and other materials used This

is known as the syllabus-content approach The test should has an obvious appearance for

it only contains what it is thought that the pupils have actually encouraged and therefore can be considered, in this respect at least, a fair test However, this test holds a disadvantage that if the syllabus is badly designed, or the books and other materials are badly chosen, then the results of the test can be very misleading Successful performance

on the test may not truly indicate successful achievement of course objectives

The alternative approach is to design the test content basing directly on the objectives of the course, which has a variety of advantages First, it forces course designers

to elicit about course objectives This in turn puts pressure on those who are responsible for the syllabus and the selection of books and materials to ensure that these are consistent with the course adjectives Tests based on course objectives work against the perpetuation

of poor teaching practice, a kin of course-content-based test, almost as if conspiracy fails to

do I strongly believe that test content based on course objectives is much preferable, which provides more accurate information about individual and group achievement, and is likely to promote a more beneficial backwash effect on teaching

1.2.2.2 Progress achievement tests

Progress achievement tests are intended to measure the progress students are making in order to plan future work (including remedial work) They are usually administered at the end of a specific unit or lesson Obviously, these tests should be related

to the course objectives These should make a clear progression towards the final

Trang 18

achievement test based on course objectives Then if the syllabus and teaching methods are appropriate to these objectives, progress tests based on short-term objectives will fit well with what have been taught If not, there will be pressure to create a better fit If it is the syllabus that is at fault, it is the tester’s responsibility to make clear that it is there, that change is needed, not in the tests

Moreover, more formal achievement test require careful preparation; teachers could feel free to set their own ways to make a rough check on pupil’s progress to keep pupils on their toes Since such tests will not form part of formal assessment procedures, their construction and scoring need not be purely towards the intermediate objectives on which a more formal progress achievement tests are based However, they can reflect a particular

“route” that an individual teacher is taking towards the achievement of objectives

1.3 Characteristics of a good test

In order to make a good test, teachers have to take the various factors into consideration such as the purpose of a test, the content of the syllabus, the pupils' background and so on In addition to these factors, test characteristics play a very important role in constructing a good test According to a number of leading scholars in testing as Valette (1977), Harrison (1983), Weir (1990), Carroll and Hall (1985), Henning (1987), and Brown (1994) all good tests have four main characteristics as follows:

Reliability is a necessary characteristic of any good test It is of primary importance

in the use of proficiency tests for both public achievement and classroom tests An appropriateness of the various factors affecting reliability is important for the teacher at the

Trang 19

very outset, since many teachers tent to regard tests as infallible measuring instruments and fail to realize that even the best test is indeed a somewhat imprecise instrument with which

to measure language skills

A fundamental criterion against any language test, which has to be judged is its reliability The concern here is with how far we can depend on the results that a test produces Three aspects of reliability are usually taken into account The first concern the consistency of scoring among different makers The second is the concern of the tester how to enhance the agreement between makers by establishing, and maintaining adherence

to, explicit guidelines for the conduct of this making The third aspect of reliability is that

of parallel-forms reliability, the requirements of which have to be born in mind when future alternative forms of a test have to be devised

The concept of reliability is particularly important when considering language tests within the communicative paradigm Moreover, Davies (1968) stresses that reliability is the first essential for any test, but for certain kinds of language tests, they may be very difficult to achieve the appropriate results

1.3.2 Discrimination

Another important feature of a test is its capacity to discriminate among the different candidates and to reflect the differences in the performances of the individuals in the group It is true for both teacher-made tests and standardized test The extend of the need to discriminate will vary depending on the purpose of the test In many classroom tests, for example, the teacher will be much more concerned with finding out how well the pupils have mastered the syllabus and will hope for a cluster of marks around the 80 percent and percent brackets Nevertheless, there may be occurrences in which the teacher may require a test to discriminate to some degree in order to assess relative abilities and locate areas of difficulty Here are the items should be spread over a wide difficulty level

in the test

- extremely easy items

- very easy items

- easy items

Trang 20

- fairly easy items

- items below average difficult level

- items of average difficult level

- items above average difficult level

- fairly difficult items

- difficult items

- very difficult items

- extremely difficult items

1.3.3 Practicability

A test must be practical, in other words, it must be fairly straight forward to the administers The most obvious practical considerations concerning the tests overlook Firstly, the length of time available for the administration of the test if frequently misjudged even by experienced test writers, especially if the complete test consists of a number of sub-tests Another practical consideration concerns the answer sheets and the stationary used The use of answer sheets, however, greatly facilitates marking and is strongly recommended when large numbers of pupils are being tested The question of practicability, is not confined solely to oral tests, such written tests as situational composition and controlled writing tests depend not only on the availability of qualified markers who can make valid judgment concerning the use of language, etc but also on the length of time available for the scoring of the test A final point concerns the presentation

of the test paper itself, where possible, it should be printed or typewritten and appear neat, tidy and authentically pleasing

1.3.4.Validity

According to Huges, A (1989:22), " A test is said to be valid if it measures accurately what it is intended to measure" The test must aim to provide a true measure of the particular skill which it is supposed to measure When closely examined, however, the concept of validity reveals a number of aspects, each of which deserves our attention

Trang 21

1.3.4 1 Content validity

" A test is said to have content validity if its content constitutes a representative sample of the language skills, structures, etc with which it is meant to be concerned." (Huges, A.,1989:22) This kind of validity depends on careful analysis of the language being tested and of the particular course objectives It is obvious that a grammar test, for instance, must be made up of items testing knowledge or control of grammar But this in itself does not ensure content validity The test would have content validity only if it included a proper sample of the relevant structures Just what are the relevant structures will depend, of course, upon the purpose of the test Therefore, in order to judge whether or

not a test has a content validity, we need a specification of the skills or structures etc that it

is meant to cover Such a specification should be made at a very early stage in test construction It isn't to be expected that everything in the specification will always appear

in the test, there may simply be too many things for all of them to appear in a single test But it will provide the test constructor with the basis for making a principled selection of elements for inclusion in the test A comparison of test specification and test content is the basis for judgments as to content validity

What is important about content validity? First, the greater a test's content's validity, the more likely it is to be an accurate measure of what it is supposed to measure

A test in which major areas identified in the specification are under-represented - or not represented at all- is unlikely to be accurate Secondly, such a test is likely to have a harmful backwash effect Areas which are not tested are likely to become areas ignored in teaching and learning

Anastasi (1982:131) defined content validity as " essentially the systematic examination of the test content to determine whether it covers a representative sample of the behavior domain to be measured" She shows a set of useful guideline for establishing content validity:

-The behavior domain to be tested must be systematically analyzed to make certain that all major aspects are covered by the test items, and in the correct proportions

Trang 22

- The domain under consideration should be fully described in advance, rather than being defined after the test has been prepared

- The content validity depends on the relevance of the individual’s test relevance of item content

The more a test stimulates the dimensions of observable performance and accords with what is known about that performance, the more likely it is to have content and construct validity According to Kelly (1978:8), content validity seems "an almost and completely overlapping concept" with construct validity, and for Moller (1982:68), " the distinction between construct and content validity language proficiency."

1.3.4 2 Construct validity

Construct validity is defined by Anastasi (1982:144) as " the extent to which the test many be said to measure a theoretical construct of trait Each construct is developed to explain and organize observed response consistencies It derives from establish inter-relationships among behavioral measures focusing on a broader, more enduring and more abstract kind of behavioral description construct validation requires the gradual accumulation of information from a variety of source Any data throwing light on the nature of the trait under consideration and the condition affecting its development and manifestations are grist for this validity mill."

Construct validity is viewed from a purely statistical perspective in much of the recent American literature Bachman and Palmer (1981a) It is seen principle as a matter of the posterior statistical validation of whether a test has measured a construct that has a

reality independence of other constructs

According to Hughes, A, 1989: 26, a test, part of a test, or a testing technique is said to have construct validity if it can be demonstrated that it measures just the ability which is supposed to measure The word " construct" refers to any underlying ability (or trait) which is hypothesised in a theory of language ability For example, it can be argued that a speed reading test based on a short comprehension passage is an inadequate measure

of reading ability (and thus has low construct validity) unless it is believed that the speed

Trang 23

reading of short passages relates closely to the ability to read a book quickly and efficiently and is a proven factor in reading ability

1.3.4.3 Criterion-related validity

Another approach to test validity is to see how far results on the test agree with those provided by some independent and highly dependable assessment of the candidate's ability This independent assessment is therefore the criterion measure against which the test is validated Criterion-related validity consists of two types, concurrent validity and predictive validity

Concurrent validity is the degree to which a test correlates with other tests testing the same thing In other words, if a test is valid it should give a similar result to other measures that are valid for the same purpose When considering concurrent validity, there are several concerns

First, the measure that is being used for comparison of the test in question must be valid If the measure is not valid, there is no point in testing another test's validity against

it For instance, teacher's ranking might be used to test validity but the teacher's ranking may be affected by a number of factors that are not related to the students' actual proficiency One possible solution is to average the rankings of several teachers to make up for this

Second, the measure must be valid for the same purpose as the test whose validity

is being considered A reading test can not be used to test the concurrent validity of a grammar test In addition, if teachers' ranking are being used, it is essential to make sure that they understand on what basis they are expected to rank the students If the test being considered is a grammar test, then the teachers should be asked to rank the students according to their grammar proficiency, not their overall English language ability

It is said that predictive validity is different from concurrent validity in that " instead of collecting the external measures at the same time as the administration of experimental test, the external measure will only be gathered some time after the test has

Trang 24

been given" (Alderso et al, 1995) To put it in a simple way, predictive validity is the extent to which the test in question can be used to make predictions about the future performance For example, does a test of English ability accurately predict how well students will get along in a university in an English- speaking country? There are numerous problems with attempting to answer such questions Measures used to know how well a student does at a university are sometimes employed to measure predictive validity, but the problem is that there are many factors other than English proficiency involved in academic success Furthermore, it is not possible to know whether the students who scored low on the tests and therefore did not get to go to university would have done if they had been allowed to go However, it is undeniable that prediction is an important and justifiable use of language tests, and evidence that indicates a relationship between test performance and the behaviour that is to be predicted provides support for the validity of this use of test results However, there is a wide range of situations in which we are not interested in prediction at all, but in determining the levels of abilities of language learners

In short, information about criterion relatedness- concurrent or predictive - is by itself insufficient evidence for validation ( Bachman 1990: 253) That is one of the reasons why in this thesis, the author do not evaluate the criterion-related validity in tests

1.3.4.4 Face validity

Anastasi (1982:136) points out that face validity is the technical sense; it refers, not

to what the test actually measures, but to what it appears superficially to measure Face validity pertains to whether the test "looks valid" to the examinees who take it, the administrative personnel who decide on its use and other technically untrained observers Fundamentally, the questions of face validity concerns report and public relations Lado (1961), Davies (1968), Ingram (1977), Palmer (1981) have all discounted the value of face validity If a test does not have face validity though, it may not be acceptable to the students taking it, or the teachers using it If the students do not accept it as valid, their adverse reaction to it may mean that they do not perform in a way that truly reflects their ability Anastasi (1982:136) takes a similar position " Certainly if test content appears irrelevant, inappropriate, silly or childish, the result will be poor co-operation, regardless of

Trang 25

the actual validity of the test Especially in adult testing, it is not sufficient for a test to be objectively valid It also needs face validity to function effectively in practical situations

In short, a test is said to have face validity if it looks as if it measures what is

supposed to measures For example, a test which intended to measure pronunciation ability but which did not require the candidate to speak (and there have been some) might be thought to lack face validity Face validity is hardly a scientific concept, yet it is very important A test which does not have face validity may not be accepted by candidates, teachers, education authorities or employers It may simply not be used; and if it is used, the candidates' reaction to it may mean that they do not perform on it in a way that truly reflects their ability Face validity can be judged by teachers or pupils

1.3.4 5 Backwash validity

Language teachers operating in a communicative frame work normally attempt to equip students with skills that are judged relevant to present of future needs, and to the extent that tests are designed to reflect these, the closer the relationship between the test and the teaching that precede it, the more the test is likely to enhance construct validity A suitable criterion for judging communicative tests in the future might well be the degree to which they satisfy pupils, teachers and future users of test results, as judged by some systematic attempt to gather data on the perceived validity of the test If the first stage, with its emphasis on construct, content, face, backwash validity, the bypassed procedures do not suit the purpose for which it was intended

On balance, special attention must be paid to the validity of a test when one constructs it Although there are many kinds of validity, from Harrison's conclusion, only face validity and content validity are most vital for the teacher setting his own tests This view of validity provides a specific and useful framework for language test evaluation and

is also adapted in this thesis

1.3.4.6 Souces of invalidity

Trang 26

Hennings (1987) believes that the following considerations bring about a reduction

of validity:

Invalid application of tests: Invalid arises from misapplication of tests A test may

be valid for specific purposes, but it is invalid in terms of the manner in which the test is used

Inappropriate Selection of Content: Invalidity occurs when items do not match the

objectives or the content of instruction, or the test items are not comprehensive in the sense

of reflecting all of the major points of the instruction programs, etc Elaborate specifications and expert opinion must be used to ensure that a test exhibit validity

Imperfect Cooperation of the Examinee: This might occur if the examinees are

insincere, misinformed, or hostile with regard to the test or the testing situation For example, students have wrong answers to a question just because of unclear test instruction

Inappropriate Referent or Norming Population: Referent means a distinct

population from which subjects of the test are developed In Toefl tests, for example, the population targeted is applicants to American universities from diverse national and linguistics background

Poor Criterion Selection: Validity exists only in terms of specified criteria If the

criteria selected are the wrong one, then the fact that the test is valid or of little practical significance This is particularly important in case of criterion related validity If the criterion measure itself has low reliability or validity as a measure of the target competence, the validity coefficients obtained by this procedure will tend to underestimate true validity

Sample Truncation: Sample Truncation is artificial restriction of the range of ability

represented in the examinees, which will result in underestimation of both reliability and validity

Trang 27

Use of Invalid Construct: Tests are said invalid in so far as they measure the

constructs they are purported to measure if we want to determine the validity of intelligence tests, we must know what kind of intelligence is being measured, whether or not the items accurately reflect that kind of intelligence

1.4 Test items for phonetics, structures and vocabulary

1.4.1.Test items

Tests usually consists of a series of items Cohen (1992:488) defines “an item is a specific task to perform, can test one or more points or objectives For example, an item may test one point such as the meaning of a given word, or several points, such as an item that tests the ability to obtain facts from a passage and then makes inferences based on the facts.” He also suggests that “sometimes an integrative item is really more a procedure than an item as in the case of a free composition which could have a number of objectives.” Furthermore , he stresses that the objectivity of an item is determined by the way it is scored A multiple-choice item, for example, is objective in that there is only one right answer He also points out that a free composition may be more subjective in nature if the scorer does not look for any one right answer but rather for a series of factors namely creative style Cohesion and coherence, grammar and mechanics

Item types for testing phonetics are ordering task, note-taking, multiple-choice items, matching… Items types for testing strutures are multiple-choice items, completion items, matching items, transformation exercises, error-recognition item, rearrangement items Item types for testing vocabulary are composition and essay, multiple-choice items, cloze items, matching items, completion items, word transformation

1.4.2 Language components

Linguistics is the study of phonology, syntax and lexis The first, phonology, is concerned with the sound of a language and the way in which these are structured into segments such as syllables and words The second, syntax, with the way we string words

Trang 28

together in phrases, clauses and sentences to build well-formed sentences Moreover, the third, semantics, with the way we assign meaning to a certain unit of a language in order to communicate Each of these has additional levels, phonology is supplemented by phonetics, the study of the physical characteristics of sound; syntax by morphology is the study of the structures of words and lexis is the study of vocabulary The language components we focus on in this thesis are grammar (structures), vocabulary, phonology, grammar belongs to syntax, vocabulary belongs to lexis, and phonology belongs to phonetics

1.4.3 The test item types used to evaluate language components

The following table describes the test item types which are often used to evaluate phonetics, structures and vocabulary:

Table 1 The test item types

- Re-ordering

- Multiple – choice

- Pairing and Matching

- Multiple- choice items

- Broken sentence items

- Pairing and matching

- Reordering

- Definitions (explaining the meaning of each word)

- Sentence Completion

- Gap filling

1.5 Syllabus Objectives on language components

English 10 is the first course book of the set of ones used for Higher Secondary School in Viet Nam English 10 is the continuation of English 6, English 7, English, 8 and

Trang 29

English 9 at Secondary Schools English 10 was designed basing on the themes/ topics which are familiar with pupils in daily life The course books consists of 16 units in which there is a topic/ theme School leavers are supposed to be able to use English as a means of communication in speaking, listening, writing and reading at the pre-intermediate level Pupils can gain basic knowledge about English speaking people and countries

According to the content of the course book, after studying 16 units covering 6 themes, pupils are expected to be able to grasp the following content on phonetics, structures and vocabulary:

Table 2: Syllabus Objectives

-Use the following tenses correctly:

simple present; simple past tense; past perfect ; present perfect;present progressive

; Be going to(expressing prediction);

simple future (Will: making predictions;

making offers) correctly

- Use the following connectors appropriately: because/ because of;

despite/ in spite of/ although; which

- Remember some verbs followed by Gerunds/ Infinitives and use To-inifitive

to talk about purpose appropriately -Master the use of Conditional type 1,2

- Turn the active sentences into passive ones, direct statemens into indirect ones

- Use the + adj as a noun

- Understand and use the words relating to the following topics appropriate: School talks;

Daily activities;

People's background; Special Education;

Technology;

School outdoor activities; The media; Life in the community Undersea world;

Conservation of nature;

National park; Music, theatre and film;

Trang 30

- Use Relative pronouns appropriately and distinguish defining and non-defining clauses

- Use Attitudinal adjectives formed from V-ing/ V-ed correctly

-Use Comparatives and Superlatives correctly

- Master the use of article (a/ an/ the);

the structure It was not until…that; Wh-

Questions; should;

Sports; Typical English cities;

Historical places

Chapter 2: the study

In this chapter, the writer provides information about the research questions, data description, data description, analytical framework, data analysis and discussion

2.1 Research Questions

For the purpose of the thesis and based on the theoretical background and given context, four research questions are proposed with a focus on construct, content and face validity These questions can serve as milestones for the analysis

Q1: Do the achievement tests for Higher Secondary School pupils meet the following criteria: content validity, construct validity, face validity?

Question 1 is further specified by minor questions evaluating on each type of validity (construct, content and face validity) respectively as follows:

Q 1.1: Can the testing techniques used in the tests correctly measure pupils's ability

to remember, understand and produce the leanrt language components?

Trang 31

Q 1.2 Are the language components in the tests a good representation of the

or two lessons serve as testing pupils' reading, listening, writing skills, phonetics, structures and vocabulary The term and final term tests are usually structured like fourty five minute tests This kind of tests are administered to test grammar, vocabulary and reading, writing, listening skills relating to the themes that pupils learn during the term or the school year However, as stated earlier, only language components in the tests were involved in the study

Trang 32

Table 3 describes the general format of the collected 45 minute tests and final tests used for Upper Secondary School with five parts as below:

Table 3: Format of 45 minute tests and final tests

Part Language

component focus

Input Respond item type

II Phonetics -Group words with

different underlined vowels and consonants

- Multiple choice

III Structures and

Vocabulary

- Incomplete single sentences

- Narrative or factual or story text ( approx 100-

Trang 33

week The test can test only structures and phonetics or vocabulary or all the three language components

However, the author only focuses on language components that are structures, vocabulary and phonetics in the collected tests The author collected the tests mainly from teachers; others were got from pupils Both qualitative and quantitative methods have been applied in this study All comments, remarks, assumptions and conclusion of the study are based on the analysis of 30 tests from those schools and the survey questionaire asking teachers who designed the collected tests about face validity

2.3.Analytical framework for data analysis

In order to find out whether the available English tests designed and used at Higher Secondary Schools under investigation ( from the Middle to the North of Viet Nam respectively) are in accordance with the syllabus and objectives of the 10th form or not, both objective and subjective methods which mainly based on the analysis of random samples of progress achievement tests and final achievement tests at the given Higher Secondary Schools The author mainly concentrated on analyzing the content validity, construct validity and face validity of the test Specifically, the author only focused on accessing the language components ( Phonetics, vocabulary and grammar) tested in the tests

2.2.3.1 Content validity

To investigate the content validity, the author analyzed and compared the tests’ content ( language components: phonetics, vocabulary and grammar) with the test specification basing on the objectives and syllabus of the textbook As mentioned earlier, after every two or three lessons, pupils have to take a 45 minute written test Therefore, pupils are supposed to take four 45 minute written tests in total and two final tests

Firstly, the author investigated the content validity of the written tests collected randomly from Upper High Schools from Ha Tinh to Ha Nam basing on test specifications

Ngày đăng: 19/03/2015, 10:32

Nguồn tham khảo

Tài liệu tham khảo Loại Chi tiết
1. Anastasi. A., (1982) Psychological testing. London: Macmillan Sách, tạp chí
Tiêu đề: Psychological testing
2. Alan Davies, Annie Brown, Cathie Elder, Kathryn Hill, Tom Lumley, Tim Mcnamada (1999), Dictionary of language testing, University of Melbourne Sách, tạp chí
Tiêu đề: Dictionary of language testing
Tác giả: Alan Davies, Annie Brown, Cathie Elder, Kathryn Hill, Tom Lumley, Tim Mcnamada
Năm: 1999
3. Alderson, C.J., C Clapham, and D. Walll (1995). Language test construction and evaluation. Cambridge, Cambridge University Press Sách, tạp chí
Tiêu đề: Language test construction and evaluation
Tác giả: Alderson, C.J., C Clapham, and D. Walll
Năm: 1995
4. Bachman, L.F., (1990). Fundamental considerations in language testing. Oxford University Press Sách, tạp chí
Tiêu đề: ). Fundamental considerations in language testing
Tác giả: Bachman, L.F
Năm: 1990
5. Bachman, L.F., (1991). What does language testing have to offer? (in) TESOL quarterly No.25 Sách, tạp chí
Tiêu đề: What does language testing have to offer
Tác giả: Bachman, L.F
Năm: 1991
6. Bachman, L.F., Palmer.A.S., (1981). Basic concerns in test validation. (in) Read, J.A.S. (ed), (1981) Sách, tạp chí
Tiêu đề: Basic concerns in test validation
Tác giả: Bachman, L.F., Palmer.A.S
Năm: 1981
7. Bernard Spolsky (1989), Communicative Competence, Language Proficiency, and Beyond, Applied Linguistics, Vol. 10, No. 2,Oxford University Press Sách, tạp chí
Tiêu đề: Communicative Competence, Language Proficiency, and Beyond
Tác giả: Bernard Spolsky
Năm: 1989
8. Broughton, G.Brumfit, C.Flavell, R. Hill, P.& Pincas, A. (1990). Teaching English as foreign language. London: routlege Education Books Sách, tạp chí
Tiêu đề: Teaching English as foreign language
Tác giả: Broughton, G.Brumfit, C.Flavell, R. Hill, P.& Pincas, A
Năm: 1990
9. Brown, F.G. (1971). Measurement and evaluation. Hascs, III, F.E. Peacock Sách, tạp chí
Tiêu đề: Measurement and evaluation
Tác giả: Brown, F.G
Năm: 1971
10. Canale, M. and M. Swain, (1980), Theoretical basis of communicative approaches to second language teaching and testing. Applied linguistics Sách, tạp chí
Tiêu đề: Theoretical basis of communicative approaches to second language teaching and testing
Tác giả: Canale, M. and M. Swain
Năm: 1980
11. Carroll, B.J and P.J. Hall, (1985). Make your own language tests, London: Pergamon Press Sách, tạp chí
Tiêu đề: Make your own language tests
Tác giả: Carroll, B.J and P.J. Hall
Năm: 1985
12. Carroll, J.B. (1968). The psychology of language testing. (in) A. Davies (ed) Language testing symposium. A psycholinguistic perspective. London: Oxford University Press Sách, tạp chí
Tiêu đề: The psychology of language testing
Tác giả: Carroll, J.B
Năm: 1968
13. Cohen, A.D. (1981). Second language testing. (in) Teaching English as a second foreign language. Celce-Murcia, M. (ed). Boston, Massachusetts: Heinle and Heinle Publishers Sách, tạp chí
Tiêu đề: Second language testing
Tác giả: Cohen, A.D
Năm: 1981
14. Davies, A. (1968). Demonstration occupational English test. Council on overseas professional qualifications Sách, tạp chí
Tiêu đề: Demonstration occupational English test
Tác giả: Davies, A
Năm: 1968
15. Davies, A. (1978). Language testing: Survey article. (in) Language teaching and linguistics: Abstracts 11 Sách, tạp chí
Tiêu đề: Language testing: Survey article
Tác giả: Davies, A
Năm: 1978
16. Finochiaro, M., & Sako, S. (1983). Foreign language testing a practical approach New York: Regent Publishing Company, Inc Sách, tạp chí
Tiêu đề: Foreign language testing a practical approach
Tác giả: Finochiaro, M., & Sako, S
Năm: 1983
17. Glen Fucher, (1999). Assessment in English for academic purposes: putting content validity in its place. Applied linguistics, Oxford University Press Sách, tạp chí
Tiêu đề: Assessment in English for academic purposes: putting content validity in its place
Tác giả: Glen Fucher
Năm: 1999
18. Harris, D.P. (1969). Testing English as a second language. New York: McGra, Hill Book Company Sách, tạp chí
Tiêu đề: Testing English as a second language
Tác giả: Harris, D.P
Năm: 1969
19. Harrison, A. (1983). Communicative testing jam tomorrow. London: Academic Press 20. Harold S. Madsen (1983) Techniques in testing, Oxford University Press Sách, tạp chí
Tiêu đề: Communicative testing jam tomorrow". London: Academic Press 20. Harold S. Madsen (1983) "Techniques in testing
Tác giả: Harrison, A
Năm: 1983
21. Heaton. J. B. (1988). Writing English language tests. London, Longman Sách, tạp chí
Tiêu đề: Writing English language tests
Tác giả: Heaton. J. B
Năm: 1988

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w