Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 345 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
345
Dung lượng
3,35 MB
Nội dung
VerbsintheWrittenEnglishofChineseLearners:
A Corpus-basedComparison
between Non-nativeSpeakersandNativeSpeakers
by
Xiaotian Guo
A thesis submitted to the University of Birmingham
for the degree of DOCTOR of PHILOSOPHY
Supervisor: Professor Susan Hunston
The Department ofEnglishThe School of Humanities
The University of Birmingham October 2006
University of Birmingham Research Archive
e-theses repository
This unpublished thesis/dissertation is copyright ofthe author and/or third
parties. The intellectual property rights ofthe author or third parties in respect
of this work are as defined by The Copyright Designs and Patents Act 1988 or
as modified by any successor legislation.
Any use made of information contained in this thesis/dissertation must be in
accordance with that legislation and must be properly acknowledged. Further
distribution or reproduction in any format is prohibited without the permission
of the copyright holder.
i
Abstract
This thesis consists of ten chapters and its research methodology is a combination of
quantitative and qualitative. Chapter One introduces the theme ofthe thesis, a demonstration
of acorpus-based comparative approach in detecting the needs ofthe learners by looking for
the similarities and disparities betweenthe learner English (the COLEC corpus) andthe NS
English (the LOCNESS corpus). Chapter Two reviews the literature in relevant learner
language studies and indicates the tasks ofthe research. The data and technology are
introduced in Chapter Three. Chapter Four shows how two verb lemma lists can be made by
using the Wordsmith Tools supported by other corpus and IT tools. How to make sense ofthe
verb lemma lists is the focus ofthe second part of this chapter. Chapter Five deals with the
individual forms ofverbsandthe findings suggest that there is less homogeneity inthe learner
English than the NS English. Chapter Six extends the research to verb–noun relationships in
the learner Englishandthe NS Englishandthe result shows that the learners prioritise verbs
over nouns. Chapter Seven studies the learners’ preferences in using the patterns of KEEP
compared with those ofthe NSs, and finds that the learners have various problems in using
this simple verb. In this chapter, too, my reservations about the traditional use of ‘overuse’
and ‘underuse’ are expressed anda finer classification system is suggested. Chapter Eight
compares another frequently-occurring verb, TAKE, inthe aspect of collocates and yields
similar findings that the learners have problems even with such simple vocabulary. In Chapter
Nine, the research findings from Chapter Four to Chapter Eight are revisited and discussed in
relation to the theme ofthe thesis. The concluding chapter, Chapter Ten, summarises the
previous chapters and envisages how learner language studies will develop inthe coming few
years.
ii
Acknowledgements
First and foremost, I would like to thank my supervisor Professor Susan Hunston. She spent a
large amount of time on my thesis and guided me from the design ofthe research to the last
version of each chapter. As an experienced supervisor and teacher, she knows very well when
to leave me free exploring for something useful and when to bring my attention back to things
with value. She hardly tells me what to do, but offers suggestions, comments, and clues for
further development, leaving me enough time to reflect and digest. Undoubtedly, the
knowledge I obtained from her supervision will be the most valuable assets for my academic
career.
Secondly, my thanks should go to my beloved wife, Xiaorong (Wang). Actually, she sacrificed
so much for my PhD study that I can hardly find appropriate words to express my gratitude.
Different from many students who were funded by one means or another, my PhD was self-
sponsored. Therefore, my finance became the dominating difficulty of my PhD study. In order
to overcome this obstacle, she worked extremely hard and underwent great hardship and
suffering. Even though she deserves a long break after the submission of my thesis, the
unfortunate damage caused to her health may take the rest of her life to mend. In this sense,
any words of thanks are incredibly weak and inadequate.
Thirdly, my sincere thanks go to my colleagues and friends who have supported me in many
different aspects. Without their help my thesis could not have been accomplished by now. The
names to follow are only some of them (with all the given names first and surnames last to be
consistent): Richard (Zhonghua) Xiao, Scott (Songlin) Piao, Wenzhong Li, Pernilla
Danielsson, Seo-In Shin, and Frank (Maocheng) Liang for their help in IT and corpus
technologies; Geoff Barnbrook, Antoinette Renouf, Wenzhong Li and Jinbang Du for their
valuable comments and suggestions; Sylviane Granger, John Milton, Angela Hasselgren,
Shichun Gui, Jianzhong Pu, and Michael Rundell for their articles, PhD theses or other
information sent to me when I was in desperate need of them; Wenjin Zhao, Zequan Liu,
Laiqi Zhang, Junhua Zhang and Yaodong Wang for their encouragement and support as
friends. There are others who helped me in one way or anther, but I am afraid I cannot list
them all here.
iii
Fourthly, I am grateful to my external examiner Mike Scott and internal examiner Martin
Hewings for their valuable comments and advice andthe chair to my viva Murray Knowles
for his valuable time.
In addition, I am deeply indebted to my sister who looked after my parents together with my
brother while I could not fulfil my part of duty as a son. I also thank my wife’s family, Shulin
and his family for their encouragement and support. My special thanks go to my daughter
who accompanied me through the ups and downs ofthe years, especially when my wife had
to work in another place. She also helped me with the proofreading oftheChinese pin-yin
(the remaining errors still belong to me, of course).
Furthermore, thanks are overdue to the Great Britain-China Education Trust and Sino-British
Fellowship Trust for the £1000 fellowship which was sent to me on the very day ofthe
Chinese Spring Festival of 2003. It was the only funding I gained throughout my PhD study.
Even though such an amount was far from liberating me from the financial strains, the very
act of providing such a grant justified my study and greatly encouraged me to go through the
rest ofthe difficulties. It meant a lot to me.
Last but not least, I must thank the University of Birmingham, especially the staff members of
the Department of English, the School of Humanities, the Information Service, the Academic
Office andthe International Office for their unfailing and patient support.
iv
Table of Contents
INTRODUCTION 1
1.1 T
HE THEME AND AIM OFTHE RESEARCH
1
1.2 I
NTRODUCING COMPUTER LEARNER CORPUS RESEARCH
1
1.3 T
HE BACKGROUND TO THIS RESEARCH
2
1.4 T
HE IMPETUS OF THIS RESEARCH
3
1.5 T
HE FOCUS AND RESEARCH QUESTIONS OFTHE RESEARCH
4
1.6 T
HE METHODOLOGY OFTHE RESEARCH
4
1.7 T
WO ASSUMPTIONS BEHIND THIS RESEARCH
5
1.8 T
HE STRUCTURE OFTHE THESIS
6
CHAPTER TWO 8
A LITERATURE REVIEW OF LEARNER LANGUAGE STUDIES 8
2.1 E
ARLIER LEARNER LANGUAGE STUDIES
8
2.1.1 Error analysis recalled 8
2.1.2 Second language acquisition reviewed 11
2.1.3 Conclusion 11
2.2 C
OMPUTER LEARNER CORPORA
:
A NEW ERA
12
2.2.1 The International Corpus of Learner English 13
2.2.2 The Longman Learners’ Corpus 13
2.2.3 The Hong Kong University of Science and Technology Learner Corpus 14
2.2.4 TheChinese Learner English Corpus 14
2.2.5 Computer learner English studies as a ‘newborn baby’ of applied linguistics 15
2.3 T
YPOLOGY OF
CLC
DATA
16
2.3.1 Synchronic vs. diachronic 16
2.3.2 Written vs. spoken 17
2.3.3 Un-annotated vs. annotated 18
2.4 C
LEAN
-
TEXT POLICY AND ANNOTATION
18
2.5 L
EARNER CORPUS ANNOTATION
21
2.6 C
ONTRASTIVE
I
NTERLANGUAGE
A
NALYSIS AND ITS DATA PROCESSING APPROACHES
22
2.6.1 The notion of Contrastive Interlanguage Analysis (CIA) 22
v
2.6.2 Quantitative plus qualitative: approaching CLC data 22
2.7 L
EARNER
E
NGLISH FEATURES
23
2.7.1 The informal and speechlike features ofwritten learner English 24
2.7.2 Small vocabulary range, overuse of general vocabulary andthe ‘teddy bear
principle’ 28
2.7.3 More open-choice-principled than idiom-principled 30
2.7.4 Proficiency level and fossilised errors 31
2.7.5 The essential role of L1 in L2 production 33
2.7.6 A narrower range of senses inthe use of vocabulary 34
2.8. A
PPLICATIONS OF RESEARCH RESULTS
35
2.8.1 TeleNex 35
2.8.2 CALL Tools 36
2.8.3 Dictionary compilation 37
2.8.4 Textbook enhancement 39
2.8.5. Data-driven learning 39
2.9 S
OME LIMITATIONS OF PREVIOUS
CLC
RESEARCHES
40
2.9.1 Lack of systematic study of lexis 41
2.9.2 Lack of POS segmentation for multiple-POS words 41
2.9.3 Lack of semantic segmentisation for multiple-sensed words 41
2.9.4 Lack of in-depth exploration in learner language feature identification 42
2.9.5 No linguistic standards to scale the level of learner English 43
2.9.6 Some reservations about the use of ‘overuse’ and ‘underuse’ 45
2.9.7 Some reservations with error-tagging 45
2.10 C
ONCLUSION
49
CHAPTER THREE 50
THE DATA ANDTHE TOOLS 50
3.1 I
NTRODUCTION
50
3.2 T
HE DATA
50
3.2.1 The Learner Corpus – COLEC 50
3.2.2 TheNative Speaker Corpus - LOCNESS 52
3.2.3 The back-up resources 56
vi
3.2.3.1 The Bank ofEnglish 56
3.2.3.2 The Google search engine 57
3.3 T
HE
W
ORD
S
MITH
T
OOLS
58
3.3.1. Concord 58
3.3.2 WordList 64
3.4 C
ONCLUSION
65
CHAPTER FOUR 66
MAKING AND MAKING SENSE OF TWO VERB LEMMA LISTS 66
4.1 I
NTRODUCTION
66
4.2 S
OME ISSUES IN MAKING A VERB LEMMA LIST
67
4.2.1 The significance of making a verb lemma list 67
4.2.2 Some notions 67
4.2.3 The difficulties in making a verb lemma list 68
4.2.4 Two approaches to making a verb list 69
4.3 M
AKING TWO VERB LEMMA LISTS
70
4.3.1 The lemma list archetype 70
4.3.2 Tagging the corpora 72
4.3.3 Editing the raw verb lemma lists 74
4.3.3.1 Dealing with small-frequency lemmas 75
4.3.3.2 Detecting wrongly used lemmas 75
4.4 M
AKING SENSE OFTHE TWO VERB LEMMA LISTS
76
4.4.1 A rational study 76
4.4.1.1 Some explorations in semantic theory applications in vocabulary teaching 76
4.4.1.2 Some pioneering work concerning the presentation of vocabulary to learners 81
4.4.1.3 Some explorations in verb classification based on syntactic constructions 82
4.4.1.4 Some explorations ofthe links betweenthe known and unknown andbetween L1
and L2 84
4.4.2 Working out a design for the grouping ofthe verb lemmas of COLEC and
LOCNESS 85
4.4.3 General principles of grouping the verb lemmas in COLEC and LOCNESS 86
4.4.3.1 Neighbouring concept groups (1) 92
vii
4.4.3.2 Neighbouring concept groups (2) 96
4.4.3.3 Near antonymous groups 100
4.4.3.4 Six large family groups 105
4.4.3.5 Special concept groups 109
4.4.3.6 The miscellaneous groups 110
4.5 R
ESEARCH QUESTIONS REVISITED AND ANSWERED
114
4.6 C
ONCLUSION
118
CHAPTER FIVE 120
VERBS IN DIFFERENT FORMS COMPARED 120
5.1 I
NTRODUCTION
120
5.2 A
GENERAL VIEW OFTHE TOTAL FREQUENCY OFTHE DIFFERENT FORMS OF VERBS
121
5.3 T
HE TOP
20
VERBS IN THEIR DIFFERENT FORMS IN
LOCNESS
AND
COLEC 122
5.3.1 The top 20 verbsin their different forms in LOCNESS 123
5.3.2 The top 20 verbsin their different forms in COLEC 124
5.4 T
HE DIFFERENT FORMS OFTHE TOP
20
VERBS COMPARED
126
5.4.1 The V-e forms ofthe top 20 verbsinthe two corpora compared 127
5.4.2 The V-s forms ofthe top 20 verbsinthe two corpora compared 128
5.4.3 The V-ing forms ofthe top 20 verbsinthe two corpora compared 129
5.4.4 The V-ed forms ofthe top 20 verbsinthe two corpora compared 131
5.4.5 The V-n forms ofthe top 20 verbsinthe two corpora compared 132
5.4.6 Some summary remarks 133
5.5 E
XAMINING THE MATCHED VERB FORM LISTS
136
5.5.1 Matching the V-i form lists 137
5.5.2 Matching the V-e form lists 138
5.5.3 Matching the V-s form list 139
5.5.4 Matching the V-ing form lists 140
5.5.5 Matching the V-ed form lists 142
5.5.6 Matching the V-n form lists 142
5.5.7 Some remarks in summary 145
5.6 S
OME PEDAGOGICAL IMPLICATIONS
146
5.6.1 Significance for the writer of teaching materials 146
viii
5.6.2 Significance for the teacher andthe learner 147
5.6.3 Significance for learner English level evaluation 148
5.6.4 Implications for further corpus design, construction andcomparison 148
5.6.5 Some problems revealed concerning CLC studies 149
5.7 C
ONCLUSION
150
CHAPTER SIX 151
BETWEEN VERBSAND NOUNS 151
6.1 I
NTRODUCTION
151
6.2 A
GENERAL VIEW OFTHE DISPARITY BETWEENTHE TWO CORPORA IN TERMS OFTHE
SELECTION BETWEENVERBSAND NOUNS
152
6.3 A
DETAILED LOOK AT THE DISPARITY BETWEENTHE TWO CORPORA IN TERMS OF
SELECTION BETWEENVERBSAND NOUNS
155
6.3.1 Betweenthe verb use andthe noun use within the same word form 156
6.3.2 Betweenverbsand nouns with different word forms 161
6.3.3 Betweenverbsand nouns in prepositional phrases 164
6.3.3.1 Betweenverbsand nouns in simple prepositions 166
6.3.3.2. Betweenverbsand nouns in complex prepositions 168
6.4 Discussions 171
6.5 Conclusion 173
CHAPTER SEVEN 174
USING PATTERNS AND PHRASES TO INTERPRET LEARNER ENGLISH 174
7.1 I
NTRODUCTION
174
7.2 I
NTRODUCING THE RATIO RELATIONSHIPS BETWEENTHE TWO CORPORA
175
7.3 D
EFINING
‘
PATTERN
’
AND
‘
PHRASE
’ 179
7.4 L
OOKING AT THE PATTERNS OF
KEEP
IN
COLEC
AND
LOCNESS 180
7.4.1 Interpreting the frequency relationships between COLEC and LOCNESS 180
7.4.1.1 A large frequency in COLEC vs. a large frequency in LOCNESS 182
7.4.1.2 A large frequency in COLEC vs. a small frequency in LOCNESS 184
7.4.1.3 A small frequency in COLEC vs. a large frequency in LOCNESS 185
7.4.1.4 A small frequency in COLEC vs. a small frequency in LOCNESS 185
7.4.1.5 No frequency in COLEC vs. a small frequency in LOCNESS 186
[...]... – what nativespeakers of the language in question typically write or say (either in general or ina situation / ina certain text type) For language teaching, however, it is not only essential to know what nativespeakers typically say, but also what the typical difficulties ofthe learners ofa certain language, or rather of certain groups of learners of this language, are 12 As seen above, there... In an article by Schachter and Celce-Murcia (1977: 442), a vivid depiction ofthe prevalence of EA is presented thus: A cursory glance at the titles and abstracts in recent issues of journals such as this one [TESOL Quarterly] (and others such as Language Learning and IRAL) would indicate that the advocates of EA have prevailed and that EA currently appears to be the “darling” ofthe 70’s However, EA... on a study of verb-related features ofChinese learner EnglishThe aim ofthe research is to demonstrate how a corpus linguistic approach to learner English studies can help us to find out the similarities and disparities between thewrittenEnglishof a group of non -native speakers (NNSs) and that ofa group ofnativespeakers (NSs) It is hoped that the identification of similarity and difference between. .. contribution of [the researchers] has provided them with any significantly new information It was a significant advance when EA researchers to have placed the learner language (rather 2 than L1 and L2) under examination A central consensus among EA researchers was that the learner’s errors, instead of being seen as negative, should be treated as positive The learner’s language was treated as “interlanguage”... KEEP, to investigate how the learners’ performance approximates that ofthe NS in terms of patterns (in line with Hunston and Francis 1999) Chapter Nine summarises the findings ofthe research chapters and discusses the advances this research has made in learner corpora studies The pedagogical implications of this research will be addressed in this chapter and some possible studies inthe area of learner... which CLC has emerged Earlier research in learner language may be traced to EA It was generally maintained before the EA era, for instance in CA, that the learner’s errors are undesirable because they are a sign of non-acquisition Since the CA researchers found a relationship betweenthe learner’s errors andthe difference betweenthe learner’s mother tongue (L1) and their second language (L2), they tried... similarities and disparities betweenthe learner Englishandthe NS English inthe aspect ofthe width and depth of verbs? (By the width of verbs, I mean the size of vocabulary in verbs By the depth of verbs, I mean the range of senses ofverbsandthe many words which, while being other POS, have a verbal function.) 2) What kinds of techniques could be used to answer the previous research question? 3) What are... research into NS corpora contribute to the description ofthenative language alone and provide “no information as to the relative difficulty and learnability of particular features to be taught” and studies “based on the analysis ofnative -speakers behaviour fail to consider the productivity of particular features from the learner’s perspective” Inthe words of Granger (1998b: 7), native corpora cannot... depth of learners’ vocabulary knowledge, whereas actually both ofthe aspects “constitute equally important and vital components ofthe overall lexical ability” Bearing this in mind, this thesis explores both the breadth andthe depth ofthe learners’ lexicon inthe aspect ofverbsIn Chapters Four and Five, the research focuses on the breadth ofthe learners’ lexicon in verbs Chapters Seven and Eight then... is their most important aspect) they are indispensable to the learner himself, because we can regard the making of errors as a device the learner uses in order to learn It is a way the learner has of testing his hypothesis about the nature ofthe language he is learning In explaining the process of how EA scholars conduct error analysis, Ellis (1994: 68-69) has summarised it in four stages, i.e the . Verbs in the Written English of Chinese Learners: A Corpus-based Comparison between Non -native Speakers and Native Speakers by Xiaotian Guo A thesis submitted to the University of Birmingham. demonstration of a corpus-based comparative approach in detecting the needs of the learners by looking for the similarities and disparities between the learner English (the COLEC corpus) and the NS English. Louvain Corpus of Native English Essays NL native language NNS non -native speaker NS native speaker POS part of speech SL second language SLA Second Language Acquisition TL target language