ielts rr volume07 report5

Context

The fields of language testing and second language acquisition (SLA) have highlighted the advantages of collaboration, as noted by several researchers (Hyltenstam & Pienemann, 1985; Bachman & Cohen, 1998; Shohamy, 1998; Ellis, 2001; Douglas, 2001; Laufer, 2001) Collaborative research can significantly enhance the development of performance and rating scales Specifically, it is essential to examine how the operationalization of competence levels in rating scales correlates with established L2 developmental stages, as well as to understand the linguistic proficiency profiles of students at various performance levels.

Research rationale

The IELTS Writing scales have been updated to adopt a more analytical approach, highlighting the need for detailed descriptions of writing proficiency at each band level According to Shaw (2004), essential features of an effective writing assessment scale include clarity and specificity in evaluating written language skills.

An effective scale for evaluating learner written performance should capture the essential qualities of writing, accurately reflect the progression of writing abilities as proficiency increases, and clearly differentiate between all band levels.

A comprehensive understanding of essential qualities in L2 writing, including their manifestation at various levels and sensitivity to performance factors like task effect, enhances our grasp of the writing construct (Weigle, 2002; Hawkey and Barker, 2004) By providing a sophisticated linguistic description of typical performance at each level, we can clearly delineate the characteristics that differentiate levels of performance This detailed approach would enable test developers to refine descriptors, which would be positively received by IELTS raters (Shaw).

Research objectives

This study documents the linguistic markers associated with various levels of English writing proficiency as defined by the academic version of the IELTS Writing module Administered in around 122 countries, the IELTS test assesses non-native English speakers aiming to study at English-medium tertiary institutions The Academic Writing module, which includes two tasks, is one of four components of the test (Listening, Reading, Writing, and Speaking) Test-takers receive separate grades for each task, with the final Writing module score being a weighted average, giving more importance to the second task.

Our original plan was to examine performances across all bands but performances at levels 1, 2 and 9 were not available so we examined scripts at band levels 3–8 only

The central questions that the study addresses are:

The defining characteristics of written language performance across IELTS bands can be analyzed through several key aspects Firstly, the frequency, type, and function of cohesive devices vary significantly, with higher bands showcasing a greater variety and more effective use of these devices to enhance coherence Secondly, vocabulary richness increases with each band, reflecting a broader range of lexical choices and more precise word usage Thirdly, syntactic complexity tends to be more pronounced in higher bands, where varied sentence structures and more complex grammatical forms are employed Lastly, grammatical accuracy improves as band levels increase, with lower bands exhibiting more frequent errors and higher bands demonstrating a command of grammar that minimizes mistakes.

2 How do these features of written language change from one IELTS level to the next across the 3–8 band range?

3 What are the effects of L1 and writing task type on the measures of proficiency under (1)?

We systematically organized the target linguistic features to encompass essential areas of language, enabling other researchers to connect with established frameworks like Cambridge ESOL’s Common Scale for Writing and the Common European Framework of Reference for Languages.

We also take into consideration how the learners’ first language and the type of task may affect their performances at different levels

This report details a completed study on L2 proficiency, reviewing existing literature on analytic measures and previous research into linguistic features across IELTS band levels, while considering factors like L1 influence and task effects It addresses design challenges encountered during the study and provides a comprehensive overview of the final sample and background data for each participant The analysis focuses on key language areas: cohesive devices, vocabulary richness, syntactic complexity, and grammatical accuracy The report concludes with a summary of findings and their implications for future research.

The characteristics of written language across various IELTS band levels can be examined through two primary methods One effective approach involves analyzing writing descriptors alongside rater behavior and perceptions, as illustrated in McNamara's 1996 study, particularly in chapter 5.

This study utilizes a second approach by examining written performances categorized at various band levels to identify common linguistic features among the scripts It builds upon existing research to enhance understanding of these characteristics across different performance levels.

Kennedy and Thorp (2002) and Mayor et al (2002).

Analytic measures of developing L2 proficiency

Larsen-Freeman (1978, pp 440) suggests that the ideal measure of linguistic ability should ‘increase uniformly and linearly as learners proceed towards full acquisition of a target language’

While proficiency scales are often perceived as linear, research indicates that the expectation of uniform progress in language development across individuals and areas is unfounded Studies show significant variability in second language (L2) acquisition rates among learners, and the interdependence of language property development seen in first language acquisition does not always apply to second language learning Therefore, rather than insisting on a uniform progression, it is more pragmatic to identify a set of measures that can accurately classify a learner's language profile at a specific level within a predetermined scale.

Wolfe-Quintero et al (1998) conducted a comprehensive meta-study on L2 writing proficiency, highlighting several key measures for further investigation These measures include the number of words per t-unit, words per clause, words per error-free t-unit, clauses per t-unit, and dependent clauses per clause Additionally, they suggest examining word type and sophisticated word type measures, as well as the ratio of error-free t-units to total t-units and errors per t-unit.

Linguistic features characteristic of each IELTS band level

This section reviews studies that have explored proficiency measures relevant to our context, enhancing the selection of metrics proposed by Wolf-Quintero et al (1998).

According to Mayor et al (2002), the key factors influencing band scores in Writing Task 2 include word count, error rate, complexity, and the effective use of the impersonal pronoun 'one'.

Kennedy and Thorp (2002) corroborated previous research, revealing that overt cohesive devices were utilized more often at IELTS levels 4 and 6, while their usage declined at levels 8 and 9 At higher proficiency levels, cohesion was typically conveyed through alternative methods, aligning more closely with native speaker norms These results echo the findings of Flowerdew (1998), as referenced in Kennedy and Thorp (2002, pp 102).

Research has shown that certain factors do not effectively predict band scores in writing assessments Specifically, the type of theme, punctuation errors, and the number of t-units with at least one dependent clause were found to be poor indicators (Mayor et al., 2002) These findings contrast with the conclusions drawn in the meta-study by Wolfe-Quintero et al (1998), highlighting the need for further investigation into these discrepancies.

While the studies identified key predictors of band level in written performances, the interplay between various investigated variables suggests that findings should be interpreted with caution In the following section (see 2.3), we will explore additional interacting variables that warrant attention.

Hawkey and Barker (2004) conducted a detailed analysis of written performances across various levels to identify distinctive features associated with each level Instead of using IELTS band levels, they applied the FCE marking scheme and the Cambridge ESOL Common Scale for Writing (CSW) to evaluate 288 scripts, comprising 53,000 words from FCE, CAE, and CPE assessments Following a rigorous rating process, they selected 98 scripts, totaling 18,000 words, that were consistently rated at levels 2 (8 scripts), 3 (43 scripts), 4 (18 scripts), and 5 (29 scripts) for further examination.

Hawkey and Barker used the categories developed using an intuitive approach to the remarking of the

The proposed draft scale for writing is based on an analysis of 98 scripts from the subcorpus, focusing on key linguistic features The criteria for identifying levels include language sophistication, accuracy, and the organization and cohesion of the text.

The features that we have investigated relate directly to these categories, as shown in Table 2.1

Hawkey and Barker (2004)/CSW features Features investigated in the present study

Sophistication of language Syntactic complexity

Organisation and cohesion Cohesive devices

Table 2.1: Comparison of Hawkey and Barker (2004)/CSW target features and those in the present study

These features are present in the IELTS Academic Writing scales as Vocabulary and Sentence

The features of Structure (VSS) and Coherence and Cohesion/Communicative Quality (CC/CQ) are fundamental to various proficiency and rating scales, notably reflected in the Common European Framework of Reference for Languages (CEFR) The CEFR manual highlights the significance of these features, particularly in the illustrative global scale and various scales addressing overall written production, general linguistic range, vocabulary range, and grammatical accuracy.

Potential intervening factors

Although the previously discussed features appear to be strong indicators of IELTS band scores, various other factors can influence the results in diverse ways This study specifically examined two of these intervening variables: the L1 effect and the task effect.

The impact of a learner's first language (L1) on second language (L2) development is well-established in second language acquisition (SLA) research, with evidence indicating that L1 influences specific L2 proficiency measures Notably, L1 transfer has demonstrated clear effects on L2 writing performance For instance, a study by Mayor et al (2002) revealed that the L1, specifically Chinese versus Greek, significantly influenced Writing Task 2 performances in several areas: complexity was assessed through the number of embedded clauses, with results showing that L1 significantly affected the types of clauses used, while band level had no notable impact; additionally, low-scoring Chinese L1 scripts exhibited more grammatical errors compared to Greek L1 scripts; finally, L1 Chinese writers tended to use more t-units, resulting in a greater number of themes in their writing.

The writer's first language (L1) appeared to have no significant impact on various types of errors, including spelling, punctuation, preposition usage, lexical choices, and the overall frequency of mistakes.

This study will make systematic analyses of possible L1 effects for each measure investigated

In a study by Mayor et al (2002), the performances of L1 Chinese and L1 Greek speakers on two versions of Writing Task 2 were compared, revealing overall similarities across performance levels However, notable differences emerged, particularly in error frequency across various categories, with the exception of preposition and lexis/idiom errors Additionally, the research highlighted variations in the number of t-units that incorporated dependent clauses.

Due to the prioritization of band level and L1 over test version, a balanced selection of test versions could not be collected for this study As a result, comparisons across different test versions will not be conducted Instead, we will analyze Task 1 and Task 2 scripts separately to examine potential task effects and establish relevant comparisons.

This study aimed to analyze the key characteristics of written language performance across IELTS band levels, focusing on cohesive devices, vocabulary richness, syntactic complexity, and grammatical accuracy We investigated how these elements evolve from one IELTS level to another within the 3–8 band range, as well as the influence of first language (L1) and writing tasks on the selected proficiency measures.

Table 3.1 presents a comprehensive comparison of key design features between the current study and earlier research This study enhances previous findings by analyzing a significantly larger dataset and focusing on both IELTS Academic Writing tasks, similar to the approach taken by Mayor et al.

(2002) study, it has controls for L1

Study No of scripts Corpus size

(words) IELTS band levels investigated Writing

132,618 3 to 8 1 and 2 26 Chinese and Spanish

Mayor et al (2002) 186 56,154 5 vs 7 and 8 2 2 Chinese and Greek

Thorp (2002) 130 35,464 4, 6, 8, 9 (8 and 9 conflated for analysis) 2 1 reported as unknown; presumably mixed

Barker (2004) 288 53,000 n/a; they were FCE, CAE and CPE 1 1 not reported; presumably mixed

† 275 of these were Task 1 scripts and 275 were Task 2 scripts, a pair per learner

Table 3.1: Comparison of coverage of the present study and some previous studies

Sampling

We aimed to collect a balanced number of IELTS scripts across all band levels (1–9), ensuring an equal representation of L1 Chinese and L1 Spanish speakers However, we encountered challenges in obtaining scripts for band levels 1, 2, and 9 due to their rarity among current IELTS test takers Ultimately, we gathered 159 scripts from various centers in China and 116 scripts from four Latin American countries: Colombia, Mexico, Peru, and Ecuador A summary of the script types included in our corpus is presented in Table 3.2.

Band L1 Chinese Centre L1 Spanish Centre Total

† The number of scripts in this table and in Figure 1 should be doubled if Task 1 and Task 2 are counted as separate scripts

Table 3.2: Scripts in our corpus

The distribution of scripts by L1 and band is uneven, which affects planned comparisons; however, the differences in mark ranges and frequencies between L1 Chinese and L1 Spanish centres are noteworthy Data indicates that test-takers in China generally take the test at lower levels of L2 proficiency compared to those in Latin America While we will not delve into the reasons or implications of these differences for Cambridge ESOL and other stakeholders, it is an important observation to consider.

Background data

To ensure the anonymity of test-takers and uphold the integrity of test performance, writing scripts and responses to the Candidate Information Sheet (CIS) are stored separately Due to the manual reconciliation process, it has not been feasible to complete the background information for every script in the dataset Table 3.3 displays the background data that has been successfully retrieved.

Years of L2 study: less than 5 12 50 62

Table 3.3: Background data that is available for the data set

The study revealed a nearly equal distribution of male and female test-takers, as well as a balanced representation between the two language groups, Chinese and Spanish The majority of participants were young individuals aged 16 to 25, all of whom had at least six years of experience in learning a second language (L2).

Definition of performance level

This study utilizes the band scores reported to students on their official IELTS test report forms, which have undergone standard quality control measures These quality control processes are detailed by Tony Green in a post on LTEST-L, a discussion forum for language testing professionals and researchers.

All IELTS examiners receive comprehensive training and accreditation to ensure consistent evaluation standards While not every exam script undergoes double-marking, a selection of scripts from each administration is reviewed to maintain rater quality and reliability.

The discussion surrounding LTEST-L highlights the need for improved practices in script evaluation, particularly advocating for double-rating of all scripts used in research and analysis Although it would have been ideal to re-rate the scripts in our sample to ensure score reliability, funding limitations and the unavailability of trained raters made this unfeasible Despite these challenges, we maintain that utilizing the 'live' scores for our analyses (sections 4.0 – 7.0) is justified, as these scores reflect the judgments of trained and monitored IELTS examiners, which are crucial for the decisions made regarding the candidates in our study.

Transcribing, coding and retrieval of information

Two transcribers collaborated with an investigator to establish transcription conventions before transcribing the same set of scripts They discussed and resolved any discrepancies in their transcriptions, ensuring consistency in their approach Additionally, a system was implemented to document transcription queries for future reference by researchers.

In our research, we utilized Wordsmith 3.0 for various standard quantitative text analyses and explored two coding and retrieval tools: CLAN X and Atlas.ti 5.0 We chose Atlas.ti 5.0 for its efficiency in transcription and coding, as well as its robust search capabilities Details of our analyses using Atlas will be discussed in subsequent sections Additionally, based on feedback from an independent reviewer, we plan to incorporate QDA Miner in future studies due to its integrated concordancer tool.

Coherence and cohesion are critical components of the IELTS rating scales, influencing the scores given for Task 1 and Task 2 scripts Raters assess coherence as part of the analytic scale, which evaluates the communicative quality and its impact on the reader (Hawkey and Barker, 2004) Analyzing text features that enhance cohesiveness and flow is essential The cohesive ties taxonomy by Halliday and Hasan (1976) has significantly shaped the analysis of coherence in written texts, serving as a foundational reference in our review.

We also discuss the analysis of anaphoric reference, specifically the frequency and patterns of use of the demonstratives: this, that, these and those.

Review of measures

Halliday and Hasan (1976) argue that cohesion within a text is established through five categories of cohesive ties: reference (also referred to as anaphora); ellipsis; substitution; conjunction; and lexis

Analyzing cohesion often involves examining the frequency, form, and context of connectors, ranging from simple conjunctions like "and" to contrastive devices such as "however." However, recent research indicates a minimal correlation between the use of linking words and test-taker performance (Ghazzoul, in progress) Additionally, Kennedy and Thorp (2002) suggest that lower IELTS band level test-takers tend to rely more on explicit linking devices compared to their higher band level counterparts.

Due to time constraints, we were unable to conduct a comprehensive analysis of lexical cohesion in the scripts However, we did explore ellipsis and substitution within a selected subset of the data.

This article analyzes 42 scripts from our corpus, representing a comprehensive range of proficiency levels from two language groups: Spanish-speaking learners with IELTS scores of 4–8 and Chinese L1 candidates with scores of 3–8 The selected texts exemplify 'perfect' band-level performances, as they received consistent scores across all analytic categories and the final band level However, we encountered challenges in determining the intentional and correct use of ellipsis and substitution, especially among lower IELTS band levels Two illustrative examples are provided for further examination.

The pie chart show that about world electricity production by energy source within

As we can see solid fuels are most And then are Nuclear 20%, Gas 18%, Oil 10%

Water 7% at least is Other renewables

The chart illustrates that the leading exporter countries are the United States and Canada, with an approximate value of $9.8 billion In contrast, Oceania follows with around $3.8 billion, highlighting a significant gap of approximately $6 billion between North America and Oceania.

The two highlighted items likely illustrate ellipsis and substitution In the first extract, "most" can be rephrased as "the most widely used energy sources," while in the second extract, "following" can be interpreted as "the next largest exporter." Although these elements aim to connect ideas, their improper usage creates several issues.

1 How might an instance of ellipsis or substitution be identified and should the analysis take into account possible intention on the part of the writer?

2 Might an overly generous identification of ellipsis and substitution at the lower IELTS band levels inflate the measurement of these features at these levels and therefore skew the results?

The Halliday and Hasan (1976) framework, originally based on the analysis of native speaker texts, lacks insights into the analysis of errors made by non-native speakers.

We decided to discontinue this analysis, believing that we could effectively address key elements of lexicogrammatical foundations through a more straightforward methodology in our evaluations of grammatical accuracy and syntactic complexity.

Anaphoric reference analysis encompasses the use of personal pronouns (e.g., he, she, it, they), demonstratives (e.g., this, these, that, those), and comparatives (e.g., same, similar, likewise, other) A preliminary examination of the data set indicated that demonstratives showed the most potential for analysis This study draws on the theoretical framework established by Botley (2000), who explored anaphora in texts written by native English speakers Botley meticulously identified occurrences of demonstratives—this, that, these, and those—and categorized them accordingly.

Botley, building on Halliday and Hasan’s (1976) framework and analyzing three corpora of English texts, identified five key features of demonstratives These features include the Recoverability of Antecedent, which assesses how readily a demonstrative's antecedent can be identified either directly from the text or indirectly through the reader's comprehension Additionally, he examined the Direction of Reference, determining whether the antecedent appears before (anaphorically) or after the demonstrative in the text.

The phoric type of demonstratives can be understood through their relationship with antecedents, which may be interpreted either semantically, as in referential phoric types, or syntactically Additionally, the syntactic function of each demonstrative is crucial, as it can serve as either the head of a noun phrase or a noun modifier Furthermore, the antecedent type plays a significant role, determining whether the demonstrative refers to a noun phrase or a clause.

These five main categories were further sub-divided to more precisely describe each occurrence, as in Table 4.1 below (reproduced from Botley, 2000)

Feature Value 1 Value 2 Value 3 Value 4 Value 5

(not-applicable, eg exophoric) none

(not applicable, eg exophoric or deictic)

Table 4.1: Values assigned to each occurrence of the demonstratives in

Currently, there is no automatic or semi-automatic system for identifying anaphora, necessitating manual analysis and annotation (Botley and McEnery, 2000, pp 3) We implemented Botley’s framework in two stages: first, we manually annotated occurrences of the four demonstratives across the entire dataset Next, we closely annotated a subset of 42 scripts using the five distinctive features identified by Botley, as outlined in Table 4.1, with specific provisos.

The recoverability of antecedents in writing largely hinges on the reader's capacity to make inferences and draw connections What may appear as a clear antecedent to some readers can be completely ambiguous to others, particularly if they lack familiarity with the specific genre or writing style.

Botley frequently employs 'anaphora' as a broad term encompassing various anaphoric phenomena, including cataphora, which refers to a reference made to an antecedent that has not yet been identified However, in the coding for Direction of Reference, anaphora and cataphora are distinctly differentiated.

In practice, however, no examples of cataphora were found in our sample

Determining whether the Phoric Type is referential or substitutional has proven to be challenging, with Botley (2000) noting its rarity in his data Consequently, it appears that this usage is uncommon and may not serve as a reliable indicator of language competence Due to its problematic nature, this category was deemed irrelevant to the study and was not utilized.

Determining the Antecedent Type may seem straightforward, but an examination of the data reveals significant overlaps between propositional and clausal antecedents Consequently, it was decided to focus solely on syntactic sub-categories, excluding the sub-category of 'propositional / factual antecedents.'

Frequency of use of demonstratives (this, that, these, those)

The qualitative data analysis software Atlas-ti was utilized to code scripts for the occurrences of demonstratives—this, that, these, and those Annotations were refined to indicate correct or incorrect usage based on singular or plural forms The final annotations were exported to an Excel file and analyzed quantitatively using SPSS The analyses included calculating the mean frequency of demonstrative usage by first language (L1), task type, and IELTS band level, along with standard deviations Additionally, the overall mean frequency of demonstratives was assessed across L1, task, and IELTS band levels.

Our analysis of demonstrative usage revealed a low incidence of incorrect applications among test-takers, with L1 Chinese participants showing errors of 8% in Task 1 and 5% in Task 2, while L1 Spanish participants had 11% in Task 1 and 10% in Task 2.

Appendix 1 presents the mean frequency of demonstrative use—this, that, these, and those—alongside standard deviations, categorized by L1 and IELTS band level for Tasks 1 and 2 Notably, the demonstrative 'those' shows minimal usage among test-takers across all levels and language groups Additionally, L1 Spanish test-takers tend to use demonstratives more frequently than their L1 Chinese peers The data in Appendix 1 reveals an unclear pattern in the use of individual demonstratives, highlighting significant variability within bands Consequently, we aimed to investigate the overall usage patterns of demonstratives.

They are presented in Table 4.2

L1 Chinese Means (SD) L1 Spanish Means (SD)

Table 4.2: The mean frequency of use of demonstratives as a whole by L1, task and IELTS band level

Table 4.2 highlights significant trends in demonstrative usage among L1 Spanish and L1 Chinese speakers, indicating that Spanish speakers utilize demonstratives approximately 50% more than their Chinese counterparts, likely due to the linguistic similarities between Spanish and English Additionally, the patterns of demonstrative usage differ between the two groups in Task 1, as illustrated in Figure 4.1, which shows that L1 Chinese test-takers exhibit a steady increase in demonstrative use corresponding to higher IELTS band levels, contrasting with the usage patterns of L1 Spanish speakers.

The decline in L1 Spanish test-takers corresponds with rising IELTS band levels, experiencing a notable increase at band level 8 Task 2 prompts more demonstrative usage than Task 1, supporting the necessity of including two writing tasks to encourage varied language application Additionally, L1 Chinese speakers show a lower tendency to use demonstratives in Task 1 compared to Task 2, while L1 Spanish writers exhibit a distinct pattern of demonstrative usage across both tasks.

In Task 1, Spanish writers with higher IELTS band levels tend to use fewer demonstratives, while in Task 2, the use of demonstratives increases with higher IELTS band levels.

Me a n T o ta l D e mo n st ra tiv e U se ( T a sk 1 )

Me a n T o ta l D e mo n st ra tiv e U se ( T a sk 2 )

Figure 4.1: The mean frequency of use of demonstratives as a whole by

L1, task and IELTS band level

We conducted a two-way ANOVA with a between-groups design, comparing L1 Spanish speakers and L1 Chinese speakers, to analyze the main effects and interactions of IELTS band levels on the average total demonstrative use for each task The analysis focused on bands 4 to 7 due to insufficient data from one L1 group for other bands Summary statistics for the two-way ANOVA, including the band and L1 interaction, are presented in Table 4.3, which details the mean total demonstrative use for Task 1.

Source Type III Sum of Squares Df Mean Square F Sig

Corrected Total 918.520 226 a R Squared = 151 (Adjusted R Squared = 124)

Table 4.3: Summary statistics for the two-way ANOVA band x L1, DV:

Mean total demonstrative use for Task 1

The study found that IELTS band level did not significantly influence the mean total use of demonstratives, while the first language (L1) had a notable impact (F = 21.698, p aspect (ing) > past tense > subject-verb agreement (third person singular ‘s’) > definite article > plural, indefinite article > possessive ‘s’.

The copula emerged as the most precise verb-related morpheme among L2 learners, revealing an implicational scale where accuracy diminished progressively Similarly, a comparable hierarchy was observed with noun-related morphemes, indicating a consistent pattern in language acquisition among learners.

In our investigation of morphemes, we focused on both early and late acquired types We anticipated that early morphemes, such as copula and plural marking, would effectively distinguish lower proficiency levels, while late morphemes, including the 3rd person singular 's' and passives, would serve as reliable indicators for higher proficiency levels.

When analyzing learner errors, it's important to acknowledge the challenges associated with error classification and quantification Research, such as that by Ellis and Barkhuizen (2005), highlights these pitfalls, while studies like Mayor et al (2002) and Hawkey and Barker (2004) provide further insights into the complexities involved.

Determining grammatical accuracy can be challenging; however, evidence indicates that it is a strong indicator of second language (L2) proficiency Research by Hawkey and Barker (2004) and the meta-study by Wolfe-Quintero et al (1998) demonstrate that error rates effectively predict proficiency levels Traditionally, grammatical accuracy has served as a key measure of development in both first and second language acquisition.

Research by Brown (1973), de Villiers and de Villiers (1973), Dulay and Burt (1973, 1974), Bailey et al (1974), Zobl and Liceras (1994), and Goldschneider and DeKeyser (2001) has significantly enhanced our understanding of language development complexities Investigating grammatical accuracy across IELTS band levels is expected to further illuminate the key research questions of this study.

Procedure for calculating grammatical accuracy

We adopted standard calculations of grammatical accuracy (see Ellis and Barkhuizen, 2005, for more details and critical discussion of this methodology)

TLU = number of correct suppliance in obligatory contexts number of obligatory contexts + number of suppliance in non-OCs

Results

Our findings are compatible with the predictions in the L2 development literature: accuracy on plural and copula was higher than accuracy on SV agreement and passives across levels and L1 groups

The use of subject-verb agreement and passive constructions is identified as a key indicator of enhanced proficiency across the entire range studied Further research is warranted, particularly focusing on the marking of third person singular 's', which remains unaffected by the learner's first language.

The tables and graphs presented summarize key global findings, followed by an in-depth discussion of additional insights that are crucial for interpreting the global accuracy scores This includes an analysis of various numerical data points, focusing on the distinctions between specific categories and the grammatical structures involved, such as copulas and subject-verb agreement.

Table 7.1: TLU: L1 Chinese – Tasks 1 and 2

Band level number on 'this' number on 'that' number on 'these' number on 'those' copula (am, is, are) copula (was, were) S-V agreement (3PS 's') passives

Figure 7.1 illustrates the TLU for L1 Chinese, focusing on Tasks 1 and 2, which analyze the use of demonstrative pronouns such as 'this,' 'that,' 'these,' and 'those.' The section also addresses the copula verbs 'am,' 'is,' and 'are,' along with their past forms 'was' and 'were.' Additionally, it highlights subject-verb agreement, particularly in the third-person singular, and the application of passive constructions.

Table 7.2: TLU: L1 Spanish – Tasks 1 and 2

Band level number on 'this' number on 'that' number on 'these' number on 'those' copula (am, is, are) copula (was, were) S-V agreement (3PS 's') passives

Figure 7.2: TLU: L1 Spanish – Tasks 1 and 2

7.3.1 Default use of the verbs ‘be’ and ‘have’

In the assessment of grammatical accuracy for the TLU calculation, the verbs 'be' and 'have' were deemed correct if their agreement and tense were appropriate, even when their meanings were somewhat unconventional However, their usage often appeared inappropriate, as they were frequently employed in a semantic default manner, where more precise verbs could have been utilized This tendency is primarily observed in lower-level scripts, although instances can also be found in higher-level submissions.

This finding suggests that focusing on the range of verb tokens in vocabulary assessments could be beneficial Specifically, applying Type-Token Ratio (TTR) exclusively to verbs may serve as a more effective indicator of second language (L2) proficiency compared to overall TTR or TTR for other word categories Further empirical investigation into this approach is warranted.

7.3.2 Some number agreement errors seem due to incorrect lexical learning

There are several examples of number agreement errors with certain nouns

Typical nouns involved: ̇ information ̇ news ̇ people ̇ women ̇ police

Examples: and these information are all belong to four countries: Jamaica, Ecuador, Singapore and Bolivia (088-9873-CN002-100104-000-1-6)

To such an extent, the police does not exclude the weapons but require their assistance when living in dangerous environments (110-3367-CN172-200304-000-2-5)

7.3.3 Prefabricated patterns do not guarantee TL production

Prefabricated patterns are known to be part of development, especially early on We found that even quite frequent constructions were open to grammatical errors

It is well know to everybody that the socity need competition

It can be clear seen that carbon dioxide produced from power stations takes the biggest amount all over the 2 decades (025-4749-CN911-121002-090-1-6)

7.3.4 Difficulties determining obligatory contexts with low-level scripts

Analyzing the clause structure in certain text sections can be challenging, particularly for lower proficiency levels When consensus on the correct analysis of an item could not be achieved, that item was excluded from the accuracy analysis A record was maintained of all discarded items to facilitate potential future analyses of these contexts.

7.3.5 Difficulty distinguishing formulaic vs productive use of language

We considered all language generated by learners as productive, recognizing that distinguishing between formulaic and productive use often relies on subjective judgment A more in-depth analysis of formulaic and repetitive language use presents an intriguing avenue for future research.

Given the limited time available, we determined that achieving our goals was unrealistic Nonetheless, we have identified and labeled cases that initially appear to be strong candidates for classification as formulaic use in the TLU analysis sheets, aiding in further analysis.

7.3.6 Inflation of scores by repetition of certain structures

Writers occasionally utilize the same lexical or grammatical structures multiple times, which can be seen as a factor of inflation in their work While it may be challenging to determine if these repetitions should be eliminated, we have chosen to retain them We encourage readers to be mindful of this aspect in the text.

Future research should focus on analyzing the identified error clusters in greater detail, which can be utilized to develop resources for educators, course directors, testers, and other stakeholders By extracting specific examples from the coded database, these materials can enhance the clarity and effectiveness of marking guidelines.

This study aims to identify the linguistic markers associated with varying levels of English writing proficiency, as defined by the academic version of the IELTS writing module.

In our study, we analyzed 275 IELTS scripts from test-takers with L1 Chinese and L1 Spanish backgrounds, specifically at levels 3–8 on the band scale Our findings revealed that the use of demonstratives such as 'this', 'that', and 'these' is influenced by both the language background of the test-takers and the specific tasks they were assigned.

1 L1 Spanish speakers use approximately 50% more demonstratives than L1 Chinese speakers

For L1 Chinese speakers, the task influences the frequency of demonstratives used, yet the correlation between demonstrative usage and IELTS band level remains consistent In contrast, L1 Spanish speakers exhibit a stable use of demonstratives; however, the relationship between their demonstrative usage and IELTS band level varies between Task 1 and Task 2.

Research indicates that the use of demonstratives decreases as language proficiency increases, implying a shift towards other cohesive elements like lexical ties Consequently, we anticipate that higher IELTS band scores will reflect increased lexical variation and sophistication in language use.

Our analysis of vocabulary richness confirms that scripts at higher IELTS band levels exhibit increased lexical variation and sophistication Additionally, we observed other significant findings related to language use.

1 The L1 of the test-taker affects lexical output, lexical variation and lexical density but it does not affect lexical sophistication

The mean frequency of use of the demonstratives ‘this’, ‘that’, ‘these’ and ‘those’ (including standard deviations) according to L1 and IELTS band level for Task 1

L1 Chinese Means (SD) L1 Spanish Means (SD) this that these those this that these those

The mean frequency of use of the demonstratives ‘this’, ‘that’, ‘these’ and ‘those’ (including standard deviations) according to L1 and IELTS band level for Task 2

L1 Chinese Means (SD) L1 Spanish Means (SD) this that these those this that these those

The 50 most frequent words in the L1 Chinese scripts

Tiêu đề	Documenting features of written language production typical at different IELTS band score levels
Tác giả	Jayanti Banerjee, Florencia Franceschina, Anne Margaret Smith
Trường học	Lancaster University
Chuyên ngành	Linguistics
Thể loại	Research Report

Định dạng
Số trang	69
Dung lượng	2,13 MB