ielts research partner paper 3

Background

Recent advancements in online video-conferencing technology have transformed face-to-face interaction, allowing users in different locations to communicate effectively in real time through audio and video Applications like Adobe Connect facilitate this seamless communication, enhancing collaboration regardless of physical proximity.

Facetime, Google Hangouts, Skype, and Zoom have become essential tools for professional communication across different locations, facilitating face-to-face interactions while significantly reducing travel expenses.

Video conferencing has gained acceptance as a delivery method in educational settings, particularly for second/foreign language (L2) learning Despite its growing use, video conferencing remains underutilized in L2 speaking assessments, and research on this assessment mode is limited This study aims to enhance the understanding of video conferencing in L2 speaking assessment by examining the comparability of speaking constructs evaluated through both face-to-face and video conferencing methods.

Phases of the study

Phase 1

Phase 1 consisted of a small-scale initial investigation conducted at a Further Education

In 2014, a study was conducted at a college in London involving 32 students from diverse nationalities and four trained IELTS examiners This convergent parallel mixed-methods research aimed to identify the similarities and differences in scores, linguistic output, test-taker feedback, and examiner behavior between face-to-face (f2f) and video conferencing (VC) delivery formats The findings of the study were compiled into a report submitted to the IELTS Partners, including the British Council and Cambridge, with recommendations for future research.

Assessment English and IDP IELTS Australia) in June 2014 and was subsequently published on the IELTS website (Nakatsuhara, Inoue, Berry and Galaczi, 2016)

See also Nakatsuhara, Inoue, Berry and Galaczi (2017a) for a theoretical, construct- focused discussion on delivering the IELTS Speaking test in face-to-face and video-conferencing modes.

Phase 2

Phase 2 was a larger-scale follow-up study designed to implement recommendations from Phase 1 It was conducted at Sydney Institute of Language and Communication

In 2015, the SILC Business School at Shanghai University conducted a study involving 99 test-takers who completed two IELTS Speaking tests under both face-to-face and computer-delivered video-conferencing conditions The performances were evaluated by 10 trained IELTS examiners using a convergent parallel mixed-methods design, which facilitated the collection of comprehensive findings from various sources The study utilized MFRM analysis to assess test-takers’ scores, alongside feedback questionnaires, focus-group discussions with examiners, and observational notes from test sessions The results revealed no significant differences in scores between the two testing modes; however, qualitative differences were noted in the functional output of test-takers and the behavior of examiners.

Most examiners believed that test-takers had equal opportunities to showcase their English proficiency and that rating their performance was equally straightforward The findings were then analyzed in relation to the EAP speaking construct, which now encompasses video-conferencing communication in distance-learning programs and oral examination contexts.

A report on the findings of the study was submitted to the IELTS Partners in March 2017 and has been published on the IELTS website (Nakatsuhara, Inoue, Berry and Galaczi,

Phase 3

The design of the current study (Phase 3) has been informed by the recommendations of the Phase 2 study and previous research on video-conferencing technology in teaching and assessment contexts

This article summarizes relevant literature and details the methodology and data collection for a study focused on examiner training and the creation of a tailored technological solution for test delivery It presents the research findings and their implications, concluding with recommendations for additional investigations needed before fully implementing a video-conferencing format for the IELTS Speaking test.

The use of video-conferencing has grown in the last decade, with the widespread availability of free or inexpensive software applications, such as Skype or Google

Video conferencing has become a valuable tool in distance learning, effectively connecting teachers and experts with students, as well as facilitating peer interactions Its benefits include enhancing content knowledge through direct access to expert insights Additionally, video conferencing fosters broader educational advantages, such as the development of intercultural competence, collaborative learning, and increased awareness and tolerance of diversity It also promotes essential learner cognitive traits, including autonomy, motivation, and self-regulated learning.

2004; Lawson, Comber, Gage & Cullum-Hanshaw 2010; Lee 2007).

Video-conferencing in language education

Video-conferencing has significantly impacted language education by providing authentic input and enhancing speaking practice in remote classrooms, especially where teachers may have limited language proficiency These interactive opportunities are crucial for second language acquisition (Ellis, 2005) The technology has been effectively utilized in small-scale language exchanges (Kinginger, 1998) and has also been integrated into extensive national educational initiatives like the Plan Ceibal en Ingles program.

Uruguay, which provides English lessons via video-conferencing to over 80,000 children in Uruguayan public schools (www.britishcouncil.uy)

Lawson et al (2010) identify two main areas of research regarding video-conferencing in education: one area examines the user experience of video-conferencing, while the other investigates its pedagogical aspects, emphasizing the elements that enhance learning outcomes.

Taken together, the main findings from this body of literature have produced several empirically supported insights of relevance for the present study.

Technical issues, particularly related to sound and video quality, significantly impact the video conferencing learning experience Lag and desynchronization between audio and video can hinder interaction and create ambiguity, ultimately affecting the overall effectiveness of the learning process.

Kern (2014) highlights that in a collaborative project involving French students from a U.S university and two French universities, significant delays in communication created confusion regarding the timing of paralinguistic cues, such as smiles and nods, leading to challenges in understanding (p 98) Similarly, Wang (2006) emphasizes the importance of sound and video quality in educational video conferencing settings.

The effect of the video-conferencing medium on paralinguistic factors, such as body language and facial expressions, has also been reported as an important consideration

Research highlights significant constraints in physical video-conferencing, including the need for participants to remain mostly immobile, as webcams can exaggerate physical movements, and gestures outside the camera's view are not captured Additionally, achieving direct eye contact poses a challenge, as speakers must choose between looking at the webcam or the screen, which disrupts genuine interaction Paralinguistic features, such as facial expressions, play a crucial role in facilitating conversation, as noted by Wang (2004, 2006), where cues like an "expectant look or raised eyebrows" are essential for engagement These elements are vital in video-conferencing settings, as they help reduce misunderstandings and ambiguity in communication, with differences in their application compared to face-to-face interactions potentially impacting the overall success of the experience.

Other empirical findings have pointed to the role of affective factors in video-conferencing

In an investigation of negotiation of meaning in video-conferencing settings, Wang

(2006) noted that breakdowns could have been triggered by participants’ nervousness

Eales, Neale and Carroll (1999) indicated that 20% of the students in their study of

K-12 classrooms did not report a positive reaction to video-conferencing Jauregi and

In a study conducted by Baňados (2008) involving Chilean and Dutch students participating in a Spanish language video-web communication project, it was found that preferences for video conferencing varied significantly These differences were influenced by individual characteristics and cultural backgrounds of the participants.

Chilean students preferred to interact face-to-face, rather than online, while no clear preference emerged for the Dutch students.

Language level considerations have emerged as playing a role as well, as shown by

In a study on video-conferencing, Wang (2006) identified that participants' limited listening and speaking skills, along with a restricted vocabulary, significantly contributed to communication breakdowns This lack of language resources hindered their ability to clarify meanings or verify understanding effectively.

Finally, the body of literature on video-conferencing in education has shown the importance of pedagogic considerations for the success of the learning experience

Research suggests that video conferencing is not ideally suited for didactic lectures, as this format fails to leverage the interactive potential of the medium effectively.

Video-conferencing in language assessment

The integration of video-conferencing systems in English language assessment has been in practice for over 25 years, with early research by Clark and Hooshmand in 1992 comparing face-to-face and video-conferencing delivery methods for Arabic and Russian tests Their findings indicated no significant performance differences in scores, but test-takers expressed a preference for face-to-face interactions, while examiners showed no clear preference Following this initial study, research focus shifted for two decades towards exploring the nuances of oral assessments in both face-to-face and semi-direct, computer-based formats.

(cf Bernstein, Van Moere & Cheng, 2010; Kenyon & Malabonga, 2001; Kiddle & Kormos,

2011; Shohamy, 1994; Stansfield, 1990; Stansfield & Kenyon, 1992; inter alia)

In a recent study which returned to an examination of live test performances and which focused on investigating a technology-based group discussion test, Davis, Timpe-

Laughlin, Gu, and Ockey (2017) conducted group discussion tests utilizing video-conferencing, facilitating interaction between a moderator and multiple participants The sessions took place across four states in the United States and in three cities in mainland China In the U.S sessions, the participants and the moderator were located in different states, highlighting the effectiveness of remote communication in educational settings.

Chinese sessions, the participants were in one of three cities, with the moderator in the

U.S participants generally had positive feedback regarding the tasks and technology used, despite experiencing some disruptions due to Internet instability in China The researchers believe that video-mediated group discussion tests show significant potential for the future, although there are still unresolved technological challenges.

A study by Ockey, Gu, and Keehner (2017) explored the use of web-based virtual environment (VE) technology to address challenges in bringing test-takers and examiners together physically In this setup, groups of three test-takers participated remotely, engaging in discussions facilitated by a moderator Each participant interacted through avatars that mimicked body language, enhancing turn-taking during conversations While the technology functioned effectively with minimal audio issues, participants reported a sense of presence that fell short of in-person communication The researchers suggested that this limitation might reduce assessment authenticity but could also alleviate anxiety, potentially allowing test-takers to demonstrate their true capabilities more effectively in a non-testing context.

In other studies of mode-of-delivery of speaking tests and anxiety, Craig and Kim (2010,

In a study conducted by Kim and Craig (2012), the effectiveness of face-to-face interviews versus video-conferencing for speaking tests was compared among 40 Korean English language learners The research focused on analyzing scores related to fluency in both interview modes, highlighting the differences in performance and preferences between the two formats.

Functional Competence, Accuracy, Coherence, Interactiveness) and test-taker feedback on ‘anxiety’ in the two modes, operationalised as ‘nervousness’ before/after the test and

‘comfort’ with the interviewer, test environment and speaking test (Craig & Kim, 2010:

A study by Kim and Craig (2012) revealed no significant difference in global and analytic scores between two test modes, with most participants expressing comfort and interest in both formats However, notable differences in test-taker anxiety were observed, as anxiety levels were higher before the face-to-face mode, supporting Ockey et al.'s hypothesis.

The empirical findings highlight the importance of targeted training in video-conferencing for assessment purposes, emphasizing the need for participants to understand body language and facial expressions Implementing warm-up sessions and tutorials can significantly enhance the effectiveness of video-conferencing, ensuring participants are well-prepared for their interactions.

To enhance familiarity with video conferencing, it is crucial for both learners and instructors to engage actively (Lee, 2007) Additionally, establishing supportive conditions is vital for effective interaction, which includes offering scaffolding to address communication breakdowns.

When implementing video-conferencing in language learning and assessment, it is essential to consider students' English competence, establishing a minimal language threshold Additionally, the impact of affective factors on the learning experience should not be overlooked, as they can significantly influence the overall success of video-conferencing initiatives.

It is therefore vital to collect evidence on test takers’ perceptions of taking a VC test

3 The current study: research questions

After the Phase 2 study, an experienced IELTS examiner trainer was tasked with creating additional training materials for examiners using video conferencing (VC) delivery These materials were also adapted and translated to prepare candidates for the VC-delivered speaking test.

Spanish In addition, technical requirements, such as the development of on-screen prompts and appropriate delivery mechanisms, were initiated.

The study presented here is a follow-up investigation based on findings from Phases 1 and 2, which indicated that the mode of test delivery does not significantly affect the scores achieved by test-takers This research aims to fulfill four main objectives.

1 confirm how well the scoring validity of the VC tests is supported by the four facets modelled (i.e test-taker, rater, test version and rating category) in a Many-Facet

2 investigate the effect of perceptions of sound quality on scores

3 investigate perceptions of the newly developed on-screen prompts by examiners and test-takers

4 examine the effectiveness of the extended training for the VC test for examiners and test-takers.

To support the four purposes of this phase of the study, the research questions that we will address in Phase 3 are as follows:

1 How well is the scoring validity of the video-conferencing tests supported by the four-facet MFRM analysis (i.e test-taker, rater, test version and rating category)?

2 To what extent did sound quality affect performance on the video- conferencing test (as perceived by examiners, as perceived by test-takers, as observed in test scores)?

3 How did test-takers perceive the video-conferencing (VC) test, the new platform and training for the VC test?

4 How did examiners perceive the video-conferencing (VC) test, the new platform and training for the VC test?

This study employed a convergent parallel mixed-methods design, as outlined by Creswell and Plano Clark (2011), to collect and analyze both quantitative and qualitative data in parallel strands The findings were integrated to provide comprehensive insights into the video-conferencing delivery mode An overview of the Phase 3 research design, including the data collection, analysis, and triangulation processes, is illustrated in Figure 1, highlighting various aspects explored from multiple perspectives.

Location and technology

This report does not cover the specific selection of locations or the custom technological solution for delivering the IELTS test in Phase 3 of the study For insights into why Latin America, particularly Argentina, Colombia, Mexico, and Venezuela, were chosen as study locations, as well as the necessary technical requirements and specifications, please refer to the internal reports submitted to the British Council by Patel (2016) and Ruiz.

Participants

Participants experience with the Internet and VC technology

Phase 3 of the study exclusively examined the video-conferencing mode for administering the IELTS Speaking test, making it essential to assess participants' prior experience with Internet and video conferencing technology from the beginning.

Examiners and test-takers reported their experiences with Internet and VC technology through questionnaires According to the data presented in Table 2 and Figure 2, both groups utilize the Internet nearly every day for social interactions, with a mean score of 4.88 for this purpose.

The study reveals a significant difference in Internet usage for educational purposes between test-takers and examiners Test-takers utilize the Internet almost daily, with a mean usage score of 4.38, while examiners engage with it only once or twice a week, reflected in a mean score of 2.88 This disparity highlights the varying reliance on online resources in the educational process.

The comparison of video-conferencing (VC) technology usage revealed similarities between examiner and test-taker groups Both groups engage in social video-conferencing once or twice a week, with average scores of 3.00 and 2.54, respectively However, their use of VC for teaching and studying remains minimal, averaging between 'never' to 'once or twice a week' with a consistent mean of 1.63 for both groups.

Table 2: Participants’ experience with the Internet and VC technology

How often do you use the Internet socially to get in touch with people?

(1.Never – 3.Once or twice a week – 5.Everyday)

How often do you use the Internet to teach

(examiners) / for your studies (test-takers)?

How often do you use video-conferencing

(e.g Skype, Facetime) socially to communicate with people?

How often do you use video-conferencing to teach (examiners) / for your studies

Figure 2: Participants’ experience with the Internet and VC technology

From the frequency responses of both examiners and test-takers, it became clear that

Internet and video-conferencing familiarity was unlikely to constitute a negative issue in this research.

Materials

Five versions of the IELTS Speaking test (i.e Travelling, Success, Teacher, Film,

In the operational IELTS Speaking tests, examiners utilized five randomized versions of retired tests sourced from Cambridge English Language Assessment.

Data collection

Test scores

In live tests 2, examiners assign scores across four analytic rating categories: Fluency and Coherence, Lexical Resource, Grammatical Range and Accuracy, and Pronunciation These scores are based on established assessment criteria and rating scales utilized in operational evaluations.

IELTS tests In the interest of space, the rating categories are hereafter referred to as

Fluency, Lexis, Grammar and Pronunciation.

All test sessions included a live examiner mark and were also double-marked by an additional examiner through video-recorded performances A carefully designed double-marking matrix ensured sufficient overlap between examiners for Many-Facet Rasch Model analysis Following the live examinations, the double-marking matrix was recreated due to local administrators' inability to track the exact assignment of test-takers to the eight examiners In the updated matrix, each examiner double-marked 10 to 13 test-takers.

1 In Phase 2, another test version was also used, but this version was dropped in this phase since both candidates and examiners in Phase 2 had experienced difficulty with a lexical item.

2 In this report (as well as in our previous reports on Phases 1 and 2 of the project),

‘live tests’ refer to experimental IELTS Speaking tests that are performed by volunteer test-takers with trained and certified IELTS examiners.

Test-taker feedback questionnaires

Upon completing the VC Speaking Test, all participants filled out a questionnaire with assistance from an administrative assistant if necessary This questionnaire comprised 13 questions, prompting test-takers to elaborate on their answers when relevant The initial four questions focused on their experiences with technology and video conferencing, while the subsequent two questions evaluated the usefulness of the test-taker guidelines.

The final seven questions (Q7–Q13) focused on participants' feelings and experiences during the test, their perceptions of sound quality, and how they believed sound quality impacted their performance Additionally, the questions assessed their views on the clarity of the on-screen prompts Completing the questionnaire required between five to ten minutes, depending on the number of open-ended responses provided.

Examiner feedback questionnaires

Examiners completed two questionnaires, including a training feedback questionnaire immediately after the training session conducted before the test days This questionnaire featured seven questions assessing the training's effectiveness and included an open comments section for additional feedback.

The second questionnaire was for the actual test administration and rating under the

VC condition After finishing all speaking tests, examiners were asked to complete an examiner feedback questionnaire (see Appendix 4) consisting of four parts Part 1

(Q1–Q4) asked about their experience with technology and video-conferencing

In Part 2 (Q5–Q15), participants shared insights regarding their experiences in administering the test and their roles as interlocutors under the VC condition, along with their perceptions of how well the training session equipped them for this task Part 3 (Q16–Q27) focused on their experiences in rating the VC test and evaluated their preparedness based on the training they received.

Part 4 (Q28–Q32) asked them to reflect on their previous experience of delivering the standard face-to-face IELTS Speaking test and consider their perceptions towards the two test delivery modes The questions in Parts 2–4 were followed by free comments boxes requesting further elaboration The questionnaire took approximately 20 minutes for examiners to complete.

Examiner focus group discussions

On completion of administering the VC Speaking tests, all examiners took part in paired focus-group discussions facilitated by trained, local British Council staff On Days 1 and

Examiners Q & R and O & P engaged in semi-structured discussions in Bogotá, while examiners L & N and K & M participated in similar discussions in Buenos Aires on Days 3 and 4 These discussions aimed to elaborate on feedback regarding technical issues, focusing on sound quality perceptions, examiner behavior, including the use of gestures, and the two modes of IELTS Speaking test delivery, particularly concerning stress and comfort levels.

This section provides an overview of the data collection methods, offering a comprehensive understanding of the research design The following section will outline the data analysis methods employed.

Data analysis

Score data analysis

To investigate the scoring validity of video-conferencing tests, we conducted a descriptive analysis of scores awarded under VC conditions using SPSS 22 Following this, we employed Many-Facet Rasch Measurement (MFRM) to further analyze the scores, focusing on four key facets: test-taker, rater, test version, and rating category.

Rasch Model analysis (MFRM) using the FACETS 3.71 analysis software (Linacre, 2013)

MFRM analysis provides precise insights into the scoring validity of VC tests by evaluating examiner consistency, severity, and the uniformity and difficulty across five test versions and four analytic rating scales.

As noted above, sufficient connectivity in the dataset to enable the MFRM analysis was achieved through the examiners’ double-marking system.

Test-taker feedback questionnaires

The analysis of closed questions in the test-taker feedback questionnaire utilized both descriptive and inferential statistics to evaluate perceptions of sound quality and its potential impact on test scores Specifically, the study aimed to determine the extent to which sound quality influenced performance on the video-conferencing test (RQ2) and to assess test-takers' perceptions of the VC test, the new platform, and the training provided (RQ3) Additionally, open-ended comments from participants were employed to enhance the interpretation of statistical findings and to provide deeper insights from various data sources.

Test-takers' feedback was analyzed and compared with responses from Phase 1 and Phase 2 studies to assess the effectiveness of the training provided in this phase and to identify any persistent issues.

Examiner feedback questionnaires

As with the test-taker feedback questionnaires, the examiner training feedback questionnaire and the examiner feedback questionnaire were analysed to inform RQ2

The study aimed to assess the impact of sound quality on performance during the video-conferencing test, as perceived by examiners Additionally, it explored examiners' perceptions of the video-conferencing platform and the training provided for the test Statistical analysis was conducted on closed questions from both questionnaires, while open-ended comments were utilized to further interpret the statistical findings and provide additional context from other data sources Comparisons were made with previous results wherever applicable.

Examiner focus group discussions

All four focus group discussions were recorded; however, due to unforeseen technical issues, one recording was unintelligible As a result, only three discussions were fully transcribed and analyzed by researchers to extract key topics and perceptions from the examiners These insights were organized into a spreadsheet for coding and categorization under various themes, including "extra time required to administer the VC test" and "suggested modifications for the VC test," to address the research question regarding examiners' perceptions.

VC test, the new platform and the effect of examiner training?).

Results

Score results

5.1.1 Initial descriptive analysis of test scores

Figures 3 and 4 illustrate the overall Speaking scores achieved by test-takers in live tests, highlighting the average scores from both live and double marking In accordance with operational IELTS testing standards, the overall Speaking scores depicted in Figure 3 are rounded down (e.g., 6.75 is rounded to 6.5, and 6.25 to 6.0) Conversely, Figure 4 presents the mean Speaking scores derived from both live and double marking assessments.

The mean of the overall Speaking scores during the live tests was 6.15 (SD=0.915,

N) and the mean of the average overall Speaking scores under the live and double- marking conditions was 6.14 (SD=0.849, N 3 ) Therefore, the South American cohort in this phase of the research scored approximately one band higher than the Chinese cohort in Phase 2 (VC condition: Mean=5.04, SD=0.967, N; see Nakatsuhara et al., 2017b)

However, the scores of the South American cohort were similar to those obtained by the first multi-national cohort in London (VC condition: Mean=6.57, SD=.982, N2; see

Figure 3: Live Speaking test scores: Figure 4: Average of live and double- overall (rounded down) marking Speaking scores: overall

5.1.2 Many-Facet Rasch Model (MFRM) Analysis

The study conducted a detailed analysis of test scores, employing a four-facet MFRM approach to evaluate score variance among test-takers, test versions, examiners, and rating scales This methodology was designed to address the research question regarding the scoring validity of video-conferencing tests, specifically assessing how effectively the four-facet MFRM analysis supports this validity.

Figure 5 presents an overview of the results from the 4-facet rating scale model analysis, illustrating the estimates of test-taker ability, test version difficulty, examiner harshness, and rating scale difficulty These factors are measured in uniform units (logits), as indicated on the left side of the map labeled "measure," allowing for direct comparisons among them.

3 Double-marking was conducted for 82 performances, as seven videos had some technical problems and could not be reliably rated.

Figure 5: All facet vertical rulers (4-facet analysis with a rating scale model)

* 0 * C010 C044 C051 C064 C066 C086 * Film Travelling Website * M P * Lexis Pronunciation * * | | C014 C023 C033 C040 C071 C079 | Success | O R | Fluency | | | -1 + C007 C008 C013 C043 C067 C073 + + + + | | | C026 C057 C088 C094 | | Q | | - | | -2 + C006 C016 C027 C032 C042 C077 C092 C100 + + + + | | | C058 C059 C075 | | | | | | -3 + C015 C063 + + + + | | | C001 C050 | | | | 5 |

The FACETS program generates a detailed measurement report for each facet of the model, as illustrated in Tables 3 to 5 These reports feature item difficulty levels represented on the Rasch logit scale, along with Fair Averages that reflect the anticipated average raw score values derived from the Rasch measures Additionally, the reports provide insights into the Infit statistics.

The Mean Square (Infit MnSq) index is a key measure of fit for assessing the assumptions of the Rasch model While both Infit and Outfit measures are available, this discussion focuses on Infit due to its reduced sensitivity to outliers from random unexpected responses Consequently, unacceptable Infit results are more indicative of underlying inconsistencies within an element.

Infit values in the range of 0.5 to 1.5 are ‘productive for measurement’ (Wright & Linacre,

1994), and the commonly acceptable range of Infit is from 0.7 to 1.3 (Bond & Fox, 2007)

All items within the four facets exhibit acceptable Infit values, with the exception of Examiner N, who showed slight overfitting (Infit Mnsq=0.62), suggesting that his scores were overly predictable While overfitting is not ideal for measurement accuracy, it does not compromise the integrity of the measurement system.

The absence of misfit in the analyses enhances our confidence in the Rasch measures obtained on the common scale This indicates a lack of systematic inconsistency in test scores, reinforcing the scoring validity of the VC tests performed during this phase of the project.

Table 3: Test version measurement report

Fixed (all same) chi-square: 6.9, d.f.: 4, significance: 14

Inter-rater agreement opportunities: 328 Exact agreements: 136 = 41.5% Expected: 160.3 = 48.9%

Table 5: Rating scales measurement report

Three key observations emerge from Tables 3 to 5: the five versions utilized in this research phase exhibited comparable difficulty levels, while the eight examiners showed significant variability in their grading severity Notably, one examiner stood out among the group.

Examiner Q demonstrated the most lenient grading approach with a fair average score of 6.50, while Examiner K was the strictest, achieving a fair average of 5.74 This results in a significant difference of 0.76 in fair average scores, surpassing the examiner severity differences noted in Phases 1 and 2, which were 0.36 and 0.36, respectively.

0.52 in Phase 2) However, such severity differences among examiners are commonly found in speaking assessment and have been described by McNamara as ‘a fact of life’

The study revealed that rating assessments are inherently subjective, as noted by McNamara (2000) Furthermore, the analysis indicated significant variations in difficulty across four rating categories: Fluency emerged as the simplest, succeeded by Pronunciation, Lexis, and Grammar However, the average differences in difficulty were minimal, with Fluency scoring 6.25 and Grammar 6.03.

The main findings of the score analyses are summarised below. a) Dataset

The range of proficiency levels of the Phase 3 participants was higher than that of

Phase 2 in China The wide range of proficiency (Bands 4.0–8.5), with many of the test- takers scoring around Bands 5.5, 6.0 and 6.5, represents a range typical of international

IELTS candidates 4 b) MFRM analysis with FACETS

MFRM analyses utilized a rating scale model with four facets—test-takers, test versions, examiners, and rating scales—showing no misfitting items in any facet This absence of misfit is promising, indicating unidimensionality and a lack of systematic inconsistency, as noted by Bonk and Ockey (2003).

The results of the MRFM analysis provide further evidence of the scoring validity of the

The recent IELTS Speaking test conducted by VC revealed a concerning severity difference of 0.76 between examiners and raters To address this issue, it is essential to explore solutions like full or partial double marking, aligning with recommendations from the Phase 2 study by Nakatsuhara et al (2017b).

The observed differences in the Speaking test may be linked to the mode of delivery, but it's also possible that other factors are influencing the results Further analysis of the data is necessary to investigate these potential issues more thoroughly.

Sound quality analysis

This article presents an analysis of sound quality and its impact on test performance, focusing on Research Question 2 (RQ2): How does sound quality influence test outcomes? The findings reveal the effects of sound quality as perceived by both examiners and test-takers, highlighting the significant role that auditory conditions play in assessment performance.

(c) as observed in test scores?

4 See https://www.ielts.org/ teaching-and-research/test- taker-performance for mean band scores for male and female test-takers and mean band scores by country.

5.2.1 Perceptions of sound quality by examiners and test-takers

In the Phase 2 research, the examiner's rating sheet and the test-taker feedback questionnaire featured two key questions, inviting participants to elaborate on their responses if they chose to do so.

Q1 Do you think the quality of the sound in the VC test was…

[1 Not clear at all, 2 Not always clear, 3 OK, 4 Clear, 5 Very clear]

Q2 Do you think the quality of the sound in the VC test affected test-takers’

(or ‘your’ in the test-taker questionnaire) performance?

[1 No, 2 Not much, 3 Somewhat, 4 Yes, 5 Very much]

Table 6 shows the perception of sound quality and its effect on performance by the examiners and test-takers

Table 6: Sound quality perception by examiners and test-takers

Perceived by Mean SD Paired samples t-test

[1 Not clear at all, 2 Not always clear, 3 OK,

[1 No, 2 Not much, 3 Somewhat, 4 Yes,

The mean values for Q1 indicate that both the examiners (M=3.98) and test-takers

The sound quality was generally rated as 'clear' by both examiners (M=3.83) and test-takers, with examiners perceiving it slightly better; however, this difference was not statistically significant Additionally, both groups agreed that the sound quality did not significantly affect test-takers' performance, as indicated by the average responses of examiners (M=1.73) and test-takers.

The findings indicate that the perceived impact of sound quality on test-taker performance was minimal, with a mean score of 1.86 This contrasts with our Phase 2 results, which revealed significant discrepancies between examiners and test-takers regarding sound quality perceptions; examiners rated the sound quality more favorably, while test-takers recognized a more substantial effect of sound quality on their performance.

5.2.2 Perceptions of sound quality re: test-taker proficiency

Test-takers were then divided into three groups according to their overall VC test scores:

Low (Band 5.5 and below), Medium (Between Band 6 and Band 6.5) and High (Band

A study examined the perception of sound quality among three proficiency groups, revealing no significant differences in sound quality perception across these groups However, when assessing the impact of sound quality on performance, low-level test-takers (Band 5.5 and below) perceived a greater effect compared to medium (Band 6.0-6.5) and high-level (Band 7.0 and above) groups Despite these perceptions, post-hoc tests using Tukey HSD did not indicate any statistically significant differences between the groups.

5 For convenience, the two questions are numbered as Q1 and Q2 in this section, though these items had different question numbers in both the test-taker and examiner

Table 7: ANOVA on test-takers’ proficiency levels and sound quality perception by examiners and test-takers

Prof level Mean SD ANOVA Post-hoc test (Tukey

HSD) Q1: Sound quality [1 Not clear at all, 2 Not always clear, 3 OK, 4 Clear, 5 Very clear]

Q2: Affecting performance [1 No, 2 Not much, 3 Somewhat, 4 Yes, 5 Very much]

Low vs Med (p=0.08) Low vs High (p=0.08) Med vs High (p=0.93)

Table 8 summarises correlations between test-takers’ proficiency levels and examiners’ and test-taker’s perceptions of sound quality (Q1) and its effect on test-taker performance (Q2)

Table 8: Correlations between test-takers’ proficiency levels and examiner/test-taker perceptions of sound quality

Perceived by Pearson correlation with test-taker’s prof level

[1 Not clear at all, 2 Not always clear, 3 OK,

[1 No, 2 Not much, 3 Somewhat, 4 Yes,

The findings reinforce the ANOVA results, revealing a significant negative correlation (r=-0.28, p=0.01) between test-takers’ perception of sound quality impact and their proficiency level, albeit with a low correlation While lower proficiency test-takers reported a similar experience in Q1, they exhibited a tendency to feel their performance was more affected by sound quality in Q2 This is likely due to weaker test-takers struggling with communication breakdowns from even slight clarity issues, unlike higher-level test-takers who can better compensate using context or background knowledge Interestingly, despite not attributing their limited performance to poor sound quality in Q1, lower proficiency test-takers acknowledged a greater susceptibility to sound quality issues compared to their higher proficiency counterparts.

This study's sound quality analysis reaffirms our Phase 2 findings, indicating that VC technology effectively facilitated the speaking test Both examiners and test-takers rated the sound quality as 'clear,' with no significant differences in perceptions between the two groups, unlike in Phase 2 This improved alignment may be attributed to the bespoke platform introduced in this phase, which likely provided a more uniform experience Overall, the perceptions of sound quality remained consistent with those observed in Phase 2.

The impact of sound quality issues on test performance appears to vary by ability level, with weaker test-takers feeling more susceptible to these problems.

5.2.3 Perceptions of sound quality and problems encountered during test administration

Despite the VC technology performing adequately and receiving favorable sound quality ratings that did not affect the scores, it is important to highlight that 70 out of 89 test sessions experienced technical issues, as noted by the examiners While some of these problems were minor, nearly 80% of the sessions faced difficulties that cannot be overlooked.

Examiners’ comments on technical and sound problems encountered are presented in full in Appendix 5, but selected comments are presented below, under two categories:

(i) comments relating to sound delays, (ii) comments relating to image freezing. i) Comments relating to sound delays

• Consistent delays in audio/video This seemed to affect the candidate (ExR, C016)

• Slight delay in sound – a little echo/delay when I spoke to candidate (ExO, C024)

• Delay continues to be the main problem I find it more difficult to come up with questions (part 3) than usual (ExP, C035)

• Some problem with delay and synching of video/sound – this was more apparent at the beginning (ExN, C062)

• Still a slight delay of 3/4 seconds and so we interrupted each other a lot (ExM, C094) ii) Comments relating to image freezing

• One small freeze, generally OK (ExQ, C017)

• There were a few times when the image froze but the audio was on, so it was not much of a problem (ExP, C036)

• Image froze I can hear the candidate and she can hear me, so carried on until the end of Part 2 – then called the administrator and asked for help (ExL, C054)

• Video froze for a few seconds in the middle (ExN, C060

• Image froze at some points but audio was OK (ExK, C076)

• At one point the image froze We just carried on (ExM, C088)

The IELTS Speaking test is crucial, and the discovery that nearly 80% of sessions experienced minor issues raises important concerns These problems must be addressed thoughtfully to ensure the successful implementation of VC mode in the future.

Of course, in any test that relies on technology, some glitches will inevitably occur, and ways should be considered for minimising and/or handling them, such as providing

Perceptions of video-conferencing test, training and guidelines

In our analysis, we previously examined test-takers' scores and the perceptions of both test-takers and examiners regarding sound quality and its potential impact on performance and scoring Now, we shift our focus to Research Question 3 (RQ3), exploring how test-takers perceived the video-conferencing (VC) test, the new platform, and the training provided for this innovative testing format.

5.3.1 Test-takers’ perceptions of the VC test

Table 9 highlights the test-takers' perceptions of the revised training guidelines provided before the test These guidelines were updated based on recommendations from our Phase 2 research to align with the new delivery platform The feedback indicates that the revisions were effective, as test-takers rated the guidelines as 'useful' (Q5 M=4.07) and found the accompanying images 'helpful' (Q6 M=3.93), with both ratings showing an increase from Phase 2 (M=3.87 and 3.65, respectively).

Table 9: Test-takers’ perceptions of the test-taker guidelines for the VC test

Q5 Were the test-taker guidelines for the test…

(1.Not useful – 3.OK – 5.Very useful) 87 4.07 0.96

Q6 Was the picture in the guidelines…

(1.Not helpful – 3.OK – 5.Very helpful) 84 3.93 1.15

The test-takers had a positive perception of the VC test, with an average understanding of the examiner rated at 4.46 (Q7) and a comfort level ranging from 'OK' to 'comfortable' at 3.46 (Q8) Notably, feedback from Phase 3 was more favorable than Phase 2, with mean scores of 3.76 and 3.15, respectively, indicating an overall improvement in their experience.

The Phase 3 test-takers also reported that they found the VC test ‘easy’ (Q9 Mean:

In this study, participants expressed a strong sense of opportunity to showcase their speaking skills, with a mean score of 4.05 A custom platform was created to display Phase 2 prompts, which test-takers found satisfactory, reporting an average clarity score of 4.12 for the on-screen prompts.

Table 10: Test-takers’ perceptions of the VC test

How often did you understand the examiner in the VC test?

Q8 Did taking the VC test make you feel

(1.Very nervous – 3.OK – 5.Very comfortable) 87 3.46 1.35

Q9 Did you feel taking the VC test was…

(1.Very difficult – 3.OK – 5.Very easy) 87 3.89 0.88

Did you feel you had enough opportunity in the VC test to demonstrate your speaking ability? (1.Not at all – 3.OK – 5.Very much)

In Part 2 (long turn), the prompt on the screen was…

(1.Not clear at all – 3.OK – 5.Very clear)

The test-taker perceptions of the VC guidelines, the VC test, and the new platform were generally positive, reflecting an improvement compared to the Phase 2 research This suggests that the revisions made to the guidelines and the development of the platform were successful.

Test-takers were invited to share open-ended comments alongside their feedback ratings While all comments can be found in Appendix 2, selected remarks are categorized into three groups: (i) positive responses welcoming the VC test, and (ii) constructive feedback aimed at enhancements.

(iii) comments that related to the sound quality and technical concerns.

(i) Comments that welcome the VC test

• It was a good experience, modern and useful (C002)

• In comparison with regular exam, it is very similar and could be a good solution

Not too much difference to the personal interview (C016)

• I’d like to do this frequently (C049)

I am proud to have participated in this excellent initiative and hope it becomes a regular evaluation method Engaging with native English speakers is highly beneficial and welcomed.

• I have been presented this exam with a real teacher and is almost the same

This project presents a valuable opportunity to connect with experienced English teachers from abroad, particularly given the challenging economic conditions in Venezuela that limit access to local English educators Engaging in this initiative not only enhances language skills but also fosters meaningful interactions with professionals in the field.

(ii) Comments that include constructive feedback for improvements

The guidelines are comprehensive, but I sometimes struggled to read them thoroughly Additionally, it's crucial to note that the inability to see invigilators on the screen can lead to feelings of embarrassment.

• In the second part it would be nice to have a timer to manage your speech (C009)

• In part 2, a bigger prompt on the screen would be better (C068)

• Maybe would be better with headphones (C086)

(iii) Comments that related to the sound quality and technical concerns

• The audio must be improve a little bit (C057)

While I appreciate the concept, it's crucial to address the internet connectivity issues in Venezuela, where disruptions can easily interrupt conversations, conferences, or reading sessions Ensuring high-quality sound from the speaker is essential for those taking the test to have a seamless experience.

• There was 3 moments where the transmission freeze (C094)

5.3.2 Examiners’ perceptions of the VC test

After the training session and VC tests, feedback from examiners was gathered via questionnaires and focus group discussions to evaluate their perceptions of the VC test, the new platform, and the training provided for the VC test.

The results from the questionnaires are presented in conjunction with excerpts from the free comment boxes on the questionnaires and comments made in the focus group discussions

Regarding the content of the VC training they received, all eight examiners found it useful,

Table 11: Results of Examiner Training Feedback Questionnaire

2 Disagree 3 Neutral 4 Agree 5 Strongly agree Q1 I found the training session useful 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 8 (100%)

The differences between the standard f2f test and the VC test were clearly explained.

Q3 What the VC room will look like was clearly explained 0 (0.00%) 0 (0.00%) 0 (0.00%) 3 (37.5%) 5 (62.5%)

VC specific techniques (e.g use of preamble, back-chanelling, gestures, how to interrupt) were thoroughly discussed.

Q5 The rating procedures in the VC test were thoroughly discussed 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 8 (100%)

Q6 The training videos that we watched together were helpful 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 8 (100%)

I had enough opportunities to discuss all my concern(s)/question(s) about the VC test.

Also, two examiners left positive comments in the free comment box of the questionnaire:

• Very clear review of procedures and relation to the VC project (Examiner N)

• The training was excellent Very thorough as all the procedures explained etc

The training videos offered valuable, in-depth insights and clear visual guidance for the exam Additionally, the role-playing activities involving mini invigilators, examiners, and candidates helped alleviate my concerns and provided essential practice.

5.3.3 Administration of the VC test

After administering the VC tests, the examiners responded to another questionnaire:

The Examiner Feedback Questionnaire (refer to Appendix 4) evaluates examiners' overall experiences and perceptions regarding the effectiveness of their training in test administration and rating It also compares the efficacy of face-to-face tests with virtual conferencing (VC) tests The findings are summarized in Table 12.

(based on a 5-point Likert scale; 1: Strongly disagree, 2: Disagree, 3: Neutral, 4: Agree,

5: Strongly agree) relating to test administration The means for all the questions are between 4 and 5, which suggests that the examiners generally felt comfortable and found it straightforward to administer the VC tests It is also apparent that the training provided to the examiners was received positively, and the contents of the training and the selection of materials were regarded as adequate

Table 12: Results of Examiner Feedback Questionnaire on test administration

Q5 Overall I felt comfortable in administering the IELTS Speaking Test in the VC mode 8 4.38 0.74

Q6 Overall the examiner training adequately prepared me for administering the VC test 8 4.63 0.74

Q7 I found it straightforward to administer Part 1 (frames) of the IELTS

Speaking Test in the VC mode 8 4.50 0.76

Q8 The examiner training adequately prepared me for administering

Part 1 of the VC test 8 4.63 0.52

Q9 I found it straightforward to administer Part 2 (long turn) of the IELTS

Speaking Test in the VC mode 8 4.25 0.71

Q11 I found it easy to handle task prompts on the screen in Part 2 of the

Q12 I found it straightforward to administer Part 3 (2-way discussion) of the IELTS Speaking Test in the VC mode 8 4.50 0.76

Q14 The examiner’s interlocutor frame was straightforward to handle and use in the VC mode 8 4.63 0.74

Q15 The examiner training gave me confidence in handling the interlocutor frame in the VC test 8 4.50 1.41

NB: The results of Q1 to Q4 are presented in Table 2 in Section 4.2.1.

In Part 2 of the exam, Q9 shows a slightly lower mean score compared to other questions Examiners reported that displaying the Part 2 task prompt on the candidate's screen was easy (Q11); however, they experienced a two-second delay to present the task card, as noted by Examiner N Additionally, the time taken for the invigilator to hand over the paper and pencil reduces the four minutes allocated for this section.

Conclusions

Summary of main findings

This follow-up study provides an in-depth analysis of test scores and the behaviors of both test-takers and examiners in the VC delivery mode for the IELTS exam.

The findings for each of the research questions raised in Section 3 are summarised in

RQ1: How well is the scoring validity of the video-conferencing tests supported by the four-facet MFRM analysis (i.e test-taker, rater, test version and rating category)?

All items within the four facets showed acceptable Infit values, indicating no systematic inconsistency in test scores This lack of misfit reinforces the scoring validity of the VC tests conducted during this project phase.

RQ2: To what extent did sound quality affect performance on the video- conferencing test (as perceived by examiners, as perceived by test-takers, as observed in test scores)?

Both examiners and test-takers agreed on the clarity of sound quality during the assessments, and they reported no notable differences in their views regarding the influence of sound quality on test-takers' performance.

Lower proficiency-level test-takers reported that their performance was more affected by sound quality compared to higher proficiency-level test-takers Although sound quality was mostly viewed positively, examiners noted that nearly 80% of test sessions experienced minor technical or sound issues.

RQ3: How did test-takers perceive the video-conferencing (VC) test, the new platform and training for the VC test?

Test-takers had a positive perception of the VC test, noting that the bespoke platform functioned satisfactorily with clear on-screen prompts The revised guidelines were beneficial, and the accompanying pictures enhanced the overall experience.

RQ4: How did examiners perceive the video-conferencing (VC) test, the new platform and training for the VC test?

Examiners had a positive perception of the VC test, appreciating its comfort and ease of administration Six out of eight examiners noted no significant differences in rating between the two delivery modes, with five believing both provided equal opportunities for candidates to demonstrate proficiency While half expressed no preference, three favored the face-to-face mode Concerns were raised about the bespoke platform, particularly regarding the additional time needed for Part 2 The training for the VC test was deemed comprehensive and useful, reinforcing the established scoring validity of the IELTS Speaking test However, consistent with previous studies, this research identified inherent issues with video conferencing that must be addressed before implementation, leading to several recommendations for improvement.

Implications of the study

The study's implications will address four key purposes: first, evaluating the scoring validity of VC tests through the Many-Facet Rasch Model (MFRM) analysis, focusing on the roles of test-taker, rater, test version, and rating category; second, examining how perceptions of sound quality influence awarded scores; third, assessing the reactions of examiners and test-takers to newly developed on-screen prompts; and fourth, analyzing the impact of extended training for both examiners and test-takers on the VC test's effectiveness Additionally, we will present other relevant observations that may benefit test developers.

6.2.1 Scoring validity of the video-conferencing mode of delivery

The findings from Phase 1, Phase 2, and the current Phase 3 of the project reinforce the scoring validity of the VC-delivered mode of the IELTS, as demonstrated by the four facets analyzed in the MFRM analysis.

Speaking test Although the range of proficiency of the Phase 3 participants was higher than that of Phase 2 in China, the wide range of proficiency found in this study (Bands

4.0–8.5), with many of the test-takers scoring around Bands 5.5, 6.0 and 6.5, was similar to that found in Phase 1 in London and represents a range typical of international IELTS candidates.

The study revealed significant differences in severity among the eight examiners, with a notable 0.76 difference in fair average scores between the most lenient and harshest evaluators This disparity is larger than the 0.36 difference observed in a previous Phase 2 study conducted in China A potential reason for this variation could be that Chinese examiners typically assess a more homogeneous group regarding language proficiency, which may lead to more consistent evaluations of oral performance.

The observed differences in the Speaking test could be attributed to the VC mode of delivery, but it is also possible that other factors are influencing these results, indicating a need for further investigation.

6.2.2 Sound quality and perceptions of the effect on scores

Stable Internet connections are required for clear sound quality, and meticulous preparation at the local site is an absolute necessity for smooth administration of the

The sound quality analysis in this study validates the findings from Phases 1 and 2, indicating that the VC technology effectively facilitated the delivery of the speaking test Notably, there were no significant differences in sound quality perceptions between examiners and test-takers, likely attributed to the implementation of a bespoke platform during this phase, which provided a more uniform experience for both parties.

The Phase 2 results indicate that both examiners and test-takers perceived sound quality similarly across various proficiency levels; however, lower proficiency test-takers felt their performance was more affected by sound quality issues Despite 70 out of 89 test sessions experiencing minor technical problems, which raises concerns given that nearly 80% had issues, it highlights the inevitability of glitches in video-conferencing tests, even as technology advances Video-conferencing tests prompt more explicit communication for negotiating meaning and managing turn-taking than face-to-face interactions, particularly during communication breakdowns caused by sound quality issues Therefore, it is essential for test providers to acknowledge these unique interactional features of video-conferencing communication and integrate them into the speaking construct of VC tests Alongside efforts to reduce technical problems, understanding how video-conferencing reflects real-life communication is crucial for effective VC test administration and accurate score interpretation, as outlined in the newly developed CEFR descriptors for online interaction.

6.2.3 Perceptions of on-screen prompts by examiners and test-takers

The introduction of a clear on-screen prompt for Part 2 (long turn) in this project phase did not present any issues for the test-takers, although some suggested that increasing the prompt's size could enhance visibility.

Examiners found the on-screen prompts easy to manage, but noted that the process of displaying and removing these prompts, along with the time spent waiting for the invigilator to provide paper and pencil for notetaking, could significantly extend the overall exam duration.

To adhere to the specified timing of 4 minutes in the Speaking test, participants often found it challenging to ask a rounding-off question, which they deemed essential To facilitate the inclusion of this question, it may be beneficial to extend Part 2 of the Speaking test by an additional 30 seconds when conducted in VC mode, resulting in a total duration of 4.5 minutes for this section.

At this stage, it is not clear whether there would always be an invigilator present in the

To enhance the VC test experience, it is essential to establish a standardized method for delivering materials for note-taking, especially if test-takers are permitted to take notes Exploring the integration of computer or laptop functionalities for note-taking during the VC test could be beneficial; however, this suggestion requires further discussion beyond the current report's scope.

6.2.4 Perceptions of training for the VC Speaking test by examiners and test-takers

Analysis of questionnaire data indicates that both examiners and test-takers found the VC Speaking test training to be beneficial, with test-takers appreciating the helpfulness of the guideline pictures However, feedback from examiners suggests a desire for the training program to include additional topics, particularly on managing technical equipment and addressing potential technical issues Furthermore, examiners emphasized the importance of having continuous technical support available, as was provided during this project phase.

The Examiner Feedback Questionnaire and focus group discussions highlighted a recurring theme of initial unfamiliarity with the VC Speaking test, similar to findings from the Phase 2 study While the one-day training was deemed beneficial, some examiners expressed a desire for additional training and practice test sessions to fully grasp the modified Interlocutor Frame after the live test sessions.

During the VC test, examiners often rely on memorized language for face-to-face interactions, making it challenging to focus on the revised Interlocutor Frame while simultaneously acting as both interlocutor and rater They recommend further modifications to the Interlocutor Frame to better address elements unique to the VC format, including the use of on-screen prompts and the interlocutor's involvement in the notetaking process.

The Phase 2 report highlights that the existing Interlocutor Frame was initially designed for traditional face-to-face speaking assessments To effectively implement the VC test, necessary adjustments to the Interlocutor Frame are required Furthermore, it is crucial to reassess the flexibility inherent in the Frame to adequately accommodate the constructs evaluated under VC conditions.

A total of 220 test-takers and 22 examiners participated in the three phases of the study, which were conducted in London, Shanghai, China and four countries in Latin

America The scoring validity of the IELTS Speaking test has been established with supporting evidence provided in each phase However, it was noted in Phases 1 and

The VC-delivered Speaking test appears to assess a different speaking construct compared to face-to-face assessments, primarily due to the absence of subtle cues like gestures and voice inflection, which are integral to in-person communication This difference highlights the need to recognize that the interactive communication in the VC test relies more on explicit negotiation of meaning and turn management To align the test with this reality, it is essential to revisit the test specifications to incorporate these elements and revise the Interlocutor Frame for greater flexibility in handling clarification requests and managing interactions This adaptation reflects the evolving nature of communication in digital contexts, which is increasingly prevalent in distance learning and various social and business interactions.

Double marking matrix

In each examiner column, test-taker IDs (e.g., C001) represent the individuals assessed by the examiner during live test sessions, while DMs denote those whose video-recorded performances received double marking by the examiner.

Test-taker feedback questionnaire

Gender: (please circle) Male / Female Age:

Please complete this questionnaire together with the candidates, while showing all available options (1–5) to them

Tick the relevant boxes (1–5) according to the candidate’s responses

YOUR EXPERIENCE WITH TECHNOLOGY (please tick):

1 Never 2 3 Once or twice a week 4 5

Q1 How often do you use the Internet socially to get in touch with people?

Q2 How often do you use the Internet for your studies?

Q3 How often do you use video- conferencing (e.g Skype, Facetime) socially to communicate with people?

Q4 How often do you use video- conferencing for your studies?

Q5 Were the candidate guidelines for the test … 1 Not useful 2 3 OK 4 5 Very useful Q6 Was the picture in the guidelines… 1 Not helpful 2 3 OK 4 5 Very helpful

Q7 How often did you understand the examiner in the

Q8 Did taking the VC test make you feel… 1 Very nervous 2 3 OK 4 5 Very comfortable Q9 Did you feel taking the VC test was 1 Very difficult 2 3 OK 4 5 Very easy

Q10 Did you feel you had enough opportunity in the

VC test to demonstrate your speaking ability?

1 Not at all 2 3 OK 4 5 Very much

Q11 Do you think the quality of the sound in the VC test was… 1 Not clear at all 2 3 OK 4 5 Very clear

Q12 Do you think the quality of the sound in the VC test affected your performance?

Q13 In Part 2 (long turn), the prompt on the screen was… 1 Not clear at all 2 3 OK 4 5 Very clear

If you chose Option 1 or 2 for any questions from Q5 to Q13, please explain why?

C003 The window of the screen cut part of the document (word).

C004 After Part 2, screen was frozen and there was a delay of 2-3 seconds

So, we interrupted to each other twice.

C005 In part 2, when the prompt appears the face of the examiner is there and could not read all the text the examiner explained.

C008 Twice I could not understand because examiner spoke very soft.

The interview felt impersonal, lacking the connection necessary for optimal performance I believe that sensing the energy and reactions of the examiner is crucial for a more engaging and effective interview experience.

C014 No, the sound was really good, I could hear the examiner pretty well.

C016 Three times I felt the sound was interrupted I asked the examiner to repeat and she repeated again

C017 Because through screen the interview is very impersonal, is very cold.

C019 Q11 I didn't hear very well because the connection was a bit bad.

C020 It's more because the pressure of the test No VC itself Feel nervous.

C024 I felt nervous because I don't like exams The sound in the VC test was perfect.

C025 The quality of the sound didn't affect my performance because it was fine and clear.

C029 The quality of the sound was very good.

C030 Well, it didn't affect my performance It was good and I felt great.

The sound quality during the VC test was satisfactory and did not hinder my performance; however, I believe a face-to-face interaction would have been more effective The virtual format made me slightly nervous, which occasionally impacted the clarity of communication.

C033 I don't think my performance were affected because the sound of the video was good, she could understand me if I understand what she was asking me.

C037 The sound of the VC test was a bit behind, so one would view her moving before the sound was clear.

C039 Sound was ok It did not affect my performance.

C040 I felt no time enough to answer questions –Quite short.

C043 As candidate I believe is very important to have a clock to measure our time

Sometimes the screen was not really clear so I don't know why but could be better to improve the video-conference program, technology or the quality of the Internet.

Incorporating a clock during the second part of your presentation can enhance your ability to deliver a more focused and effective response Additionally, the delay in video conferencing can hinder your concentration on the speech.

Not many schools provide advanced tools for studying, which can hinder the learning experience Although it didn't significantly impact the overall outcome, the slow connection occasionally caused the image to freeze and the sound to experience interference.

C048 I was not affected by the VC.

The C049 Q5 test proved to be highly beneficial, as it highlighted areas for improvement and assessed my fluency While there were instances where the prompt response was slow, overall, the experience was satisfactory.

C052 It was a convenient because the quality of the sound was perfect.

C053 I felt very well, it was amazing and I loved it.

C054 Q3 – I use more Whatsapp than Skype or video-conferencing socially to communicate with people Q12 - No, the evaluator heard me fine.

C055 Q4 Because usually I study by myself, if I need someone to explain something to me

I'd rather be face-to-face Q3 I'd rather to text Q12 It didn't affect my performance.

C057 I felt nervous because is not common for me to speak English with a stranger that is testing my speaking ability.

C059 I felt nervous because it was my first time on VC in English It's a good option for the student to improve on his learning path

C062 The quality of the sound was very good so I think it did not affect my performance at all.

C063 In Q12 because the sound in the test was good and I was very nervous.

C064 I felt that the sound was too soft sometimes, not enough loud for me.

C065 Because I can notice his facial expression and make me lost eye contact

Because sometime the sound was interrupted (Q7 and Q13)

The sound quality during the examination was inconsistent, causing interruptions that made it challenging to follow the examiner's speech I often relied on lip-reading to enhance my understanding, but this proved difficult through video conferencing due to delays between the audio and visual elements Additionally, the absence of labels on the screen created further complications until the issue was eventually resolved.

C077 I would prefer an interview face to face, definitely.

Many individuals experience anxiety when speaking English, particularly during evaluations Additionally, it can be challenging to form a well-rounded opinion in a limited timeframe; extending the discussion by an extra 10 minutes could greatly enhance the quality of responses.

C080 I was very nervous but I could handle it I always feel nervous when I have to talk in

English The quality of the sound was perfect for me, so it doesn't affect time.

C081 It was very good, I liked the experience, with talked with someone with Skype It was very helpful I want to do again this experience

C082 Sound was good enough to show my performance.

C083 I do not chose that option until Q12 where I say no because it is not affect my performance the quality of the sound in any moment.

C086 The sound need to be improved, maybe putting and stereo speakers.

In Venezuela, I prefer using alternative communication channels, such as YouTube videos, for studying instead of video conferencing due to frequent connection issues This reliance on videos makes me slightly anxious, as I worry about the efficiency of the internet connection However, I found that my overall performance was not impacted by the sound quality during my studies.

C092 Q3 – Not enough time Q11 – The sound was low.

C093 Q3 I prefer to use Whatsapp Q4 I don't need it.

C094 I could not understand some parts cause the audio quality.

C095 The quality of the sound was very good It didn't affect my performance.

C098 I think it is necessary to put other speaker and it is not necessary to focus on screen.

C100 I choose option 2 in Q6 because I didn't see any picture in the guidelines.

Are there any other positive or negative points that you'd like to highlight?

Feeling nervous upon entering the room is common, so taking a moment to test the sound and ensure everything is functioning properly can enhance your experience This modern approach not only alleviates anxiety but also proves to be highly beneficial.

Examiner was kind and polite.

C003 It was interesting Not very common and different The experience was positive.

C004 I found it is a positive experience, the sound was good Maybe the first minute was weird but after I felt comfortable with the examiner and the exam.

C005 It was very positive Video and sound was my concern but they were good

C006 All is positive Help you to feel good and comfortable Maybe is not good for shy people, but for me is ok.

The guidelines contain extensive information, and I sometimes struggle to read them thoroughly Additionally, the inability to see the invigilators on screen can lead to feelings of embarrassment during the examination process.

C008 It is a good way to take the exam It is a good experience But I prefer to have someone in front not video.

C009 In the second part it would be nice to have a timer to manage your speech.

C010 The methodology is positive, I think this is a good platform and technology.

C011 I felt very nervous, more than normal when you talk face-to-face with someone

Quite cold It went so fast No time to think.

C013 Me sentia nervioso per la persona que estaba detias mio

C015 Most of things are positive Sometimes I asked to repeat the question and examiner did it.

C016 In comparison with regular exam, it is very similar and could be a good solution

Not too much difference to the personal interview.

C020 Body expression is lost for the interviewer.

C021 It's very important to check sound (connection) quality.

C049 I'd like to do this frequently.

C050 It was ok Sometime the sound wasn't ok But in general is a good experience and the questions because I felt comfortable I feel a little nervous because was my first time.

C051 Sometimes there was a delay with the VC and I couldn't understand properly what the examiner was saying, she had to repeat me the words.

C054 The experience was very interesting and useful to me and it helped me to understand better this kind of experience.

C055 At some point I didn't have the time to finish what I was saying.

C057 The audio must be improve a little bit.

C060 Say how much time it is available for every question.

C063 It was a good experience to practice.

I am proud to have participated in the excellent initiative C064 and hope it becomes a regular part of our evaluation process Engaging with native English speakers is a valuable experience that enhances learning.

C065 Negative, maybe a bigger screen should be better Try to improve the speed of

C066 Too long the waiting time, make more nervous the person If you give us more video exams options, we should improve our performance.

C068 In part 2, there could be a bigger prompt on the screen.

In my experience, I prefer taking exams in person Having taken the IELTS five years ago, I find the face-to-face format to be easier, especially considering my intermediate level of proficiency.

C080 The examiner makes me feel comfortable, was really nice Great experience.

C081 The sound was a little low, but screen was good, I could see the teacher.

To enhance focus during Part II (long turn), candidates should be provided with a printed version of the question This approach eliminates the need for note-taking on the question itself, allowing candidates to concentrate on jotting down notes for their answers.

C083 Nothing, it was a great experience to know more or less my knowledge in this moment.

C086 Maybe would be better with headphones.

C089 It was a very nice experience.

C090 I have been presented this exam with a real teacher and is almost the same

This project presents a unique opportunity to connect with experienced English teachers from abroad, filling the gap created by the lack of resources for English education in Venezuela Given the current economic challenges, collaborating with international educators can enhance language skills and provide valuable insights.

While I appreciate the concept, it's essential to consider the Internet connectivity issues in Venezuela, where interruptions can disrupt conversations, conferences, or reading sessions Ensuring high-quality sound from the speaker is crucial for individuals taking the test.

C094 There was 3 moments where the transmission freeze.

C095 Positive – the examiner was very helpful and friendly She made me feel comfortable.

C100 It was a great experience and I recommend use it for IELTS test, thanks.

Examiner training feedback questionnaire

Please circle your Examiner ID: K L M N O P Q R

Tick the relevant boxes according to how far you agree or disagree with the statements below.

1 Strongly disagree 2 Disagree 3 Neutral 4 Agree 5 Strongly agree

Q1 I found the training session useful 8 (100%)

Q2 The differences between the standard F2F test and the VC test were clearly explained.

Q3 What the VC room will look like was clearly explained 3 (37.5%) 5 (62.5%)

Q4 VC specific techniques (e.g use of preamble, back-channelling, gestures, how to interrupt) were thoroughly discussed.

Q5 The rating procedures in the

VC test were thoroughly discussed 8 (100%)

Q6 The training videos that we watched together were helpful 8 (100%)

Q7 I had enough opportunities to discuss all my concern(s)/ question(s) about the VC test.

Additional comments? Do you have any suggestions for improving the training session?

Examiner N: Very clear review of procedures and relation to the VC project.

Examiner O: The training was excellent Very thorough as all the procedures explained etc

The training videos were particularly beneficial, offering in-depth insights and clear visual guidance about the exam Additionally, the role-playing exercises involving mini invigilators, examiners, and candidates helped alleviate my concerns and provided valuable practice.

Your feedback will be very useful for improving the training session.

Examiner Feedback Questionnaire

Today you administered and rated a number of IELTS Speaking Tests using video- conferencing (VC) technology.

To help inform an evaluation of this mode of delivery and rating, we’d welcome comments on your experience of administering and rating the IELTS Speaking Tests.

Years of experience as an EFL/ESL teacher? years months

Years of experience as an IELTS examiner? years months

YOUR EXPERIENCE WITH TECHNOLOGY (please tick):

1 Never 2 3 Once or twice a week 4 5

Q1 How often do you use the Internet socially to get in touch with people?

Q2 How often do you use the Internet to teach?

Q3 How often do you use video- conferencing (e.g Skype, Facetime) socially to communicate with people?

Q4 How often do you use video- conferencing to teach?

Tick the relevant boxes according to how far you agree or disagree with the statements below.

1 Strongly disagree 2 Disagree 3 Neutral 4 Agree 5 Strongly agree Results

Q5 Overall I felt comfortable in administering the IELTS Speaking

Test in the VC mode.

Q6 Overall the examiner training adequately prepared me for administering the VC test

Q7 I found it straightforward to administer Part 1 (frames) of the

IELTS Speaking Test in the VC mode.

Q8 The examiner training adequately prepared me for administering Part 1 of the VC test.

Q9 I found it straightforward to administer Part 2 (long turn) of the

IELTS Speaking Test in the VC mode.

Q11 I found it easy to handle task prompts on the screen in Part 2 of the VC test.

Q12 I found it straightforward to administer Part 3 (2-way discussion) of the IELTS Speaking Test in the VC mode.

Q14 The examiner’s interlocutor frame was straightforward to handle and use in the VC mode.

Q15 The examiner training gave me confidence in handling the interlocutor frame in the VC test

If you chose Option 1 or 2 for any of the questions from Q5 to Q15, please explain why?

Examiner N: I found the delays affected the timings for the Parts, especially Part 1 A few seconds delay for each question adds up and I found it difficult to deliver 3 frames.

Examiner O: It did feel weird, very weird at first as an examiner I felt nervous as this is a new platform for me; initially I was really speaking loudly.

In Part 2 of the examination, if a candidate struggles and overlooks certain prompts, the examiner is unable to directly indicate which prompts were missed The only assistance allowed is a general inquiry like, "Can you tell me more?" To enhance the process, it might be beneficial to incorporate additional backup prompts into the script Additionally, delays can make it challenging to adhere to strict timing, and interrupting or stopping the candidate can be difficult to execute smoothly.

Examiner M: Generally felt very comfortable with the tests Perhaps with weaker students it was more challenging, as some didn't seem to know about all the different parts.

Examiner N: The delays do impact on the interactions, however, I do feel that a good sample can be elicited and the candidates' level can be assessed.

Examiner O: Positives: sound was very clear and great It's just that the sound although clear was delayed.

During the VC test, Examiner P experienced challenges with timing due to delays in image and voice This lag often resulted in a few seconds passing before the candidate recognized that they needed to stop speaking when interrupted.

To enhance eye contact during the examination, it is recommended to position the camera at a level that aligns better with the candidate's face Additionally, incorporating the question "What's your name?" into the pre-test script could be beneficial, although there won't be time for follow-up questions in Part II of the assessment.

Examiner R found the new pre-test conversation with candidates to be highly beneficial The red-highlighted instructions and commands in Part 2 were particularly useful While the platform was easy to navigate, the examiner expressed a desire for additional practice opportunities, such as mock interviews.

1 Strongly disagree 2 Disagree 3 Neutral 4 Agree 5 Strongly agree Results

Q16 Overall I felt comfortable in rating candidate performance in the

Q17 Overall the examiner training adequately prepared me for rating candidate performance in the VC test.

Q18 I found it straightforward to apply the Fluency and Coherence scale in the VC test.

Q19 The examiner training adequately prepared me for applying the Fluency and Coherence scale in the VC test

Q20 I found it straightforward to apply the Lexical Resource scale in the

Q21 The examiner training adequately prepared me for applying the Lexical

Resource scale in the VC test

Q22 I found it straightforward to apply the Grammatical Range and

Accuracy scale in the VC test.

Q23 The examiner training adequately prepared me for applying the

Grammatical Range and Accuracy scale in the VC test.

Q24 I found it straightforward to apply the Pronunciation scale in the VC test.

Q25 The examiner training adequately prepared me for applying the Pronunciation scale in the VC test.

Q26 I feel confident about the accuracy of my ratings in the VC test.

Q27 The examiner training helped me to feel confident with the accuracy of my ratings on the VC test

If you chose Option 1 or 2 for any of the questions from Q16 to Q27, please explain why?

Examiner M: My issues relate to delays within the interactions only rating was not a problem.

Examiner O: I feel I need to spend more time on rating for the VC test I must admit that I'm not as confident marking on the VC platform as during the live tests.

At the start of the lesson, I was overwhelmed with concerns about connectivity, the script, timing, and topic conditions This anxiety affected my ability to rate the candidate comfortably, unlike earlier in the day when I felt more familiar and relaxed with the process.

Examiner O: This was an exciting experience The VC platform made for a more interesting dynamic at times More so than the F2F.

Examiner P: Pronunciation is perhaps the most difficult grade to give because sometimes the audio is not as clear as it is during a F2F interview.

Examiner Q: Especially at first, I was focusing on the technology and not so much the ratings

After a while I felt more comfortable.

Examiner R reported that overall connectivity was good throughout the day, facilitating the rating of all criteria without issues However, there was a concern regarding a 2-3 second delay in video and audio with the candidate, which hindered the ability to time and pose questions smoothly and naturally.

4 COMPARING THE EXPERIENCE OF THE STANDARD FACE-TO-FACE (F2F)

AND THE VIDEO-CONFERENCING (VC) MODE FOR THE IELTS SPEAKING TEST

Q28 Which mode of speaking test do you feel more comfortable with? 4 (50.0%) 0 3 (37.5%) 1 (12.5%)

Q29 Which mode of speaking test do you feel is easier for you to administer? 3 (37.5%) 1 (12.5%) 3 (37.5%) 1 (12.5%)

Q30 Which mode of speaking test do you feel is easier for you to rate? 2 (25.0%) 0 6 (75.0%) 0

Q31 Which mode of speaking test do you think gives candidates a better chance to demonstrate their level of English proficiency?

Q32 Which speaking test do you prefer? 3 (37.5%) 0 4 (50.0%) 1 (12.5%)

Are you aware of doing anything differently in your examiner role across the two speaking test modes – face-to-face and video-conferencing? If yes, please give details…

In virtual speaking tests, Examiner K noted a shift towards more deliberate speech to ensure candidates comprehended the questions, especially since lower-level candidates frequently requested repetitions, a rarity in face-to-face interviews Initially anticipating awkwardness with posture and eye contact in video conferencing, K found that once accustomed to the delay and able to prevent overlapping dialogue, the interaction felt as natural as being in the same room with the candidates.

Examiner L expresses uncertainty about administering VC tests at this early stage, feeling more comfortable with face-to-face (F2F) interviews However, as the day went on, their comfort level increased, to the point where they nearly forgot the candidate was not physically present They believe that with more time, they could feel equally at ease with both testing modes.

At the start of the day, managing both computer and recording device notes posed some challenges for Examiner M, but by the time they reached the third or fourth candidate, the process became more streamlined.

I was fine When there is a slight delay it often meant we were speaking at the same time on occasions However, this was only true with a few candidates.

Examiner N expresses a preference for face-to-face (F2F) interactions, but notes that if the delays in question delivery to candidates were minimized, they would be open to either method They highlight that while there were occasional sync issues between video and sound, the delays had a more significant effect on the overall experience.

Examiner O expressed a need for more time to adapt to the platform, noting a preference for face-to-face interactions They suggested that reducing delays could bridge the gap between the two formats, potentially eliminating differences in experience.

Examiner P acknowledges that delays impact the delivery of exam frames and expresses a preference for face-to-face speaking tests However, they recognize the potential benefits of the virtual classroom (VC) test, particularly for remote locations Although they typically take time to adapt to new technologies, they are confident that they would eventually become comfortable with administering this type of exam.

Examiner Q: Not enough time to ask follow-up Qs for Part II.

Examiner R noted that the candidates' inexperience with VC mode might make them appear nervous, but this can be easily mitigated through practice and training Thank you for the opportunity to participate.

Thank you for answering these questions.

Sound and image problems encountered by examiners

Cand ID Exmr ID Comments

C001 Q Slight 2 sec delay but didn't interfere with process

The audio quality may seem slightly muffled, indicating a hollow or box-like sound in the recording However, there are no issues with adapting to new formats or technology The video image is clear, and the highlighted red instructions in Part 2 are particularly beneficial for users.

C004 R Marked delay (2–3) seconds in both video and sound

C006 R It seemed as if the quality of the VC presented the candidate with many sustained difficulties

C008 R During the 2nd half of Part 3 the screen size (from my view) got larger and then it went back to normal

Some background noise (that I could hear in candidate's venue) but did not seem to affect candidate

C010 R Some noise (talking) coming from room in Medellin There was some delay 2–3 secs in audio Made it a little hard for me to time my next question.

C011 Q Delay had a slight negative affect (examiner and candidate spoke over each other)

C012 R Screen (from where I sat) kept changing size Some delay in audio and some 'freezing' in video

Delays in sound and video can complicate the timing of asking questions during presentations In our Bogota room, excessive noise made communication difficult, and when I inquired with the invigilator, she confirmed that they were unable to hear anything.

C016 R Consistent delays in audio/video; this seemed to affect the candidate

C017 Q One small freeze, generally OK Slight 2 sec delay

C019 Q Slight issue at beginning Candidate couldn't hear question.

C020 R A lot of delays It's very hard to fathom when the candidate will have heard the entire question

C021 Q One Q broken up: asked for repetition Froze during Part 2.

C023 O Graded the candidate only up to Part 2 Major delays in sound – I felt like was speaking really unnaturally C024 O Slight delay in sound – a little echo/delay when I spoke to candidate

C025 P It takes me longer to switch from Part 2 to Part 3 There was less delay in this interview.

Delays can pose challenges in communication with candidates, particularly in signaling when they should start or stop speaking Additionally, the lack of notice before recording begins complicates the process Furthermore, displaying the prompt card on the screen during a candidate's long turn restricts visibility, resulting in only a small view of the candidate.

The C028 P Delay significantly hampers my transition from Instructions (Part 2) to the preparation time, as well as from the end of individual long turns to Part 3 Additionally, during the interview, I noticed slight beeps on the screen and a brief green screen flash, which, despite being momentary, contributed to a positive audio-visual experience, potentially enhancing my interaction.

The video recording experienced a brief delay in the initial segment, with minor audio bleeping during the introduction At the start, the screen momentarily turned green and froze for less than a second, but the candidate remained clearly visible and audible throughout the recording.

The intro frame sound experienced noticeable delays, particularly in Part 1, which occasionally impacted the candidates' responses to questions However, once I became aware of this issue, it became easier to manage and navigate the situation effectively.

Delays in an interview can lead to a slower pace, as it’s essential to ensure that the interviewee has fully understood your questions Longer pauses between sections allow for clearer communication and ensure that the person has completed their thoughts before moving on.

The interview's audio quality was generally good; however, there were notable sound delays, particularly in Part 3, where a lag of 1-2 seconds occurred Initially, the candidate from Mexico struggled to hear me, necessitating an increase in volume on their end, but this issue was successfully resolved during the session.

C035 P Delay continues to be the main problem I find it more difficult to come up with questions (Part 3) than usual.

Cand ID Exmr ID Comments

C041 O Delay in sound made things seem a bit unnatural I forgot to remove the instruction card in Part 2.

In Part 1 of the C043 O, the sound experienced brief cutouts, although it has generally been clear However, there were noticeable delays, suggesting that the candidate took longer to respond to questions.

C045 O *Delays in sound meant that we were….*speed/flow/candidate interaction

During the conclusion of Part 2, I inadvertently switched off the recording while attempting to turn off the topic card, mistakenly clicking the record button instead As a result, the audio recording continued until the interview's conclusion.

Despite some initial delays and brief image freezes, I adapted to the timing, and the overall quality of the interview remained unaffected.

C048 O Screen went blank – delay is sound

C054 L Image froze I can hear the candidate and she can hear me carried on until the end of Part 2 – then called the administrator and asked for help.

C057 L I couldn't take the topic card down at the end of Part 2 Had to call the administrator Went on with card on screen.

C059 L The candidate said once or twice that he couldn't hear me very well

C060 N Video froze for a few seconds in the middle

C061 N Candidate seemed to lose video feed of me Problem corrected alone Audio distortion at about 9 mins

40 secs, corrected after about 10 secs C062 N Some problem with delay and synching of video/sound – this was more apparent at the beginning C063 N Delay was about 3-5 seconds which affected dynamics a little

In Part 2, the initial time constraints make it challenging to ask the ROQ without risking delays Although delays persist, I am adapting to the situation The invigilator noted some sound disruptions on the candidate's side, but I did not observe any issues myself.

C066 N Delays – some impact Interestingly, the candidate didn't hold eye contact for much of the test

C067 N The candidate didn't understand – it isn't clear to me if this was due to the quality of the sound of her language level.

C070 N Delays have some impact but not seriously

During the examination, the examiner noticed a problem with card C071 N that was not visible to the candidate, making it difficult to log out and re-enter the video call; however, this issue was eventually resolved Additionally, for card C073 L, there was a brief freeze in the image towards the end of the interview lasting a few seconds.

C075 K Some delay – card was not showing

C076 K Image froze at some points but audio was OK.

C077 K Delays/candidate often needed repetition (language? Or sound quality?)

C078 K Some delay which caused overlapping (very fluent candidate)

C079 K Some delay – slight pixilation – did not interfere

C081 K Candidate's slow delivery plus delay made communication awkward sometimes Timing was affected C082 K Some overlapping due to delays and candidate style of delivery

C083 K In Part 2 when candidate gets stuck you cannot point at items on the card Shall we read them to the candidate?

C086 K Image froze in Part 3 but audio was OK

C088 M At one point the image froze We just carried on.

C089 M I forgot the recording again for the first 2 minutes It froze for 20 seconds.

C091 M The sound was better in this exam The delay, only occasionally.

C092 M Throughout the test there was a 3/4 second delay so there was some overlapping between me and the candidate.

C094 M Still a slight delay of 3/4 seconds and so we interrupted each other a lot.

C096 M Loud noise of plane(?) flying overhead at one moment on candidate's side.

Tiêu đề	Exploring the use of video-conferencing technology to deliver the IELTS Speaking Test: Phase 3 technical trial
Tác giả	Vivien Berry, Fumiyo Nakatsuhara, Chihiro Inoue, Evelina Galaczi
Người hướng dẫn	Mina Patel, British Council, Val Harris, IELTS examiner trainer, Sonya Lobo-Webb, IELTS examiner
Trường học	University of Bedfordshire
Chuyên ngành	English as a Second Language
Thể loại	Research Paper
Năm xuất bản	2018

Định dạng
Số trang	58
Dung lượng	755,54 KB