quantitative and qualitative evaluation of darpa communicator spoken dialogue systems

Báo cáo khoa học: "Quantitative and Qualitative Evaluation of Darpa Communicator Spoken Dialogue Systems" pdf

Báo cáo khoa học: "Quantitative and Qualitative Evaluation of Darpa Communicator Spoken Dialogue Systems" pdf

... user’s travel plans both at the beginning of the dialogue and also after Quantitative and Qualitative Evaluation of Darpa Communicator Spoken Dialogue Systems Marilyn A. Walker AT&T Labs – ... labels and summed the total effort ex- pended on each type of dialogue act over the dialogue or the percentage of a dialogue given over to a particular type of dialogue behavior. These sums and ... achieve a better understanding of the role of qualitative as- pects of each system’s dialogue behavior. We quantify the extent to which the dialogue act metrics improve our understanding by applying the...

Ngày tải lên: 31/03/2014, 04:20

8 319 0
Quality of Telephone-Based Spoken Dialogue Systems docx

Quality of Telephone-Based Spoken Dialogue Systems docx

... number of system words uttered in a dialogue average number of time-out prompts in a dialogue average number of turns in a dialogue average number of user questions in a dialogue average number of ... be observed, and is thus a prerequisite for setting up better theories and systems. The development of spoken dialogue systems requires not only a change in the focus of speech and language research ... determine the quality of the developed systems, and the resulting satisfaction of their users. As a wide range of novice users is the target group of current state -of- the- art systems and services, the...

Ngày tải lên: 27/06/2014, 11:20

385 207 0
Quality of Telephone-Based Spoken Dialogue Systems phần 1 ppsx

Quality of Telephone-Based Spoken Dialogue Systems phần 1 ppsx

... Telephone-Based Spoken Dialogue Systems 20 internal correction, anticipation, and prediction. Examples of such systems are given in Section 2.1.3.7. Multimodal dialogue systems including speech: Systems of ... be observed, and is thus a prerequisite for setting up better theories and systems. The development of spoken dialogue systems requires not only a change in the focus of speech and language research ... determine the quality of the developed systems, and the resulting satisfaction of their users. As a wide range of novice users is the target group of current state -of- the- art systems and services, the...

Ngày tải lên: 07/08/2014, 21:20

46 293 0
Quality of Telephone-Based Spoken Dialogue Systems phần 2 potx

Quality of Telephone-Based Spoken Dialogue Systems phần 2 potx

... outcome of an assessment or evaluation experiment 5 . Spoken language systems are relatively complex systems which offer a num- ber of different (and ill-defined) functions. The functions of the ... provision of help to the user, the correction of errors and misunderstandings, the interpretation of complex discourse phenomena like ellipses and ana- phoric references, and the organization of information ... by the dialogue manager are the collection of all information from the user which is needed for the task, the distribution of dialogue initiative, the provision of feedback and verification of information...

Ngày tải lên: 07/08/2014, 21:20

49 243 0
Báo cáo khoa học: "You Can’t Beat Frequency (Unless You Use Linguistic Knowledge) – A Qualitative Evaluation of Association Measures for Collocation and Term Extraction" pot

Báo cáo khoa học: "You Can’t Beat Frequency (Unless You Use Linguistic Knowledge) – A Qualitative Evaluation of Association Measures for Collocation and Term Extraction" pot

... for CE (Wermter and Hahn, 2004) and for ATR (Wermter and Hahn, 2005), which have been shown to outperform several of the statistics- only metrics. 3 Methods and Experiments 3.1 Qualitative Criteria Because ... in CE and ATR) because it has been shown to be the best-performing statistics- only measure for CE (cf. Evert and Krenn (2001) and Krenn and Evert (2001)) and also for ATR (see Wermter and Hahn ... Press. Stefan Evert and Brigitte Krenn. 2001. Methods for the qualitative evaluation of lexical association mea- sures. In ACL’01/EACL’01 – Proceedings of the 39th Annual Meeting of the Association...

Ngày tải lên: 31/03/2014, 01:20

8 435 0
Tài liệu Báo cáo khoa học: "Methods for the Qualitative Evaluation of Lexical Association Measures" doc

Tài liệu Báo cáo khoa học: "Methods for the Qualitative Evaluation of Lexical Association Measures" doc

... curves (Figures 3 and 4), we find: (i) Examination of 50% of the data in the SLs leads to identification of between 75% (AdjN) and 80% (PNV) of the TPs. (ii) For the first 40% of the SLs, and lead to ... discussion of the excluded low-frequency candidates). 4 Experimental Setup After extraction of the base data and manual iden- tification of TPs, the AMs are applied, resulting in an ordered candidate ... instance, 80% of the full set of PNV data and 58% of the AdjN data are ha- paxes. Thus it is important to know how many (and which) true collocations there are among the excluded low-frequency candidates. 5.1...

Ngày tải lên: 20/02/2014, 18:20

8 516 0
Báo cáo khoa học: "Correlation between ROUGE and Human Evaluation of Extractive Meeting Summaries" pptx

Báo cáo khoa học: "Correlation between ROUGE and Human Evaluation of Extractive Meeting Summaries" pptx

... ICASSP. X. Zhu and G. Penn. 2005. Evaluation of sentence selection for speech summarization. In ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/ or Summariza- tion. X. Zhu and G. ... those of the authors and do not necessarily reflect the views of NSF. References J. Carbonell and J. Goldstein. 1998. The use of mmr, diversity- based reranking for reordering documents and producing summaries. ... Infor- mative Coverage (IC): S2 and S9; Informative Relevance (IRV): S3 and S8; and Informative Redundancy (IRD): S4 and S7. 4 Results 4.1 Correlation between Human Evaluation and Original ROUGE Score Similar...

Ngày tải lên: 17/03/2014, 02:20

4 293 0
Báo cáo khoa học: "Correlating Human and Automatic Evaluation of a German Surface Realiser" doc

Báo cáo khoa học: "Correlating Human and Automatic Evaluation of a German Surface Realiser" doc

... are. Belz and Reiter (2006) and Reiter and Belz (2009) describe com- parison experiments between the automatic eval- uation of system output and human (expert and non-expert) evaluation of the same ... evalua- tion of a string realisation system usually involves string comparisons between the output of the sys- tem and some gold standard set of strings. Typi- cally automatic metrics from the fields of ... corpus of 200 million words of newspa- per and other text. Cahill and Forst (2009) describe a number of experiments where they collect judgements from native speakers about the three systems...

Ngày tải lên: 23/03/2014, 17:20

4 285 0
Báo cáo khoa học: "Comparing Automatic and Human Evaluation of NLG Systems" potx

Báo cáo khoa học: "Comparing Automatic and Human Evaluation of NLG Systems" potx

... Background 2.1 Evaluation of NLG systems NLG systems have traditionally been evaluated using human subjects (Mellish and Dale, 1998). NLG evaluations have tended to be of the intrinsic type (Sparck Jones and ... communicative goal; and that corpus texts are often not of high enough qual- ity to form a realistic test. 2.2 Automatic evaluation of generated texts in MT and Summarisation The MT and document summarisation ... algorithms, and data sets. BLEU and re- lated metrics work by comparing the output of an MT system to a set of reference (‘gold standard’) translations, and in principle this kind of evalua- tion...

Ngày tải lên: 24/03/2014, 03:20

8 376 0
DELIVERING HEALTH EDUCATION VIA THE WEB: DESIGN AND FORMATIVE EVALUATION OF A DISCOURSE-BASED LEARNING ENVIRONMENT pot

DELIVERING HEALTH EDUCATION VIA THE WEB: DESIGN AND FORMATIVE EVALUATION OF A DISCOURSE-BASED LEARNING ENVIRONMENT pot

... structure and navigation, readability of text, appropriateness of graphics and icons, clarity and quality of information, suitability of external links, and clarity and perceived motivating and discussion ... environment. A variety of data collection protocols and tools were developed to collect quantitative and qualitative data. Pre and post tests related to HIV/AIDS and nutrition will allow for quantitative comparison ... formative evaluation of the Web site have a varied teaching background in terms of level and content and are focusing their postgraduate studies on the design, development and evaluation of technology-based...

Ngày tải lên: 28/03/2014, 21:20

12 411 0
w