1. Trang chủ
  2. » Giáo Dục - Đào Tạo

An intelligent tutoring system that gene

22 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 22
Dung lượng 506,24 KB

Nội dung

Artificial Intelligence in Medicine (2006) 38, 25—46 http://www.intl.elsevierhealth.com/journals/aiim An intelligent tutoring system that generates a natural language dialogue using dynamic multi-level planning Chong Woo Woo a, Martha W Evens b,*, Reva Freedman c, Michael Glass d, Leem Seop Shim e, Yuemei Zhang f, Yujian Zhou g, Joel Michael h a School of Computer Science, Kookmin University, 861-1 Chongnung-Dong, Sungbuk-Ku, Seoul, Republic of Korea b Computer Science Department, Illinois Institute of Technology, Room 236, 10 West 31st Street, Chicago, IL 60616, USA c Northern Illinois University, De Kalb, IL 60115, USA d Valparaiso University, Valparaiso, IN 46383, USA e HS Tech, Inc., 26500 Agoura Road, Suite #108, Calabasas, CA 91302, USA f Wells Fargo - N9301-01J, 255 Second Avenue South, Minneapolis, MN 55479, USA g WebEx Communications, Inc., 3979 Freedom Circle, Santa Clara, CA 95054, USA h Department of Molecular Biophysics and Physiology, Rush Medical College, 1750 West Harrison, Chicago, IL 60612, USA Received 16 February 2005; received in revised form 14 October 2005; accepted 21 October 2005 KEYWORDS Intelligent tutoring system; Natural language dialogue; Instructional planning; Dynamic planning; Hierarchical planning; Reactive planning; Language understanding; Dialogue generation Summary Objective: The objective of this research was to build an intelligent tutoring system capable of carrying on a natural language dialogue with a student who is solving a problem in physiology Previous experiments have shown that students need practice in qualitative causal reasoning to internalize new knowledge and to apply it effectively and that they learn by putting their ideas into words Methods: Analysis of a corpus of 75 hour-long tutoring sessions carried on in keyboardto-keyboard style by two professors of physiology at Rush Medical College tutoring first-year medical students provided the rules used in tutoring strategies and tactics, parsing, and text generation The system presents the student with a perturbation to the blood pressure, asks for qualitative predictions of the changes produced in seven important cardiovascular variables, and then launches a dialogue to correct any errors * Corresponding author Tel.: +1 312 567 5153; fax: +1 312 567 5067 E-mail address: evens@iit.edu (M.W Evens) 0933-3657/$ — see front matter # 2005 Elsevier B.V All rights reserved doi:10.1016/j.artmed.2005.10.004 26 C.W Woo et al and to probe for possible misconceptions The natural language understanding component uses a cascade of finite-state machines The generation is based on lexical functional grammar Results: Results of experiments with pretests and posttests have shown that using the system for an hour produces significant learning gains and also that even this brief use improves the student’s ability to solve problems more then reading textual material on the topic Student surveys tell us that students like the system and feel that they learn from it The system is now in regular use in the first-year physiology course at Rush Medical College Conclusion: We conclude that the CIRCSIM—Tutor system demonstrates that intelligent tutoring systems can implement effective natural language dialogue with current language technology # 2005 Elsevier B.V All rights reserved Introduction 1.1 Research goals The goal of this research was to develop an intelligent tutoring system capable of carrying on a natural language dialogue with students Our system was originally conceived by two professors at Rush Medical College, Joel Michael and Allen Rovick, inspired by the conviction that natural language interaction was the most effective way for students to learn They observed that students learn best when required to give explanations of their thinking and they instituted small group problem-solving sessions and individual tutoring sessions to supplement their own computer-aided instruction (CAI) systems [1,2] CIRCSIM—Tutor is designed to help first year medical students learn to solve problems involving the baroreceptor reflex system, which stabilizes blood pressure in the human body The system presents a perturbation to the cardiovascular system and asks the student to make qualitative predictions about changes in seven important cardiovascular parameters It analyzes these predictions, identifies any errors, and assists them in correcting their errors The design of the system is based on the analysis of human tutoring sessions carried on by Michael and Rovick at Rush Medical College This analysis convinced us that planning plays a central role in the generation of a tutorial dialogue When CIRCSIM—Tutor asks the student to make predictions about the behavior of various cardiovascular parameters it asks only whether that parameter rose or fell or stayed the same, because the focus is on qualitative causal reasoning Practicing physicians not generally need to know the numeric values of these parameters but they need to use this kind of qualitative reasoning every day [3] Michael and Rovick originally used a detailed mathematical model in their CAI system, but they found that students lost track of the underlying ideas when they tried to handle detailed mod- els and tables of numbers [4] So from the very beginning of this project one of the goals was to teach qualitative reasoning [5—7], an approach to problem solving that reasons about the causal relationships that structure our world Anderson [8] argues that qualitative reasoning is the most demanding approach, one that is essential to a high performance tutoring system He claims that it can also maximize pedagogical effectiveness, because it is human-like reasoning, although the implementation effort is much larger than that required for the traditional models 1.2 Evolution of computer-based instruction at rush medical college Michael and Rovick had a great deal of experience with CAI in the cardiovascular domain at Rush Medical College Their systems evolved from HEARTSIM [1], to CIRCSIM [2,9], to the CIRCSIM—Tutor prototype [10,11] and finally to CIRCSIM—Tutor, which has itself evolved over almost 15 years [12] HEARTSIM was a Plato program and CIRCSIM is a stand-alone Basic program The CIRCSIM—Tutor prototype was a Prolog prototype of our intelligent tutoring system designed and implemented without any natural language capabilities [10] CIRCSIM—Tutor uses many features from the CIRCSIM—Tutor prototype but it is written in Lisp and it includes much more complete student modeling, instructional planning, and natural language facilities 1.3 Natural language dialogue and tutoring systems We made natural language dialogue the core of our system, because it is an especially powerful tool for learning Chi et al [13] have now produced scientific verification of our belief that putting ideas into your own words is a central part of learning In an ingenious experiment Fox Tree [14] demonstrated that people remember ideas that they have heard discussed in dialogue better than a monologue on An intelligent tutoring system that generates a natural language dialogue the same topic Perhaps this is why Plato’s Dialogues have survived for 2400 years while so much other Greek learning has been lost Fox [15] pointed out that, in tutoring dialogues, the tutors and the students typically construct the answer together, so the students remain active participants and also share ownership in the result The builders of the first intelligent tutoring systems, Carbonell [16], Carr and Goldstein [17], Burton and Brown [18], Collins and Stevens [19], all assumed that tutoring should be carried out via natural language dialogue Then the difficulty of natural language processing and the attractions of the new graphical user interfaces drew intelligent tutoring systems research in a different direction When this project began, CIRCSIM—Tutor was alone in the field with Wilensky’s [20,21] Unix Consultant, which is really a coach and not a tutor There was important research in dialogue-based intelligent tutoring systems carried on by Woolf [22,23] and Cawsey [24], but they used template-based generation and limited (partly menu) input rather than trying to handle whatever the user typed and generating responses from scratch Happily, CIRCSIM—Tutor is not so lonely anymore Increases in machine capability have made natural language dialogue much more manageable and knowledge of text generation has increased rapidly One notable example is Atlas [25,26], a physics tutor at the University of Pittsburgh VanLehn started from Andes, a successful cognitive tutor for physics, and assembled a top team to provide natural language interaction Freedman’s Atlas Planning Environment carried out the dialogue planning [27] Rose ´’s [26,28] Carmel parser did the parsing and produced a logical representation of the student input Jordan et al [29,30] used Tacitus-Lite for reasoning about the analysis of the student’s essay and also developed knowledge creation dialogues for dialogue generation Graesser and his group at the University of Memphis built AutoTutor [31] based on their studies of human tutoring, which uses latent semantic analysis (LSA) to handle natural language understanding and generation For natural language understanding, they used LSA to match the student input to one or several ideal answers They used LSA in generation, as well, to pick out the most relevant answers to a question from a collection of texts generated by experts With the encouragement of a multi-university research grant from the Office of Naval Research the Atlas and AutoTutor project teams joined forces to build tutors for qualitative physics using their two different approaches The current generation of tutors resulting from this research, Why2-Atlas 27 [32] and Why2-AutoTutor [33], has been shown to be more effective than reading text; the practical alternative for most university courses Both tutors present the student with a problem and ask them to write a short essay giving an answer and an explanation of their reasoning Then the system critiques the essay and helps the student to improve its content VanLehn’s team has also built the Pyrenees tutor [34], another physics tutor much like Atlas, except that it discusses problem-solving algorithms with the student in explicit terms, which gives a significant improvement Making use of these results, Lane and VanLehn [35] have recently developed another dialogue-based tutor for introductory programming students that emphasizes the understanding of the algorithms involved These tutors must analyze input with longer and more complex content than CIRCSIM—Tutor sees, but their dialogue is not as interactive CATO, developed by Ashley and Aleven, is designed to help law students learn techniques of argumentation Ashley is a well-known expert in legal artificial intelligence, while Aleven contributes the natural language expertise [36,37] As a by-product of this research, the authors carried out an experiment demonstrating the efficacy of Socratic tutoring over more didactic tutoring A series of experiments by Di Eugenio [38] showed that improving the quality of the natural language generation of an existing system can make a significant difference in learning outcomes Moore’s BEETLE [39,40], designed to teach basic electricity and electronics to Navy recruits, uses even more sophisticated generation techniques Her group is using these excellent natural language capabilities to encourage students to participate more fully by responding to student affect and playing an effective part in the mixed-initiative dialogues that result Forbus and Rose ´ have combined forces to give Forbus’s well-known CyclePad tutor for engineering design [6] a new interface called CycleTalk [41], which has the ability to hold tutoring dialogues with considerable success The systems described so far, like CIRCSIM—Tutor, all make use of written interaction, but the age of speech-enabled tutoring has begun Litman has added a speech front-end to Atlas to produce ITSpoke [42] In a series of well-designed experiments, she has shown that ITSpoke produces even better learning outcomes than Atlas Stanley Peters and his team at the Center for the Study of Language and Information at Stanford University have added speech to Wilkins’ Naval Damage Control simulation to produce SCoT [43] Now that this tutor has been shown to be effective, it may suggest a way to provide training for various types of emergency 28 response teams In summary, there are now several groups making important contributions to our knowledge of dialogue-based tutoring 1.4 Domain of CIRCSIM—Tutor–—the baroreceptor reflex The cardiovascular system consists of many mutually interacting components, and it is important for the student to understand the cause and effect relationships between the individual components of the system Fig shows a causal model of CIRCSIM—Tutor, called the ‘‘concept map,’’ designed by Michael and Rovick [44,4] Each box in the map represents a physiological variable, such as SV (Stroke volume) and MAP (mean arterial pressure) An arrow with a plus or a minus sign between two boxes tells the direction of the causal effects and whether the causal relationship between the connected variables is direct or inverse For example, a qualitative change in one component of the system, a decrease in CVP (central venous pressure), directly causes a decrease in SV This qualitative change propagates to other adjacent components of the system according to the propagation rule It is important for the student to recognize that when the baroreceptors sense a change in MAP, the baroreceptor reflex kicks in and the central nervous system (CNS in the diagram) directly manipulates three neural variables, the heart rate (HR), the inotropic state (IS), and the total peripheral resistance (TPR), in order to regulate MAP There are three stages in the human body’s response to a perturbation in the system that controls blood pressure The first stage is the direct response (DR), in which a perturbation in the system has an immediate physical, hemodynamic effect on C.W Woo et al the other parameters The second stage is the reflex response (RR), in which other parameters are affected by the negative feedback mechanism to stabilize the blood pressure The final stage is the steady state (SS), which is achieved as a balance between the changes directly caused by the initial perturbation and the further changes induced by the negative feedback process 1.5 Organization of this paper In the next section, we describe what the system looks like from the user’s point of view, display a sample fragment of dialogue and give a brief report of the system trial in November 1999, which demonstrated that an hour with the system produced larger learning gains than reading a carefully chosen piece of text for the same amount of time In Section 3, we describe some of the special features of the CIRCSIM—Tutor system The rest of the paper describes how the system works to produce the kind of dialogue shown in the example In Section 4, we describe the system architecture and then in the subsequent sections we describe each major module in the system and how it functions Section discusses the core issue of planning and describes the many kinds of planning that are needed for expert tutoring Section describes the domain knowledge base and the problem solver and Section describes the screen manager Section discusses some different approaches to understanding the student input Section describes the student modeler and the different types of assessment that the system makes of the student’s performance Section 10 describes our approach to generating output and Section 11 presents our conclusions Figure The causal concept map (An arrow from box A to box B means that parameter A immediately determines parameter B A plus sign indicates that this relationship is direct; a minus sign indicates that it is inverse.) RV: venous resistance, PIT: intrathoracic pressure, CVP: central venous pressure, CBV: central blood volume, BV: blood volume, SV: Stroke Volume, CO: Cardiac Output, MAP: mean arterial pressure, BR: baroreceptor reflex, CNS: central nervous system, IS: inotropic state, HR: heart rate, TPR: total peripheral resistance An intelligent tutoring system that generates a natural language dialogue 29 Table List of available procedures Table Decrease arterial resistance (Ra) to 50% of normal Denervate the baroreceptors Decrease Ra to 50% of normal in a denervated preparation Hemorrhage: remove 0.5 liter of blood Hemorrhage: remove an additional 1.0 l of blood Decrease cardiac contractility to 50% of normal Increase venous resistance to 200% of normal Increase intrathoracic pressure to mg Hg Parameters DR RR Inotropic state Central venous pressure Stroke volume Heart rate Cardiac output Total peripheral resistance Mean arterial pressure À + À À À + À + + + + + CIRCSIM—Tutor in action 2.1 How CIRCSIM—Tutor interacts with the student CIRCSIM—Tutor begins with a brief introductory message and then displays a list of eight available procedures (shown in Table 1) These procedures were developed by Michael and Rovick for use in the CIRCSIM program and were inherited by CIRCSIM— Tutor Each procedure (called that because they replaced experimental procedures with animals) describes a perturbation of the cardiovascular system As soon as the student has made a choice, the system brings up the screen in Fig with a description of the procedure in the window on the upper right and the prediction table underneath Table shows a larger diagram of the prediction table The first column is used to enter qualitative predictions for the DR phase before the baroreceptor kicks in A Figure The CIRCSIM—Tutor prediction table SS DR: direct response; RR: reflex response; SS: steady state popup menu allows the student to enter a ‘‘+’’ sign to indicate an increase, a ‘‘À’’ for a decrease and a ‘‘0’’ to indicate no change CIRCSIM—Tutor asks the student to figure out which variable will change first and enter the change for that variable in the corresponding square If the student has difficulty in doing this, the system gives the student a hint If that hint does not work, it produces a broader hint If the student’s third try is still wrong, the system tells the student the answer Once the student has succeeded in predicting the first variable, the system asks for predictions for the rest of the first column without giving any feedback until the student has predicted all six remaining variables The system then marks any errors with a diagonal bar across the box and starts a remedial dialogue with the student about these errors, as shown in the figure After the student has corrected all the errors in the DR column, the system asks for predictions for the RR A CIRCSIM—Tutor screen from version 2.8 (November 1999) 30 phase, then again marks any errors, and begins another tutorial dialogue Once the RR errors have been corrected, the system asks for predictions about the behavior of these parameters in the SS phase (the third and last column) and then again launches a tutorial dialogue 2.2 The prediction table/multiple simultaneous inputs CIRCSIM—Tutor begins with a prediction table, in which the student is asked to make qualitative predictions about the behavior of the system given a particular perturbation CIRCSIM—Tutor inherited the prediction table from CIRCSIM [2,9] This very successful, widely used system asks the student to fill in all three columns of predictions at once, recognizes certain patterns of errors, and then delivers one of over 240 targeted remedial paragraphs stored in the system Michael and Rovick were convinced that the prediction table was an important factor in the effectiveness of this older system Although we believe that immediate feedback is valuable (which is why CIRCSIM—Tutor gathers only one column of predictions at a time), we feel that the advantages of using the prediction table outweigh that value First, the prediction table provides the student with a simple mental model of the task and a way of keeping track of current progress in the solution process Second, CIRCSIM—Tutor can make a much more detailed and sophisticated student model It records errors and error patterns Some error patterns violate fundamental equations; others suggest the possible presence of important misconceptions Based on a careful analysis of these errors, the tutor can generate a lesson plan, and interactive tutoring begins by using a mixed-initiative Socratic strategy in natural language Thus, the prediction table provides a qualitative simulation environment for the student by requiring multiple simultaneous inputs (multiple responses to different aspects of a problem provided by the student in a single uninterrupted turn) before interactive tutoring begins There are several benefits of adapting this kind of design strategy First, the system receives enough initial knowledge about the student so that it can narrow the focus for tutoring Second, it can also detect some common student misconceptions [45,46] and probe for them further Third, the presence of a simple mental model of the entire domain prevents the students from getting too far off the track Elsom-Cook [47] argues that using multiple pedagogic strategies can provide a very powerful learning environment CIRCSIM—Tutor begins with a coach-like environment during the Prediction Table C.W Woo et al entry, and then moves to Socratic tutoring for the interactive tutoring session This kind of flexibility provides a fourth benefit The system can adapt rapidly to the needs of individual students 2.3 A fragment of a CIRCSIM—Tutor dialogue from November 1999 Here is a fragment of dialogue between a student and CIRCSIM—Tutor in November 1999, in which a student (Subject #2) is tutored about a prediction error in the change in SV in the RR phase The procedure that the student is solving right now is ‘‘decrease inotropic state by 50%.’’ This situation can result from several common diseases As it happens, this student had made all the DR predictions correctly The prediction table in Table shows the DR and RR predictions made by this student  T1: What are the determinants of Stroke Volume?  S2: CVP and IS  T3: Correct, Stroke Volume is determined by Central Venous Pressure and Inotropic State  Which determinant is dominant in this case?  S4: CVP  T5: Correct, the important determinant of Stroke Volume in this case is Central Venous Pressure Is the relationship from Central Venous Pressure to Stroke Volume direct or is it inverse?  S6: direct  T7: Good, you got the correct answer What is the correct value of Stroke Volume?  S8:  T9: Correct, the value of Stroke Volume is decreased Please change the value of SV in the prediction table In this example, student #2 gives the expected answer at each stage as we carry out a tutoring strategy that was abstracted from expert tutoring sessions [48], as described in Section Now let us look at a more typical example, where the student has some difficulty with some of these questions This second example below comes from student #35, who made the exact same prediction error for SV in the RR phase The system starts off with the same strategy in T11 as in T1, but the student comes up with only one of the determinants in S12, so the system asks for the other in T13 In S14, the student’s answer is not too brief but too long The system gives an explanation in T15 and follows up with a hint  T11: What are the determinants of Stroke Volume?  S12: Inotropic state An intelligent tutoring system that generates a natural language dialogue  T13: Right, Inotropic State is one determinant of Stroke Volume The other determinant represents preload What is the other determinant of Stroke Volume?  S14: Preload, afterlaod and inotr opic state  T15: Right, Inotropic State is one MAP, or afterload, does affect SV, but the effect is not quantitatively important in this case Think about what parameter in the prediction table represents preload  S16: Preload  T17: What parameter in the prediction table represents preload?  S18: Central Venous Pressure  T19: Right, Central Venous Pressure determines preload, therefore determines Stroke Volume In this second example the student’s answer ‘‘preload’’ in S16 is categorized as a ‘‘near miss.’’ The student is certainly not wrong and we not want the system to respond with ‘‘wrong’’ in this situation as the first version of the system did The system gives the student yet another nudge in the right direction When the student comes up with the right parameter name, the system recapitulates what is happening and ties it all up The student goes on to correct the prediction for SV We discovered that hinting is an important strategy for human tutors and we have analyzed human hints [45] in some detail and implemented them in CIRCSIM—Tutor [49] 2.4 Brief description of the results of CIRCSIM—Tutor experiment in November 1999 We carried out an extensive experiment to validate CIRCSIM—Tutor in November 1998, with 50 first-year medical students at Rush Medical College, which is described in detail in Michael et al [50] In Novem- 31 ber 1999, we carried out another experiment with a control group that shows that these students learn more about solving problems in an hour with CIRCSIM—Tutor than in reading carefully chosen text from a standard textbook for an hour This experiment demonstrated that CIRCSIM—Tutor works and led to its routine use at Rush It was carried out in a regularly scheduled 2-h laboratory All of the students took a pretest A control group containing 28 students read a specially edited chapter on the baroreceptor reflex, excerpted from Heller and Mohrman’s Cardiovascular Physiology [51] by our experts The experimental group (with 22 students) used CIRCSIM—Tutor A third group of 23 students used CIRCSIM All of the students took a posttest We had earlier developed two comparable tests, tests a and b In each group half of the students took test a as pretest and test b as pretest The students who had taken test a as pretest took test b as posttest; while those who took test b as pretest took test a as posttest Each test had three parts, relationship questions, problem-solving questions, and multiple-choice questions A later analysis showed that the pairs of multiple-choice questions were not comparable and so we will not report those results Finally, the students who had used CIRCSIM— Tutor filled out a brief survey form asking for their reactions to the system More details about this experiment can be found in Evens and Michael [12] The system performed pretty well It did not crash and 60% of the students completed all eight procedures The students made 96 spelling errors and the system corrected 91 of them It came up with something appropriate to say in response to all but six of the 1692 dialogue inputs In those six cases, in spite of the inappropriate responses by the system, the student was able to figure out how to keep going and continue the session A summary of the test results appears in Table Using one-tailed t-tests and assuming equal var- Table Results of the CIRCSIM—Tutor experiment at rush medical college in November 1999 Pretest mean (S.D.) Posttest mean (S.D.) Gain (pre—post) p value Effect size Control (N = 28) Relationship points (max 24) Correct predictions (max 20) 14.1 (4.8) 12.2 (3.0) 19.9 (4.5) 13.8 (2.6) 5.8 1.6

Ngày đăng: 20/12/2021, 10:11

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN