INTRODUCTION TO COMPUTER SCIENCE - PART 7 pptx

6 324 0
INTRODUCTION TO COMPUTER SCIENCE - PART 7 pptx

Đang tải... (xem toàn văn)

Thông tin tài liệu

INTRODUCTION TO COMPUTER SCIENCE HANDOUT #7. AUTOMATA K5 & K6, Computer Science Department, Vaên Lang University Second semester Feb, 2002 Instructor: Traàn Ñöùc Quang Major themes: 1. Patterns and Pattern Matching 2. Finite State Machines and Automata 3. Deterministic and Nondeterministic Automata Reading: Sections 10.2 and 10.3. 7.1 PATTERNS AND PATTERN MATCHING A pattern is a set of objects with some recognizable property. One type of pattern is a set of character strings, such as the set of legal C identifiers, each of which is a string of letters, digits, and underscores, beginning with a letter or underscore. Given a pattern and an input, the process of determining if the input matches the pattern is called pattern matching, a problem also known as pattern recognition. In compiling, for example, one of the essential parts is to regconize construct patterns in programs before translating programs into a desired code. Let’s see an illustration for the first phase of this process. Consider an if-statement in C, if (a==b) x = 1; A C compiler will read input characters from the left, one at a time, collect them into small groups of characters (lexemes or tokens) matching some lexical pattern. This phase is called lexical analysis. Our statement, for example, may be grouped into the following tokens, each has its own pattern: 1. The keyword if 2. The left parenthesis ( 3. The identifier a 4. The comparison operator == 40 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7. AUTOMATA 5. The identifier b 6. The right parenthesis ) 7. The identifier x 8. The assignment operator = 9. The integer 1 10. The statement-terminator ; White space characters (blanks, tabs, and newlines) would also be eliminated. 7.2 STATE MACHINES AND AUTOMATA Programs that search for patterns often have a special structure. We can identify cer- tain positions in the code at which we know something particular about the program’s progress toward its goal of finding an instance of a pattern. We call these positions states. The overall behavior of the program can be viewed as moving from state to state as it reads its input. To see the behavior of such a program, we can draw a graph with a node for each state, and an arc for each moving from state to state (called a transition). A graph for a program recognizing English words with five vowels in order is shown below: There are two important states in this graph, one with an incoming arc labeled start (state 0), and the other with a double circle (state 5). The former, the start state, is the state in which we begin to recognize the pattern; the latter, the accepting state, is the state we reach after having found our pattern and "accept". There may be several accepting states but one start state. Such a graph is called a finite automaton or just automaton. We can design a pattern-matching program by first designing the automaton, then mechanically translating it into a program. I will give an example in the next section. Automata can be viewed as a state machine consisting of a finite control, an input tape, and a head to read a sequence of symbols written on the tape. At any time during its operation, the machine reads a symbol on the tape, changes its state, and moves the head one symbol to the right. A picture of automata is shown in the figure on the next page. 4 53210 ΛΛ − a ΛΛ − e ΛΛ − i ΛΛ − o ΛΛ − u a uoie start 7.3 DETERMINISTIC AND NONDETERMINISTIC AUTOMATA 41 7.3 DETERMINISTIC AND NONDETERMINISTIC AUTOMATA The automaton discussed in the previous section has an important property. For any state s and any input character x, there is at most one transition out of state s whose label includes x. Such an automaton is said to be deterministic. It is straighforward to convert deterministic finite automata (DFA) into programs. We create a piece of code for each state. The code for state s examines its input and decides which of transitions out of s, if any, should be followed. If a transition from state s to state t is selected, then the code for state s must arrange for the code of state t to be executed next, perhaps by using a goto-statement. Suppose we have a DFA for a bounce filter. You need not understand its meaning. Just observe that the DFA has the start state a and the two accepting states c and d, examines the input characters 1 and 0. From this DFA, we can mechanically produce a simple program under the guide mentioned. A resulting program is given on the next page. i f ( a = = finite control input tape ca b d start 0 1 1 0 1 0 1 0 42 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7. AUTOMATA void bounce() { char x; /* state a */ a: putchar(’0’); x = getchar(); if (x == ’0’) goto a; /* transition to state a */ if (x == ’1’) goto b; /* transition to state b */ goto finis; /* state b */ b: putchar(’0’); x = getchar(); if (x == ’0’) goto a; /* transition to state a */ if (x == ’1’) goto c; /* transition to state c */ goto finis; /* state 1 */ c: putchar(’0’); x = getchar(); if (x == ’0’) goto d; /* transition to state d */ if (x == ’1’) goto c; /* transition to state c */ goto finis; /* state d */ d: putchar(’1’); x = getchar(); if (x == ’0’) goto a; /* transition to state a */ if (x == ’1’) goto c; /* transition to state c */ goto finis; finis: ; } Although it is easy to convert a DFA into a program, designing it is more difficult. In fact, there is a generalization of DFAs, which is conceptually more natural. This kind of automata, called nondeterministic finite automata (NFA for short), may have two or more transitions containing the same symbol out of one state. Note that a DFA is technically a NFA as well, one that happens not to have multi- ple transitions on one symbol. 7.4 GLOSSARY 43 NFAs are not directly implementable by programs, but they are useful conceptual tools for a number of applications. Moreover, by using the "subset construction", it is possible to convert any NFA to a DFA that accepts the same set of character strings but this topic is beyond our discussion. For an illustration, I only show a NFA in the following figure. Note that we use the symbol ΛΛ to indicate any legal symbol. 7.4 GLOSSARY Pattern: Mẫu. See the definition in text. Pattern Matching: Đối sánh mẫu, so mẫu. Recognition: Nhận dạng. Identifier: Đònh danh. A name of an data object in a program. Character: Ký tự. Any symbol that we may input from the keyboard, including letters, digits, special symbols such as +, ^, and some nonprintable symbols. Letter: Chữ cái. Digit: Ký số, chữ số. Underscore: Dấu gạch thấp _. Input: Nguyên liệu, dữ liệu nhập. Output: Thành phẩm, dữ liệu xuất. Code: Mã lệnh, mã chương trình. A full program or program segment in any form, such as a high-level language or machine language. Compilation: Quá trình biên dòch. Sometimes also translation. Compiler: Trình biên dòch. Interpreter: Trình thông dòch. Translator: Chương trình dòch (nói chung). Lexeme: Từ tố. Token: Thẻ từ. 2 310 ΛΛ namstart 44 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7. AUTOMATA Assignment operator: Toán tử gán. Statement-terminator: Dấu kết thúc câu lệnh. Instance: Thể hiện. Automaton, automata (pl.): Automat, Ôtômat. Deterministic finite automata: Automat hữu hạn đơn đònh (tất đònh). Nondeterministic finite automata: Automat hữu hạn đa đònh (không đơn đònh, không tất đònh). State: Trạng thái. Transition: Chuyển vò. Start state: Khởi trạng. Accepting state, final state: Trạng thái kiểm nhận, kết trạng. Finite control: Bộ điều khiển hữu hạn. Input tape: Băng nguyên liệu. Head: Đầu đọc. . (nói chung). Lexeme: Từ tố. Token: Thẻ từ. 2 310 ΛΛ namstart 44 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7. AUTOMATA Assignment operator: To n tử gán. Statement-terminator: Dấu kết thúc câu lệnh. Instance:. INTRODUCTION TO COMPUTER SCIENCE HANDOUT #7. AUTOMATA K5 & K6, Computer Science Department, Vaên Lang University Second semester Feb, 2002 Instructor: Traàn Ñöùc Quang Major. The comparison operator == 40 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #7. AUTOMATA 5. The identifier b 6. The right parenthesis ) 7. The identifier x 8. The assignment operator = 9. The integer

Ngày đăng: 09/08/2014, 11:21

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan