Chapter 4 Automata Discrete Mathematics II (Materials drawn from this chapter in: Peter Linz. An Introduction to Formal Languages and Automata, (5th Ed.), Jones Bartlett Learning, 2011. John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullamn. Introduction to Automata Theory, Languages, and Computation (3rd Ed.), Prentice Hall, 2006. Antal Iv´anyi Algorithms of Informatics, Kempelen Farkas Hallgato´i Inform´aci´os K¨ozpont, 2011. ) Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Faculty of Computer Science and Engineering University of Technology, VNUHCMContents 1 Motivation 2 Alphabets, words and languages 3 Regular expression or rationnal expression 4 Nondeterministic finite automata 5 Deterministic finite automata 6 Recognized languages 7 Determinisation
Automata Chapter Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Automata Discrete Mathematics II Contents Motivation Alphabets, words and languages Regular expression or rationnal expression (Materials drawn from this chapter in: - Peter Linz An Introduction to Formal Languages and Automata, (5th Ed.), Jones & Bartlett Learning, 2011 - John E Hopcroft, Rajeev Motwani and Jeffrey D Ullamn Introduction to Automata Theory, Languages, and Computation (3rd Ed.), Prentice Hall, 2006 - Antal Iv´ anyi Algorithms of Informatics, Kempelen Farkas Hallgat´ oi Inform´ aci´ os K¨ ozpont, 2011 ) Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Faculty of Computer Science and Engineering University of Technology, VNU-HCM 4.1 Contents Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Motivation Alphabets, words and languages Contents Regular expression or rationnal expression Motivation Alphabets, words and languages Non-deterministic finite automata Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Deterministic finite automata Recognized languages Recognized languages Determinisation Determinisation 4.2 Automata Introduction Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Standard states of a process in operating system • O with label: states • →: transitions Contents Resource Motivation Waiting Blocked Alphabets, words and languages Regular expression or rationnal expression Resource Non-deterministic finite automata CPU Deterministic finite automata Recognized languages CPU Resource Determinisation Running 4.3 Why study automata theory? Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang A useful model for many important kinds of software and hardware designing and checking the behaviour of digital circuits lexical analyser of a typical compiler: a compiler component that breaks the input text into logical units scanning large bodies of text, such as collections of Web pages, to find occurrences of words, phrases or other patterns Contents Motivation verifying pratical systems of all types that have a finite number of distinct states, such as communications protocols of protocols for secure exchange information, etc Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4.4 Alphabets, symbols Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Definition Alphabet Σ (bảng chữ cái) is a finite and non-empty set of symbols (or characters) For example: • Σ = {a, b} • The binary alphabet: Σ = {0, 1} • The set of all lower-case letters: Σ = {a, b, , z} • The set of all ASCII characters Remark Σ is almost always all available characters (lowercase letters, capital letters, numbers, symbols and special characters such as space or newline) But nothing prevents to imagine other sets Contents Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4.5 Automata Strings (words) Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Definition • A string/word u (chuỗi/từ) over Σ is a finite sequence (possibly empty) of symbols (or characters) in Σ • A empty string is denoted by ε • The length of the string, denoted by |u|, is the number of characters Contents Motivation Alphabets, words and languages • All the strings over Σ is denoted by Σ∗ Regular expression or rationnal expression • A language L over Σ is a sub-set of Σ∗ Non-deterministic finite automata Deterministic finite automata Remark Recognized languages ∗ The purpose aims to analyze a string of Σ in order to know whether it belongs or not to L Determinisation 4.6 Example Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let Σ = {0, 1} • ε is a string with length of • and are the strings with length of • 00, 01, 10 and 11 are the strings with length of • ∅ is a language over Σ It’s called the empty language ∗ • Σ is a language over Σ It’s called the universal language • {ε} is a language over Σ • {0, 00, 001} is also a language over Σ • The set of strings which contain an odd number of is a language over Σ • The set of strings that contain as many of as is a language over Σ Contents Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4.7 String concatenation Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Intuitively, the concatenation of two strings 01 and 10 is 0110 Concatenating the empty string ε and the string 110 is the string 110 Definition String concatenation is an application of Σ∗ × Σ∗ to Σ∗ Concatenation of two strings u and v in Σ is the string u.v Contents Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4.8 Languages Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Specifying languages A language can be specified in several ways: a) enumeration of its words, for example: • L1 = {ε, 0, 1}, • L2 = {a, aa, aaa, ab, ba}, • L3 = {ε, ab, aabb, aaabbb, aaaabbbb, }, b) a property, such that all words of the language have this property but other words have not, for example: • L4 = {an bn |n = 0, 1, 2, }, • L5 = {uu−1 |u ∈ Σ∗ }, • L6 = {u ∈ {a, b}∗ |na (u) = nb (u)} where na (u) denotes the number of letter ’a’ in word u c) its grammar, for example: • Let G = (N, T, P, S) where N = {S}, T = {a, b}, P = {S → aSb, S → ab} i.e L(G) = {an bn |N ≥ 1} since S ⇒ aSb ⇒ a2 Sb2 ⇒ ⇒ an Sbn Contents Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4.9 Operations on languages L, L1 , L2 are languages over Σ Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang • union L1 ∪ L2 = {u ∈ Σ∗ | u ∈ L1 or u ∈ L2 }, • intersection L1 ∩ L2 = {u ∈ Σ∗ | u ∈ L1 and u ∈ L2 }, • difference L1 \ L2 = {u ∈ Σ∗ | u ∈ L1 and u ∈ L2 }, • complement L = Σ∗ \ L, • multiplication L1 L2 = {uv | u ∈ L1 , v ∈ L2 }, • power L0 = {ε}, Ln = Ln−1 L , if n ≥ 1, • iteration or star operation ∞ Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Li = L0 ∪ L ∪ L2 ∪ · · · ∪ Li ∪ · · · , L∗ = Contents Determinisation i=0 We will use also the notation L+ ∞ Li = L ∪ L2 ∪ · · · ∪ Li ∪ · · · L+ = i=1 The union, product and iteration are called regular operations 4.10 Exercise Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Contents Consider the set of strings on {a, b} in which every aa is followed immediately by b For example aab, aaba, aabaabbaab are in the language, but aaab and aabaa are not Construct an accepting NFA Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4.24 Automata Exercise Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let Σ = {a, b, c} Construct an accepting finite automata for languages represented by the following regular expressions Contents • E1 = a∗ + b, Motivation • E2 = b∗ , ∗ • E3 = aab + cab ac, • E4 = b(ca + ac)(aa)∗ + a∗ (a + b), • E5 = (aaaabaaa)2∗ c, • E6 = b+ ac (b+ = bb∗ ), • E7 = (b + c)ab + (ba(c + ab2 + a3 + a4 + b)∗ )∗ , Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation • E8 = [a(b + c)∗ abc]∗ 4.25 Deterministic finite automata Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Definition A deterministic finite automata (DFA, Ôtômat hữu hạn đơn định) is given by a 5-tuplet (Q, Σ, q0 , δ, F ) with • Q a finite set of states • Σ is the input alphabet of the automata • q0 ∈ Q is the initial state • δ : Q × Σ → Q is a transition function • F ⊆ Q is the set of final/accepting states Condition Contents Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation Transition function δ is an application 4.26 Automata Example Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let Σ = {a, b} Hereinafter, a deterministic and complete automata that recognizes the set of strings which contain an odd number of a b b Motivation a q0 Contents q1 a Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages • Q = {q0 , q1 }, Determinisation • δ(q0 , a) = q1 , δ(q0 , b) = q0 , δ(q1 , a) = q0 , δ(q1 , b) = q1 , • F = {q1 } 4.27 Configurations and executions Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let A = (Q, Σ, q0 , δ, F ) A configuration (cấu hình) of automata A is a couple (q, u) where q ∈ Q and u ∈ Σ∗ We define the relation → of derivation between configurations : (q, a.u) → (q , u) iif δ(q, a) = q Contents Motivation Alphabets, words and languages Regular expression or rationnal expression An execution (thực thi) of automata A is a sequence of configurations (q0 , u0 ) (qn , un ) such that (qi , ui ) → (qi+1 , ui+1 ), for i = 0, 1, , n − Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4.28 Exercise Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Soit Σ = {0, 1} • Give an automaton that accepts all words that contain a number of multiple of • Give an execution of this automata on 1101010 Soit Σ = {a, b} • Give an automata that accepts all strings containing characters a • Give an execution of this automata on aabb, ababb and bbaa Contents Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4.29 Recognized languages Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Definition A language L over an alphabet Σ, defined as a sub-set of Σ∗ , is recognized if there exists a finite automata accepting all strings of L Contents Motivation Alphabets, words and languages Proposition Regular expression or rationnal expression If L1 and L2 are two recognized languages, then Non-deterministic finite automata • L1 ∪ L2 and L1 ∩ L2 are also recognized; • L1 L2 and L∗1 are also recognized Deterministic finite automata Recognized languages Determinisation 4.30 Automata Example Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Sub-string ab Construct a DFA that recognizes the language over the alphabet {a, b} containing the sub-string ab Contents Motivation Regular expression ∗ (a + b) ab(a + b) ∗ Alphabets, words and languages Automata b Transition table → q0 q1 q2 ∗ a q1 q1 q2 b q0 q2 q2 b a q0 a, b a q1 q2 Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4.31 Automata Example Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Determine build a DFA that recognizes the language over the alphabet {a, b} with an even number of a and an even number b Automata Contents Motivation b Transition table q0 q1 b a a a a b q1 q0 q3 q2 Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation b q2 → q0∗ q1 q2 q3 a q2 q3 q0 q1 Alphabets, words and languages q3 →: start state ∗ : final state(s) b 4.32 Automata Equivalent automatons Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Two following DFAs are equivalent? q0 p0 p3 a Contents Motivation a b b b a b Alphabets, words and languages a a a q1 Regular expression or rationnal expression Non-deterministic finite automata q2 p1 a p2 Deterministic finite automata Recognized languages a Determinisation b b 4.33 Automata Equivalent automatons Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Two following DFAs are equivalent? q0 p0 p3 a Contents Motivation a a b b a b Alphabets, words and languages a a a q1 Regular expression or rationnal expression Non-deterministic finite automata q2 p1 a p2 Deterministic finite automata Recognized languages a Determinisation b b 4.34 Automata From NFA to DFA Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Transition table → {0} {1} {0, 2}∗ Given a NFA b b a a {1} {0, 2} {1} b {0} {1} {0, 2} Contents Corresponding DFA Motivation b b Alphabets, words and languages a Regular expression or rationnal expression {0} Non-deterministic finite automata {1} a Deterministic finite automata ε Recognized languages a a Determinisation b {0, 2} 4.35 Automata Exercise Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let Σ = {a, b, c} Determine DFAs which corresponds to the following NFAs: b, c b, c a, ε 0 a b Contents Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata c, ε Recognized languages b, c a Deterministic finite automata b, ε Determinisation ε ε 4.37 Exercise Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let Σ = {a, b, c} Determine finite automata, not necessarily deterministic, recognizing the following languages: • L1 = {a, ab, ca, cab, acc}, • L2 = { set of words of even number of a}, • L3 = { set of words containing ab and ending with b} Then, determine the corresponging DFAs Contents Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4.38 Automata Exercise Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let Σ = {a, b, c} Construct accepting DFAs for languages represented by the following regular expressions Contents • E1 = a∗ + b, Motivation • E2 = b∗ , ∗ • E3 = aab + cab ac, • E4 = b(ca + ac)(aa)∗ + a∗ (a + b), • E5 = (aaaabaaa)2∗ c, • E6 = b+ ac (b+ = bb∗ ), • E7 = (b + c)ab + (ba(c + ab2 + a3 + a4 + b)∗ )∗ , Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation • E8 = [a(b + c)∗ abc]∗ 4.39 [...]... Non-deterministic finite automata a, b Deterministic finite automata q0 q1 Recognized languages Determinisation Regular expression b∗ (a + b) 4. 19 Automata Exercise Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let Σ = {a, b} Which of the strings 1) a3 b, 2) aba2 b, 3) a4 b2 ab3 a, Contents 4) a4 ba4 , Motivation 5) ab4 a9 b, Alphabets, words and languages 6) ba5 ba4 b, Regular expression or... expression 4) b(ca + ac)(aa)∗ + a∗ (a + b), 5) (aaaabaaa)2∗ c, Non-deterministic finite automata 6) b+ ac (b+ = bb∗ ), Deterministic finite automata 7) (b + c)ab + (ba(c + ab2 + a3 + a4 + b)∗ )∗ ? Recognized languages Determinisation ∗ Define a (simple) regular expression representing the language L 4. 18 Automata Finite automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Finite automata (Ôtômat... finite automata b b a, b q0 Deterministic finite automata Recognized languages Determinisation q1 a 4. 21 Nondeterministic finite automata Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Definition A nondeterministic finite automata (NFA, Ôtômat hữu hạn phi đơn định) is mathematically represented by a 5-tuples (Q, Σ, q0 , δ, F ) where • Q a finite set of states • Σ is the alphabet of the automata. .. expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4. 24 Automata Exercise Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let Σ = {a, b, c} Construct an accepting finite automata for languages represented by the following regular expressions Contents • E1 = a∗ + b, Motivation • E2 = b∗ , ∗ • E3 = aab + cab ac, • E4 = b(ca + ac)(aa)∗ + a∗ (a... + c)ab + (ba(c + ab2 + a3 + a4 + b)∗ )∗ , Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation • E8 = [a(b + c)∗ abc]∗ 4. 25 Deterministic finite automata Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Definition A deterministic finite automata (DFA, Ôtômat hữu hạn... or rationnal expression 7) ba5 b2 , Non-deterministic finite automata 2 8) bab a? Deterministic finite automata are accepted by the following finite automata? a b Recognized languages Determinisation b q0 q1 a q2 b a 4. 20 Automata Exercise Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Give regular expression for the following finite automata a b b q0 q1 a b q2 Contents Motivation Alphabets, words... followed by a ’c’ 4. 12 Exercise Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let Σ = {a, b, c} and L = {ab, aa, b, ca, bac} Which of the following strings are in L∗ : 1) aaa = a3 , 2) abaabaaabaa = aba2 ba3 ba2 , 3) bbb, 4) aab, 5) cc, Contents Motivation Alphabets, words and languages Regular expression or rationnal expression 6) aaaabaaaa = a4 ba4 , Non-deterministic finite automata 7)... 1, , n − 1 Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation 4. 28 Exercise Automata Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Soit Σ = {0, 1} • Give an automaton that accepts all words that contain a number of 0 multiple of 3 • Give an execution of this automata on 1101010 Soit Σ = {a, b} • Give an automata that accepts all strings containing... b∗ a, Alphabets, words and languages 3) b(ca + ac)(aa)∗ + a∗ (a + b), Regular expression or rationnal expression 4) (a∗ b + b∗ a)∗ Non-deterministic finite automata Deterministic finite automata Example a∗ b = {b, ab, a2 b, a3 b, , aaa ab}, Recognized languages Determinisation 4. 17 Automata Exercise Nguyen An Khuong, Huynh Tuong Nguyen, Bui Hoai Thang Let Σ = {a, b, c} and L = {ab, aa, b, ca,... alphabet of the automata • q0 ∈ Q is the initial state • δ : Q × Σ → Q is a transition function • F ⊆ Q is the set of final/accepting states Condition Contents Motivation Alphabets, words and languages Regular expression or rationnal expression Non-deterministic finite automata Deterministic finite automata Recognized languages Determinisation Transition function δ is an application 4. 26 Automata Example