An Introduction to FORMAL LANGUAGES and AUTOMATA Fifth Edition PETER LINZ University of California at Davis JONES & BARTLETT LEARNING World Headquarters Jones & Bartlett Learning 40 Tall Pine Drive Sudbury, MA 01776 978-443-5000 info@jblearning.com www.jblearning.com Jones & Bartlett Learning Canada 6339 Ormindale Way Mississauga, Ontario L5V 1J2 Canada Jones & Bartlett Learning International Barb House, Barb Mews London W6 7PA United Kingdom Jones & Bartlett Learning books and products are available through most bookstores and online booksellers To contact Jones & Bartlett Learning directly, call 800-832-0034, fax 978-443-8000, or visit our website, www.jblearning.com Substantial discounts on bulk quantities of Jones & Bartlett Learning publications are available to corporations, professional associations, and other qualified organizations For details and specific discount information, contact the special sales department at Jones & Bartlett Learning via the above contact information or send an email to specialsales@jblearning.com Copyright © 2012 by Jones & Bartlett Learning, LLC All rights reserved No part of the material protected by this copyright may be reproduced or utilized in any form, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the copyright owner Production Credits Publisher: Cathleen Sether Senior Acquisitions Editor: Timothy Anderson Senior Editorial Assistant: Stephanie Sguigna Production Director: Amy Rose Senior Marketing Manager: Andrea DeFronzo V.P., Manufacturing and Inventory Control: Therese Connell Composition: Northeast Compositors, Inc Cover and Title Page Design: Kristin E Parker Cover Image: © Alexis Puentes/ShutterStock, Inc Printing and Binding: Malloy, Inc Cover Printing: Malloy, Inc Library of Congress Cataloging-in-Publication Data Linz, Peter An introduction to formal languages and automata / Peter Linz.—5th ed p cm Includes bibliographical references and index ISBN 978-1-4496-1552-9 (casebound) Formal languages Machine theory I Title QA267.3.L56 2011 005.13’1—dc22 2010040050 6048 Printed in the United States of America 15 14 13 12 11 10 To the Memory of my Parents Contents Preface 1 Introduction to the Theory of Computation 1.1 Mathematical Preliminaries and Notation Sets Functions and Relations Graphs and Trees Proof Techniques 1.2 Three Basic Concepts Languages Grammars Automata 1.3 Some Applications* 2 Finite Automata 2.1 Deterministic Finite Accepters Deterministic Accepters and Transition Graphs Languages and Dfa’s Regular Languages 2.2 Nondeterministic Finite Accepters Definition of a Nondeterministic Accepter Why Nondeterminism? 2.3 Equivalence of Deterministic and Nondeterministic Finite Accepters 2.4 Reduction of the Number of States in Finite Automata* 3 Regular Languages and Regular Grammars 3.1 Regular Expressions Formal Definition of a Regular Expression Languages Associated with Regular Expressions 3.2 Connection Between Regular Expressions and Regular Languages Regular Expressions Denote Regular Languages Regular Expressions for Regular Languages Regular Expressions for Describing Simple Patterns 3.3 Regular Grammars Right- and Left-Linear Grammars Right-Linear Grammars Generate Regular Languages Right-Linear Grammars for Regular Languages Equivalence of Regular Languages and Regular Grammars 4 Properties of Regular Languages 4.1 Closure Properties of Regular Languages Closure under Simple Set Operations Closure under Other Operations 4.2 Elementary Questions about Regular Languages 4.3 Identifying Nonregular Languages Using the Pigeonhole Principle A Pumping Lemma 5 Context-Free Languages 5.1 Context-Free Grammars Examples of Context-Free Languages Leftmost and Rightmost Derivations Derivation Trees Relation Between Sentential Forms and Derivation Trees 5.2 Parsing and Ambiguity Parsing and Membership Ambiguity in Grammars and Languages 5.3 Context-Free Grammars and Programming Languages 6 Simplification of Context-Free Grammars and Normal Forms 6.1 Methods for Transforming Grammars A Useful Substitution Rule Removing Useless Productions Removing λ-Productions Removing Unit-Productions 6.2 Two Important Normal Forms Chomsky Normal Form Greibach Normal Form 6.3 A Membership Algorithm for Context-Free Grammars* 7 Pushdown Automata 7.1 Nondeterministic Pushdown Automata Definition of a Pushdown Automaton The Language Accepted by a Pushdown Automaton 7.2 Pushdown Automata and Context-Free Languages Pushdown Automata for Context-Free Languages Context-Free Grammars for Pushdown Automata 7.3 Deterministic Pushdown Automata and Deterministic Context-Free Languages 7.4 Grammars for Deterministic Context-Free Languages* 8 Properties of Context-Free Languages 8.1 Two Pumping Lemmas A Pumping Lemma for Context-Free Languages A Pumping Lemma for Linear Languages 8.2 Closure Properties and Decision Algorithms for Context-Free Languages Closure of Context-Free Languages Some Decidable Properties of Context-Free Languages 9 Turing Machines 9.1 The Standard Turing Machine Definition of a Turing Machine Turing Machines as Language Accepters Turing Machines as Transducers 9.2 Combining Turing Machines for Complicated Tasks 9.3 Turing’s Thesis 10 Other Models of Turing Machines 10.1 Minor Variations on the Turing Machine Theme Equivalence of Classes of Automata Turing Machines with a Stay-Option Turing Machines with Semi-Infinite Tape The Off-Line Turing Machine 10.2 Turing Machines with More Complex Storage Multitape Turing Machines Multidimensional Turing Machines 10.3 Nondeterministic Turing Machines 10.4 A Universal Turing Machine 10.5 Linear Bounded Automata 11 A Hierarchy of Formal Languages and Automata 11.1 Recursive and Recursively Enumerable Languages Languages That Are Not Recursively Enumerable A Language That Is Not Recursively Enumerable A Language That Is Recursively Enumerable but Not Recursive 11.2 Unrestricted Grammars 11.3 Context-Sensitive Grammars and Languages Context-Sensitive Languages and Linear Bounded Automata Relation Between Recursive and Context-Sensitive Languages 11.4 The Chomsky Hierarchy 12 Limits of Algorithmic Computation 12.1 Some Problems That Cannot Be Solved by Turing Machines Computability and Decidability The Turing Machine Halting Problem Reducing One Undecidable Problem to Another 12.2 Undecidable Problems for Recursively Enumerable Languages 12.3 The Post Correspondence Problem 12.4 Undecidable Problems for Context-Free Languages 12.5 A Question of Efficiency 13 Other Models of Computation 13.1 Recursive Functions Primitive Recursive Functions Ackermann’s Function μ Recursive Functions 13.2 Post Systems 13.3 Rewriting Systems Matrix Grammars References for Further Reading A V Aho and J D Ullman 1972 The Theory of Parsing, Translation, and Compiling Vol 1, Englewood Cliffs, N.J.: Prentice Hall P J Denning, J B Dennis, and J E Qualitz 1978 Machines, Languages, and Computation Englewood Cliffs, N.J.: Prentice Hall M.R Garey and D Johnson 1979 Computers and Intractability Freeman M A Harrison 1978 Introduction to Formal Language Theory Reading, Mass.: Addison-Wesley J E Hopcroft and J D Ullman 1979 Introduction to Automata Theory, Languages and Computation Reading, Mass.: Addison-Wesley R Hunter 1981 The Design and Construction of Compilers Chichester, New York: John Wiley R Johnsonbaugh 1996 Discrete Mathematics Fourth Ed New York: Macmillan Z Kohavi and N.K Jha 2010 Switching and Finite Automata Theory Third Edition Cambridge University Press C.H Papadimitriou 1994 Computational Complexity Reading, Mass.: Addison-Wesley G.E Revesz 1983 Introduction to Formal Languages McGraw-Hill A Salomaa 1973 Formal Languages New York: Academic Press A Salomaa 1985 “Computations and Automata,” in Encyclopedia of Mathematics and Its Applications Cambridge: Cambridge University Press Index The index that appeared in the print version of this title does not match the pages in your eBook Please use the search function on your eReading device to search for terms of interest For your reference, the terms that appear in the print index are listed below A accepter Ackermann’s function algorithm alphabet ambiguity of a grammar inherent automata deterministic general characteristics nondeterministic axioms B Backus-Naur form base of a cycle basis for induction binary tree blank blank-tape halting problem BNF C Cartesian product of sets child-parent relation in a tree Chomsky hierarchy Chomsky normal form Church’s thesis Church-Turing thesis clique problem closure positive star closure properties of context-free languages of regular languages complement of a language of a set complete systems complexity space time complexity class NP complexity class P composition computability computable function computation computational complexity concatenation of languages of strings configuration of an automaton conjunctive normal form consistent systems context-free grammars context-free languages context-sensitive grammars context-sensitive languages control unit of an automaton Cook-Karp thesis Cook’s theorem cycles in a graph CYK algorithm D dead configuration decidability deciding a language DeMorgan’s laws dependency graphs derivation leftmost rightmost derivation trees partial yield determinism deterministic context-free languages deterministic finite accepters deterministic pushdown automata, dfa diagonalization disjoint sets dpda E edge of a graph efficiency of computation empty set end markers for an lba enumeration procedure equivalence classes of automata classes of dfa’s and nfa’s of grammars Mealy and Moore machines relation extended output function extended transition function for dfa’s for finite-state transducers for nfa’s F final state finite automata finite-state transducer fst formal languages functions computable domain partial range total G Gödel, K grammars context-free context-sensitive left-linear linear regular right-linear simple unrestricted graphs labeled Greibach normal form GTG H halting problem halt state of a Turing machine Hamiltonian path problem hierarchy of formal languages homomorphic image homomorphism I incompleteness theorem inherent ambiguity initial state input alphabet input file instantaneous description of a pushdown automaton of a Turing machine internal states of an automaton intractable problems J JFLAP L lambda-productions languages accepted by a dfa accepted by a dpda accepted by an lba accepted by an nfa accepted by an npda accepted by a Turing machine, associated with a regular expression generated by a grammar generated by a Markov algorithm, generated by Post system lba left-linear grammars leftmost derivation limitations of finite-state transducers linear bounded automata LL grammars loop L-systems M µ-recursive functions Markov algorithm matrix grammar Mealy machines membership algorithm for context-free languages for context-sensitive languages for regular languages minimal dfa minimalization operator monus Moore machines MPC solution M-translation N nfa noncontracting grammars nondeterministic finite accepter nondeterminism nonterminal constant normal form of a grammar NP problems NP-complete problems npda null set O Ogden’s lemma order proper relation in a tree order of magnitude notation P parsing brute force exhaustive search of context-free grammars top-down partition path in a graph labeled simple pattern matching PC-solution pda phrase-structure grammar pigeonhole principle polynomial-time reduction Post correspondence problem modified Post systems powerset prefix primitive recursion primitive recursive functions primitive regular expressions productions of a grammar program of a Turing machine projector function proof techniques contradiction induction proper order proper subset pumping lemma for context-free languages for linear languages for regular languages pushdown automata deterministic nondeterministic R read-write head recursive functions recursive languages recursively enumerable languages, reduction of number of states in a dfa of undecidable problems polynomial-time regular expressions regular grammars regular languages relations reverse of a language of string rewriting systems Rice’s theorem right-linear grammar rightmost derivation right quotient of a language root of a tree S satisfiablity problem 3SAT semantics of a programming language sentence sentential form sets countable size uncountable set operations s-grammar simulation stack alphabet start symbol standard representation of a regular language state-entry problem storage of an automaton string empty length operations prefix suffix subset proper substring successor function suffix syntax of a programming language T tape alphabet tape of a Turing machine terminal constants terminal symbol theory of computation time-complexity tracks on a tape tractable problems transducer transition function extended transition graph generalized of a finite accepter of a pushdown automaton of a Turing machine trap state trees Turing-computable Turing machine multidimensional with multiple tracks multitape nondeterministic off-line with semi-infinite tape standard with stay-option universal Turing’s thesis U unit-productions universal set universal Turing machine unrestricted grammars useless productions V variables of a grammar nullable start useless vertex final initial of a graph W walk in a graph Y yield of a derivation tree Z zero function