1. Trang chủ
  2. » Tất cả

158005061-Data-Structures-and-Algorithms-in-Python-Michael-T-Goodrich

0 2 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 0
Dung lượng 5,62 MB

Nội dung

Data Structures and Algorithms in Python Michael T Goodrich Department of Computer Science University of California, Irvine Roberto Tamassia Department of Computer Science Brown University Michael H Goldwasser Department of Mathematics and Computer Science Saint Louis University VP & PUBLISHER EXECUTIVE EDITOR EDITORIAL PROGRAM ASSISTANT MARKETING MANAGER DESIGNER SENIOR PRODUCTION MANAGER ASSOCIATE PRODUCTION MANAGER Don Fowley Beth Lang Golub Katherine Willis Christopher Ruel Kenji Ngieng Janis Soo Joyce Poh This book was set in LaTEX by the authors Printed and bound by Courier Westford The cover was printed by Courier Westford This book is printed on acid free paper Founded in 1807, John Wiley & Sons, Inc has been a valued source of knowledge and understanding for more than 200 years, helping people around the world meet their needs and fulfill their aspirations Our company is built on a foundation of principles that include responsibility to the communities we serve and where we live and work In 2008, we launched a Corporate Citizenship Initiative, a global effort to address the environmental, social, economic, and ethical challenges we face in our business Among the issues we are addressing are carbon impact, paper specifications and procurement, ethical conduct within our business and among our vendors, and community and charitable support For more information, please visit our website: www.wiley.com/go/citizenship Copyright © 2013 John Wiley & Sons, Inc All rights reserved No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc 222 Rosewood Drive, Danvers, MA 01923, website www.copyright.com Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, (201)748-6011, fax (201)748-6008, website http://www.wiley.com/go/permissions Evaluation copies are provided to qualified academics and professionals for review purposes only, for use in their courses during the next academic year These copies are licensed and may not be sold or transferred to a third party Upon completion of the review period, please return the evaluation copy to Wiley Return instructions and a free of charge return mailing label are available at www.wiley.com/go/returnlabel If you have chosen to adopt this textbook for use in your course, please accept this book as your complimentary desk copy Outside of the United States, please contact your local sales representative Printed in the United States of America 10 To Karen, Paul, Anna, and Jack – Michael T Goodrich To Isabel – Roberto Tamassia To Susan, Calista, and Maya – Michael H Goldwasser Preface The design and analysis of efficient data structures has long been recognized as a vital subject in computing and is part of the core curriculum of computer science and computer engineering undergraduate degrees Data Structures and Algorithms in Python provides an introduction to data structures and algorithms, including their design, analysis, and implementation This book is designed for use in a beginninglevel data structures course, or in an intermediate-level introduction to algorithms course We discuss its use for such courses in more detail later in this preface To promote the development of robust and reusable software, we have tried to take a consistent object-oriented viewpoint throughout this text One of the main ideas of the object-oriented approach is that data should be presented as being encapsulated with the methods that access and modify them That is, rather than simply viewing data as a collection of bytes and addresses, we think of data objects as instances of an abstract data type (ADT), which includes a repertoire of methods for performing operations on data objects of this type We then emphasize that there may be several different implementation strategies for a particular ADT, and explore the relative pros and cons of these choices We provide complete Python implementations for almost all data structures and algorithms discussed, and we introduce important object-oriented design patterns as means to organize those implementations into reusable components Desired outcomes for readers of our book include that: • They have knowledge of the most common abstractions for data collections (e.g., stacks, queues, lists, trees, maps) • They understand algorithmic strategies for producing efficient realizations of common data structures • They can analyze algorithmic performance, both theoretically and experimentally, and recognize common trade-offs between competing strategies • They can wisely use existing data structures and algorithms found in modern programming language libraries • They have experience working with concrete implementations for most foundational data structures and algorithms • They can apply data structures and algorithms to solve complex problems In support of the last goal, we present many example applications of data structures throughout the book, including the processing of file systems, matching of tags in structured formats such as HTML, simple cryptography, text frequency analysis, automated geometric layout, Huffman coding, DNA sequence alignment, and search engine indexing v Preface vi Book Features This book is based upon the book Data Structures and Algorithms in Java by Goodrich and Tamassia, and the related Data Structures and Algorithms in C++ by Goodrich, Tamassia, and Mount However, this book is not simply a translation of those other books to Python In adapting the material for this book, we have significantly redesigned the organization and content of the book as follows: • The code base has been entirely redesigned to take advantage of the features of Python, such as use of generators for iterating elements of a collection • Many algorithms that were presented as pseudo-code in the Java and C++ versions are directly presented as complete Python code • In general, ADTs are defined to have consistent interface with Python’s builtin data types and those in Python’s collections module • Chapter provides an in-depth exploration of the dynamic array-based underpinnings of Python’s built-in list, tuple, and str classes New Appendix A serves as an additional reference regarding the functionality of the str class • Over 450 illustrations have been created or revised • New and revised exercises bring the overall total number to 750 Online Resources This book is accompanied by an extensive set of online resources, which can be found at the following Web site: www.wiley.com/college/goodrich Students are encouraged to use this site along with the book, to help with exercises and increase understanding of the subject Instructors are likewise welcome to use the site to help plan, organize, and present their course materials Included on this Web site is a collection of educational aids that augment the topics of this book, for both students and instructors Because of their added value, some of these online resources are password protected For all readers, and especially for students, we include the following resources: • All the Python source code presented in this book • PDF handouts of Powerpoint slides (four-per-page) provided to instructors • A database of hints to all exercises, indexed by problem number For instructors using this book, we include the following additional teaching aids: • Solutions to hundreds of the book’s exercises • Color versions of all figures and illustrations from the book • Slides in Powerpoint and PDF (one-per-page) format The slides are fully editable, so as to allow an instructor using this book full freedom in customizing his or her presentations All the online resources are provided at no extra charge to any instructor adopting this book for his or her course Preface vii Contents and Organization The chapters for this book are organized to provide a pedagogical path that starts with the basics of Python programming and object-oriented design We then add foundational techniques like algorithm analysis and recursion In the main portion of the book, we present fundamental data structures and algorithms, concluding with a discussion of memory management (that is, the architectural underpinnings of data structures) Specifically, the chapters for this book are organized as follows: 10 11 12 13 14 15 A B Python Primer Object-Oriented Programming Algorithm Analysis Recursion Array-Based Sequences Stacks, Queues, and Deques Linked Lists Trees Priority Queues Maps, Hash Tables, and Skip Lists Search Trees Sorting and Selection Text Processing Graph Algorithms Memory Management and B-Trees Character Strings in Python Useful Mathematical Facts A more detailed table of contents follows this preface, beginning on page xi Prerequisites We assume that the reader is at least vaguely familiar with a high-level programming language, such as C, C++, Python, or Java, and that he or she understands the main constructs from such a high-level language, including: • Variables and expressions • Decision structures (such as if-statements and switch-statements) • Iteration structures (for loops and while loops) • Functions (whether stand-alone or object-oriented methods) For readers who are familiar with these concepts, but not with how they are expressed in Python, we provide a primer on the Python language in Chapter Still, this book is primarily a data structures book, not a Python book; hence, it does not give a comprehensive treatment of Python Preface viii We delay treatment of object-oriented programming in Python until Chapter This chapter is useful for those new to Python, and for those who may be familiar with Python, yet not with object-oriented programming In terms of mathematical background, we assume the reader is somewhat familiar with topics from high-school mathematics Even so, in Chapter 3, we discuss the seven most-important functions for algorithm analysis In fact, sections that use something other than one of these seven functions are considered optional, and are indicated with a star () We give a summary of other useful mathematical facts, including elementary probability, in Appendix B Relation to Computer Science Curriculum To assist instructors in designing a course in the context of the IEEE/ACM 2013 Computing Curriculum, the following table describes curricular knowledge units that are covered within this book Knowledge Unit AL/Basic Analysis AL/Algorithmic Strategies AL/Fundamental Data Structures and Algorithms AL/Advanced Data Structures AR/Memory System Organization and Architecture DS/Sets, Relations and Functions DS/Proof Techniques DS/Basics of Counting DS/Graphs and Trees DS/Discrete Probability PL/Object-Oriented Programming PL/Functional Programming SDF/Algorithms and Design SDF/Fundamental Programming Concepts SDF/Fundamental Data Structures SDF/Developmental Methods SE/Software Design Relevant Material Chapter and Sections 4.2 & 12.2.4 Sections 12.2.1, 13.2.1, 13.3, & 13.4.2 Sections 4.1.3, 5.5.2, 9.4.1, 9.3, 10.2, 11.1, 13.2, Chapter 12 & much of Chapter 14 Sections 5.3, 10.4, 11.2 through 11.6, 12.3.1, 13.5, 14.5.1, & 15.3 Chapter 15 Sections 10.5.1, 10.5.2, & 9.4 Sections 3.4, 4.2, 5.3.2, 9.3.6, & 12.4.1 Sections 2.4.2, 6.2.2, 12.2.4, 8.2.2 & Appendix B Much of Chapters and 14 Sections 1.11.1, 10.2, 10.4.2, & 12.3.1 Much of the book, yet especially Chapter and Sections 7.4, 9.5.1, 10.1.3, & 11.2.1 Section 1.10 Sections 2.1, 3.3, & 12.2.1 Chapters & Chapters & 7, Appendix A, and Sections 1.2.1, 5.2, 5.4, 9.1, & 10.1 Sections 1.7 & 2.2 Sections 2.1 & 2.1.3 Mapping IEEE/ACM 2013 Computing Curriculum knowledge units to coverage in this book Preface ix About the Authors Michael Goodrich received his Ph.D in Computer Science from Purdue University in 1987 He is currently a Chancellor’s Professor in the Department of Computer Science at University of California, Irvine Previously, he was a professor at Johns Hopkins University He is a Fulbright Scholar and a Fellow of the American Association for the Advancement of Science (AAAS), Association for Computing Machinery (ACM), and Institute of Electrical and Electronics Engineers (IEEE) He is a recipient of the IEEE Computer Society Technical Achievement Award, the ACM Recognition of Service Award, and the Pond Award for Excellence in Undergraduate Teaching Roberto Tamassia received his Ph.D in Electrical and Computer Engineering from the University of Illinois at Urbana-Champaign in 1988 He is the Plastech Professor of Computer Science and the Chair of the Department of Computer Science at Brown University He is also the Director of Brown’s Center for Geometric Computing His research interests include information security, cryptography, analysis, design, and implementation of algorithms, graph drawing and computational geometry He is a Fellow of the American Association for the Advancement of Science (AAAS), Association for Computing Machinery (ACM) and Institute for Electrical and Electronic Engineers (IEEE) He is also a recipient of the Technical Achievement Award from the IEEE Computer Society Michael Goldwasser received his Ph.D in Computer Science from Stanford University in 1997 He is currently a Professor in the Department of Mathematics and Computer Science at Saint Louis University and the Director of their Computer Science program Previously, he was a faculty member in the Department of Computer Science at Loyola University Chicago His research interests focus on the design and implementation of algorithms, having published work involving approximation algorithms, online computation, computational biology, and computational geometry He is also active in the computer science education community Additional Books by These Authors • M.T Goodrich and R Tamassia, Data Structures and Algorithms in Java, Wiley • M.T Goodrich, R Tamassia, and D.M Mount, Data Structures and Algorithms in C++, Wiley • M.T Goodrich and R Tamassia, Algorithm Design: Foundations, Analysis, and Internet Examples, Wiley • M.T Goodrich and R Tamassia, Introduction to Computer Security, AddisonWesley • M.H Goldwasser and D Letscher, Object-Oriented Programming in Python, Prentice Hall Preface x Acknowledgments We have depended greatly upon the contributions of many individuals as part of the development of this book We begin by acknowledging the wonderful team at Wiley We are grateful to our editor, Beth Golub, for her enthusiastic support of this project, from beginning to end The efforts of Elizabeth Mills and Katherine Willis were critical in keeping the project moving, from its early stages as an initial proposal, through the extensive peer review process We greatly appreciate the attention to detail demonstrated by Julie Kennedy, the copyeditor for this book Finally, many thanks are due to Joyce Poh for managing the final months of the production process We are truly indebted to the outside reviewers and readers for their copious comments, emails, and constructive criticism, which were extremely useful in writing this edition We therefore thank the following reviewers for their comments and suggestions: Claude Anderson (Rose Hulman Institute of Technology), Alistair Campbell (Hamilton College), Barry Cohen (New Jersey Institute of Technology), Robert Franks (Central College), Andrew Harrington (Loyola University Chicago), Dave Musicant (Carleton College), and Victor Norman (Calvin College) We wish to particularly acknowledge Claude for going above and beyond the call of duty, providing us with an enumeration of 400 detailed corrections or suggestions We thank David Mount, of University of Maryland, for graciously sharing the wisdom gained from his experience with the C++ version of this text We are grateful to Erin Chambers and David Letscher, of Saint Louis University, for their intangible contributions during many hallway conversations about the teaching of data structures, and to David for comments on early versions of the Python code base for this book We thank David Zampino, a student at Loyola University Chicago, for his feedback while using a draft of this book during an independent study course, and to Andrew Harrington for supervising David’s studies We also wish to reiterate our thanks to the many research collaborators and teaching assistants whose feedback shaped the previous Java and C++ versions of this material The benefits of those contributions carry forward to this book Finally, we would like to warmly thank Susan Goldwasser, Isabel Cruz, Karen Goodrich, Giuseppe Di Battista, Franco Preparata, Ioannis Tollis, and our parents for providing advice, encouragement, and support at various stages of the preparation of this book, and Calista and Maya Goldwasser for offering their advice regarding the artistic merits of many illustrations More importantly, we thank all of these people for reminding us that there are things in life beyond writing books Michael T Goodrich Roberto Tamassia Michael H Goldwasser Contents Preface Python Primer 1.1 Python Overview 1.1.1 The Python Interpreter 1.1.2 Preview of a Python Program 1.2 Objects in Python 1.2.1 Identifiers, Objects, and the Assignment Statement 1.2.2 Creating and Using Objects 1.2.3 Python’s Built-In Classes 1.3 Expressions, Operators, and Precedence 1.3.1 Compound Expressions and Operator Precedence 1.4 Control Flow 1.4.1 Conditionals 1.4.2 Loops 1.5 Functions 1.5.1 Information Passing 1.5.2 Python’s Built-In Functions 1.6 Simple Input and Output 1.6.1 Console Input and Output 1.6.2 Files 1.7 Exception Handling 1.7.1 Raising an Exception 1.7.2 Catching an Exception 1.8 Iterators and Generators 1.9 Additional Python Conveniences 1.9.1 Conditional Expressions 1.9.2 Comprehension Syntax 1.9.3 Packing and Unpacking of Sequences 1.10 Scopes and Namespaces 1.11 Modules and the Import Statement 1.11.1 Existing Modules 1.12 Exercises xi v 2 4 12 17 18 18 20 23 24 28 30 30 31 33 34 36 39 42 42 43 44 46 48 49 51 Contents xii Object-Oriented Programming 2.1 Goals, Principles, and Patterns 2.1.1 Object-Oriented Design Goals 2.1.2 Object-Oriented Design Principles 2.1.3 Design Patterns 2.2 Software Development 2.2.1 Design 2.2.2 Pseudo-Code 2.2.3 Coding Style and Documentation 2.2.4 Testing and Debugging 2.3 Class Definitions 2.3.1 Example: CreditCard Class 2.3.2 Operator Overloading and Python’s Special Methods 2.3.3 Example: Multidimensional Vector Class 2.3.4 Iterators 2.3.5 Example: Range Class 2.4 Inheritance 2.4.1 Extending the CreditCard Class 2.4.2 Hierarchy of Numeric Progressions 2.4.3 Abstract Base Classes 2.5 Namespaces and Object-Orientation 2.5.1 Instance and Class Namespaces 2.5.2 Name Resolution and Dynamic Dispatch 2.6 Shallow and Deep Copying 2.7 Exercises Algorithm Analysis 3.1 Experimental Studies 3.1.1 Moving Beyond Experimental Analysis 3.2 The Seven Functions Used in This Book 3.2.1 Comparing Growth Rates 3.3 Asymptotic Analysis 3.3.1 The “Big-Oh” Notation 3.3.2 Comparative Analysis 3.3.3 Examples of Algorithm Analysis 3.4 Simple Justification Techniques 3.4.1 By Example 3.4.2 The “Contra” Attack 3.4.3 Induction and Loop Invariants 3.5 Exercises 56 57 57 58 61 62 62 64 64 67 69 69 74 77 79 80 82 83 87 93 96 96 100 101 103 109 111 113 115 122 123 123 128 130 137 137 137 138 141 Contents xiii Recursion 4.1 Illustrative Examples 4.1.1 The Factorial Function 4.1.2 Drawing an English Ruler 4.1.3 Binary Search 4.1.4 File Systems 4.2 Analyzing Recursive Algorithms 4.3 Recursion Run Amok 4.3.1 Maximum Recursive Depth in Python 4.4 Further Examples of Recursion 4.4.1 Linear Recursion 4.4.2 Binary Recursion 4.4.3 Multiple Recursion 4.5 Designing Recursive Algorithms 4.6 Eliminating Tail Recursion 4.7 Exercises 148 150 150 152 155 157 161 165 168 169 169 174 175 177 178 180 Array-Based Sequences 5.1 Python’s Sequence Types 5.2 Low-Level Arrays 5.2.1 Referential Arrays 5.2.2 Compact Arrays in Python 5.3 Dynamic Arrays and Amortization 5.3.1 Implementing a Dynamic Array 5.3.2 Amortized Analysis of Dynamic Arrays 5.3.3 Python’s List Class 5.4 Efficiency of Python’s Sequence Types 5.4.1 Python’s List and Tuple Classes 5.4.2 Python’s String Class 5.5 Using Array-Based Sequences 5.5.1 Storing High Scores for a Game 5.5.2 Sorting a Sequence 5.5.3 Simple Cryptography 5.6 Multidimensional Data Sets 5.7 Exercises 183 184 185 187 190 192 195 197 201 202 202 208 210 210 214 216 219 224 Stacks, Queues, and Deques 6.1 Stacks 6.1.1 The Stack Abstract Data Type 6.1.2 Simple Array-Based Stack Implementation 6.1.3 Reversing Data Using a Stack 6.1.4 Matching Parentheses and HTML Tags 228 229 230 231 235 236 Contents xiv 6.2 Queues 6.2.1 The Queue Abstract Data Type 6.2.2 Array-Based Queue Implementation 6.3 Double-Ended Queues 6.3.1 The Deque Abstract Data Type 6.3.2 Implementing a Deque with a Circular Array 6.3.3 Deques in the Python Collections Module 6.4 Exercises 239 240 241 247 247 248 249 250 Linked Lists 7.1 Singly Linked Lists 7.1.1 Implementing a Stack with a Singly Linked List 7.1.2 Implementing a Queue with a Singly Linked List 7.2 Circularly Linked Lists 7.2.1 Round-Robin Schedulers 7.2.2 Implementing a Queue with a Circularly Linked List 7.3 Doubly Linked Lists 7.3.1 Basic Implementation of a Doubly Linked List 7.3.2 Implementing a Deque with a Doubly Linked List 7.4 The Positional List ADT 7.4.1 The Positional List Abstract Data Type 7.4.2 Doubly Linked List Implementation 7.5 Sorting a Positional List 7.6 Case Study: Maintaining Access Frequencies 7.6.1 Using a Sorted List 7.6.2 Using a List with the Move-to-Front Heuristic 7.7 Link-Based vs Array-Based Sequences 7.8 Exercises 255 256 261 264 266 267 268 270 273 275 277 279 281 285 286 286 289 292 294 299 300 301 305 308 311 313 315 317 317 325 327 328 Trees 8.1 General Trees 8.1.1 Tree Definitions and Properties 8.1.2 The Tree Abstract Data Type 8.1.3 Computing Depth and Height 8.2 Binary Trees 8.2.1 The Binary Tree Abstract Data Type 8.2.2 Properties of Binary Trees 8.3 Implementing Trees 8.3.1 Linked Structure for Binary Trees 8.3.2 Array-Based Representation of a Binary Tree 8.3.3 Linked Structure for General Trees 8.4 Tree Traversal Algorithms Contents xv 8.4.1 Preorder and Postorder Traversals of General Trees 8.4.2 Breadth-First Tree Traversal 8.4.3 Inorder Traversal of a Binary Tree 8.4.4 Implementing Tree Traversals in Python 8.4.5 Applications of Tree Traversals 8.4.6 Euler Tours and the Template Method Pattern  8.5 Case Study: An Expression Tree 8.6 Exercises 328 330 331 333 337 341 348 352 Priority Queues 9.1 The Priority Queue Abstract Data Type 9.1.1 Priorities 9.1.2 The Priority Queue ADT 9.2 Implementing a Priority Queue 9.2.1 The Composition Design Pattern 9.2.2 Implementation with an Unsorted List 9.2.3 Implementation with a Sorted List 9.3 Heaps 9.3.1 The Heap Data Structure 9.3.2 Implementing a Priority Queue with a Heap 9.3.3 Array-Based Representation of a Complete Binary 9.3.4 Python Heap Implementation 9.3.5 Analysis of a Heap-Based Priority Queue 9.3.6 Bottom-Up Heap Construction  9.3.7 Python’s heapq Module 9.4 Sorting with a Priority Queue 9.4.1 Selection-Sort and Insertion-Sort 9.4.2 Heap-Sort 9.5 Adaptable Priority Queues 9.5.1 Locators 9.5.2 Implementing an Adaptable Priority Queue 9.6 Exercises Tree 362 363 363 364 365 365 366 368 370 370 372 376 376 379 380 384 385 386 388 390 390 391 395 10 Maps, Hash Tables, and Skip Lists 10.1 Maps and Dictionaries 10.1.1 The Map ADT 10.1.2 Application: Counting Word Frequencies 10.1.3 Python’s MutableMapping Abstract Base Class 10.1.4 Our MapBase Class 10.1.5 Simple Unsorted Map Implementation 10.2 Hash Tables 10.2.1 Hash Functions 401 402 403 405 406 407 408 410 411 Contents xvi 10.3 10.4 10.5 10.6 10.2.2 Collision-Handling Schemes 10.2.3 Load Factors, Rehashing, and Efficiency 10.2.4 Python Hash Table Implementation Sorted Maps 10.3.1 Sorted Search Tables 10.3.2 Two Applications of Sorted Maps Skip Lists 10.4.1 Search and Update Operations in a Skip List 10.4.2 Probabilistic Analysis of Skip Lists  Sets, Multisets, and Multimaps 10.5.1 The Set ADT 10.5.2 Python’s MutableSet Abstract Base Class 10.5.3 Implementing Sets, Multisets, and Multimaps Exercises 11 Search Trees 11.1 Binary Search Trees 11.1.1 Navigating a Binary Search Tree 11.1.2 Searches 11.1.3 Insertions and Deletions 11.1.4 Python Implementation 11.1.5 Performance of a Binary Search Tree 11.2 Balanced Search Trees 11.2.1 Python Framework for Balancing Search 11.3 AVL Trees 11.3.1 Update Operations 11.3.2 Python Implementation 11.4 Splay Trees 11.4.1 Splaying 11.4.2 When to Splay 11.4.3 Python Implementation 11.4.4 Amortized Analysis of Splaying  11.5 (2,4) Trees 11.5.1 Multiway Search Trees 11.5.2 (2,4)-Tree Operations 11.6 Red-Black Trees 11.6.1 Red-Black Tree Operations 11.6.2 Python Implementation 11.7 Exercises Trees 417 420 422 427 428 434 437 439 443 446 446 448 450 452 459 460 461 463 465 468 473 475 478 481 483 488 490 490 494 496 497 502 502 505 512 514 525 528 Contents xvii 12 Sorting and Selection 12.1 Why Study Sorting Algorithms? 12.2 Merge-Sort 12.2.1 Divide-and-Conquer 12.2.2 Array-Based Implementation of Merge-Sort 12.2.3 The Running Time of Merge-Sort 12.2.4 Merge-Sort and Recurrence Equations  12.2.5 Alternative Implementations of Merge-Sort 12.3 Quick-Sort 12.3.1 Randomized Quick-Sort 12.3.2 Additional Optimizations for Quick-Sort 12.4 Studying Sorting through an Algorithmic Lens 12.4.1 Lower Bound for Sorting 12.4.2 Linear-Time Sorting: Bucket-Sort and Radix-Sort 12.5 Comparing Sorting Algorithms 12.6 Python’s Built-In Sorting Functions 12.6.1 Sorting According to a Key Function 12.7 Selection 12.7.1 Prune-and-Search 12.7.2 Randomized Quick-Select 12.7.3 Analyzing Randomized Quick-Select 12.8 Exercises 13 Text Processing 13.1 Abundance of Digitized Text 13.1.1 Notations for Strings and the Python str 13.2 Pattern-Matching Algorithms 13.2.1 Brute Force 13.2.2 The Boyer-Moore Algorithm 13.2.3 The Knuth-Morris-Pratt Algorithm 13.3 Dynamic Programming 13.3.1 Matrix Chain-Product 13.3.2 DNA and Text Sequence Alignment 13.4 Text Compression and the Greedy Method 13.4.1 The Huffman Coding Algorithm 13.4.2 The Greedy Method 13.5 Tries 13.5.1 Standard Tries 13.5.2 Compressed Tries 13.5.3 Suffix Tries 13.5.4 Search Engine Indexing Class 536 537 538 538 543 544 546 547 550 557 559 562 562 564 567 569 569 571 571 572 573 574 581 582 583 584 584 586 590 594 594 597 601 602 603 604 604 608 610 612 Contents xviii 13.6 Exercises 613 14 Graph Algorithms 14.1 Graphs 14.1.1 The Graph ADT 14.2 Data Structures for Graphs 14.2.1 Edge List Structure 14.2.2 Adjacency List Structure 14.2.3 Adjacency Map Structure 14.2.4 Adjacency Matrix Structure 14.2.5 Python Implementation 14.3 Graph Traversals 14.3.1 Depth-First Search 14.3.2 DFS Implementation and Extensions 14.3.3 Breadth-First Search 14.4 Transitive Closure 14.5 Directed Acyclic Graphs 14.5.1 Topological Ordering 14.6 Shortest Paths 14.6.1 Weighted Graphs 14.6.2 Dijkstra’s Algorithm 14.7 Minimum Spanning Trees 14.7.1 Prim-Jarn´ık Algorithm 14.7.2 Kruskal’s Algorithm 14.7.3 Disjoint Partitions and Union-Find Structures 14.8 Exercises 619 620 626 627 628 630 632 633 634 638 639 644 648 651 655 655 659 659 661 670 672 676 681 686 15 Memory Management and B-Trees 15.1 Memory Management 15.1.1 Memory Allocation 15.1.2 Garbage Collection 15.1.3 Additional Memory Used by the Python Interpreter 15.2 Memory Hierarchies and Caching 15.2.1 Memory Systems 15.2.2 Caching Strategies 15.3 External Searching and B-Trees 15.3.1 (a,b) Trees 15.3.2 B-Trees 15.4 External-Memory Sorting 15.4.1 Multiway Merging 15.5 Exercises 697 698 699 700 703 705 705 706 711 712 714 715 716 717 Contents xix A Character Strings in Python 721 B Useful Mathematical Facts 725 Bibliography 732 Index 737 Chapter Python Primer Contents 1.1 Python Overview 1.1.1 The Python Interpreter 1.1.2 Preview of a Python Program 1.2 Objects in Python 1.2.1 Identifiers, Objects, and the Assignment Statement 1.2.2 Creating and Using Objects 1.2.3 Python’s Built-In Classes 1.3 Expressions, Operators, and Precedence 1.3.1 Compound Expressions and Operator Precedence 1.4 Control Flow 1.4.1 Conditionals 1.4.2 Loops 1.5 Functions 1.5.1 Information Passing 1.5.2 Python’s Built-In Functions 1.6 Simple Input and Output 1.6.1 Console Input and Output 1.6.2 Files 1.7 Exception Handling 1.7.1 Raising an Exception 1.7.2 Catching an Exception 1.8 Iterators and Generators 1.9 Additional Python Conveniences 1.9.1 Conditional Expressions 1.9.2 Comprehension Syntax 1.9.3 Packing and Unpacking of Sequences 1.10 Scopes and Namespaces 1.11 Modules and the Import Statement 1.11.1 Existing Modules 1.12 Exercises 2 4 12 17 18 18 20 23 24 28 30 30 31 33 34 36 39 42 42 43 44 46 48 49 51 Chapter Python Primer 1.1 Python Overview Building data structures and algorithms requires that we communicate detailed instructions to a computer An excellent way to perform such communications is using a high-level computer language, such as Python The Python programming language was originally developed by Guido van Rossum in the early 1990s, and has since become a prominently used language in industry and education The second major version of the language, Python 2, was released in 2000, and the third major version, Python 3, released in 2008 We note that there are significant incompatibilities between Python and Python This book is based on Python (more specifically, Python 3.1 or later) The latest version of the language is freely available at www.python.org, along with documentation and tutorials In this chapter, we provide an overview of the Python programming language, and we continue this discussion in the next chapter, focusing on object-oriented principles We assume that readers of this book have prior programming experience, although not necessarily using Python This book does not provide a complete description of the Python language (there are numerous language references for that purpose), but it does introduce all aspects of the language that are used in code fragments later in this book 1.1.1 The Python Interpreter Python is formally an interpreted language Commands are executed through a piece of software known as the Python interpreter The interpreter receives a command, evaluates that command, and reports the result of the command While the interpreter can be used interactively (especially when debugging), a programmer typically defines a series of commands in advance and saves those commands in a plain text file known as source code or a script For Python, source code is conventionally stored in a file named with the py suffix (e.g., demo.py) On most operating systems, the Python interpreter can be started by typing python from the command line By default, the interpreter starts in interactive mode with a clean workspace Commands from a predefined script saved in a file (e.g., demo.py) are executed by invoking the interpreter with the filename as an argument (e.g., python demo.py), or using an additional -i flag in order to execute a script and then enter interactive mode (e.g., python -i demo.py) Many integrated development environments (IDEs) provide richer software development platforms for Python, including one named IDLE that is included with the standard Python distribution IDLE provides an embedded text-editor with support for displaying and editing Python code, and a basic debugger, allowing step-by-step execution of a program while examining key variable values 1.1 Python Overview 1.1.2 Preview of a Python Program As a simple introduction, Code Fragment 1.1 presents a Python program that computes the grade-point average (GPA) for a student based on letter grades that are entered by a user Many of the techniques demonstrated in this example will be discussed in the remainder of this chapter At this point, we draw attention to a few high-level issues, for readers who are new to Python as a programming language Python’s syntax relies heavily on the use of whitespace Individual statements are typically concluded with a newline character, although a command can extend to another line, either with a concluding backslash character (\), or if an opening delimiter has not yet been closed, such as the { character in defining value map Whitespace is also key in delimiting the bodies of control structures in Python Specifically, a block of code is indented to designate it as the body of a control structure, and nested control structures use increasing amounts of indentation In Code Fragment 1.1, the body of the while loop consists of the subsequent lines, including a nested conditional structure Comments are annotations provided for human readers, yet ignored by the Python interpreter The primary syntax for comments in Python is based on use of the # character, which designates the remainder of the line as a comment print( Welcome to the GPA calculator ) print( Please enter all your letter grades, one per line ) print( Enter a blank line to designate the end ) # map from letter grade to point value points = { A+ :4.0, A :4.0, A- :3.67, B+ :3.33, B :3.0, B- :2.67, C+ :2.33, C :2.0, C :1.67, D+ :1.33, D :1.0, F :0.0} num courses = total points = done = False while not done: grade = input( ) # read line from user # empty line was entered if grade == : done = True elif grade not in points: # unrecognized grade entered print("Unknown grade {0} being ignored".format(grade)) else: num courses += total points += points[grade] # avoid division by zero if num courses > 0: print( Your GPA is {0:.3} format(total points / num courses)) Code Fragment 1.1: A Python program that computes a grade-point average (GPA)

Ngày đăng: 13/04/2019, 01:33