computer architech 1 705 pdf ” Õ T 5 F a ® 3 2 Đ tà t2 < Đ 3 ` Đ Đ su B P Ð Bò ” 7 ® ¬ tì S 5 0IIPUTER ÂRfRITE(TIIRI Fourth Edition In Praise of Computer Architecture A Quantitative Approach Fourth Ed[.]
In Praise of Computer Architecture: A Quantitative Approach Fourth Edition “The multiprocessor is here and it can no longer be avoided As we bid farewell to single-core processors and move into the chip multiprocessing age, it is great timing for a new edition of Hennessy and Patterson’s classic Few books have had as significant an impact on the way their discipline is taught, and the current edition will ensure its place at the top for some time to come.” —Luiz André Barroso, Google Inc “What the following have in common: Beatles’ tunes, HP calculators, chocolate chip cookies, and Computer Architecture? They are all classics that have stood the test of time.” —Robert P Colwell, Intel lead architect “Not only does the book provide an authoritative reference on the concepts that all computer architects should be familiar with, but it is also a good starting point for investigations into emerging areas in the field.” —Krisztián Flautner, ARM Ltd “The best keeps getting better! This new edition is updated and very relevant to the key issues in computer architecture today Plus, its new exercise paradigm is much more useful for both students and instructors.” —Norman P Jouppi, HP Labs “Computer Architecture builds on fundamentals that yielded the RISC revolution, including the enablers for CISC translation Now, in this new edition, it clearly explains and gives insight into the latest microarchitecture techniques needed for the new generation of multithreaded multicore processors.” —Marc Tremblay, Fellow & VP, Chief Architect, Sun Microsystems “This is a great textbook on all key accounts: pedagogically superb in exposing the ideas and techniques that define the art of computer organization and design, stimulating to read, and comprehensive in its coverage of topics The first edition set a standard of excellence and relevance; this latest edition does it again.” —Milos˘ Ercegovac, UCLA “They’ve done it again Hennessy and Patterson emphatically demonstrate why they are the doyens of this deep and shifting field Fallacy: Computer architecture isn’t an essential subject in the information age Pitfall: You don’t need the 4th edition of Computer Architecture.” —Michael D Smith, Harvard University “Hennessy and Patterson have done it again! The 4th edition is a classic encore that has been adapted beautifully to meet the rapidly changing constraints of ‘late-CMOS-era’ technology The detailed case studies of real processor products are especially educational, and the text reads so smoothly that it is difficult to put down This book is a must-read for students and professionals alike!” —Pradip Bose, IBM “This latest edition of Computer Architecture is sure to provide students with the architectural framework and foundation they need to become influential architects of the future.” — Ravishankar Iyer, Intel Corp “As technology has advanced, and design opportunities and constraints have changed, so has this book The 4th edition continues the tradition of presenting the latest in innovations with commercial impact, alongside the foundational concepts: advanced processor and memory system design techniques, multithreading and chip multiprocessors, storage systems, virtual machines, and other concepts This book is an excellent resource for anybody interested in learning the architectural concepts underlying real commercial products.” —Gurindar Sohi, University of Wisconsin–Madison “I am very happy to have my students study computer architecture using this fantastic book and am a little jealous for not having written it myself.” —Mateo Valero, UPC, Barcelona “Hennessy and Patterson continue to evolve their teaching methods with the changing landscape of computer system design Students gain unique insight into the factors influencing the shape of computer architecture design and the potential research directions in the computer systems field.” —Dan Connors, University of Colorado at Boulder “With this revision, Computer Architecture will remain a must-read for all computer architecture students in the coming decade.” —Wen-mei Hwu, University of Illinois at Urbana–Champaign “The 4th edition of Computer Architecture continues in the tradition of providing a relevant and cutting edge approach that appeals to students, researchers, and designers of computer systems The lessons that this new edition teaches will continue to be as relevant as ever for its readers.” —David Brooks, Harvard University “With the 4th edition, Hennessy and Patterson have shaped Computer Architecture back to the lean focus that made the 1st edition an instant classic.” —Mark D Hill, University of Wisconsin–Madison Computer Architecture A Quantitative Approach Fourth Edition John L Hennessy is the president of Stanford University, where he has been a member of the faculty since 1977 in the departments of electrical engineering and computer science Hennessy is a Fellow of the IEEE and ACM, a member of the National Academy of Engineering and the National Academy of Science, and a Fellow of the American Academy of Arts and Sciences Among his many awards are the 2001 Eckert-Mauchly Award for his contributions to RISC technology, the 2001 Seymour Cray Computer Engineering Award, and the 2000 John von Neumann Award, which he shared with David Patterson He has also received seven honorary doctorates In 1981, he started the MIPS project at Stanford with a handful of graduate students After completing the project in 1984, he took a one-year leave from the university to cofound MIPS Computer Systems, which developed one of the first commercial RISC microprocessors After being acquired by Silicon Graphics in 1991, MIPS Technologies became an independent company in 1998, focusing on microprocessors for the embedded marketplace As of 2006, over 500 million MIPS microprocessors have been shipped in devices ranging from video games and palmtop computers to laser printers and network switches David A Patterson has been teaching computer architecture at the University of California, Berkeley, since joining the faculty in 1977, where he holds the Pardee Chair of Computer Science His teaching has been honored by the Abacus Award from Upsilon Pi Epsilon, the Distinguished Teaching Award from the University of California, the Karlstrom Award from ACM, and the Mulligan Education Medal and Undergraduate Teaching Award from IEEE Patterson received the IEEE Technical Achievement Award for contributions to RISC and shared the IEEE Johnson Information Storage Award for contributions to RAID He then shared the IEEE John von Neumann Medal and the C & C Prize with John Hennessy Like his co-author, Patterson is a Fellow of the American Academy of Arts and Sciences, ACM, and IEEE, and he was elected to the National Academy of Engineering, the National Academy of Sciences, and the Silicon Valley Engineering Hall of Fame He served on the Information Technology Advisory Committee to the U.S President, as chair of the CS division in the Berkeley EECS department, as chair of the Computing Research Association, and as President of ACM This record led to a Distinguished Service Award from CRA At Berkeley, Patterson led the design and implementation of RISC I, likely the first VLSI reduced instruction set computer This research became the foundation of the SPARC architecture, currently used by Sun Microsystems, Fujitsu, and others He was a leader of the Redundant Arrays of Inexpensive Disks (RAID) project, which led to dependable storage systems from many companies He was also involved in the Network of Workstations (NOW) project, which led to cluster technology used by Internet companies These projects earned three dissertation awards from the ACM His current research projects are the RAD Lab, which is inventing technology for reliable, adaptive, distributed Internet services, and the Research Accelerator for Multiple Processors (RAMP) project, which is developing and distributing low-cost, highly scalable, parallel computers based on FPGAs and open-source hardware and software Computer Architecture A Quantitative Approach Fourth Edition John L Hennessy Stanford University David A Patterson University of California at Berkeley With Contributions by Andrea C Arpaci-Dusseau University of Wisconsin–Madison Diana Franklin California Polytechnic State University, San Luis Obispo Remzi H Arpaci-Dusseau University of Wisconsin–Madison David Goldberg Xerox Palo Alto Research Center Krste Asanovic Massachusetts Institute of Technology Wen-mei W Hwu University of Illinois at Urbana–Champaign Robert P Colwell R&E Colwell & Associates, Inc Norman P Jouppi HP Labs Thomas M Conte North Carolina State University Timothy M Pinkston University of Southern California José Duato Universitat Politècnica de València and Simula John W Sias University of Illinois at Urbana–Champaign David A Wood University of Wisconsin–Madison Amsterdam • Boston • Heidelberg • London New York • Oxford • Paris • San Diego San Francisco • Singapore • Sydney • Tokyo Publisher Denise E M Penrose Project Manager Dusty Friedman, The Book Company In-house Senior Project Manager Brandy Lilly Developmental Editor Nate McFadden Editorial Assistant Kimberlee Honjo Cover Design Elisabeth Beller and Ross Carron Design Cover Image Richard I’Anson’s Collection: Lonely Planet Images Composition Nancy Logan Text Design: Rebecca Evans & Associates Technical Illustration David Ruppe, Impact Publications Copyeditor Ken Della Penta Proofreader Jamie Thaman Indexer Nancy Ball Printer Maple-Vail Book Manufacturing Group Morgan Kaufmann Publishers is an Imprint of Elsevier 500 Sansome Street, Suite 400, San Francisco, CA 94111 This book is printed on acid-free paper © 1990, 1996, 2003, 2007 by Elsevier, Inc All rights reserved Published 1990 Fourth edition 2007 Designations used by companies to distinguish their products are often claimed as trademarks or registered trademarks In all instances in which Morgan Kaufmann Publishers is aware of a claim, the product names appear in initial capital or all capital letters Readers, however, should contact the appropriate companies for more complete information regarding trademarks and registration Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, e-mail: permissions@elsevier com You may also complete your request on-line via the Elsevier Science homepage ( http:// elsevier.com), by selecting “Customer Support” and then “Obtaining Permissions.” Library of Congress Cataloging-in-Publication Data Hennessy, John L Computer architecture : a quantitative approach / John L Hennessy, David A Patterson ; with contributions by Andrea C Arpaci-Dusseau [et al.] —4th ed p.cm Includes bibliographical references and index ISBN 13: 978-0-12-370490-0 (pbk : alk paper) ISBN 10: 0-12-370490-1 (pbk : alk paper) Computer architecture I Patterson, David A II Arpaci-Dusseau, Andrea C III Title QA76.9.A73P377 2006 004.2'2—dc22 2006024358 For all information on all Morgan Kaufmann publications, visit our website at www.mkp.com or www.books.elsevier.com Printed in the United States of America 06 07 08 09 10 To Andrea, Linda, and our four sons Foreword by Fred Weber, President and CEO of MetaRAM, Inc I am honored and privileged to write the foreword for the fourth edition of this most important book in computer architecture In the first edition, Gordon Bell, my first industry mentor, predicted the book’s central position as the definitive text for computer architecture and design He was right I clearly remember the excitement generated by the introduction of this work Rereading it now, with significant extensions added in the three new editions, has been a pleasure all over again No other work in computer architecture—frankly, no other work I have read in any field—so quickly and effortlessly takes the reader from ignorance to a breadth and depth of knowledge This book is dense in facts and figures, in rules of thumb and theories, in examples and descriptions It is stuffed with acronyms, technologies, trends, formulas, illustrations, and tables And, this is thoroughly appropriate for a work on architecture The architect’s role is not that of a scientist or inventor who will deeply study a particular phenomenon and create new basic materials or techniques Nor is the architect the craftsman who masters the handling of tools to craft the finest details The architect’s role is to combine a thorough understanding of the state of the art of what is possible, a thorough understanding of the historical and current styles of what is desirable, a sense of design to conceive a harmonious total system, and the confidence and energy to marshal this knowledge and available resources to go out and get something built To accomplish this, the architect needs a tremendous density of information with an in-depth understanding of the fundamentals and a quantitative approach to ground his thinking That is exactly what this book delivers As computer architecture has evolved—from a world of mainframes, minicomputers, and microprocessors, to a world dominated by microprocessors, and now into a world where microprocessors themselves are encompassing all the complexity of mainframe computers—Hennessy and Patterson have updated their book appropriately The first edition showcased the IBM 360, DEC VAX, and Intel 80x86, each the pinnacle of its class of computer, and helped introduce the world to RISC architecture The later editions focused on the details of the 80x86 and RISC processors, which had come to dominate the landscape This latest edition expands the coverage of threading and multiprocessing, virtualization ix x ■ Computer Architecture and memory hierarchy, and storage systems, giving the reader context appropriate to today’s most important directions and setting the stage for the next decade of design It highlights the AMD Opteron and SUN Niagara as the best examples of the x86 and SPARC (RISC) architectures brought into the new world of multiprocessing and system-on-a-chip architecture, thus grounding the art and science in real-world commercial examples The first chapter, in less than 60 pages, introduces the reader to the taxonomies of computer design and the basic concerns of computer architecture, gives an overview of the technology trends that drive the industry, and lays out a quantitative approach to using all this information in the art of computer design The next two chapters focus on traditional CPU design and give a strong grounding in the possibilities and limits in this core area The final three chapters build out an understanding of system issues with multiprocessing, memory hierarchy, and storage Knowledge of these areas has always been of critical importance to the computer architect In this era of system-on-a-chip designs, it is essential for every CPU architect Finally the appendices provide a great depth of understanding by working through specific examples in great detail In design it is important to look at both the forest and the trees and to move easily between these views As you work through this book you will find plenty of both The result of great architecture, whether in computer design, building design or textbook design, is to take the customer’s requirements and desires and return a design that causes that customer to say, “Wow, I didn’t know that was possible.” This book succeeds on that measure and will, I hope, give you as much pleasure and value as it has me Contents Foreword ix Preface xv Acknowledgments Chapter Fundamentals of Computer Design 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.13 Chapter xxiii Introduction Classes of Computers Defining Computer Architecture Trends in Technology Trends in Power in Integrated Circuits Trends in Cost Dependability Measuring, Reporting, and Summarizing Performance Quantitative Principles of Computer Design Putting It All Together: Performance and Price-Performance Fallacies and Pitfalls Concluding Remarks Historical Perspectives and References Case Studies with Exercises by Diana Franklin 14 17 19 25 28 37 44 48 52 54 55 Instruction-Level Parallelism and Its Exploitation 2.1 2.2 2.3 2.4 2.5 2.6 2.7 Instruction-Level Parallelism: Concepts and Challenges Basic Compiler Techniques for Exposing ILP Reducing Branch Costs with Prediction Overcoming Data Hazards with Dynamic Scheduling Dynamic Scheduling: Examples and the Algorithm Hardware-Based Speculation Exploiting ILP Using Multiple Issue and Static Scheduling 66 74 80 89 97 104 114 xi xii ■ Contents 2.8 2.9 2.10 2.11 2.12 2.13 Chapter 3.6 3.7 3.8 3.9 Introduction Studies of the Limitations of ILP Limitations on ILP for Realizable Processors Crosscutting Issues: Hardware versus Software Speculation Multithreading: Using ILP Support to Exploit Thread-Level Parallelism Putting It All Together: Performance and Efficiency in Advanced Multiple-Issue Processors Fallacies and Pitfalls Concluding Remarks Historical Perspective and References Case Study with Exercises by Wen-mei W Hwu and John W Sias 154 154 165 170 172 179 183 184 185 185 Multiprocessors and Thread-Level Parallelism 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 Chapter 118 121 131 138 140 141 142 Limits on Instruction-Level Parallelism 3.1 3.2 3.3 3.4 3.5 Chapter Exploiting ILP Using Dynamic Scheduling, Multiple Issue, and Speculation Advanced Techniques for Instruction Delivery and Speculation Putting It All Together: The Intel Pentium Fallacies and Pitfalls Concluding Remarks Historical Perspective and References Case Studies with Exercises by Robert P Colwell Introduction Symmetric Shared-Memory Architectures Performance of Symmetric Shared-Memory Multiprocessors Distributed Shared Memory and Directory-Based Coherence Synchronization: The Basics Models of Memory Consistency: An Introduction Crosscutting Issues Putting It All Together: The Sun T1 Multiprocessor Fallacies and Pitfalls Concluding Remarks Historical Perspective and References Case Studies with Exercises by David A Wood 196 205 218 230 237 243 246 249 257 262 264 264 Memory Hierarchy Design 5.1 5.2 5.3 Introduction Eleven Advanced Optimizations of Cache Performance Memory Technology and Optimizations 288 293 310 Contents 5.4 5.5 5.6 5.7 5.8 5.9 Chapter Protection: Virtual Memory and Virtual Machines Crosscutting Issues: The Design of Memory Hierarchies Putting It All Together: AMD Opteron Memory Hierarchy Fallacies and Pitfalls Concluding Remarks Historical Perspective and References Case Studies with Exercises by Norman P Jouppi Introduction Advanced Topics in Disk Storage Definition and Examples of Real Faults and Failures I/O Performance, Reliability Measures, and Benchmarks A Little Queuing Theory Crosscutting Issues Designing and Evaluating an I/O System—The Internet Archive Cluster 6.8 Putting It All Together: NetApp FAS6000 Filer 6.9 Fallacies and Pitfalls 6.10 Concluding Remarks 6.11 Historical Perspective and References Case Studies with Exercises by Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 315 324 326 335 341 342 342 358 358 366 371 379 390 392 397 399 403 404 404 Pipelining: Basic and Intermediate Concepts A.1 A.2 A.3 A.4 A.5 A.6 A.7 A.8 A.9 A.10 Appendix B xiii Storage Systems 6.1 6.2 6.3 6.4 6.5 6.6 6.7 Appendix A ■ Introduction The Major Hurdle of Pipelining—Pipeline Hazards How Is Pipelining Implemented? What Makes Pipelining Hard to Implement? Extending the MIPS Pipeline to Handle Multicycle Operations Putting It All Together: The MIPS R4000 Pipeline Crosscutting Issues Fallacies and Pitfalls Concluding Remarks Historical Perspective and References A-2 A-11 A-26 A-37 A-47 A-56 A-65 A-75 A-76 A-77 Instruction Set Principles and Examples B.1 B.2 B.3 B.4 B.5 Introduction Classifying Instruction Set Architectures Memory Addressing Type and Size of Operands Operations in the Instruction Set B-2 B-3 B-7 B-13 B-14 xiv ■ Contents B.6 B.7 B.8 B.9 B.10 B.11 B.12 Appendix C Instructions for Control Flow Encoding an Instruction Set Crosscutting Issues: The Role of Compilers Putting It All Together: The MIPS Architecture Fallacies and Pitfalls Concluding Remarks Historical Perspective and References B-16 B-21 B-24 B-32 B-39 B-45 B-47 Review of Memory Hierarchy C.1 C.2 C.3 C.4 C.5 C.6 C.7 C.8 Introduction Cache Performance Six Basic Cache Optimizations Virtual Memory Protection and Examples of Virtual Memory Fallacies and Pitfalls Concluding Remarks Historical Perspective and References C-2 C-15 C-22 C-38 C-47 C-56 C-57 C-58 Companion CD Appendices Appendix D Appendix E Embedded Systems Updated by Thomas M Conte Interconnection Networks Revised by Timothy M Pinkston and José Duato Appendix F Vector Processors Revised by Krste Asanovic Appendix G Hardware and Software for VLIW and EPIC Appendix H Large-Scale Multiprocessors and Scientific Applications Appendix I Computer Arithmetic by David Goldberg Appendix J Survey of Instruction Set Architectures Appendix K Historical Perspectives and References Online Appendix (textbooks.elsevier.com/0123704901) Appendix L Solutions to Case Study Exercises References Index R-1 I-1