www.it-ebooks.info ffirs.indd iiffirs.indd ii 3/26/2012 12:36:44 PM3/26/2012 12:36:44 PM www.it-ebooks.info Blair-Chappell rs V4 - 03/16/2012 PARALLEL PROGRAMMING WITH INTEL Ø PARALLEL STUDIO XE FOREWORD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xxvii PART I AN INTRODUCTION TO PARALLELISM CHAPTER 1 Parallelism Today . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 CHAPTER 2 An Overview of Parallel Studio XE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 CHAPTER 3 Parallel Studio XE for the Impatient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 PART II USING PARALLEL STUDIO XE CHAPTER 4 Producing Optimized Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 CHAPTER 5 Writing Secure Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .131 CHAPTER 6 Where to Parallelize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 CHAPTER 7 Implementing Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .181 CHAPTER 8 Checking for Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 CHAPTER 9 Tuning Parallel Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 CHAPTER 10 Parallel Advisor–Driven Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 CHAPTER 11 Debugging Parallel Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .309 CHAPTER 12 Event-Based Analysis with VTune Amplifi er XE . . . . . . . . . . . . . . . . . . . 341 PART III CASE STUDIES CHAPTER 13 The World’s First Sudoku “Thirty-Niner” . . . . . . . . . . . . . . . . . . . . . . . . . 377 CHAPTER 14 Nine Tips to Parallel-Programming Heaven . . . . . . . . . . . . . . . . . . . . . . 397 CHAPTER 15 Parallel Track Fitting in the CERN Collider . . . . . . . . . . . . . . . . . . . . . . . 419 CHAPTER 16 Parallelizing Legacy Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489 ffirs.indd iffirs.indd i 3/26/2012 12:36:44 PM3/26/2012 12:36:44 PM www.it-ebooks.info ffirs.indd iiffirs.indd ii 3/26/2012 12:36:44 PM3/26/2012 12:36:44 PM www.it-ebooks.info Blair-Chappell rs V4 - 03/16/2012 Parallel Programming with Intel Ø Parallel Studio XE ffirs.indd iiiffirs.indd iii 3/26/2012 12:36:44 PM3/26/2012 12:36:44 PM www.it-ebooks.info ffirs.indd ivffirs.indd iv 3/26/2012 12:36:44 PM3/26/2012 12:36:44 PM www.it-ebooks.info Blair-Chappell rs V4 - 03/16/2012 Parallel Programming with Intel Ø Parallel Studio XE Stephen Blair-Chappell Andrew Stokes ffirs.indd vffirs.indd v 3/26/2012 12:36:44 PM3/26/2012 12:36:44 PM www.it-ebooks.info Blair-Chappell rs V4 - 03/16/2012 Parallel Programming with Intel Ø Parallel Studio XE Published by John Wiley & Sons, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2012 by John Wiley & Sons, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-0-470-89165-0 ISBN: 978-1-118-22113-6 (ebk) ISBN: 978-1-118-23488-4 (ebk) ISBN: 978-1-118-25954-2 (ebk) Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions. Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifi cally disclaim all warranties, including without limitation warranties of fi tness for a particular purpose. No warranty may be created or extended by sales or pro- motional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the pub- lisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Web site may provide or recommendations it may make. Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read. For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with stan- dard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http:// booksupport.wiley.com . For more information about Wiley products, visit www.wiley.com. Library of Congress Control Number: 2011945570 Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, Programmer to Programmer, and related trade dress are trade- marks or registered trademarks of John Wiley & Sons, Inc. and/or its affi liates, in the United States and other countries, and may not be used without written permission. Intel is a registered trademark of Intel Corporation. All other trade- marks are the property of their respective owners. John Wiley & Sons, Inc., is not associated with any product or vendor mentioned in this book. ffirs.indd viffirs.indd vi 3/26/2012 12:36:45 PM3/26/2012 12:36:45 PM www.it-ebooks.info Blair-Chappell rs V4 - 03/16/2012 ABOUT THE AUTHORS STEPHEN BLAIR-CHAPPELL has been working for Intel in the Software and Services Group (SSG) for the past 15 years. During his time with Intel, Stephen has worked on the compiler team as a devel- oper and, more recently, as a technical consulting engineer helping users make the best use of the Intel software tools. Prior to working with Intel, Stephen was managing director of the UK offi ce of CAD-UL, a German-based compiler and debugger company. During his time at CAD-UL Stephen was primarily responsible for technical support in the UK. Projects he worked on during that time included the design and specifi cation of a graphical linker; the development and teaching of pro- tected mode programming courses to programmers; and support to many varied companies in the telecoms, automotive, and embedded industries. Stephen fi rst studied electronics as a technician at Matthew Boulton Technical College, and later studied Applied Software Engineering at Birmingham City University (BCU), where he also eventu- ally taught. Outside work, Stephen is a regular contributor to the life of his local church, St Martin in the Bull Ring, Birmingham, where he plays the organ, preaches, and leads the occasional service. ANDREW STOKES is a retired lecturer in software and electronics at Birmingham City University (BCU), UK. Prior to lecturing, Andrew was a software developer in the research and commercial fi elds. He fi rst started software development in the 1980s at Cambridge University Engineering Laboratory, where he worked on software for scanning electron microscopes. These software devel- opments continued in the commercial fi eld, where he worked on graphical programs in support of a Finite Element Analysis package. During his time at BCU, Andrew developed many software simulation tools, including programs for artifi cial neural network simulation, CPU simulation, processor design, code development tools, and a PROLOG expert system. Andrew continues these software interests during retirement, with a healthy interest in games programming, such as 3-D chess, where parallel programming is para- mount. Away from computing, Andrew is a keen gardener and particularly likes the vibrant colors of the typical English garden. ffirs.indd viiffirs.indd vii 3/26/2012 12:36:46 PM3/26/2012 12:36:46 PM www.it-ebooks.info Blair-Chappell rs V4 - 03/16/2012 ABOUT THE TECHNICAL EDITORS KITTUR GANESH is a Senior Technical Consulting Engineer at Intel, providing consulting, sup- port, and training for more than 7 years on various software products targeting Intel architecture. Previously, for more than 6 years at Intel, Kittur designed and developed software primarily used for fracturing design data of Intel chips. Prior to joining Intel more than 13 years ago, Kittur was involved in developing commercial software in the EDA industry for more than 10 years. Kittur has a M.S. (Computer Science), M.S. (Industrial Engineering) and a B.S. (Mechanical Engineering). PABLO HALPERN is a Senior Software Engineer at Intel Corporation, working in the parallel runtime libraries group. He is a member of the C++ Standards Committee and helped produce the recent C++11 revision of the standard. Pablo is the author of the well-received book, The C++ Standard Library from Scratch and a coauthor of the paper, Reducers and Other Cilk++ Hyperobjects, which was named best paper at ACM SPAA in 2009. He has more than three decades of experience in the software industry, with expertise in C++, language and compiler design, large-scale development and testing, and network management protocols. During this time, he has developed and taught both beginning and advanced courses on C++ programming. He currently lives in New Hampshire with his wife and two children. ffirs.indd viiiffirs.indd viii 3/26/2012 12:36:46 PM3/26/2012 12:36:46 PM www.it-ebooks.info [...]... Scalability Calculating Speedup Predicting Scalability Parallelism and Real-Time Systems Hard and Soft Real-Time A Hard Real-Time Example using RTX Advice for Real-Time Programmers Summary 19 19 21 22 22 23 23 24 CHAPTER 2: AN OVERVIEW OF PARALLEL STUDIO XE 25 Why Parallel Studio XE? What’s in Parallel Studio XE? Intel Parallel Studio XE Intel Parallel Advisor 25 26 26 28 The Advisor Workflow Surveying... are adding parallelism to their code The required technical skill is “average” to “experienced.” Knowledge of C programming is a prerequisite ‰ Students and academics who are looking to gain practical experience in making code parallel ‰ Owners and users of Intel Parallel Studio XE WHAT THIS BOOK COVERS This book, written using Parallel Studio XE 2011, shows how you can profi le, optimize, and parallelize... Examples of parallelism are provided using Cilk Plus, OpenMP, and Threading Building Blocks The case studies are based on larger projects and show how Parallel Studio XE was used to parallelize them WHAT YOU NEED TO USE THIS BOOK You need the following to use this book: ‰ Intel Parallel Studio XE You can download an evaluation version from the Intel Software Evaluation Center (http://software .intel. com/en-us/articles/... Correctness Replacing Annotations 28 28 29 29 30 31 Intel Parallel Composer XE 31 Intel C/C++ Optimizing Compiler Profile-Guided Optimization Cilk Plus OpenMP Intel Threading Building Blocks Intel Integrated Performance Primitives An Application Example IPP and Threading Intel Parallel Debugger Extension Intel Debugger Math Kernel Library VTune Amplifier XE 31 32 33 37 38 40 41 42 43 43 44 45 Hotspot Analysis... Dissassembly Source View Parallel Inspector XE 46 46 46 48 48 Predefined Analysis Types Errors and Warnings 48 49 xiv www.it-ebooks.info ftoc.indd xiv 3/26/2012 12:37:19 PM Blair-Chappell ftoc V3 - 03/16/2012 CONTENTS Static Security Analysis Different Approaches to Using Parallel Studio XE Summary CHAPTER 3: PARALLEL STUDIO XE FOR THE IMPATIENT The Four-Step Methodology Example 1: Working with Cilk Plus Obtaining... consultation with a group of senior programming engineers revealed the top three hurdles in adopting parallelism: the challenges of porting legacy code, the lack of education, and the lack of the right kinds of programming tools This book helps to address some of these hurdles This book was written to help you use Intel Parallel Studio XE to write programs that use the latest features of multi-core CPUs With. .. or Ubuntu* 10.04 A PC based on an IA-32 or Intel 64 architecture processor supporting the Intel Streaming SIMD Extensions 2 (Intel SSE2) instructions (Intel Pentium 4 processor or later), or compatible non -Intel processor If you use a non -Intel processor, you will not be able to carry out the activities in Chapter 12, “Event-Based Analysis with VTune Amplifier XE. ” xxviii www.it-ebooks.info flast.indd... 3/26/2012 12:37:07 PM www.it-ebooks.info flast.indd xxxii 3/26/2012 12:37:08 PM Blair-Chappell c01.indd V3 - 02/24/2011 Page 1 PART I An Introduction to Parallelism CHAPTER 1: Parallelism Today CHAPTER 2: An Overview of Parallel Studio XE CHAPTER 3: Parallel Studio XE for the Impatient www.it-ebooks.info c01.indd 1 3/26/2012 12:01:07 PM Blair-Chappell c01.indd V3 - 02/24/2011 Page 2 www.it-ebooks.info c01.indd... PART I: AN INTRODUCTION TO PARALLELISM CHAPTER 1: PARALLELISM TODAY The Arrival of Parallelism 3 3 The Power Density Race The Emergence of Multi-Core and Many-Core Computing The Top Six Challenges 3 4 7 Legacy Code Tools Education Fear of Many-Core Computing Maintainability Return on Investment 7 7 8 8 8 9 Parallelism and the Programmer 9 Types of Parallelism Intel s Family of Parallel Models Cilk Plus... nature of parallelism affects every aspect of programming today I’m encouraged by Stephen’s work, which walks through each aspect instead of just coding Covering the issues of discovery, debugging, and tuning is critical to understanding the challenges of parallel programming I hope this book is an inspiration to all who read it “Think Parallel. ” —James Reinders Director, Parallel Evangelist, Intel Portland, . 24 CHAPTER 2: AN OVERVIEW OF PARALLEL STUDIO XE 25 Why Parallel Studio XE? 25 What’s in Parallel Studio XE? 26 Intel Parallel Studio XE 26 Intel Parallel Advisor 28 The Advisor Workfl ow 28 Surveying. PM3/26/2012 12:36:44 PM www.it-ebooks.info Blair-Chappell rs V4 - 03/16/2012 Parallel Programming with Intel Ø Parallel Studio XE ffirs.indd iiiffirs.indd iii 3/26/2012 12:36:44 PM3/26/2012 12:36:44. PM3/26/2012 12:36:44 PM www.it-ebooks.info Blair-Chappell rs V4 - 03/16/2012 Parallel Programming with Intel Ø Parallel Studio XE Stephen Blair-Chappell Andrew Stokes ffirs.indd vffirs.indd v 3/26/2012