Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 103 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
103
Dung lượng
1,99 MB
Nội dung
Multiprocessors Nhóm 1.Đỗ Luật Khoa 2.Lương Quang Tùng 3.Trần Thanh Phương 4.Phan Thanh Duy 5.Đặng Thanh Hùng 6.Thái Tiểu Minh 7.Nguyễn Thị Thanh Xuân Multiprocessor system What is multiprocessor system? A multiprocessor is a tightly coupled computer system having two or more processing units (Multiple Processors) each sharing main memory and peripherals, in order to simultaneously process programs (theo wikipedia) Why multiprocessor systems? To get high performance and to reduce energy consumption Flynn Classification ? Flynn [1966] proposed a simple model of categorizing all computers that is still useful today He looked at the parallelism in the instruction and data streams May 20, 1934 (age ) New York City What category can it be in the Flynn Classification? The four classifications defined by Flynn are based upon the number of concurrent instruction (or control) and data streams available in the architecture: Category… Single instruction stream, single data stream (SISD)— This category is the uniprocessor Single instruction stream, multiple data streams (SIMD)— The same instruction is executed by multiple processors using different data streams Multiple instruction streams, single data stream (MISD)— No commercial multiprocessor of this type has been built to date Multiple instruction streams, multiple data streams (MIMD)—Each processor fetches its own instructions and operates on its own data Category… Because the MIMD model can exploit thread-level parallelism, it is the architecture of choice for generalpurpose multiprocessors MIMDs offer flexibility MIMDs can build on the cost-performance advantages of off-the-shelf processors MIMD Model Each processor is executing its own instruction stream In many cases, each processor executes a different process A process is a segment of code that may be run independently contains all the information necessary to execute that program on a processor Thread multiple processors executing a single program and sharing the code and most of their address space MIMD multiprocessor with n processors, we must usually have at least n threads or processes to execute Thread-level parallelism is identified at a high level by the software system threads consist of hundreds to millions of instructions that may be executed in parallel MIMD Model MIMD multiprocessors: Have classes, depending on the number of processors involved which in turn dictates a memory organization and interconnect strategy centralized shared-memory architectures most a few dozen processor chips (and less than 100 cores) in 2006 For multiprocessors with small processor counts By using multiple point-to-point connections, or a switch, and adding additional memory banks Because there is a single main memory that has a symmetric relationship to all processors and a uniform access time from any processor called symmetric (shared-memory) multiprocessors (SMPs) or uniform memory access (UMA) • This type of symmetric shared-memory architecture is currently by far the most popular organization Example Example MESI Protocol MESI Writeback Invalidation Protocol To reduce two types of unnecessary bus transactions BusRdX that snoops and converts the block from S to M when only you are the sole owner of the block BusRd that gets the line in S state when there is no sharers Introduce the Exclusive state One can write to the copy without generating BusRdX Illinois Protocol: Proposed by Pamarcos and Patel in 1984 Employed in Intel, PowerPC, MIPS MESI Writeback Invalidation Protocol (Processor Request) PrWr / - PrRd, PrWr / - PrRd / - Exclusive Modified PrWr / BusRdX PrWr / BusRdX PrRd / BusRd (not-S) Invalid S: Shared Signal Processor-initiated Shared PrRd / PrRd / BusRd (S) MESI Writeback Invalidation Protocol (Bus Transactions) • • • Whenever possible, Illinois protocol performs $-to-$ transfer rather than having memory to supply the data Use a Selection algorithm if there are multiple suppliers (Alternative: add an O state or force update memory) Most of the MESI implementations simply write to memory Exclusive BusRd / Flush Or -) BusRdX / - Modified BusRd / Flush BusRdX / Flush Invalid Shared BusRd / Flush* BusRdX / Flush* Bus-snooper-initiated Flush*: Flush for data supplier; no action for other sharers MESI Writeback Invalidation Protocol (Bus Transactions) PrWr / - PrRd, PrWr / - PrRd / - Exclusive BusRd / Flush (or -) Modified PrWr / BusRdX PrWr / BusRdX PrRd / BusRd (not-S) BusRdX / - BusRd / Flush BusRdX / Flush Invalid Shared BusRd / Flush* BusRdX / Flush* S: Shared Signal Processor-initiated PrRd / BusRd (S) Bus-snooper-initiated Flush*: Flush for data supplier; no action for other sharers PrRd / - MESI Example P1 P2 Cache X=1 P3 Cache Cache E BusRd MEMORY Processor Action P1 reads X State in P1 E State in P2 State in P3 - 96 Bus X=1 Bus Transaction BusRd Data Supplier Memory MESI Example P1 P2 Cache X=2 P3 Cache Cache M Bus MEMORY Processor Action P1 reads X P1 Writes X State in P1 E M State in P2 State in P3 - 97 X=1 Bus Transaction BusRd - Data Supplier Memory Own Cache MESI Example P1 P2 Cache X=2 P3 Cache Cache S X=2 Flush Bus BusRd MEMORY Processor Action P1 reads X P1 Writes X P3 reads X State in P1 E M S S State in P2 State in P3 S 98 X=2 Bus Transaction BusRd BusRd/Flush Data Supplier Memory Own Cache P1 Cache MESI Example P1 P2 Cache X=2 P3 Cache Cache I X=3 Bus BusRdX MEMORY Processor Action P1 reads X P1 Writes X P3 reads X P3 Writes X State in P1 E M S I M State in P2 State in P3 S M 99 X=2 Bus Transaction Data Supplier BusRd Memory Own Cache BusRd/Flush P1 Cache BusRdX Memory MESI Example P1 P2 Cache X=3 P3 Cache Cache S X=3 S Flush BusRd MEMORY Processor Action P1 reads X P1 Writes X P3 reads X P3 Writes X P1 reads X State in P1 E M S I S State in P2 State in P3 S M S 100 Bus X=3 Bus Transaction Data Supplier BusRd Memory Own Cache BusRd/Flush P1 Cache BusRdX Memory P3 BusRd/Flush Cache MESI Example P1 P2 Cache X=3 P3 Cache Cache S X=3 S Bus MEMORY Processor Action P1 reads X P1 Writes X P3 reads X P3 Writes X P1 reads X P3 reads X State in P1 E M S I S S State in P2 State in P3 S M S S 101 X=3 Bus Transaction Data Supplier BusRd Memory Own Cache BusRd/Flush P1 Cache BusRdX Memory P3 BusRd/Flu Cache sh Own Cache MESI Example P1 P2 Cache X=3 P3 Cache S X=3 Cache X=3 S Flush1 Bus BusRd MEMORY Processor Action P1 reads X P1 Writes X P3 reads X P3 Writes X P1 reads X P3 reads X P2 reads X State in P1 E M S I S S S S State in P2 State in P3 S M S S S102 S X=3 Bus Transaction Data Supplier BusRd Memory Own Cache BusRd/Flush P1 Cache BusRdX Memory P3 BusRd/Flush Cache Own Cache BusRd/Flush1 P1/P3 Cache Thanks 103 [...]... imbalances in the resource needs and resource availability over multiple threads Multithreading Figure: The speed up from using multithreading on one core on an I7 processor averages 1.31 for and the PARSEC benchmarks the energy efficiency improvement is 1. 07 This data was collected and analyzed by Esmacilzadeh [2011] Cache for Multiprocessors