1. Trang chủ
  2. » Công Nghệ Thông Tin

Advanced Computer Architecture - Lecture 12: Instruction level parallelism

38 5 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Advanced Computer Architecture - Lecture 12: Instruction level parallelism. This lecture will cover the following: introduction to multi cycle pipelined datapath; longer pipelines – FP instructions; loop level parallelism; FP loop hazards; typical MIPS FP pipeline; hazards in longer latency pipeline;...

CS 704 Advanced Computer Architecture Lecture 12 Instruction Level Parallelism (Introduction to multi cycle pipelined datapath) Prof Dr M Ashraf Chughtai Today’s Topics Recap: Pipelining Basics Longer Pipelines – FP Instructions Loop Level Parallelism FP Loop Hazards Summary MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) Recap: Pipelined datapath and control In the previous lecture we reviewed the pipelined datapath to understand the basics of ILP – overlap among the instruction execution to enhance performance Key components of pipeline data path Performance enhancement due to pipeline: – Pipelining helps instruction bandwidth but not latency MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) Recap: Pipeline Hazards Structural hazards MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) Recap: Pipeline Hazards … Cont’d Data Hazards MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) Recap: Three Generic Data Hazards Read After Write (RAW): (dependence) – instrJ tries to read operand before instri writes it; i: add r1,r2,r3 j: sub r4,r1,r3 MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) Recap: Three Generic Data Hazards Write After Read (WAR): anti-dependence – i: sub r4,r1,r3 j: add r1,r2,r3 - Also called Name dependence(renaming) MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) Recap: Three Generic Data Hazards • Write After Write (WAW) i: sub r1,r4,r3 j: add r1,r2,r3 MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) Recap: Pipeline Hazards … Cont’d Control hazards How to overcome Hazards? Stall MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) Recap: How to remove Hazards? Structural Hazard: Multiple functional units Data Hazard : Forwarding or bypassing Control Hazards: Predict, delay branch MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 10 Working of extended FP Pipeline Note that additional pipeline register have been inserted between intervening stage, e.g., A1/A2, A2/A3, … Furthermore, ID/EX register must be expanded to connect ID to A1, M1, EX and DIV Function Units Here, the FP divide FP is not pipelined but it requires 24 clock cycles to complete MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 24 FP Pipeline Timing: Example MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 25 Hazards in Longer Latency Pipeline All the functional units are not fully pipelined So structural hazard may occur Instructions have varying running time, so more than one register write may occur Instructions are no longer reaching WB stage in order so WAW data hazard may occur WAR hazards are not possible since registers are read in ID stage Stall for RAW data hazard may be more frequent because of longer latency of operations MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 26 FP Pipeline Hazards - RAW Clock Cycle Number INST L.D F4, 0(R2) MUL.D F0,F4,F6 IFID IF ADD.D F2,F0,F8 S.DF2, 0(R2) MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) EX ID IF Me st st WB M1 ID IF M2 st st 27 FP Pipeline Structural Hazard Clock Cycle Number IF MAC/VU-Advanced Computer Architecture ID M1 M2 M3 M4 M5 IF ID IF Ex ID Me Ex WB Me WB IF ID A1 A2 A3 A4 Me WB IF ID IF Ex ID Me WB Ex Me IF ID EX Me WB Lecture 12 –Instruction Level Parallelism (1) 10 11 M6 M7 Me WB WB 28 Conclusion about FP Pipeline 1: Structural Hazard – wait until required functional unit is available 2: Check for RAW data hazard : wait until the source registers are not listed as pending destinations register that will not be available 3.Check for WAW: determine if any instruction in A1, A2, …D, M1, M2 , … has same destination as this instruction MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 29 Precise Exceptions: Out-of-order Completion! In the program: DIV.D F0,F2,F4 ADD.D F10,F10,F8 SUB.D F12,F12,F14 MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 30 Overcoming the Data Hazard by Scheduling Static Scheduling – Compiler based Dynamic Scheduling – Hardware based Statically Scheduled Pipeline: MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 31 Dynamic Scheduling Overcoming the Data Hazard Dynamically Scheduled Pipeline: Advantages: Allows to handle cases where dependence is unknown at the compile time - - Allows code compiled for one pipeline to run on other pipe line MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 32 Concept of Dynamic Scheduling Cont’d In the program: DIV.D F0,F2,F4 ADD.D F10,F0,F8 SUB.D F12,F8,F14 MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 33 Problems of Out-of-order execution WAR and WAW In the program: DIV.D F0,F2,F4 ADD.D F6,F0,F8 SUB.D F8,F10,F14 MUL.D F6,F10,F8 MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 34 Exception due to out of order execution Already completed instructions Not Yet completed instructions MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 35 Overcoming Exceptions Split the ID pipe stage into two: Issue: Decode instructions and check for structural hazard Read Operand: Wait until no data hazards, then read MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 36 Summary We have talked about longer FP pipelines MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 37 Asslam-u-aLacum and ALLAH Hafiz MAC/VU-Advanced Computer Architecture Lecture 12 –Instruction Level Parallelism (1) 38 ... complete MAC/VU -Advanced Computer Architecture Lecture 12 ? ?Instruction Level Parallelism (1) 24 FP Pipeline Timing: Example MAC/VU -Advanced Computer Architecture Lecture 12 ? ?Instruction Level Parallelism. .. - Finally, discard all but the correct stream MAC/VU -Advanced Computer Architecture Lecture 12 ? ?Instruction Level Parallelism (1) 12 Superscalar Design MAC/VU -Advanced Computer Architecture Lecture. .. (page A-48) Explanation next please MAC/VU -Advanced Computer Architecture Lecture 12 ? ?Instruction Level Parallelism (1) 20 Typical MIPS FP Pipeline MAC/VU -Advanced Computer Architecture Lecture

Ngày đăng: 05/07/2022, 11:49

Xem thêm: