The architecture of computer hardware and systems software an information technology approach ch08

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	38
Dung lượng	1,39 MB

Nội dung

Chapter CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology Approach 3rd Edition, Irv Englander John Wiley and Sons  2003 CPU Architecture Overview      CISC – Complex Instruction Set Computer RISC – Reduced Instruction Set Computer CISC vs RISC Comparisons VLIW – Very Long Instruction Word EPIC – Explicitly Parallel Instruction Computer Chapter 8: CPU and Memory: 8-2 CISC Architecture  Examples  Intel x86, IBM Z-Series Mainframes, older CPU architectures  Characteristics  Few general purpose registers  Many addressing modes  Large number of specialized, complex instructions  Instructions are of varying sizes Chapter 8: CPU and Memory: 8-3 Limitations of CISC Architecture  Complex instructions are infrequently used by programmers and compilers  Memory references, loads and stores, are slow and account for a significant fraction of all instructions  Procedure and function calls are a major bottleneck  Passing arguments  Storing and retrieving values in registers Chapter 8: CPU and Memory: 8-4 RISC Features  Examples  Power PC, Sun Sparc, Motorola 68000  Limited and simple instruction set  Fixed length, fixed format instruction words  Enable pipelining, parallel fetches and executions  Limited addressing modes  Reduce complicated hardware  Register-oriented instruction set  Reduce memory accesses  Large bank of registers  Reduce memory accesses  Efficient procedure calls Chapter 8: CPU and Memory: 8-5 CISC vs RISC Processing Chapter 8: CPU and Memory: 8-6 Circular Register Buffer Chapter 8: CPU and Memory: 8-7 Circular Register Buffer - After Procedure Call Chapter 8: CPU and Memory: 8-8 CISC vs RISC Performance Comparison  RISC  Simpler instructions  more instructions  more memory accesses  RISC  more bus traffic and increased cache memory misses  More registers would improve CISC performance but no space available for them  Modern CISC and RISC architectures are becoming similar Chapter 8: CPU and Memory: 8-9 VLIW Architecture  Transmeta Crusoe CPU  128-bit instruction bundle = molecule  32-bit atoms (atom = instruction)  Parallel processing of instructions  64 general purpose registers  Code morphing layer  Translates instructions written for other CPUs into molecules  Instructions are not written directly for the Crusoe CPU Chapter 8: CPU and Memory: 8-10 Two-level Caches  Why the sizes of the caches have to be different? Chapter 8: CPU and Memory: 8-24 Cache vs Virtual Memory  Cache speeds up memory access  Virtual memory increases amount of perceived storage  independence from the configuration and capacity of the memory system  low cost per bit Chapter 8: CPU and Memory: 8-25 Modern CPU Processing Methods      Timing Issues Separate Fetch/Execute Units Pipelining Scalar Processing Superscalar Processing Chapter 8: CPU and Memory: 8-26 Timing Issues      Computer clock used for timing purposes MHz – million steps per second GHz – billion steps per second Instructions can (and often) take more than one step Data word width can require multiple steps Chapter 8: CPU and Memory: 8-27 Separate Fetch-Execute Units  Fetch Unit  Instruction fetch unit  Instruction decode unit   Determine opcode Identify type of instruction and operands  Several instructions are fetched in parallel and held in a buffer until decoded and executed  IP – Instruction Pointer register  Execute Unit  Receives instructions from the decode unit  Appropriate execution unit services the instruction Chapter 8: CPU and Memory: 8-28 Alternative CPU Organization Chapter 8: CPU and Memory: 8-29 Instruction Pipelining  Assembly-line technique to allow overlapping between fetch-execute cycles of sequences of instructions  Only one instruction is being executed to completion at a time  Scalar processing  Average instruction execution is approximately equal to the clock speed of the CPU  Problems from stalling  Instructions have different numbers of steps  Problems from branching Chapter 8: CPU and Memory: 8-30 Branch Problem Solutions  Separate pipelines for both possibilities  Probabilistic approach  Requiring the following instruction to not be dependent on the branch  Instruction Reordering (superscalar processing) Chapter 8: CPU and Memory: 8-31 Pipelining Example Chapter 8: CPU and Memory: 8-32 Superscalar Processing  Process more than one instruction per clock cycle  Separate fetch and execute cycles as much as possible  Buffers for fetch and decode phases  Parallel execution units Chapter 8: CPU and Memory: 8-33 Superscalar CPU Block Diagram Chapter 8: CPU and Memory: 8-34 Scalar vs Superscalar Processing Chapter 8: CPU and Memory: 8-35 Superscalar Issues  Out-of-order processing – dependencies (hazards)  Data dependencies  Branch (flow) dependencies and speculative execution  Parallel speculative execution or branch prediction  Branch History Table  Register access conflicts  Logical registers Chapter 8: CPU and Memory: 8-36 Hardware Implementation  Hardware – operations are implemented by logic gates  Advantages  Speed  RISC designs are simple and typically implemented in hardware Chapter 8: CPU and Memory: 8-37 Microprogrammed Implementation  Microcode are tiny programs stored in ROM that replace CPU instructions  Advantages  More flexible  Easier to implement complex instructions  Can emulate other CPUs  Disadvantage  Requires more clock cycles Chapter 8: CPU and Memory: 8-38 ... Chapter 8: CPU and Memory: 8-20 Step-by-Step Use of Cache Chapter 8: CPU and Memory: 8-21 Step-by-Step Use of Cache Chapter 8: CPU and Memory: 8-22 Performance Advantages  Hit ratios of 90% common... and compilers follow guidelines to ensure parallel execution of instructions Chapter 8: CPU and Memory: 8-11 Paging  Managed by the operating system  Built into the hardware  Independent of. .. infrequently used by programmers and compilers  Memory references, loads and stores, are slow and account for a significant fraction of all instructions  Procedure and function calls are a major

Ngày đăng: 10/01/2018, 16:23