ECE391 Ch1 Basics of Computer Systems pptx 8292022 1 1 History of CPUs 2 Intel x86 Processors 3 ARM processors 4 Memory 5aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Computer Software 1 Ho Chi Minh City University of Technology Department of.
8/29/2022 Ho Chi Minh City University of Technology Department of Electrical and Electronics History of CPUs Intel x86 Processors ARM processors Memory Computer Software 1 History of CPUs 1950s: Ferranti Mark 1, 1951: from University of Manchester single 80-bit accumulator , the 40-bit "multiplicand/quotient register" UNIVAC I (UNIVersal Automatic Computer I) designed principally by J Presper Eckert and John Mauchly, the inventors of the ENIAC 1,905 operations per second running on a 2.25 MHz clock IBM 704 in 1957: Ferranti Mark 1, c 1951 Ferranti Mark 1, c 1951 An IBM 704 computer at NACA in 1957 8/29/2022 History of CPUs 1960s: IBM System/360 (S/360): 34,500 instructions per second, with memory from to 64 KB PDP-11: developed by Digital Equipment Corporation 32 bit processor, allow MB of physical memory Motorola 68000: Initial speed grades were 4, 6, and MHz 68k instruction set IBM System/360 PDP-11/40 Motorola MC68000 History of CPUs 1970s: Intel 4004 (1971): a single instruction cycle was 10.8 microseconds Clock rate is MHz Intel 8008 (1972)/ 8080(1974)/8086(1976): 8-bit CPU with an external 14-bit address 8008 clock frequency: 0.2 - 0.8MHz 8080 clock frequency: MHz 8086 clock frequency : 5-10MHz 32-bit VAX (1977): based on DEC's earlier PDP-11, support virtual memory Intel 4004 Intel 8088 Intel 8086 8/29/2022 A Brief History of Computer Link YouTube: https://www.youtube.com/watch?v=iK0PT5q7GlE Intel x86 Processors Dominate laptop/desktop/server market Evolutionary design Backwards compatible up until 8086, introduced in 1978 Added more features as time goes on Complex instruction set computer (CISC) Many different instructions with many different formats But, only small subset encountered with Linux programs Hard to match performance of Reduced Instruction Set Computers (RISC) But, Intel has done just that! In terms of speed Less so for low power 8/29/2022 Intel x86 Evolution: Milestones Name Date Transistors MHz 8086 1978 29K 5-10 First 16-bit Intel processor Basis for IBM PC & DOS 1MB address space 386 1985 275K 16-33 First 32 bit Intel processor , referred to as IA32 16 bit data path Added “flat addressing”, capable of running Unix 486 32-bit register, 32-bit data 486DX include FPU (Floating Point Unit) Pentium 4E 2004 125M 2800-3800 First 64-bit Intel x86 processor, referred to as x86-64 Core 2006 291M 1060-3500 First multi-core Intel processor Core i3, i5, i7 2008 731M 1700-3900 Two cores / four cores Intel x86 Processors, cont Machine Evolution 386 1985 0.3M Pentium 1993 3.1M 4.5M Pentium/MMX 1997 Pentium Pro 1995 6.5M Pentium III 1999 8.2M Pentium 2001 42M 2006 291M Core Duo Core i7 2008 731M Added Features Instructions to support multimedia operations Instructions to enable more efficient conditional operations Transition from 32 bits to 64 bits More cores 8/29/2022 2015 State of the Art Core i7 Broadwell 2015 Desktop Model cores Integrated graphics 3.3-3.8 GHz 65W Server Model cores Integrated I/O 2-2.6 GHz 45W Intel x86 Processors 8086 processor 40 pin dual in-line package 16-bit wide data bus 16-bit registers 20-bit external address bus provides a MB physical address space The maximum linear address space is limited to 64 KB Max CPU clock: 5- 10 MHz 10 8/29/2022 CPU - x86 Processor CPU, memory, input/output devices Instruction set, interfacing C to assembly, macros, stack frame and calling convention Interrupt, exception 11 The architecture of 8086 microprocessor major units: BIU - Bus Interface Unit: bus interface, segment registers, fetch queue EU - Execution Unit: control unit, ALU, registers 12 8/29/2022 x86 Processors - 8086 Instructions: One-address or two addresses operations Support Assembly and high-level programming language (C, Pascal) Main registers: are called data register or general register 16 bit data Can be accessed by 8-bit registers AH AL AX (primary accumulator) BH BL BX (base, accumulator) CH CL CX (counter, accumulator) DH DL DX (accumulator, other functions 13 x86 Processors - 8086 Index registers: for addressing SI Source Index DI Destination Index BP Base Pointer SP Stack Pointer IP Instruction Pointer CS Code Segment DS Data Segment ES Extra Segment SS Stack Segment Program counter: Segment registers: 14 8/29/2022 x86 Processors - 8086 Segment registers: a way to allow programs to address more than 64 KB the registers CS, DS, SS, and ES point to the currently used program code segment (CS), the current data segment (DS), the current stack segment (SS), and one extra segment determined by the programmer (ES) CS Code Segment DS Data Segment ES Extra Segment SS Stack Segment 0110 1000 1000 0111 0000 Segment, 16 bits, shifted bits left + Offset, 16 bits Address, 20 bits 0011 0100 1010 1001 0110 1011 1101 0001 1001 15 x86 Processors - 8086 Examples for x86 memory segmentation 16 8/29/2022 x86 Processors - 8086 x86-32: 80386, 80486 Register extend to 32-bit EAX EBX ECX, EDX ESI, EDI, EBP, ESP, EIP, EFLAGS Two new segment registers (FS and GS) were added FS, GS is extra data for segment registers x86-64: AMD64, Core i5, Core i7, An R-prefix identifies the 64-bit registers (RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP, RFLAGS, RIP) Add eight additional 64-bit general registers (R8-R15) 17 Some History: IA32 Registers general purpose Origin (mostly obsolete) %eax %ax %ah %al accumulate %ecx %cx %ch %cl counter %edx %dx %dh %dl data %ebx %bx %bh %bl base %esi %si source index %edi %di destination index %esp %sp %ebp %bp stack pointer base pointer 16-bit virtual registers (backwards compatibility) 18 8/29/2022 x86-64 Integer Registers %rax %eax %r8 %r8d %rbx %ebx %r9 %r9d %rcx %ecx %r10 %r10d %rdx %edx %r11 %r11d %rsi %esi %r12 %r12d %rdi %edi %r13 %r13d %rsp %esp %r14 %r14d %rbp %ebp %r15 %r15d Can reference low-order bytes (also low-order & bytes) 19 ARM Processors • ARM (Acorn RISC Machine) started as a new, powerful, CPU design for the replacement of the 8-bit 6502 in Acorn Computers (Cambridge, UK, 1985) • First models had only a 26-bit program counter, limiting the memory space to 64 MB (not too much by today standards, but a lot at that time) • 1990 spin-off: ARM renamed Advanced RISC Machines 20 10 8/29/2022 ARM Processors • ARM now focuses on Embedded CPU cores • IP licensing: Almost every silicon manufacturer sells some microcontroller with an ARM core Some even compete with their own designs • Processing power with low current consumption • Good MIPS/Watt figure • Ideal for portable devices • Compact memories: 16-bit opcodes (Thumb) • New cores with added features • Harvard architecture (ARM9, ARM11, Cortex) • Floating point arithmetic • Vector computing • Java language 21 ARM Processors • 32-bit CPU, Harvard architecture • 3-operand instructions (typical): ADD Rd,Rn,Operand2 • RISC design: • Few, simple, instructions • Load/store architecture (instructions operate on registers, not memory) • Large register set • Pipelined execution 22 11 8/29/2022 Von Neumann Harvard ARM9s and newers ARM7s and olders Inst Data AHB bus MEMORY I D Cache Cache & I/O Bus Interface AHB bus Memory-mapped I/O: • • No specific instructions for I/O (use Load/Store instr instead) Peripheral’s registers at some memory addresses MEMORY & I/O 23 ARM7TDMI Pipeline FETCH DECODE EXECUTE Reg Read Shift ALU Reg Write Clock cycle ARM9TDMI Pipeline FETCH DECODE Reg Read EXECUTE Shift ALU MEMORY access WRITE Reg Write Clock cycle • Fetch: Read Op-code from memory to internal Instruction Register • Decode: Activate the appropriate control lines depending on Opcode • Execute: Do the actual processing 24 12 8/29/2022 FETCH DECODE EXECUTE FETCH DECODE EXECUTE FETCH DECODE EXECUTE instruction time • Simple instructions (like ADD) Complete at a rate of one per cycle 25 • More complex instructions: ADD STR ADD ADD FETCH DECODE EXECUTE FETCH DECODE FETCH Cal ADDR Data Xfer stall DECODE EXECUTE FETCH stall DECODE EXECUTE FETCH DECODE ADD EXECUTE instruction time STR : effective clock cycles (+1 cycle) 26 13 8/29/2022 Data Sizes and Instruction Sets The ARM is a 32-bit architecture When used in relation to the ARM: Byte means bits Halfword means 16 bits (two bytes) Word means 32 bits (four bytes) Most ARM’s implement two instruction sets 32-bit ARM Instruction Set 16-bit Thumb Instruction Set 27 Processor Modes The ARM has seven operating modes: User : unprivileged mode under which most tasks run FIQ : entered when a high priority (fast) interrupt is raised IRQ : entered when a low priority (normal) interrupt is raised SVC : (Supervisor) entered on reset and when a Software Interrupt instruction is executed Abort : used to handle memory access violations Undef : used to handle undefined instructions System : privileged mode using the same registers as user mode 28 14 8/29/2022 The Registers ARM has 37 registers all of which are 32-bits long dedicated program counter dedicated current program status register dedicated saved program status registers 30 general purpose registers The current processor mode governs which of several banks is accessible Each mode can access a particular set of r0-r12 registers a particular r13 (the stack pointer, sp) and r14 (the link register, lr) the program counter, r15 (pc) the current program status register, cpsr Privileged modes (except System) can also access a particular spsr (saved program status register) 29 The ARM Register Set Current Visible Registers Abort Mode Undef SVC Mode IRQ FIQ User Mode Mode Mode r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 (sp) r14 (lr) r15 (pc) cpsr spsr Banked out Registers User, User SYS FIQ IRQ SVC Undef Abort r8 r9 r10 r11 r12 r13 (sp) r14 (lr) r8 r9 r10 r11 r12 r13 (sp) r14 (lr) r13 (sp) r14 (lr) r13 (sp) r14 (lr) r13 (sp) r14 (lr) r13 (sp) r14 (lr) spsr spsr spsr spsr spsr 30 15 8/29/2022 Special Registers Special function registers: PC (R15): Program Counter Any instruction with PC as its destination register is a program branch LR (R14): Link Register Saves a copy of PC when executing the BL instruction (subroutine call) or when jumping to an exception or interrupt routine - It is copied back to PC on the return from those routines SP (R13): Stack Pointer There is no stack in the ARM architecture Even so, R13 is usually reserved as a pointer for the program-managed stack CPSR : Current Program Status Register Holds the visible status register SPSR : Saved Program Status Register Holds a copy of the previous status register while executing exception or interrupt routines - It is copied back to CPSR on the return from the exception or interrupt - No SPSR available in User or System modes 31 Memory Memory - Purpose of memory is data storage Two major types of memory : Primary memory - to hold data and instructions during processing eg RAM Relatively limited capacity and volatile Secondary memory - to provide permanent long term storage eg hard disk High capacity and non-volatile RAM banks Hard disk NAND flash chip 32 16 8/29/2022 Memory Primary memory consists of a set of locations defined by sequentially numbered addresses Each location contains a binary number that can be interpreted as data or an instruction 8086 uses 20-bit physical address Manage 1MB of memory 80386 uses 32-bit physical address Manage 4GB of memory X86-64 uses 64-bit physical address Manage ??? of memory 33 u Memory locations are called words Words are bits (one byte) in size, or a multiple of Common word sizes are 16, 32 and 64 bits 1 0 0 1 1 0 1 0 0 0 1 0 1 1 1 1 0 1 Memory locations, using an bit word 34 17 8/29/2022 Memory Memory is commonly measured in multiples of bits and bytes bit = binary digit (0 or 1) 1 byte = bits 1KB = 1024 bytes = 210 1MB = 1024 KB= 220 1GB = 1024 MB = 230 1TB = 1024 GB = 240 35 Big Endian vs Little Endian • x86 processors are little-endian • IBM z/Architecture mainframes are big-endian processors Big Endian (Others) Register FE ED FA Little Endian (Intel) High Memory Addresses CE 00 00 CE FA ED FE 0x5 0x4 0x3 0x2 0x1 0x0 Register 00 00 FE ED FA CE FE ED FA CE Low Memory Addresses 36 18 8/29/2022 Computer Software Assembly/Machine Code View CPU Registers Memory Addresses Code Data Stack Data PC Condition Codes Instructions Programmer-Visible State PC: Program counter Memory Address of next instruction Called “RIP” (x86-64) Register file Heavily used program data Byte addressable array Code and user data Stack to support procedures Condition codes Store status information about most recent arithmetic or logical operation Used for conditional branching 37 Computer Software Compiling Into Assembly C Code (sum.c) long plus(long x, long y); void sumstore(long x, long y, long *dest) { long t = plus(x, y); *dest = t; } Generated x86-64 Assembly sumstore: pushq movq call movq popq ret %rbx %rdx, %rbx plus %rax, (%rbx) %rbx Obtain (on shark machine) with command gcc –Og –S sum.c Produces file sum.s Warning: Will get very different results on non-Shark machines (Andrew Linux, Mac OS-X, …) due to different versions of gcc and different compiler settings 38 19 8/29/2022 Quiz 1) Pick the correct choice for the 8086 CPU A 16 bit word size, bit data path B bit word size, bit data path C 16 bit word size, 16 bit data path D bit word size, bit data path E bit word size, 16 bit data path 2) Pick the correct choice for the 80386SX CPU A 16 bit word size, 16 bit data path B 32 bit word size, 16 bit data path C bit word size, 32 bit data path D 32 bit word size, bit data path E 32 bit word size, 32 bit data path 3) Pick the correct choice for the 80486DX CPU A 32 bit word size, 16 bit data path B 64 bit word size, 32 bit data path C 32 bit word size, 32 bit data path D 32 bit word size, 16 bit data path E 32 bit word size, 64 bit data path 39 Quiz 4) What is the first CPU to include an internal math coprocessor? A 386DX B 486SX C 486DX D Pentium 5) What are the two main components of the CPU? A The Control Unit and ALU B The Registers and Output/Input management C The ALU and FPU 6) What are the two main desktop CPU manufacturers? A Intel and AMD B Via and Power PC C Marek and Sun UltraSparc 7) What are the 32-bit data when we read a double-word at the address 0x4000 with Big Endian mode? A 0xAC7E652F B 0x2F657EAC C 0xCAE756F2 Address Content 0x4000 2F 0x4001 65 0x4002 7E 0x4003 AC 40 20 8/29/2022 Quiz 8) Pick the correct choice for the ARM processor A 16 bit word size, 16 bit data path B 32 bit word size, 16 bit data path C bit word size, 32 bit data path D 32 bit word size, bit data path E 32 bit word size, 32 bit data path 9) Pick the wrong choice for ARM architecture A Von Neumann architecture B Harvard architecture C stage pipeline architecture D 32-bit ARM Instruction Set 10) Pick the wrong choice for ARM registers A ARM has 37 32-bit registers B There are 13 general purpose registers C R13 is Stack Pointer D R14 is the program counter 41 Exercises Suppose that you discover that RAM addresses 000C0000 to 000C7FFF are reserved for a PC’s video adapter How many bytes of memory is this? Suppose that you have an Intel 8086 Find the five-hex-digit address that corresponds to each of these segment:offset pairs: (a) 2B8C:8D21 (b) 059A:7A04 (c) 1234:5678 In an 8086 program, suppose that the data segment register DS contains the segment number 23D1 and that an instruction fetches a word at offset 7B86 in the data segment What is the five-hex-digit address of the word that is fetched? In an 8086 program, suppose that the code segment register CS contains the segment number 014C and that the instruction pointer IP contains 15FE What is the five-hex-digit address of the next instruction to be fetched? What are advantages and disadvantage of secondary memory? 42 21