1. Trang chủ
  2. » Công Nghệ Thông Tin

Advanced Computer Architecture - Lecture 27: Memory hierarchy design

54 8 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Memory Hierarchy Design
Người hướng dẫn Prof. Dr. M. Ashraf Chughtai
Trường học mac/vu
Chuyên ngành advanced computer architecture
Thể loại lecture
Định dạng
Số trang 54
Dung lượng 1,61 MB

Nội dung

Advanced Computer Architecture - Lecture 27: Memory hierarchy design. This lecture will cover the following: cache design techniques; cache performance metrics; cache designs; addressing techniques; CPU clock cycles; memory stall cycles; block size tradeoff; categories of cache organization;...

CS 704 Advanced Computer Architecture Lecture 27 Memory Hierarchy Design (Cache Design Techniques) Prof Dr M Ashraf Chughtai Today’s Topics Recap: Caching and Locality Cache Performance Metrics Cache Designs Addressing Techniques Summary MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) Recap: Memory Hierarchy Principles High speed storage at the cheapest cost per byte Different types of memory modules are organize in hierarchy, based on the: Concept of Caching Principle of Locality MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) Recap: Concept of Caching A small, fastest and most expensive storage be used as the staging area or temporary-place – store frequently-used subset of the data or instructions from the relatively cheaper, larger and slower memory; and – To avoid having to go to the main memory every time this information is needed MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) Recap: Principle of Locality principle of locality To obtain data or instructions of a program, the processor access a relatively small portion of the address space at any instant of time MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) Recap: Types of Locality There are two different types of locality Temporal locality Spatial locality MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) Recap: Working of Memory Hierarchy ― the memory hierarchy will keep the more recently accessed data items closer to the processor because chances are the processor will access them again soon MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) Recap: Working of Memory Hierarchy Cont’d • NOT ONLY we move the item that has just been accessed closer to the processor, but we ALSO move the data items that are adjacent to it MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) Recap: Cache Devices Cache device is a small SRAM which is made directly accessible to the processor Cache sits between normal main memory and CPU as data and instruction caches and may be located on CPU chip or as a module Data transfer between cache - CPU, and cache- Main memory is performed by the cache controller Cache and main memory is organized in equal sized blocks MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) Recap: Cache/Main Memory Data Transfer An address-tag is associated with each cache block that defines the relationship of the cache block with the higher-level memory (say main memory) Data Transfer between CPU and Caches takes place as the word transfer Data transfer between Cache and the Main memory takes place as the block transfer MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 10 Associative Mapping: Address Structure Word bit Tag 22 bit 22 bit tag stored with each 32 bit block of data Compare tag field with tag entry in cache to check for hit Least significant bits of address identify which 16 bit word is required from 32 bit data block e.g – Address FFFFFC MAC/VU-Advanced Computer Architecture Tag 3FFFFC Data 24682468 Lecture 27 Memory Hierarchy (3) Cache line 3FFF 40 Fully Associative 31 Cache Tag (27 bits long) Byte Select Ex: 0x01 Valid Bit  Cache Data X Byte 31 X Byte 63 : :  Cache Tag Byte 1 Byte 0 Byte 33 Byte 32 X X X MAC/VU-Advanced Computer Architecture : : Lecture 27 Memory Hierarchy (3) : 41 Fully Associative Cache Organization The address is sent to all entries at once and compared in parallel and only the one that matches are sent to the output This is called an associative lookup Hardware intensive Fully associative cache is limited to 64 or less entries MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 42 Fully Associative Cache Organization Conflict miss is zero for a fully associative cache Assume we have 64 entries here The first 64 items we accessed can fit in But when we try to bring in the 65th item, we will need to throw one of them out to make room for the new item This bring us to the cache misses of type Capacity Miss MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 43 Set Associative Mapping Summary Address length = (s + w) bits Number of addressable units = 2s+w words or bytes Block size = line size = 2w words or bytes Number of blocks in main memory = 2d Number of lines in set = k Number of sets = v = 2d Number of lines in cache = kv = k * 2d Size of tag = (s – d) bits MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 44 Set Associative Mapping Cache is divided into a number of sets Each set contains a number of lines A given block maps to any line in a given set – e.g Block B can be in any line of set i e.g lines per set – way associative mapping – A given block can be in one of lines in only one set MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 45 Set Associative Cache This organization allows to place a block in a restricted set of places in the cache, where a set is a group of blocks in the cache at each index value Here a block is first mapped onto a set (i.e., mapped at an index value and then it can be placed anywhere within that set The set is usually chosen by bit-selection; i.e., (block address) MOD (Number of sets in cache) If there are n-blocks in a set, the cache placement is referred to as the n-way set associative MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 46 Set Associative Mapping Address Structure Tag bit Set 13 bit Word bit Use set field to determine cache set to look in Compare tag field to see if we have a hit e.g – Address – 1FF 7FFC – 001 7FFC MAC/VU-Advanced Computer Architecture Tag Data Set number 1FF 12345678 1FFF 001 11223344 1FFF Lecture 27 Memory Hierarchy (3) 47 Two Way Set Associative Mapping Example MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 48 Two-way Set Associative Cache Let us consider this example 2-way set Associative Cache Here, two cache entries are possible for each index i.e., two direct mapped caches are working in parallel Valid Cache Tag : : Adr Tag Compare Cache Index Cache Data Cache Data Cache Block 0 Cache Block 0 : : Sel1 OR MAC/VU-Advanced Computer Architecture Hit Mux Sel0 Cache Tag : Valid : Compare Cache Block Lecture 27 Memory Hierarchy (3) 49 Working of Two-way Set Associative Cache Let us see how it works? ─ the cache index selects a set from the cache The two tags in the set are compared in parallel with the upper bits of the memory address ─ If neither tag matches the incoming address tag, we have a cache miss ─ Otherwise, we have a cache hit and we will select the data on the side where the tag matches occur This is simple enough What is its disadvantages? MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 50 Disadvantage of Set Associative Cache First of all, a N-way set associative cache will need N comparators instead of just one comparator (use the right side of the diagram for direct mapped cache) A N-way set associative cache will also be slower than a direct mapped cache because of this extra multiplexer delay Finally, for a N-way set associative cache, the data will be available AFTER the hit/miss signal becomes valid because the hit/miss is needed to control the data MUX MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 51 Disadvantage of Set Associative Cache For a direct mapped cache, that is everything before the MUX on the right or left side, the cache block will be available BEFORE the hit/miss signal (AND gate output) because the data does not have to go through the comparator This can be an important consideration because the processor can now go ahead and use the data without knowing if it is a Hit or Miss MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 52 Disadvantage of Set Associative Cache If it assumes that it is a hit; it will be ahead by 90% of the time as cache hit rate is in the upper 90% range, and for other 10% of the time that it is wrong, just make sure that it can recover We cannot play this speculation game with a N-way set - associative cache because as we said earlier, the data will not be available to until the hit/miss signal is valid MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 53 Allah Hafiz And Aslam-U-Alacum MAC/VU-Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 54 ... MAC/VU -Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 20 Block Size Tradeoff: Miss Penalty Miss Penalty Block Size MAC/VU -Advanced Computer Architecture Lecture 27 Memory Hierarchy. .. every entry MAC/VU -Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 37 Fully Associative Cache Organization MAC/VU -Advanced Computer Architecture Lecture 27 Memory Hierarchy (3) 38... MAC/VU -Advanced Computer Architecture Tag Data Set number 1FF 12345678 1FFF 001 11223344 1FFF Lecture 27 Memory Hierarchy (3) 47 Two Way Set Associative Mapping Example MAC/VU -Advanced Computer Architecture

Ngày đăng: 05/07/2022, 11:54