Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 13 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
13
Dung lượng
133,34 KB
Nội dung
University of Washington Sec3on 7: Memory and Caches ¢ ¢ ¢ ¢ ¢ Cache basics Principle of locality Memory hierarchies Cache organiza3on Program op3miza3ons that consider caches Cache Organiza3on University of Washington General Cache Organiza3on (S, E, B) E = 2e lines per set set line S = 2s sets v valid bit tag B-‐1 cache size: S x E x B data bytes B = 2b bytes of data per cache line (the data block) Cache Organiza3on University of Washington Cache Read E = 2e lines per set • Locate set • Check if any line in set has matching tag • Yes + line valid: hit • Locate data star?ng at offset Address of byte in memory: S = 2s sets t bits s bits b bits tag set block index offset data begins at this offset v valid bit tag B-‐1 B = 2b bytes of data per cache line (the data block) Cache Organiza3on University of Washington Example: Direct-‐Mapped Cache (E = 1) Direct-‐mapped: One line per set Assume: cache block size 8 bytes v tag v tag v tag v tag S = 2s sets Cache Organiza3on Address of int: t bits 0…01 find set 100 University of Washington Example: Direct-‐Mapped Cache (E = 1) Direct-‐mapped: One line per set Assume: cache block size 8 bytes Address of int: valid? + match?: yes = hit v tag t bits block offset Cache Organiza3on 0…01 100 University of Washington Example: Direct-‐Mapped Cache (E = 1) Direct-‐mapped: One line per set Assume: cache block size 8 bytes Address of int: valid? + match?: yes = hit v tag t bits block offset int (4 Bytes) is here No match: old line is evicted and replaced Cache Organiza3on 0…01 100 University of Washington E-‐way Set-‐Associa3ve Cache (Here: E = 2) E = 2: Two lines per set Assume: cache block size 8 bytes Address of short int: t bits v tag v tag v tag v tag v tag v tag v tag v tag Cache Organiza3on 0…01 100 find set University of Washington E-‐way Set-‐Associa3ve Cache (Here: E = 2) E = 2: Two lines per set Assume: cache block size 8 bytes Address of short int: t bits compare both valid? + match: yes = hit v tag tag v tag block offset Cache Organiza3on 0…01 100 University of Washington E-‐way Set-‐Associa3ve Cache (Here: E = 2) E = 2: Two lines per set Assume: cache block size 8 bytes Address of short int: t bits compare both 0…01 valid? + match: yes = hit v tag v tag block offset short int (2 Bytes) is here No match: • One line in set is selected for evic3on and replacement • Replacement policies: random, least recently used (LRU), … Cache Organiza3on 100 University of Washington Types of Cache Misses ¢ Cold (compulsory) miss § Occurs on first access to a block ¢ Conflict miss § Most hardware caches limit blocks to a small subset (some7mes just one) of the available cache slots § if one (e.g., block i must be placed in slot (i mod size)), direct-‐mapped § if more than one, n-‐way set-‐associa7ve (where n is a power of 2) § Conflict misses occur when the cache is large enough, but mul7ple data objects all map to the same slot § e.g., referencing blocks 0, 8, 0, 8, would miss every 7me  Capacity miss Đ Occurs when the set of ac7ve cache blocks (the working set) is larger than the cache (just won’t fit) Cache Organiza3on University of Washington What about writes?  Mul3ple copies of data exist: Đ L1, L2, possibly L3, main memory ¢ What is the main problem with that? Cache Organiza3on University of Washington What about writes? ¢ Mul3ple copies of data exist: § L1, L2, possibly L3, main memory ¢ What to do on a write-‐hit? § Write-‐through (write immediately to memory) § Write-‐back (defer write to memory un7l line is evicted) Đ Â Need a dirty bit to indicate if line is different from memory or not What to do on a write-‐miss? § Write-‐allocate (load into cache, update line in cache) Good if more writes to the loca7on follow § No-‐write-‐allocate (just write immediately to memory) § ¢ Typical caches: § Write-‐back + Write-‐allocate, usually § Write-‐through + No-‐write-‐allocate, occasionally Cache Organiza3on University of Washington Intel Core i7 Cache Hierarchy Processor package Core Core Regs L1 d-cache Regs L1 i-cache L1 d-cache … L2 unified cache L1 i-cache L2 unified cache L3 unified cache (shared by all cores) Main memory Cache Organiza3on L1 i-‐cache and d-‐cache: 32 KB, 8-‐way, Access: 4 cycles L2 unified cache: 256 KB, 8-‐way, Access: 11 cycles L3 unified cache: MB, 16-‐way, Access: 30-‐40 cycles Block size: 64 bytes for all caches ... occasionally Cache Organiza3on University of Washington Intel Core i7 ? ?Cache Hierarchy Processor package Core Core Regs L1 d -cache Regs L1 i -cache L1 d -cache … L2 unified cache L1 i -cache. .. L2 unified cache L3 unified cache (shared by all cores) Main memory Cache Organiza3on L1 i-? ?cache and d-? ?cache: 32 KB, 8-‐way, Access: 4 cycles L2 unified ? ?cache: 256... per ? ?cache line (the data block) Cache Organiza3on University of Washington Example: Direct-‐Mapped ? ?Cache (E = 1) Direct-‐mapped: One line per set Assume: cache