|
Abstract:
|
The increasing speed-gap between processor and memory and the limited memory bandwidth make last-level cache performance crucial for CMP architectures. Non Uniform Cache Architectures (NUCA) has been introduced to deal with this problem. This memory organization divides the whole memory space into smaller pieces or banks allowingnearer banks to have better access latencies than further banks.Moreover, an adaptive replacement policy that efficiently reduces misses in the last-level cache could boost performance, particularly if set associativity is assumed. Unfortunately,traditional replacement policies do not behave properly as they were assumed for single-processors. This paper focuses on BankReplacement. This policy involves three key decisions when there is a miss: where to place a data within the cache set, which datato evict from the cache set and finally, where to place the evicted data. We propose a novel replacement technique that enablesmore intelligent replacement decisions to be taken, based on the observation that some type of data are less commonly accesseddepending of the bank where they reside. We call this technique as LRU-PEA (Least Recently Used with a Priority EvictionApproach). We show that the proposed technique significantly reduces the requests to the off-chip memory by increasing thehit ratio in the NUCA cache. This translates into an average IPC improvement of 8% and into an Energy per Instruction (EPI) reduction of 5%. |