Abstract:
|
Portable devices often demand powerful processors to run computing intensive applications, such as video playing or gaming, and ultra low en-ergy consumption to extend device uptime. Such con-flicting requirements are hard to fulfil and appeal for adaptive hardware that only consumes energy when required. LP-NUCA is a tiled cache organization aimed at high-performance low-power embedded processors that sequentially looks up for blocks ordered by tem-poral locality in groups of small tiles. Unfortunately, LP-NUCA has two main dynamic energy wasting sources: (a) blocks are continuously migrating among tiles even in low locality phases, (b) to reduce cache latency, the tag and data arrays of the tiles are always accessed in parallel. This paper proposes a learning-based controller that dynamically tunes block migration and cache ac-cess policy between parallel and serial. During low temporal locality phases the controller drops blocks from the LP-NUCA root tile, L1, and forces a serial access to the tag and data arrays in the tiles, thus reducing the energy waste. Using a cycle-accurate simulator and energy estimations derived from an LP-NUCA layout, the proposed controller reduces dy-namic energy by 20% on average for single and multi-thread workloads. |