Publication date

2021-01-18



Abstract

The frontend stalls caused by instruction and BTB misses are a significant source of performance degradation. Server processors commonly use prefetchers to mitigate the frontend bottleneck. However, next-line prefetchers, which are available in server processors, are incapable of eliminating many L1 instruction misses. Temporal prefetchers, on the other hand, eliminate most of the misses but impose significant area overhead. Finally, while BTB-directed prefetchers offer low area overhead, as they rely on the BTB content for prefetching, BTB misses stall the prefetcher, which likely leads to costly instruction misses. In this talk, I present a divide-and-conquer approach to address the frontend bottleneck. The proposal, named SN4L+Dis+BTB, imposes the same area overhead as the state-of-the-art BTB-directed prefetcher, and at the same time, outperforms it by 5% on average and up to 16%.

Document Type

Conference report

Language

English

Publisher

Severo Ochoa Research Seminars at BSC

Recommended citation

This citation was generated automatically.

Rights

http://creativecommons.org/licenses/by-nc-nd/4.0/

Open Access

Attribution-NonCommercial-NoDerivatives 4.0 International

This item appears in the following Collection(s)

Congressos [11159]