Data de publicació

2021-01-18



Resum

The frontend stalls caused by instruction and BTB misses are a significant source of performance degradation. Server processors commonly use prefetchers to mitigate the frontend bottleneck. However, next-line prefetchers, which are available in server processors, are incapable of eliminating many L1 instruction misses. Temporal prefetchers, on the other hand, eliminate most of the misses but impose significant area overhead. Finally, while BTB-directed prefetchers offer low area overhead, as they rely on the BTB content for prefetching, BTB misses stall the prefetcher, which likely leads to costly instruction misses. In this talk, I present a divide-and-conquer approach to address the frontend bottleneck. The proposal, named SN4L+Dis+BTB, imposes the same area overhead as the state-of-the-art BTB-directed prefetcher, and at the same time, outperforms it by 5% on average and up to 16%.

Tipus de document

Conference report

Llengua

Anglès

Publicat per

Severo Ochoa Research Seminars at BSC

Citació recomanada

Aquesta citació s'ha generat automàticament.

Drets

http://creativecommons.org/licenses/by-nc-nd/4.0/

Open Access

Attribution-NonCommercial-NoDerivatives 4.0 International

Aquest element apareix en la col·lecció o col·leccions següent(s)

Congressos [11159]