Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Barcelona Supercomputing Center
Universitat Politècnica de Catalunya. VIRTUOS - Virtualisation and Operating Systems
2020-03-01
Some high performance computing (HPC) applications exhibit increasing real-time requirements, which call for effective means to predict their high execution times distribution. This is a new challenge for HPC applications but a well-known problem for real-time embedded applications where solutions already exist, although they target low-performance systems running single-threaded applications. In this paper, we show how some performance validation and measurement-based practices for real-time execution time prediction can be leveraged in the context of HPC applications on high-performance platforms, thus enabling reliable means to obtain real-time guarantees for those applications. In particular, the proposed methodology uses coordinately techniques that randomly explore potential timing behavior of the application together with Extreme Value Theory (EVT) to predict rare (and high) execution times to, eventually, derive probabilistic Worst-Case Execution Time (pWCET) curves. We demonstrate the effectiveness of this approach for an acoustic wave inversion application used for geophysical exploration
This research was funded by the Horizon 2020 Framework Programme, grant number 801137, project RECIPE
Peer Reviewed
Postprint (published version)
Article
Inglés
Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors; High performance computing; Embedded computer systems; Real-time data processing; WCET; Probabilistic timing analysis; Randomization; Measurement-based; HPC applications; Càlcul intensiu (Informàtica); Ordinadors immersos, Sistemes d'; Temps real (Informàtica)
Multidisciplinary Digital Publishing Institute (MDPI)
https://www.mdpi.com/2227-7390/8/3/314#cite
info:eu-repo/grantAgreement/EC/H2020/801137/EU/REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems/RECIPE
https://creativecommons.org/licenses/by/4.0/
Open Access
Attribution 4.0 International
E-prints [72987]