Author:
|
Terboven, Christian; Hahnfeld, Jonas; Teruel, Xavier; Mateo, Sergi; Duran, Alejandro; Klemm, Michael; Olivier, Stephen L.; Supinski, Bronis R.
|
Abstract:
|
OpenMP tasking supports parallelization of irregular algorithms. Recent OpenMP specifications extended tasking to increase functionality and to support optimizations, for instance with the taskloop construct. However, task scheduling remains opaque, which leads to inconsistent performance on NUMA architectures. We assess design issues for task affinity and explore several approaches to enable it. We evaluate these proposals with implementations in the Nanos++ and LLVM OpenMP runtimes that improve performance up to 40 % and significantly reduce execution time variation. |
Abstract:
|
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energys National Nuclear Security Administration under contract DE-AC04-94AL85000.
This work has been developed with the support of the grant SEV-2011-00067 of the
Severo Ochoa Program, awarded by the Spanish Government, by the Spanish Ministry
of Science and Innovation (TIN2015-65316-P, Computacion de Altas Prestaciones VII)
and by the Intel-BSC Exascale Lab collaboration project.
Some of the experiments were performed with computing resources granted by JARA-
HPC from RWTH Aachen University under project jara0001. Parts of this work were
funded by the German Federal Ministry of Research and Education (BMBF) under grant
numbers 01IH13008A(ELP).
Intel and Xeon are trademarks or registered trademarks of Intel Corporation or its
subsidiaries in the United States and other countries.
* Other names and brands are the property of their respective owners.
Software and workloads used in performance tests may have been optimized for per-
formance only on Intel microprocessors. Performance tests, such as SYSmark and Mobile-Mark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance.
Intel’s compilers may or may not optimize to the same degree for non-Intel micro-
processors for optimizations that are not unique to Intel microprocessors. These opti-
mizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the
specific instruction sets covered by this notice. |