Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/2117/96856

POSTER: collective dynamic parallelism for directive based GPU programming languages and compilers
Ozen, Guray; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors; Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Early programs for GPU (Graphics Processing Units) acceleration were based on a flat, bulk parallel programming model, in which programs had to perform a sequence of kernel launches from the host CPU. In the latest releases of these devices, dynamic (or nested) parallelism is supported, making possible to launch kernels from threads running on the device, without host intervention. Unfortunately, the overhead of launching kernels from the device is higher compared to launching from the host CPU, making the exploitation of dynamic parallelism unprofitable. This paper proposes and evaluates the basic idea behind a user-directed code transformation technique, named collective dynamic parallelism, that targets the effective exploitation of nested parallelism in modern GPUs. The technique dynamically packs dynamic parallelism kernel invocations and postpones their execution until a bunch of them are available. We show that for sparse matrix vector multiplication, CollectiveDP outperforms well optimized libraries, making GPU useful when matrices are highly irregular.
Peer Reviewed
-Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
-Parallel processing (Electronic computers)
-Application programming interfaces (API)
-Computer graphics
-Concurrency control
-Cosine transforms
-Memory architecture
-Parallel architectures
-Parallel programming
-Program processors
-CUDA
-Graphics Processing Unit
-Languages and compilers
-Nested Parallelism
-Openacc
-openmp
-Parallel programming model
-Sparse matrix-vector multiplication
-Processament en paral·lel (Ordinadors)
Artículo - Versión publicada
Objeto de conferencia
Association for Computing Machinery (ACM)
         

Mostrar el registro completo del ítem

Documentos relacionados

Otros documentos del mismo autor/a

Ozen, Guray; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José
Ozen, Guray; Mateo, Sergi; Ayguadé Parra, Eduard; Labarta, Jesús; Beyer, James B.
Poggi Mastrokalo, Nicolas; Carrera Pérez, David; Call, Aaron; Mendoza, Sergio; Becerra Fontal, Yolanda; Torres Viñals, Jordi; Ayguadé Parra, Eduard; Gagliardi, Fabrizio; Labarta Mancho, Jesús José; Reinauer, Rob; Vujic, Nikola; Green, Daron; Blakeley, Jose
Ciesko, Jan; Mateo, Sergi; Teruel, Xavier; Beltran Querol, Vicenç; Martorell Bofill, Xavier; Badia,, R.M.; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José