Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/2117/104446

Coarse grain parallelization of deep neural networks
González Tallada, Marc
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors; Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Deep neural networks (DNN) have recently achieved extraordinary results in domains like computer vision and speech recognition. An essential element for this success has been the introduction of high performance computing (HPC) techniques in the critical step of training the neural network. This paper describes the implementation and analysis of a network-agnostic and convergence-invariant coarse-grain parallelization of the DNN training algorithm. The coarse-grain parallelization is achieved through the exploitation of the batch-level parallelism. This strategy is independent from the support of specialized and optimized libraries. Therefore, the optimization is immediately available for accelerating the DNN training. The proposal is compatible with multi-GPU execution without altering the algorithm convergence rate. The parallelization has been implemented in Caffe, a state-of-the-art DNN framework. The paper describes the code transformations for the parallelization and we also identify the limiting performance factors of the approach. We show competitive performance results for two state-of-the-art computer vision datasets, MNIST and CIFAR-10. In particular, on a 16-core Xeon E5-2667v2 at 3.30GHz we observe speedups of 8x over the sequential execution, at similar performance levels of those obtained by the GPU optimized Caffe version in a NVIDIA K40 GPU.
Peer Reviewed
Àrees temàtiques de la UPC::Informàtica::Programació
Neural networks (Computer science)
Parallel programming (Computer science)
Performance
Coarse-grain parallelism
Shared memory algorithms
Deep learning
OpenMP
Stochastic gradient descent
Xarxes neuronals (Informàtica)
Programació en paral·lel (Informàtica)
info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/conferenceObject
Institute of Electrical and Electronics Engineers (IEEE)
         

Mostrar el registro completo del ítem

Documentos relacionados

Otros documentos del mismo autor/a

Ferrer, Roger; Planas Carbonell, Judit; Bellens, Pieter; Duran González, Alejandro; González Tallada, Marc; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José
Balart, J; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Sura, Z; Chen, T; Zhang, T; O'Brien, Kevin; O'Brien, Kathryn
Ferrer, Roger; Planas Carbonell, Judit; Bellens, Pieter; Duran Gonzalez, Alejandro; González Tallada, Marc; Martorell Bofill, Xavier; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José
Quintana-Ortí, Enrique S.; Planas, Judit; Pérez Cáncer, Josep Maria; Mayo, Rafael; Martorell Bofill, Xavier; Martinell, Lluis; Labarta Mancho, Jesús José; Jiménez González, Daniel; Ayguadé Parra, Eduard; Badia Sala, Rosa Maria; Bellens, Pieter; Cabrera, Daniel; Duran González, Alejandro; Ferrer, Roger; González Tallada, Marc; Igual, Francisco D.
Vujic, N; González Tallada, Marc; Martorell Bofill, Xavier; Ayguadé Parra, Eduard