To access the full text documents, please follow this link: http://hdl.handle.net/2117/125141

The dark side of DNN pruning
Yazdani Aminabadi, Reza; Arnau Montañés, José María; González Colás, Antonio María; Riera Villanueva, Marc
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors; Universitat Politècnica de Catalunya. Departament d'Enginyeria Minera, Industrial i TIC; Universitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors; Universitat Politècnica de Catalunya. CERCLE - Cercle d'Arquitectura
DNN pruning has been recently proposed as an effective technique to improve the energy-efficiency of DNN-based solutions. It is claimed that by removing unimportant or redundant connections, the pruned DNN delivers higher performance and energy-efficiency with negligible impact on accuracy. However, DNN pruning has an important side effect: it May reduce the confidence of DNN predictions. We show that, although top-1 accuracy May be maintained with DNN pruning, the likelihood of the class in the top-1 is significantly reduced when using the pruned models. For applications such as Automatic Speech Recognition (ASR), where the DNN scores are consumed by a successive stage, the workload of this stage can be dramatically increased due to the loss of confidence in the DNN. An ASR system consists of a DNN for computing acoustic scores, followed by a Viterbi beam search to find the most likely sequence of words. We show that, when pruning the DNN model used for acoustic scoring, the Word Error Rate (WER) is maintained but the execution time of the ASR system is increased by 33%. Although pruning improves the efficiency of the DNN, it results in a huge increase of activity in the Viterbi search since the output scores of the pruned model are less reliable. Based on this observation, we propose a novel hardware-based ASR system that effectively integrates a DNN accelerator for pruned models with a Viterbi accelerator. In order to avoid the aforementioned increase in Viterbi search workload, our system loosely selects the N-best hypotheses at every time step, exploring only the N most likely paths. To avoid an expensive sort of the hypotheses based on their likelihoods, our accelerator employs a set-associative hash table to keep track of the best paths mapped to each set. In practice, this solution approaches the selection of N-best, but it requires much simpler hardware. Our approach manages to efficiently combine both DNN pruning and Viterbi search, and achieves 9x energy savings and 4.2x speedup with respect to the state-of-the-art ASR solutions.
Peer Reviewed
-Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
-Computer architecture
-Automatic speech recognition (ASR)
-Deep learning
-DNN pruning
-Hardware accelerator
-Viterbi search
-Computer architecture
-Computer hardware
-Deep learning
-Deep neural networks
-Energy conservation
-Energy efficiency
-Hardware
-Viterbi algorithm
-Automatic speech recognition
-DNN pruning
-Hardware accelerators
-N-best hypothesis
-Redundant connections
-Solution approach
-State of the art
-Viterbi search
-Speech recognition
-Arquitectura d'ordinadors
Article - Published version
Conference Object
Institute of Electrical and Electronics Engineers (IEEE)
         

Show full item record

Related documents

Other documents of the same author

Yazdani Aminabadi, Reza; Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María
Yazdani Aminabadi, Reza; Segura Salvador, Albert; Arnau Montañés, José María; González Colás, Antonio María
Yazdani Aminabadi, Reza; Arnau Montañés, José María; González Colás, Antonio María
Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María
Riera Villanueva, Marc; Arnau Montañés, José María; González Colás, Antonio María
 

Coordination

 

Supporters