<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-04-14T07:41:11Z</responseDate><request verb="GetRecord" identifier="oai:www.recercat.cat:2117/103823" metadataPrefix="oai_dc">https://recercat.cat/oai/request</request><GetRecord><record><header><identifier>oai:recercat.cat:2117/103823</identifier><datestamp>2026-01-16T08:00:09Z</datestamp><setSpec>com_2072_1033</setSpec><setSpec>col_2072_452950</setSpec></header><metadata><oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
   <dc:title>Automatic speech recognition with deep neural networks for impaired speech</dc:title>
   <dc:creator>España-i-Bonet, Cristina</dc:creator>
   <dc:creator>Rodríguez Fonollosa, José Adrián</dc:creator>
   <dc:contributor>Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions</dc:contributor>
   <dc:contributor>Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla</dc:contributor>
   <dc:subject>Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Llenguatge natural</dc:subject>
   <dc:subject>Automatic Speech Recognition</dc:subject>
   <dc:subject>Database systems</dc:subject>
   <dc:subject>Network architecture</dc:subject>
   <dc:subject>Neural networks – Speech</dc:subject>
   <dc:subject>Automatic speech recognition</dc:subject>
   <dc:subject>Deep learning</dc:subject>
   <dc:subject>Deep neural networks</dc:subject>
   <dc:subject>Dysarthria</dc:subject>
   <dc:subject>Human performance</dc:subject>
   <dc:subject>Kaldi</dc:subject>
   <dc:subject>Speaker adaptation</dc:subject>
   <dc:subject>Word error rate</dc:subject>
   <dc:subject>Reconeixement automàtic de la parla</dc:subject>
   <dc:description>The final publication is available at https://link.springer.com/chapter/10.1007%2F978-3-319-49169-1_10</dc:description>
   <dc:description>Automatic Speech Recognition has reached almost human performance in some controlled scenarios. However, recognition of impaired speech is a difficult task for two main reasons: data is (i) scarce and (ii) heterogeneous. In this work we train different architectures on a database of dysarthric speech. A comparison between architectures shows that, even with a small database, hybrid DNN-HMM models outperform classical GMM-HMM according to word error rate measures. A DNN is able to improve the recognition word error rate a 13% for subjects with dysarthria with respect to the best classical architecture. This improvement is higher than the one given by other deep neural networks such as CNNs, TDNNs and LSTMs. All the experiments have been done with the Kaldi toolkit for speech recognition for which we have adapted several recipes to deal with dysarthric speech and work on the TORGO database. These recipes are publicly available.</dc:description>
   <dc:description>Peer Reviewed</dc:description>
   <dc:description>Postprint (author's final draft)</dc:description>
   <dc:date>2016</dc:date>
   <dc:type>Conference report</dc:type>
   <dc:identifier>España-i-Bonet, C., Fonollosa, J. A. R. Automatic speech recognition with deep neural networks for impaired speech. A: International Conference on Advances in Speech and Language Technologies for Iberian Languages. "Advances in Speech and Language Technologies for Iberian Languages: Third International Conference, IberSPEECH 2016: Lisbon, Portugal, November 23-25, 2016: proceedings". Lisbon: Springer, 2016, p. 97-107.</dc:identifier>
   <dc:identifier>978-3-319-49169-1</dc:identifier>
   <dc:identifier>https://hdl.handle.net/2117/103823</dc:identifier>
   <dc:identifier>10.1007/978-3-319-49169-1_10</dc:identifier>
   <dc:language>eng</dc:language>
   <dc:relation>https://link.springer.com/chapter/10.1007%2F978-3-319-49169-1_10</dc:relation>
   <dc:relation>info:eu-repo/grantAgreement/MINECO//IPT-2012-0914-300000/ES/Programa segunda voz/</dc:relation>
   <dc:relation>info:eu-repo/grantAgreement/MINECO//TEC2015-69266-P/ES/TECNOLOGIAS DE APRENDIZAJE PROFUNDO APLICADAS AL PROCESADO DE VOZ Y AUDIO/</dc:relation>
   <dc:rights>Open Access</dc:rights>
   <dc:format>11 p.</dc:format>
   <dc:format>application/pdf</dc:format>
   <dc:publisher>Springer</dc:publisher>
</oai_dc:dc></metadata></record></GetRecord></OAI-PMH>