2026-04-19T19:13:52Zhttps://recercat.cat/oai/request

oai:recercat.cat:2117/1038132025-07-17T08:51:59Zcom_2072_1033col_2072_452950

Improving the robustness of the usual fbe-based asr front-end Nadeu Camprubí, Climent Macho, D Hernando Pericás, Francisco Javier Àrees temàtiques de la UPC::Enginyeria de la telecomunicació Telecommunication Telecomunicació All speech recognition systems require some form of signal representation that parametrically models the temporal evolution of the spectral envelope. Current parameterizations involve, either explicitly or implicitly, a set of energies from frequency bands which are often distributed in a mel scale. The computation of those filterbank energies (FBE) always includes smoothing of basic spectral measurements and non-linear amplitude compression. A variety of linear transformations are typically applied to this time-frequency representation prior to the Hidden Markov Model (HMM) pattern-matching stage of recognition. In the paper, we will discuss some robustness issues involved in both the computation of the FBEs and the posterior linear transformations, presenting alternative techniques that can improve robustness in additive noise conditions. In particular, the root non-linearity, a voicing-dependent FBE computation technique and a time&frequency filtering (tiffing) technique will be considered. Recognition results for the Aurora database will be shown to illustrate the potential application of these alternatives techniques for enhancing the robustness of speech recognition systems. Peer Reviewed Postprint (published version) 2000 Conference report http://creativecommons.org/licenses/by-nc-nd/3.0/es/ Open Access Attribution-NonCommercial-NoDerivs 3.0 Spain Mergablum