Impact of Speech Enhancement on ASR time stamping
|Authors:||Theologos Athanaselis; Stylianos Bakamidis; Stavroula-Evita Fotinea; Ioannis Dologlou|
|Book title:||Proceedings of the Fourth European Symposium on Intelligent Technologies, Hybrid Systems and their implementation on Smart Adaptive Systems, EUNITE-2004|
This paper discusses the improvement of speech recognition time stamping accuracy in the presence of noise, when a parametric method of signal enhancement is used. The more accurate time stamp of each recognised word enhances the performance of multimodal systems. The recognised word’s time stamp reflects how accurate the word’s boundaries are, along the recognised utterance. This is important since errors in the presence of noise are more frequent and tend to make applications, such as spoken dialogue systems, too cumbersome to use. The input signal is corrupted with coloured noise with varying signal-to-noise ratio. As noise levels increase not only the WER rises but also the word’s time stamp deteriorates. A non-linear spectral subtraction method (NSS) will be used in conjunction with the Continuous Speech Recognition system of the ERMIS project to quantify the impact of speech enhancement on word’s time stamp accuracy.