Publications

Refine Results

(Filters Applied) Clear All

High-performance low-complexity wordspotting using neural networks

Published in:
IEEE Trans. Signal Process., Vol. 45, No. 11, November 1997, pp. 2864-2870.

Summary

A high-performance low-complexity neural network wordspotter was developed using radial basis function (RBF) neural networks in a hidden Markov model (HMM) framework. Two new complementary approaches substantially improve performance on the talker independent Switchboard corpus. Figure of Merit (FOM) training adapts wordspotter parameters to directly improve the FOM performance metric, and voice transformations generate additional training examples by warping the spectra of training data to mimic across-talker vocal tract variability.
READ LESS

Summary

A high-performance low-complexity neural network wordspotter was developed using radial basis function (RBF) neural networks in a hidden Markov model (HMM) framework. Two new complementary approaches substantially improve performance on the talker independent Switchboard corpus. Figure of Merit (FOM) training adapts wordspotter parameters to directly improve the FOM performance metric...

READ MORE

Improving wordspotting performance with artificially generated data

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 9 May 1996, pp. 526-9.

Summary

Lack of training data is a major problem that limits the performance of speech recognizers. Performance can often only be improved by expensive collection of data from many different talkers. This paper demonstrates that artificially transformed speech can increase the variability of training data and increase the performance of a wordspotter without additional expensive data collection. This approach was shown to be effective on a high-performance whole-word wordspotter on the Switchboard Credit Card database. The proposed approach used in combination with a discriminative training approach increased the Figure of Merit of the wordspotting system by 9.4% percentage points (62.5% to 71.9%). The increase in performance provided by artificially transforming speech was roughly equivalent to the increase that would have been provided by doubling the amount of training data. The performance of the wordspotter was also compared to that of human listeners who were able to achieve lower error rates because of improved consonant recognition.
READ LESS

Summary

Lack of training data is a major problem that limits the performance of speech recognizers. Performance can often only be improved by expensive collection of data from many different talkers. This paper demonstrates that artificially transformed speech can increase the variability of training data and increase the performance of a...

READ MORE

Wordspotter training using figure-of-merit back propagation

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, Speech Processing, 19-22 April 1994, pp. 389-392.

Summary

A new approach to wordspotter training is presented which directly maximizes the Figure of Merit (FOM) defined as the average detection rate over a specified range of false alarm rates. This systematic approach to discriminant training for wordspotters eliminates the necessity of ad hoc thresholds and tuning. It improves the FOM of wordspotters tested using cross-validation on the credit-card speech corpus training conversations by 4 to 5 percentage points to roughly 70% This improved performance requires little extra complexity during wordspotting and only two extra passes through the training data during training. The FOM gradient is computed analytically for each putative hit, back-propagated through HMM word models using the Viterbi alignment, and used to adjust RBF hidden node centers and state-weights associated with every node in HMM keyword models.
READ LESS

Summary

A new approach to wordspotter training is presented which directly maximizes the Figure of Merit (FOM) defined as the average detection rate over a specified range of false alarm rates. This systematic approach to discriminant training for wordspotters eliminates the necessity of ad hoc thresholds and tuning. It improves the...

READ MORE

Figure of merit training for detection and spotting

Published in:
Proc. Neural Information Processing Systems, NIPS, 29 November - 2 December 1993.

Summary

Spotting tasks require detection of target patterns from a background of richly varied non-target inputs. The performance measure of interest for these tasks, called the figure of merit (FOM), is the detection rate for target patterns when the false alarm rate is in an acceptable range. A new approach to training spotters is presented which computes the FOM gradient for each input pattern and then directly maximizes the FOM using back propagation. This eliminates the need for thresholds during training. It also uses network resources to model Bayesian a posteriori probability functions accurately only for patterns which have a significant effect on the detection accuracy over the false alarm rate of interest. FOM training increased detection accuracy by 5 percentage points for a hybrid radial basis function (RBF) - hidden Markov model (HMM) wordspotter on the credit-card speech corpus.
READ LESS

Summary

Spotting tasks require detection of target patterns from a background of richly varied non-target inputs. The performance measure of interest for these tasks, called the figure of merit (FOM), is the detection rate for target patterns when the false alarm rate is in an acceptable range. A new approach to...

READ MORE

Showing Results

1-4 of 4