Publications
Two-talker pitch tracking for co-channel talker interference suppression
Summary
Summary
Almost all co-channel talker interference suppression systems use the difference in the pitches of the target and jammer speakers to suppress the jammer and enhance the target. While joint pitch estimators outputting two pitch estimates as a function of time have been proposed, the task of proper assignment of pitch...
An integrated speech-background model for robust speaker identification
Summary
Summary
This paper examines a procedure for text independent speaker identification in noisy environments where the interfering background signals cannot be characterized using traditional broadband or impulsive noise models. In the procedure, both the speaker and the background processes are modeled using mixtures of Gaussians. Speaker and background models are integrated...
A speech recognizer using radial basis function neural networks in an HMM framework
Summary
Summary
A high performance speaker-independent isolated-word speech recognizer was developed which combines hidden Markov models (HMMs) and radial basis function (RBF) neural networks. RBF networks in this recognizer use discriminant training techniques to estimate Bayesian probabilities for each speech frame while HMM decoders estimate overall word likelihood scores for network outputs...
Shape invariant time-scale and pitch modification of speech
Summary
Summary
The simplified linear model of speech production predicts that when the rate of articulation is changed, the resulting waveform takes on the appearance of the original, except for a change in the time scale. The goal of this paper is to develop a time-scale modification system that preserves this shape-invariance...
Improved hidden Markov model speech recognition using radial basis function networks
Summary
Summary
A high performance speaker-independent isolated-word hybrid speech recognizer was developed which combines Hidden Markov Models (HMMs) and Radial Basis Function (RBF) neural networks. In recognition experiments using a speaker-independent E-set database, the hybrid recognizer had an error rate of 11.5% compared to 15.7% for the robust unimodal Gaussian HMM recognizer...
Neural network classifiers estimate Bayesian a posteriori probabilities
Summary
Summary
Many neural network classifiers provide outputs which estimate Bayesian a posteriori probabilities. When the estimation is accurate, network outputs can be treated as probabilities and sum to one. Simple proofs show that Bayesian probabilities are estimated when desired network outputs are 1 of M (one output unity, all others zero)...
Opportunities for advanced speech processing in military computer-based systems
Summary
Summary
This paper presents a study of military applications of advanced speech processing technology which includes three major elements: 1) review and assessment of current efforts in military applications of speech technology; 2) identification of opportunities for future military applications of advanced speech technology; and 3) identification of problem areas where...
Low-rate speech coding based on the sinusoidal model
Summary
Summary
One approach to the problem of representation of speech signals is to use the speech production model in which speech is viewed as the result of passing a glottal excitation waveform through a time-varying linear filter that models the resonant characteristics of the vocal tract. In many applications it suffices...
Speech nonlinearities, modulations, and energy operators
Summary
Summary
In this paper, we investigate an AM-FM model for representing modulations in speech resonances. Specifically, we propose a frequency modulation (FM) model for the time-varying formants whose amplitude varies as the envelope of an amplitude-modulated (AM) signal. To detect the modulations we apply the energy operator (psi)(x) = (x)^2 -...
Peak-to-rms reduction of speech based on a sinusoidal model
Summary
Summary
In a number of applications, a speech waveform is processed using phase dispersion and amplitude compression to reduce its peak-to-rms ratio so as to increase loudness and intelligibility while minimizing perceived distortion. In this paper, a sinusoidal-based analysis/synthesis system is used to apply a radar design solution to the problem...