Publications
Sinusoidal coding
Summary
Summary
This chapter summarizes the sinewave-based pitch extractor, and the high-order all-pole modelling techniques that provided the basis for the multirate Sinusoidal Transform Coder and its application to multi-speaker conferencing.
Speaker identification and verification using Gaussian mixture speaker models
Summary
Summary
This paper presents high performance speaker identification and verification systems based on Gaussian mixture speaker models: robust, statistically based representations of speaker identification. The identification system is a maximum likelihood classifier and the verification system is a likelihood ratio hypothesis tester using background speaker normalization. The systems are evaluated on...
Energy onset times for speaker identification
Summary
Summary
Onset times of resonant energy pulses are measured with the high-resolution Teager operator and used as features in the Reynolds Gaussian-mixture speaker identification algorithm. Feature sets are constructed with primary pitch and secondary pulse locations derived from low and high speech formants. Preliminary testing was performed with a confusable 40-speaker...
Formant AM-FM for speaker identification
Summary
Summary
The performance of systems for speaker identification (SID) can be quite good with clean speech, though much lower with degraded speech. Thus it is useful to search for new features for SID, particularly features that are robust over a degraded channel. This paper investigates features that are robust over a...
Experimental evaluation of features for robust speaker identification
Summary
Summary
This correspondence presents an experimental evaluation of different features and channel compensation techniques for robust speaker identification. The goal is to keep all processing and classification steps constant and to vary only the features and compensations used to allow a controlled comparison. A general, maximum-likelihood classifier based on Gaussian mixture...
Large population speaker recognition using wideband and telephone speech
Summary
Summary
The two largest factors affecting automatic speaker identification performance are the size of the population to be distinguished among and the degradations introduced by noisy communication channels (e.g. telephone transmission). To experimentally examine these two factors, this paper presents text-independent speaker identification results for varying speaker population sizes up to...
Wordspotter training using figure-of-merit back propagation
Summary
Summary
A new approach to wordspotter training is presented which directly maximizes the Figure of Merit (FOM) defined as the average detection rate over a specified range of false alarm rates. This systematic approach to discriminant training for wordspotters eliminates the necessity of ad hoc thresholds and tuning. It improves the...
Automatic language identification of telephone speech messages using phoneme recognition and N-gram modeling
Summary
Summary
This paper compares the performance of four approaches to automatic language identification (LID) of telephone speech messages: Gaussian mixture model classification (GMM), language-independent phoneme recognition followed by language-dependent language modeling (PRLM), parallel PRLM (PRLM-P), and language-dependent parallel phoneme recognition (PPR). These approaches span a wide range of training requirements and...
Demonstrations and applications of spoken language technology: highlights and perspectives from the 1993 ARPA Spoken Language Technology and Applications Day
Summary
Summary
The ARPA Spoken Language Technology and Applications Day (SLTA'93) was a special workshop which presented a set of live, state-of-the-art demonstrations of speech recognition and Spoken Language Understanding systems. The purpose of this paper is to provide perspective on current opportunities for applications which they can enable, and reviewing the...
Integrated models of signal and background with application to speaker identification in noise
Summary
Summary
This paper is concerned with the problem of robust parametric model estimation and classification in noisy acoustic environments. Characterization and modeling of the external noise sources in these environments is in itself an important issue in noise compensation. The techniques described here provide a mechanism for integrating parametric models of...