Publication Abstract

Singer, E., Torres-Carrasquillo, P. A., Gleason, T. P., Campbell, W.M., and Reynolds, D. A. Acoustic, Phonetic, and Discriminative Approaches to Automatic Language Recognition. In Proc. Eurospeech in Geneva, Switzerland, ISCA, pp. 1345-1348, 1-4 September 2003.

Abstract

Formal evaluations conducted by NIST in 1996 demonstrated that systems that used parallel banks of tokenizer-dependent language models produced the best language recognition performance. Since that time, other approaches to language recognition have been developed that match or surpass the performance of phone-based systems. This paper describes and evaluates three techniques that have been applied to the language recognition problem: phone recognition, Gaussian mixture modeling, and support vector machine classification. A recognizer that fuses the scores of three systems that employ these techniques produces a 2.7% equal error rate (EER) on the 1996 NIST evaluation set and a 2.8% EER on the NIST 2003 primary condition evaluation set. An approach to dealing with the problem of out-of-set data is also discussed.