Predicting, diagnosing, and improving automatic language identification performance
September 22, 1997
Conference Paper
Author:
Published in:
5th European Conf. on Speech Communication and Technology, EUROSPEECH, 22-25 September 1997.
R&D Area:
Summary
Language-identification (LID) techniques that use multiple single-language phoneme recognizers followed by n-gram language models have consistently yielded top performance at NIST evaluations. In our study of such systems, we have recently cut our LID error rate by modeling the output of n-gram language models more carefully. Additionally, we are now able to produce meaningful confidence scores along with our LID hypotheses. Finally, we have developed some diagnostic measures that can predict performance of our LID algorithms.