Summary
This paper compares the performance of four approaches to automatic language identification (LID) of telephone speech messages: Gaussian mixture model classification (GMM), language-independent phoneme recognition followed by language-dependent language modeling (PRLM), parallel PRLM (PRLM-P), and language-dependent parallel phoneme recognition (PPR). These approaches span a wide range of training requirements and levels of recognition complexity. All approaches were tested on the development test subset of the OGI multi-language telephone speech corpus. Generally, system performance was directly related to system complexity, with PRLM-P and PPR performing best. On 45 second test utterance, average two language, closed-set, forced-choice classification performance, reached 94.5% correct. The best 10 language, closed-set, forced-choice performance was 79.2% correct.