The 2004 MIT Lincoln Laboratory speaker recognition system
March 19, 2005
The MIT Lincoln Laboratory submission for the 2004 NIST Speaker Recognition Evaluation (SRE) was built upon seven core systems using speaker information from short-term acoustics, pitch and duration prosodic behavior, and phoneme and word usage. These different levels of information were modeled and classified using Gaussian Mixture Models, Support Vector Machines and N-gram language models and were combined using a single layer perception fuser. The 2004 SRE used a new multi-lingual, multi-channel speech corpus that provided a challenging speaker detection task for the above systems. In this paper we describe the core systems used and provide an overview of their performance on the 2004 SRE detection tasks.