Compensating for mismatch in high-level speaker recognition

June 28, 2006

Conference Paper

Author:

William M. Campbell

Published in:

2006 IEEE Odyssey, the Speaker and Language Recognition Workshop, 28-30 June 2006.

R&D Area:

Cyber Security and Information Sciences

R&D Group:

Artificial Intelligence Technology and Systems

Compensating for mismatch in high-level speaker recognition

Summary

Speaker recognition using high-level features has been a successful area of exploration. Features obtained from many different levels phones, words, prosodic events, etc. are used to characterize the speaker. A good modeling technique for these features is the support vector machine (SVM). SVMs model the n-gram frequencies from speaker utterances in a high-dimensional SVM feature space and have shown excellent performance over a wide variety of high-level features. A complimentary method of recent exploration in SVM speaker recognition is the use of nuisance attribute projection (NAP). NAP removes directions from SVM feature space that are superfluous to the task of speaker recognition channel information, session variability, etc. In this paper, we consider the application of NAP to high-level speaker recognition. We describe the difficulties in applying this method and propose solutions. We also conduct experiments showing that NAP can reduce variability in SVM feature space leading to improved performance.