Large population speaker identification using clean and telephone speech

March 1, 1995

Journal Article

Author:

Douglas A. Reynolds

Published in:

IEEE Signal Process. Lett., Vol. 2, No. 3, March 1995, pp. 46-48.

R&D Area:

Cyber Security and Information Sciences

R&D Group:

Artificial Intelligence Technology and Systems

Large population speaker identification using clean and telephone speech

Summary

This paper presents text-independent speaker identification results for varying speaker population sizes up to 630 speakers for both clean, wideband speech, and telephone speech. A system based on Gaussian mixture speaker models is used for speaker identification, and experiments are conducted on the TIMIT and NTIMIT databases. The TIMIT results show large population performance under near-ideal conditions, and the NTIMIT results show the corresponding accuracy loss due to telephone transmission. These are believed to be the first speaker identification experiments on the complete 630 speaker TIMIT and NTIMIT databases and the largest text-independent speaker identification task reported to date. Identification accuracies of 99.5 and 60.7% were achieved on the TIMIT and NTIMIT databases, respectively.