Beyond cepstra: exploiting high-level information in speaker recognition

December 11, 2003

Conference Paper

Author:

Douglas A. Reynolds

…

Published in:

Proc. Multimodal User Authentication Workshop, 11-12 December, 2003, pp. 223-9.

R&D Area:

Cyber Security and Information Sciences

R&D Group:

Artificial Intelligence Technology and Systems

Beyond cepstra: exploiting high-level information in speaker recognition

Summary

Traditionally speaker recognition techniques have focused on using short-term, low-level acoustic information such as cepstra features extracted over 20-30 ms windows of speech. But speech is a complex behavior conveying more information about the speaker than merely the sounds that are characteristic of his vocal apparatus. This higher-level information includes speaker-specific prosodics, pronunciations, word usage and conversational style. In this paper, we review some of the techniques to extract and apply these sources of high-level information with results from the NIST 2003 Extended Data Task.