Publications
Speaker diarisation for broadcast news
Summary
Summary
It is often important to be able to automatically label 'who spoke when' during some audio data. This paper describes two systems for audio segmentation developed at CUED and MIT-LL and evaluates their performance using the speaker diarisation score defined in the 2003 Rich Transcription Evaluation. A new clustering procedure...
Language recognition with support vector machines
Summary
Summary
Support vector machines (SVMs) have become a popular tool for discriminative classification. Powerful theoretical and computational tools for support vector machines have enabled significant improvements in pattern classification in several areas. An exciting area of recent application of support vector machines is in speech processing. A key aspect of applying...
High-level speaker verification with support vector machines
Summary
Summary
Recently, high-level features such as word idiolect, pronunciation, phone usage, prosody, etc., have been successfully used in speaker verification. The benefit of these features was demonstrated in the NIST extended data task for speaker verification; with enough conversational data, a recognition system can become familiar with a speaker and achieve...
A tutorial on text-independent speaker verification
Summary
Summary
This paper presents an overview of a state-of-the-art text-independent speaker verification system. First, an introduction proposes a modular scheme of the training and test phases of a speaker verification system. Then, the most commonly speech parameterization used in speaker verification, namely, cepstral analysis, is detailed. Gaussian mixture modeling, which is...
Analysis of multitarget detection for speaker and language recognition
Summary
Summary
The general multitarget detection (or open-set identification) task is the intersection of the more common tasks of close-set identification and open-set verification/detection. In this task, a bank of parallel detectors process an input and must decide if the input is from one of the target classes and, if so, which...
Beyond cepstra: exploiting high-level information in speaker recognition
Summary
Summary
Traditionally speaker recognition techniques have focused on using short-term, low-level acoustic information such as cepstra features extracted over 20-30 ms windows of speech. But speech is a complex behavior conveying more information about the speaker than merely the sounds that are characteristic of his vocal apparatus. This higher-level information includes...
Biometrically enhanced software-defined radios
Summary
Summary
Software-defined radios and cognitive radios offer tremendous promise, while having great need for user authentication. Authenticating users is essential to ensuring authorized access and actions in private and secure communications networks. User authentication for software-defined radios and cognitive radios is our focus here. We present various means of authenticating users...
Acoustic, phonetic, and discriminative approaches to automatic language identification
Summary
Summary
Formal evaluations conducted by NIST in 1996 demonstrated that systems that used parallel banks of tokenizer-dependent language models produced the best language identification performance. Since that time, other approaches to language identification have been developed that match or surpass the performance of phone-based systems. This paper describes and evaluates three...
Fusing high- and low-level features for speaker recognition
Summary
Summary
The area of automatic speaker recognition has been dominated by systems using only short-term, low-level acoustic information, such as cepstral features. While these systems have produced low error rates, they ignore higher levels of information beyond low-level acoustics that convey speaker information. Recently published works have demonstrated that such high-level...
Integration of speaker recognition into conversational spoken dialogue systems
Summary
Summary
In this paper we examine the integration of speaker identification/verification technology into two dialogue systems developed at MIT: the Mercury air travel reservation system and the Orion task delegation system. These systems both utilize information collected from registered users that is useful in personalizing the system to specific users and...