Publications
Biometrically enhanced software-defined radios
Summary
Summary
Software-defined radios and cognitive radios offer tremendous promise, while having great need for user authentication. Authenticating users is essential to ensuring authorized access and actions in private and secure communications networks. User authentication for software-defined radios and cognitive radios is our focus here. We present various means of authenticating users...
Fusing high- and low-level features for speaker recognition
Summary
Summary
The area of automatic speaker recognition has been dominated by systems using only short-term, low-level acoustic information, such as cepstral features. While these systems have produced low error rates, they ignore higher levels of information beyond low-level acoustics that convey speaker information. Recently published works have demonstrated that such high-level...
Person authentication by voice: a need for caution
Summary
Summary
Because of recent events and as members of the scientific community working in the field of speech processing, we feel compelled to publicize our views concerning the possibility of identifying or authenticating a person from his or her voice. The need for a clear and common message was indeed shown...
Combining cross-stream and time dimensions in phonetic speaker recognition
Summary
Summary
Recent studies show that phonetic sequences from multiple languages can provide effective features for speaker recognition. So far, only pronunciation dynamics in the time dimension, i.e., n-gram modeling on each of the phone sequences, have been examined. In the JHU 2002 Summer Workshop, we explored modeling the statistical pronunciation dynamics...
Conditional pronunciation modeling in speaker detection
Summary
Summary
In this paper, we present a conditional pronunciation modeling method for the speaker detection task that does not rely on acoustic vectors. Aiming at exploiting higher-level information carried by the speech signal, it uses time-aligned streams of phones and phonemes to model a speaker's specific Pronunciation. Our system uses phonemes...
Phonetic speaker recognition using maximum-likelihood binary-decision tree models
Summary
Summary
Recent work in phonetic speaker recognition has shown that modeling phone sequences using n-grams is a viable and effective approach to speaker recognition, primarily aiming at capturing speaker-dependent pronunciation and also word usage. This paper describes a method involving binary-tree-structured statistical models for extending the phonetic context beyond that of...
The SuperSID project : exploiting high-level information for high-accuracy speaker recognition
Summary
Summary
The area of automatic speaker recognition has been dominated by systems using only short-term, low-level acoustic information, such as cepstral features. While these systems have indeed produced very low error rates, they ignore other levels of information beyond low-level acoustics that convey speaker information. Recently published work has shown examples...
Phonetic speaker recognition with support vector machines
Summary
Summary
A recent area of significant progress in speaker recognition is the use of high level features-idiolect, phonetic relations, prosody, discourse structure, etc. A speaker not only has a distinctive acoustic sound but uses language in a characteristic manner. Large corpora of speech data available in recent years allow experimentation with...
Gender-dependent phonetic refraction for speaker recognition
Summary
Summary
This paper describes improvement to an innovative high-performance speaker recognition system. Recent experiments showed that with sufficient training data phone strings from multiple languages are exceptional features for speaker recognition. The prototype phonetic speaker recognition system used phone sequences from six languages to produce an equal error rate of 11.5%...
Speaker recognition from coded speech and the effects of score normalization
Summary
Summary
We investigate the effect of speech coding on automatic speaker recognition when training and testing conditions are matched and mismatched. Experiments used standard speech coding algorithms (GSM, G.729, G.723, MELP) and a speaker recognition system based on Gaussian mixture models adapted from a universal background model. There is little loss...