Publications
Conditional pronunciation modeling in speaker detection
Summary
Summary
In this paper, we present a conditional pronunciation modeling method for the speaker detection task that does not rely on acoustic vectors. Aiming at exploiting higher-level information carried by the speech signal, it uses time-aligned streams of phones and phonemes to model a speaker's specific Pronunciation. Our system uses phonemes...
Phonetic speaker recognition using maximum-likelihood binary-decision tree models
Summary
Summary
Recent work in phonetic speaker recognition has shown that modeling phone sequences using n-grams is a viable and effective approach to speaker recognition, primarily aiming at capturing speaker-dependent pronunciation and also word usage. This paper describes a method involving binary-tree-structured statistical models for extending the phonetic context beyond that of...
The SuperSID project : exploiting high-level information for high-accuracy speaker recognition
Summary
Summary
The area of automatic speaker recognition has been dominated by systems using only short-term, low-level acoustic information, such as cepstral features. While these systems have indeed produced very low error rates, they ignore other levels of information beyond low-level acoustics that convey speaker information. Recently published work has shown examples...
Using prosodic and conversational features for high-performance speaker recognition : report from JHU WS'02
Summary
Summary
While there has been a long tradition of research seeking to use prosodic features, especially pitch, in speaker recognition systems, results have generally been disappointing when such features are used in isolation and only modest improvements have been set when used in conjunction with traditional cepstral GMM systems. In contrast...
A multi-threaded fast convolver for dynamically parallel image filtering
Summary
Summary
2D convolution is a staple of digital image processing. The advent of large format imagers makes it possible to literally ''pave'' with silicon the focal plane of an optical sensor, which results in very large images that can require a significant amount computation to process. Filtering of large images via...
Phonetic speaker recognition with support vector machines
Summary
Summary
A recent area of significant progress in speaker recognition is the use of high level features-idiolect, phonetic relations, prosody, discourse structure, etc. A speaker not only has a distinctive acoustic sound but uses language in a characteristic manner. Large corpora of speech data available in recent years allow experimentation with...
Modeling prosodic dynamics for speaker recognition
Summary
Summary
Most current state-of-the-art automatic speaker recognition systems extract speaker-dependent features by looking at short-term spectral information. This approach ignores long-term information that can convey supra-segmental information, such as prosodics and speaking style. We propose two approaches that use the fundamental frequency and energy trajectories to capture long-term information. The first...
Cluster detection in databases : the adaptive matched filter algorithm and implementation
Summary
Summary
Matched filter techniques are a staple of modern signal and image processing. They provide a firm foundation (both theoretical and empirical) for detecting and classifying patterns in statistically described backgrounds. Application of these methods to databases has become increasingly common in certain fields (e.g. astronomy). This paper describes an algorithm...
The effect of identifying vulnerabilities and patching software on the utility of network intrusion detection
Summary
Summary
Vulnerability scanning and installing software patches for known vulnerabilities greatly affects the utility of network-based intrusion detection systems that use signatures to detect system compromises. A detailed timeline analysis of important remote-to-local vulnerabilities demonstrates (1) vulnerabilities in widely-used server software are discovered infrequently (at most 6 times a year) and...
2-D processing of speech with application to pitch estimation
Summary
Summary
In this paper, we introduce a new approach to two-dimensional (2-D) processing of the one-dimensional (1-D) speech signal in the time-frequency plane. Specifically, we obtain the shortspace 2-D Fourier transform magnitude of a narrowband spectrogram of the signal and show that this 2-D transformation maps harmonically-related signal components to a...