Publications
A comparison of subspace feature-domain methods for language recognition
Summary
Summary
Compensation of cepstral features for mismatch due to dissimilar train and test conditions has been critical for good performance in many speech applications. Mismatch is typically due to variability from changes in speaker, channel, gender, and environment. Common methods for compensation include RASTA, mean and variance normalization, VTLN, and feature...
A hybrid SVM/MCE training approach for vector space topic identification of spoken audio recordings
Summary
Summary
The success of support vector machines (SVMs) for classification problems is often dependent on an appropriate normalization of the input feature space. This is particularly true in topic identification, where the relative contribution of the common but uninformative function words can overpower the contribution of the rare but informative content...
Dialect recognition using adapted phonetic models
Summary
Summary
In this paper, we introduce a dialect recognition method that makes use of phonetic models adapted per dialect without phonetically labeled data. We show that this method can be implemented efficiently within an existing PRLM system. We compare the performance of this system with other state-of-the-art dialect recognition methods (both...
Eigen-channel compensation and discriminatively trained Gaussian mixture models for dialect and accent recognition
Summary
Summary
This paper presents a series of dialect/accent identification results for three sets of dialects with discriminatively trained Gaussian mixture models and feature compensation using eigen-channel decomposition. The classification tasks evaluated in the paper include: 1)the Chinese language classes, 2) American and Indian accented English and 3) discrimination between three Arabic...
The MITLL NIST LRE 2007 language recognition system
Summary
Summary
This paper presents a description of the MIT Lincoln Laboratory language recognition system submitted to the NIST 2007 Language Recognition Evaluation. This system consists of a fusion of four core recognizers, two based on tokenization and two based on spectral similarity. Results for NIST?s 14-language detection task are presented for...
Two protocols comparing human and machine phonetic discrimination performance in conversational speech
Summary
Summary
This paper describes two experimental protocols for direct comparison on human and machine phonetic discrimination performance in continuous speech. These protocols attempt to isolate phonetic discrimination while controlling for language and segmentation biases. Results of two human experiments are described including comparisons with automatic phonetic recognition baselines. Our experiments suggest...
Beyond frame independence: parametric modelling of time duration in speaker and language recognition
Summary
Summary
In this work, we address the question of generating accurate likelihood estimates from multi-frame observations in speaker and language recognition. Using a simple theoretical model, we extend the basic assumption of independent frames to include two refinements: a local correlation model across neighboring frames, and a global uncertainty due to...
Detection probability modeling for airport wind-shear sensors
Summary
Summary
An objective wind-shear detection probability estimation model is developed for radar, lidar, and sensor combinations. The model includes effects of system sensitivity, site-specific wind-shear, clutter, and terrain blockage characteristics, range-aliased obscuration statistics, antenna beam filling and attenuation, and signal processing differences which allow a sensor- and site-specific performance analysis of...
Amplitude spectroscopy of a solid-state artificial atom
Summary
Summary
The energy-level structure of a quantum system, which has a fundamental role in its behaviour, can be observed as discrete lines and features in absorption and emission spectra. Conventionally, spectra are measured using frequency spectroscopy, whereby the frequency of a harmonic electromagnetic driving field is tuned into resonance with a...
A 64 x 64-pixel CMOS test chip for the development of large-format ultra-high-speed snapshot imagers
Summary
Summary
A 64 x 64-pixel test circuit was designed and fabricated in 0.18- m CMOS technology for investigating high-speed imaging with large-format imagers. Several features are integrated into the circuit architecture to achieve fast exposure times with low-skew and jitter for simultaneous pixel snapshots. These features include an H-tree clock distribution...