Publications
Adaptive short-time analysis-synthesis for speech enhancement
Summary
Summary
In this paper we propose a multiresolution short-time analysis method for speech enhancement. It is well known that fixed resolution methods such as the traditional short-time Fourier transform do not generally match the time-frequency structure of the signal being analyzed resulting in poor estimates of the speech and noise spectra...
A covariance kernel for SVM language recognition
Summary
Summary
Discriminative training for language recognition has been a key tool for improving system performance. In addition, recognition directly from shifted-delta cepstral features has proven effective. A recent successful example of this paradigm is SVM-based discrimination of languages based on GMM mean supervectors (GSVs). GSVs are created through MAP adaptation of...
A multi-class MLLR kernel for SVM speaker recognition
Summary
Summary
Speaker recognition using support vector machines (SVMs) with features derived from generative models has been shown to perform well. Typically, a universal background model (UBM) is adapted to each utterance yielding a set of features that are used in an SVM. We consider the case where the UBM is a...
Exploiting temporal change in pitch in formant estimation
Summary
Summary
This paper considers the problem of obtaining an accurate spectral representation of speech formant structure when the voicing source exhibits a high fundamental frequency. Our work is inspired by auditory perception and physiological modeling studies implicating the use of temporal changes in speech by humans. Specifically, we develop and assess...
Language recognition with discriminative keyword selection
Summary
Summary
One commonly used approach for language recognition is to convert the input speech into a sequence of tokens such as words or phones and then to use these token sequences to determine the target language. The language classification is typically performed by extracting N-gram statistics from the token sequences and...
Multisensor very low bit rate speech coding using segment quantization
Summary
Summary
We present two approaches to noise robust very low bit rate speech coding using wideband MELP analysis/synthesis. Both methods exploit multiple acoustic and non-acoustic input sensors, using our previously-presented dynamic waveform fusion algorithm to simultaneously perform waveform fusion, noise suppression, and crosschannel noise cancellation. One coder uses a 600 bps...
Improved GMM-based language recognition using constrained MLLR transforms
Summary
Summary
In this paper we describe the application of a feature-space transform based on constrained maximum likelihood linear regression for unsupervised compensation of channel and speaker variability to the language recognition problem. We show that use of such transforms can improve baseline GMM-based language recognition performance on the 2005 NIST Language...
Experimental demonstration of remote optical detection of trace explosives.
Summary
Summary
MIT Lincoln Laboratory has developed a concept that could enable remote (10s of meters) detection of trace explosives' residues via a field-portable laser system. The technique relies upon laser-induced photodissociation of nitro-bearing explosives into vibrationally excited nitric oxide (NO) fragments. Subsequent optical probing of the first vibrationally excited state at...
Analytic theory of power law graphs
Summary
Summary
An analytical theory of power law graphs is presented basedon the Kronecker graph generation technique. The analysisuses Kronecker exponentials of complete bipartite graphsto formulate the sub-structure of such graphs. This allows various high level quantities (e.g. degree distribution,betweenness centrality, diameter, eigenvalues, and isoparametric ratio) to be computed directly from the...
Integration of high-speed surface-channel charge coupled devices into an SOI CMOS process using strong phase shift lithography
Summary
Summary
To enable development of novel signal processing circuits, a high-speed surface-channel charge coupled device (CCD) process has been co-integrated with the Lincoln Laboratory 180-nm RF fully depleted silicon-on-insulator (FDSOI) CMOS technology. The CCDs support charge transfer clock speeds in excess of 1 GHz while maintaining high charge transfer efficiency (CTE)...