Publications

Refine Results

(Filters Applied) Clear All

Digital signal processing applications in cochlear-implant research

Published in:
Lincoln Laboratory Journal, Vol. 7, No. 1, Spring 1994, pp. 31-62.

Summary

We have developed a facility that enables scientists to investigate a wide range of sound-processing schemes for human subjects with cochlear implants. This digital signal processing (DSP) facility-named the Programmable Interactive System for Cochlear Implant Electrode Stimulation (PISCES)-was designed, built, and tested at Lincoln Laboratory and then installed at the Cochlear Implant Research Laboratory (CIRL) of the Massachusetts Eye and Ear Infirmary (MEEI). New stimulator algorithms that we designed and ran on PISCES have resulted in speech-reception improvements for implant subjects relative to commercial implant stimulators. These improvements were obtained as a result of interactive algorithm adjustment in the clinic, thus demonstrating the importance of a flexible signal-processing facility. Research has continued in the development of a laboratory-based, sohare-controlled, real-time, speech processing system; the exploration of new sound-processing algorithms for improved electrode stimulation; and the design of wearable stimulators that will allow subjects full-time use of stimulator algorithms developed and tested in a laboratory setting.
READ LESS

Summary

We have developed a facility that enables scientists to investigate a wide range of sound-processing schemes for human subjects with cochlear implants. This digital signal processing (DSP) facility-named the Programmable Interactive System for Cochlear Implant Electrode Stimulation (PISCES)-was designed, built, and tested at Lincoln Laboratory and then installed at the...

READ MORE

Neural networks, Bayesian a posteriori probabilities, and pattern classification

Published in:
Chapter 4 in From Statistics to Neural Networks: Theory and Pattern Recognition Applications, 1994, pp. 83-104.

Summary

Researchers in the fields of neural networks, statistics, machine learning, and artificial intelligence have followed three basic approaches to developing new pattern classifiers. Probability Density Function (PDF) classifiers include Gaussian and Gaussian Mixture classifiers which estimate distributions or densities of input features separately for each class. Posterior probability classifiers include multilayer perceptron neural networks with sigmoid nonlinearities and radial basis function networks. These classifiers estimate minimum-error Bayesian a posteriori probabilities (hereafter referred to as posterior probabilities) simultaneously for all classes. Boundary forming classifiers include hard-limiting single-layer perceptrons, hypersphere classifiers, and nearest neighbor classifiers. These classifiers have binary indicator outputs which form decision regions that specify the class of any input pattern. Posterior probability and boundary-forming classifiers are trained using discriminant training. All training data is used simultaneously to estimate Bayesian posterior probabilities or minimize overall classification error rates. PDF classifiers are trained using maximum likelihood approaches which individually model class distributions without regard to overall classification performance. Analytic results are presented which demonstrate that many neural network classifiers can accurately estimate posterior probabilities and that these neural network classifiers can sometimes provide lower error rates than PDF classifiers using the same number of trainable parameters. Experiments also demonstrate how interpretation of network outputs as posterior probabilities makes it possible to estimate the confidence of a classification decision, compensate for differences in class prior probabilities between test and training data, and combine outputs of multiple classifiers over time for speech recognition.
READ LESS

Summary

Researchers in the fields of neural networks, statistics, machine learning, and artificial intelligence have followed three basic approaches to developing new pattern classifiers. Probability Density Function (PDF) classifiers include Gaussian and Gaussian Mixture classifiers which estimate distributions or densities of input features separately for each class. Posterior probability classifiers include...

READ MORE

Predicting the risk of complications in coronary artery bypass operations using neural networks

Published in:
Proc. 7th Int. Conf. on Neural Information Processing Systems, NIPS, 1994, pp. 1055-62.

Summary

Experiments demonstrated that sigmoid multilayer perceptron (MLP) networks provide slightly better risk prediction than conventional logistic regression when used to predict the risk of death, stroke, and renal failure on 1257 patients who underwent coronary artery bypass operations at the Lahey Clinic. MLP networks with no hidden layer and networks with one hidden layer were trained using stochastic gradient descent with early stopping. MLP networks and logistic regression used the same input features and were evaluated using bootstrap sampling with 50 replications. ROC areas for predicting mortality using preoperative input features were 70.5% for logistic regression and 76.0% for MLP networks. Regularization provided by early stopping was an important component of improved performance. A simplified approach to generating confidence intervals for MLP risk predictions using an auxiliary "confidence MLP" was developed. The confidence MLP is trained to reproduce confidence intervals that were generated during training using the outputs of 50 MLP networks trained with different bootstrap samples.
READ LESS

Summary

Experiments demonstrated that sigmoid multilayer perceptron (MLP) networks provide slightly better risk prediction than conventional logistic regression when used to predict the risk of death, stroke, and renal failure on 1257 patients who underwent coronary artery bypass operations at the Lahey Clinic. MLP networks with no hidden layer and networks...

READ MORE

Figure of merit training for detection and spotting

Published in:
Proc. Neural Information Processing Systems, NIPS, 29 November - 2 December 1993.

Summary

Spotting tasks require detection of target patterns from a background of richly varied non-target inputs. The performance measure of interest for these tasks, called the figure of merit (FOM), is the detection rate for target patterns when the false alarm rate is in an acceptable range. A new approach to training spotters is presented which computes the FOM gradient for each input pattern and then directly maximizes the FOM using back propagation. This eliminates the need for thresholds during training. It also uses network resources to model Bayesian a posteriori probability functions accurately only for patterns which have a significant effect on the detection accuracy over the false alarm rate of interest. FOM training increased detection accuracy by 5 percentage points for a hybrid radial basis function (RBF) - hidden Markov model (HMM) wordspotter on the credit-card speech corpus.
READ LESS

Summary

Spotting tasks require detection of target patterns from a background of richly varied non-target inputs. The performance measure of interest for these tasks, called the figure of merit (FOM), is the detection rate for target patterns when the false alarm rate is in an acceptable range. A new approach to...

READ MORE

Energy separation in signal modulations with application to speech analysis

Published in:
IEEE Trans. Signal Process., Vol. 41, No. 10, October 1993, pp. 3024-3051.

Summary

Oscillatory signals that have both an amplitude-modulation (AM) and a frequency-modulation (FM) structure are encountered in almost all communication systems. We have also used these structures recently for modeling speech resonances, being motivated by previous work on investigating fluid dynamics phenomena during speech production that provide evidence for the existence of modulations in speech signals. In this paper, we use a nonlinear differential operator that can detect modulations in AM-FM signals by estimating the product of their time-varying amplitude and frequency. This operator essentially tracks the energy needed by a source to produce the oscillatory signal. To solve the fundamental problem of estimating both the amplitude envelope and instantaneous frequency of an AM-FM signal we develop a novel approach that uses nonlinear combinations of instantaneous signal outputs from the energy operator to separate its output energy product into its amplitude modulation and frequency modulation components. The theoretical analysis is done first for continuous-time signals. Then several efficient algorithms are developed and compared for estimating the amplitude envelope and instantaneous frequency of discrete-time AM-FM signals. These energy separation algorithms are then applied to search for modulations in speech resonances, which we model using AM-FM signals to account for time-varying amplitude envelopes and instantaneous frequencies. Our experimental results provide evidence that bandpass filtered speech signals around speech formants contain amplitude and frequency modulations within a pitch period. Overall, the energy separation algorithms, due to their very low computational complexity and instantaneously-adapting nature, are very useful in detecting modulation patterns in speech and other time-varying signals.
READ LESS

Summary

Oscillatory signals that have both an amplitude-modulation (AM) and a frequency-modulation (FM) structure are encountered in almost all communication systems. We have also used these structures recently for modeling speech resonances, being motivated by previous work on investigating fluid dynamics phenomena during speech production that provide evidence for the existence...

READ MORE

LNKnet: Neural network, machine-learning, and statistical software for pattern classification

Published in:
Lincoln Laboratory Journal, Vol. 6, No. 2, Summer/Fall 1993, pp. 249-268.

Summary

Pattern-classification and clustering algorithms are key components of modern information processing systems used to perform tasks such as speech and image recognition, printed-character recognition, medical diagnosis, fault detection, process control, and financial decision making. To simplify the task of applying these types of algorithms in new application areas, we have developed LNKnet-a software package that provides access to more than 20 pattern-classification, clustering, and feature-selection algorithms. Included are the most important algorithms from the fields of neural networks, statistics, machine learning, and artificial intelligence. The algorithms can be trained and tested on separate data or tested with automatic cross-validation. LNKnet runs under the UNM operating system and access to the different algorithms is provided through a graphical point-and-click user interface. Graphical outputs include two-dimensional (2-D) scatter and decision-region plots and 1-D plots of data histograms, classifier outputs, and error rates during training. Parameters of trained classifiers are stored in files from which the parameters can be translated into source-code subroutines (written in the C programming language) that can then be embedded in a user application program. Lincoln Laboratory and other research laboratories have used LNKnet successfully for many diverse applications.
READ LESS

Summary

Pattern-classification and clustering algorithms are key components of modern information processing systems used to perform tasks such as speech and image recognition, printed-character recognition, medical diagnosis, fault detection, process control, and financial decision making. To simplify the task of applying these types of algorithms in new application areas, we have...

READ MORE

Automatic language identification using Gaussian mixture and hidden Markov models

Author:
Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 2, Speech Processing, ICASSP, 27-30 April 1993, pp. 399-402.

Summary

Ergodic, continuous-observation, hidden Markov models (HMMs) were used to perform automatic language classification and detection of speech messages. State observation probability densities were modeled as tied Gaussian mixtures. The algorithm was evaluated on four multilanguage speech databases: a three language subset of the Spoken Language Library, a three language subset of a five language Rome Laboratory database, the 20 language CCITT database, and the ten language OGI telephone speech database. Generally, performance of a single state HMM (i.e. a static Gaussian mixture classifier) was comparable to the multistate HMMs, indicating that the sequential modeling capabilities of HMMs were not exploited.
READ LESS

Summary

Ergodic, continuous-observation, hidden Markov models (HMMs) were used to perform automatic language classification and detection of speech messages. State observation probability densities were modeled as tied Gaussian mixtures. The algorithm was evaluated on four multilanguage speech databases: a three language subset of the Spoken Language Library, a three language subset...

READ MORE

Detection of transient signals using the energy operator

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 3, ICASSP, 27-30 April 1993, pp. 145-148.

Summary

A function of the Teager-Kaiser energy operator is introduced as a method for detecting transient signals in the presence of amplitude-modulated and frequency-modulated tonal interference. This function has excellent time resolution and is robust in the presence of white noise. The output of the detection function is also independent of the interference-to-transient ratio when that ratio is large. It is demonstrated that the detection function can be applied to interference signals with multiple amplitude-modulated and frequency-modulated tonal components.
READ LESS

Summary

A function of the Teager-Kaiser energy operator is introduced as a method for detecting transient signals in the presence of amplitude-modulated and frequency-modulated tonal interference. This function has excellent time resolution and is robust in the presence of white noise. The output of the detection function is also independent of...

READ MORE

Time-scale modification of complex acoustic signals

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, Plenary, Special, Audio, Underwater Acoustics, VLSI, Neural Networks, 27-30 April 1993, pp. 213-216.

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The technique constrains the modified signal to take on a specified spectral characteristic while imposing a time-scaled version of the original temporal envelope. Both full-band and sub-band representations of the temporal envelope are considered. In the full-band case, the modified signal is obtained by appropriate selection of its Fourier transform phase. In the sub-band case, using locations of maxima in the sub-band temporal envelopes, the phase of each bandpass signal is formed to preserve "events" in the envelope of the composite signal. The approach is applied to synthetic and actual short-duration acoustic signals consisting of closely-spaced and overlapping sequential time components.
READ LESS

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The technique constrains the modified signal to take on a specified spectral characteristic while imposing a time-scaled version of the original temporal envelope. Both full-band and sub-band representations of the temporal envelope are...

READ MORE

Time-scale modification with temporal envelope invariance

Published in:
Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 17-20 October 1993, pp. 127-130.

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The method preserves the time-scaled temporal envelope of a signal and for enhancement capitalizes on the perceptual importance of a signal's temporal structure. The basis for the approach is a sub-band representation whose channel phases are controlled to shape the temporal envelope of the time-scaled signal. The phase control is derived from locations of events which occur within filterbank outputs. A frame-based generalization of the method imposes phase consistency across consecutive synthesis frames. The approach is applied to synthetic and actual short-duration acoustic signals consisting of closely-spaced and overlapping sequential time components.
READ LESS

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The method preserves the time-scaled temporal envelope of a signal and for enhancement capitalizes on the perceptual importance of a signal's temporal structure. The basis for the approach is a sub-band representation whose...

READ MORE