Publications

Refine Results

(Filters Applied) Clear All

A speech recognizer using radial basis function neural networks in an HMM framework

Published in:
ICASSP'92, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 1, Speech Processing 1, 23-26 March 1992, pp. 629-632.

Summary

A high performance speaker-independent isolated-word speech recognizer was developed which combines hidden Markov models (HMMs) and radial basis function (RBF) neural networks. RBF networks in this recognizer use discriminant training techniques to estimate Bayesian probabilities for each speech frame while HMM decoders estimate overall word likelihood scores for network outputs. RBF training is performed after the HMM recognizer has automatically segmented training tokens using forced Viterbi alignment. In recognition experiments using a speaker-independent E-set database, the hybrid recognizer had an error rate of 11.5% compared to 15.7% for the robust unimodal Gaussian HMM recognizer upon which the hybrid system was based. The error rate was also lower than that of a tied-mixture HMM recognizer with the same number of centers. These results demonstrate that RBF networks can be successfully incorporated in hybrid recognizers and suggest that they may be capable of good performance with fewer parameters than required by Gaussian mixture classifiers.
READ LESS

Summary

A high performance speaker-independent isolated-word speech recognizer was developed which combines hidden Markov models (HMMs) and radial basis function (RBF) neural networks. RBF networks in this recognizer use discriminant training techniques to estimate Bayesian probabilities for each speech frame while HMM decoders estimate overall word likelihood scores for network outputs...

READ MORE

Initialization for improved IIR filter performance

Published in:
IEEE Trans. Signal Process., Vol. 40, No. 3, March 1992, pp. 543-550.

Summary

A new method for initializing the memory registers of IIR filters is introduced. In addition to providing improved performance as compared to other methods of initialization, this method is unique in that it makes no a priori assumptions regarding the input-signal content. Therefore, this method applies equally well to a variety of IIR filter designs and applications. The method is best suited for signal-processing applications in which "batch" processing of the data is used. However, sequential processing can be accommodated when delays at the beginning of a processing segment can be tolerated.
READ LESS

Summary

A new method for initializing the memory registers of IIR filters is introduced. In addition to providing improved performance as compared to other methods of initialization, this method is unique in that it makes no a priori assumptions regarding the input-signal content. Therefore, this method applies equally well to a...

READ MORE

Shape invariant time-scale and pitch modification of speech

Published in:
IEEE Trans. Signal Process., Vol. 40, No. 3, March 1992, pp. 497-510.

Summary

The simplified linear model of speech production predicts that when the rate of articulation is changed, the resulting waveform takes on the appearance of the original, except for a change in the time scale. The goal of this paper is to develop a time-scale modification system that preserves this shape-invariance property during voicing. This is done using a version of the sinusoidal analysis-synthesis system that models and independently modifies the phase contributions of the vocal tract and vocal cord excitation. An important property of the system is its capability of performing time-varying rates of change. Extensions of the method are applied to fixed and time-varying pitch modification of speech. The sine-wave analysis-synthesis system also allows for shape-invariant joint time-scale and pitch modification, and allows for the adjustment of the time scale and pitch according to speech characteristics such as the degree of voicing.
READ LESS

Summary

The simplified linear model of speech production predicts that when the rate of articulation is changed, the resulting waveform takes on the appearance of the original, except for a change in the time scale. The goal of this paper is to develop a time-scale modification system that preserves this shape-invariance...

READ MORE

Terminal Doppler weather radar/low-level wind shear alert system integration algorithm specification, version 1.1

Author:
Published in:
MIT Lincoln Laboratory Report ATC-187

Summary

There will be a number of airports that receive both a Terminal Doppler Weather Radar (TDWR) windshear detection system and a phase III Low-Level Wind Shear Alert System (LLWAS). At those airports, the two systems will need to he combined into a single windshear detection system. This report specifies the algorithm to be used to integrate the two subsystems. The algorithm takes in the alphanumeric runway alert messages generated by each subsystem and joins them into integrated alert messages. The design goals of this windshear detection system are (1) to maintain the probability of detection for hazardous events while reducing the number of false alerts and microburst overwarnings and 2) to increase the accuracy of the loss/gain estimates. The first design goal is accomplished by issuing an integrated alert for an operational runway whenever either subsystem issues a 'strong' alert for that runway; by canceling a 'weak' windshear alert on an operational runway if only one subsystem is making the declaration; and by reducing a 'weak' microburst alert on an operational runway to a 'strong' windshear alert if only one subsystem is making the declaration. The second design goal is accomplished by using the average of the two loss/gain values, when appropriate. TDWR, windshear, LLWAS, algorithm specification.
READ LESS

Summary

There will be a number of airports that receive both a Terminal Doppler Weather Radar (TDWR) windshear detection system and a phase III Low-Level Wind Shear Alert System (LLWAS). At those airports, the two systems will need to he combined into a single windshear detection system. This report specifies the...

READ MORE

Improved hidden Markov model speech recognition using radial basis function networks

Published in:
Advances in Neural Information Processing Systems, Denver, CO, 2-5 December 1991.

Summary

A high performance speaker-independent isolated-word hybrid speech recognizer was developed which combines Hidden Markov Models (HMMs) and Radial Basis Function (RBF) neural networks. In recognition experiments using a speaker-independent E-set database, the hybrid recognizer had an error rate of 11.5% compared to 15.7% for the robust unimodal Gaussian HMM recognizer upon which the hybrid system was based. These results and additional experiments demonstrate that RBF networks can be successfully incorporated in hybrid recognizers and suggest that they may be capable of good performance with fewer parameters than required by Gaussian mixture classifiers. A global parameter optimization method designed to minimize the overall word error rather than the frame recognition error failed to reduce the error rate.
READ LESS

Summary

A high performance speaker-independent isolated-word hybrid speech recognizer was developed which combines Hidden Markov Models (HMMs) and Radial Basis Function (RBF) neural networks. In recognition experiments using a speaker-independent E-set database, the hybrid recognizer had an error rate of 11.5% compared to 15.7% for the robust unimodal Gaussian HMM recognizer...

READ MORE

Neural network classifiers estimate Bayesian a posteriori probabilities

Published in:
Neural Comput., Vol. 3, No. 4, Winter 1991, pp. 461-483.

Summary

Many neural network classifiers provide outputs which estimate Bayesian a posteriori probabilities. When the estimation is accurate, network outputs can be treated as probabilities and sum to one. Simple proofs show that Bayesian probabilities are estimated when desired network outputs are 1 of M (one output unity, all others zero) and a squared-error or mss-entropy cost function is used. Results of Monte Carlo simulations performed using multilayer perceptron (MLP) networks trained with backpropagation, radial basis function (RBD networks, and high-order polynomial networks graphically demonstrate that network outputs provide good estimates of Bayesian probabilities. Estimation accuracy depends on network complexity, the amount of training data, and the degree to which training data reflect true likelihood distributions and a priori class probabilities. Interpretation of network outputs as Bayesian probabilities allows outputs from multiple networks to be combined for higher level decision making, simplifies creation of rejection thresholds, makes it possible to compensate for differences between pattern class probabilities in training and test data, allows outputs to be used to minimize alternative risk functions, and suggests alternative measures of network performance.
READ LESS

Summary

Many neural network classifiers provide outputs which estimate Bayesian a posteriori probabilities. When the estimation is accurate, network outputs can be treated as probabilities and sum to one. Simple proofs show that Bayesian probabilities are estimated when desired network outputs are 1 of M (one output unity, all others zero)...

READ MORE

Air-to-air visual acquisition handbook

Author:
Published in:
MIT Lincoln Laboratory Report ATC-151

Summary

The document describes a set of computer programs that provide a practical means for predicting air-to-air visual acquisition performance for aircraft on collision courses. The programs are based upon a mathematical model of pilot visual acquisition performance. Guidelines are provided for selecting model parameters based upon previously collected flight test data. Selected results of computer analysis are provided.
READ LESS

Summary

The document describes a set of computer programs that provide a practical means for predicting air-to-air visual acquisition performance for aircraft on collision courses. The programs are based upon a mathematical model of pilot visual acquisition performance. Guidelines are provided for selecting model parameters based upon previously collected flight test...

READ MORE

Unalerted air-to-air visual acquisition

Author:
Published in:
MIT Lincoln Laboratory Report ATC-152

Summary

A series of flight tests were flown to measure pilot air-to-air visual acquisition performance for pilots employing unalerted visual search. Twenty-four general aviation subject pilots flew a cross-country route while an intercepting aircraft was controlled to produce three intercepts with altitude separation of 500 feet. Pilots received no traffic advisory information to alert them to the possible presence of the intercepting aircraft. Results were analyzed to estimate the instantaneous rate of visual acquisition for a visual target of specified size and contrast. The results were used to calibrate a mathematical model of visual acquisition that can be used to predict pilot performance under a range of conditions.
READ LESS

Summary

A series of flight tests were flown to measure pilot air-to-air visual acquisition performance for pilots employing unalerted visual search. Twenty-four general aviation subject pilots flew a cross-country route while an intercepting aircraft was controlled to produce three intercepts with altitude separation of 500 feet. Pilots received no traffic advisory...

READ MORE

Terminal Doppler Weather Radar test bed operation, Orlando, January - June 1990

Published in:
MIT Lincoln Laboratory Report ATC-180

Summary

This semiannual report for the Terminal Doppler Weather Radar program, sponsored by the Federal Aviation Administration (FAA), covers the period from 1 January 1990 through 30 June 1990. The principal activity of this period was the transport and reassembly of the FL-2 weather radar test site from Kansas City, MO to Orlando, FL and the change of radar frequency from S-band used in Kansas City to C-band for Orlando operations. Site operations to prepare the FL-2C radar site for summer testing began in January and continued through May, when testing began. This report describes the RF hardware, the data collection, the computer systems at site, and the networks between Orlando, FL and Lexington, MA. Also included are discussions of the microburst and gust front algorithm development, data collection, display terminals, and training for Air Traffic Control (ATC) supervisors and controllers.
READ LESS

Summary

This semiannual report for the Terminal Doppler Weather Radar program, sponsored by the Federal Aviation Administration (FAA), covers the period from 1 January 1990 through 30 June 1990. The principal activity of this period was the transport and reassembly of the FL-2 weather radar test site from Kansas City, MO...

READ MORE

Opportunities for advanced speech processing in military computer-based systems

Published in:
Proc. IEEE, Vol. 79, No. 11, November 1991, pp. 1626-1641.

Summary

This paper presents a study of military applications of advanced speech processing technology which includes three major elements: 1) review and assessment of current efforts in military applications of speech technology; 2) identification of opportunities for future military applications of advanced speech technology; and 3) identification of problem areas where research in speech processing is needed to meet application requirements, and of current research thrusts which appear promising. The relationship of this study to previous assessments of military applications of speech technology is discussed and substantial recent progress is noted. Current efforts in military applications of speech technology which are highlighted include: 1) narrow-band (2400 his) and very low-rate (50-1200 his) secure voice communication; 2) voice/data integration in computer networks; 3) speech recognition in fighter aircraft, military helicopters, battle management, and air traffic control training systems; and 4) noise and interference removal for human listeners. Opportunities for advanced applications are identified by means of descriptions of several generic systems which would be possible with advances in speech technology and in system integration. These generic systems include 1) an integrated multirate voice data communications terminal; 2) an interactive speech enhancement system; 3) a voice-controlled pilot's associate system; 4) advanced air traffic control training systems; 5) a battle management command and control support system with spoken natural language interface; and 6) a spoken language translation system. In identifying problem areas and research efforts to meet application requirements, it is observed that some of the most promising research involves the integration of speech algorithm techniques including speech coding, speech recognition, and speaker recognition.
READ LESS

Summary

This paper presents a study of military applications of advanced speech processing technology which includes three major elements: 1) review and assessment of current efforts in military applications of speech technology; 2) identification of opportunities for future military applications of advanced speech technology; and 3) identification of problem areas where...

READ MORE