Publications

Refine Results

(Filters Applied) Clear All

Convergence of iterative nonexpansive signal reconstruction algorithms

Published in:
IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-29, No. 5, October 1981, pp. 1052-1059.

Summary

Iterative algorithms for signal reconstruction from partial time- and frequency-domain knowledge have proven useful in a number of application areas. In this paper, a general convergence proof, applicable to a general class of such iterative reconstruction algorithms, is presented. The proof relies on the concept of a nonexpansive mapping in both the time and frequency domains. Two examples studied in detail are time-limited extrapolation (equivalently, band-limited extrapolation) and phase-only signal reconstruction. The proof of convergence for the phase-only iteration is a new result obtained by this method of proof. The generality of the approach allows the incorporation of nonlinear constraints such as time- (or space-) domain positivity or minimum and maximum value constraints. Finally, the underrelaxed form of these iterations is also shown to converge even when the solution is not guaranteed to be unique.
READ LESS

Summary

Iterative algorithms for signal reconstruction from partial time- and frequency-domain knowledge have proven useful in a number of application areas. In this paper, a general convergence proof, applicable to a general class of such iterative reconstruction algorithms, is presented. The proof relies on the concept of a nonexpansive mapping in...

READ MORE

Data traffic performance of an integrated circuit and packet-switched multiplex structure

Published in:
IEEE Trans. on Commun., Vol. COM-28, No. 6, June 1980, pp. 873-878.

Summary

Results are developed for data traffic performance in an integrated multiplex structure which includes circuit-switching for voice and packet-switching for data. The results are obtained both through simulation and analysis, and show that excessive data queues and delays will build up under heavy loading conditions. These large data delays occur during periods of time when the voice traffic load through the multiplexer exceeds its statistical average. A variety of flow control mechanisms to reduce data packet delays are investigated. These mechanisms include control of voice bit rate, limitation of the data buffer, and combinations of voice rate and data buffer control. Simulations indicate that these flow control mechanisms provide substantial improvements in system performance.
READ LESS

Summary

Results are developed for data traffic performance in an integrated multiplex structure which includes circuit-switching for voice and packet-switching for data. The results are obtained both through simulation and analysis, and show that excessive data queues and delays will build up under heavy loading conditions. These large data delays occur...

READ MORE

A split band adaptive predictive coding (SBAPC) speech system

Published in:
IEEE Int. Conf. on Acoustics, Speech, & Signal Processing, 9-11 April 1980.

Summary

As developed by Atal and Schroeder [1], conventional Adaptive Predictive Coding (APC) of speech employs both vocal tract and pitch prediction to achieve a low energy, spectrally flattened residual. Errors in the pitch predictor can result in clipping errors which can propagate in the system for relatively long periods of time and degrade the quality of the synthesized speech. Makhoul and Berouti [2] have developed a high quality 16 kbps APC system which eliminates the pitch predictor by using a multi-level variable rate quantizer. In order to achieve comparable quality at even lower data rates, a split band APC (SBAPC) structure is proposed which employs the multi-level quantizer on the low frequency portion of the residual and a 1-bit quantizer on the high frequency portion of the residual.
READ LESS

Summary

As developed by Atal and Schroeder [1], conventional Adaptive Predictive Coding (APC) of speech employs both vocal tract and pitch prediction to achieve a low energy, spectrally flattened residual. Errors in the pitch predictor can result in clipping errors which can propagate in the system for relatively long periods of...

READ MORE

The tradeoff between delay and TASI advantage in a packetized speech multiplexer

Published in:
IEEE Trans. on Commun., Vol. COM-27, No. 11, November 1979, pp. 1716-20.

Summary

A packetized speech multiplexer differs from a circuit-switched TASI system in that the presence of a packet buffer allows a tradeoff where the TASI advantage can be increased at a cost in packet delay. This tradeoff is investigated via a simulation. Results are presented to show the relations between TASI advantage and delay, for both an average delay criterion and a maximum delay criterion. It is shown that, particularly for the case where small numbers of talkers are multiplexed, the packetized system offers significant improvements in TASI advantage over the conventional circuit-switched multiplexer, at modest costs in packet delay.
READ LESS

Summary

A packetized speech multiplexer differs from a circuit-switched TASI system in that the presence of a packet buffer allows a tradeoff where the TASI advantage can be increased at a cost in packet delay. This tradeoff is investigated via a simulation. Results are presented to show the relations between TASI...

READ MORE

A phrase recognizer using syllable-based acoustic measurements

Published in:
IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-26, No. 5, October 1978, pp. 409-418.

Summary

A system for the recognition of spoken phrases is described. The recognizer assumes that the input utterance contains one of a known set of allowable phrases, which may be spoken within a longer carrier sentence. Analysis is performed on a syllable-by-syllable basis with only the strong syllables considered in the recognition process. Each strong syllable is represented in terms of a set of distinguishing acoustic measurements taken at time points in and around the syllable nucleus. Phrases are represented as sequences of strong syllables. All parameters used in recognition are derived from LPC coefficients. Input speech is limited to 3.3 kHZ upper frequency. Recognition is completed within 1-3 s after the utterance is spoken. An interactive training facility allows flexible composition of key phrase sets. Testing was performed for a number of phrase sets each containing ten or fewer phrases, and included equal numbers of talkers used in training and talkers not used in training. Average phrase recognition accuracy was 95 percent when parameters were derived from unquantized (i.e., 16 bit) LPC coefficients and 90 percent when the LPC coefficients were transmitted to the recognizer across the ARPA network at 3500 bits/s. The recognizer has been incorporated into a user interface system where the parameters required to set up a point-to-point ARPANET voice connection can be established remotely by voice.
READ LESS

Summary

A system for the recognition of spoken phrases is described. The recognizer assumes that the input utterance contains one of a known set of allowable phrases, which may be spoken within a longer carrier sentence. Analysis is performed on a syllable-by-syllable basis with only the strong syllables considered in the...

READ MORE

A linear prediction vocoder with voice excitation

Published in:
Proc. EASCON, 29 September - 1 October 1975, pp. 30-a-30-g.

Summary

A speech bandwidth compression system, which employs voice excitation in conjunction with a Linear Predictive Coding (LPC) parameterization of the vocal tract filter, is described. To generate the excitation signal, the transmitted speech baseband is broadened at the receiver with a nonlinear distorter, and spectrally flattened by means of an adaptive inverse filter whose parameters are obtained through LPC analysis of the distorted baseband. The voice-excited linear prediction (VELP) system has been implemented in real time on the Fast Digital Processor at Lincoln Laboratory. A detailed description of an 8 kbps version of VELP is given. VELP offers promise as a good quality, medium rate speech compression system which, by avoiding the pitch problem, performs relatively well for telephone quality input speech.
READ LESS

Summary

A speech bandwidth compression system, which employs voice excitation in conjunction with a Linear Predictive Coding (LPC) parameterization of the vocal tract filter, is described. To generate the excitation signal, the transmitted speech baseband is broadened at the receiver with a nonlinear distorter, and spectrally flattened by means of an...

READ MORE

A system for acoustic-phonetic analysis of continuous speech

Published in:
Proc. IEEE Symp. on Speech Recognition, 15-19 April 1974, pp. 54-67.

Summary

A system for acoustic-phonetic analysis of continuous speech is being developed to serve as part of an automatic speech understanding system. The acoustic system accepts the speech waveform as an input and produces as output a string of phoneme-like units referred to as acoustic phonetic elements (APEL'S). This paper should be considered as a progress report, since the system is still under active development. The initial phase of the acoustic analysis consists of signal processing and parameter extraction, and includes spectrum analysis via linear prediction, computation of a number of parameters of the spectrum, and fundamental frequency extraction. This is followed by a preliminary segmentation of the speech into a few broad acoustic categories and formant tracking during vowel-like segments. The next phase consists of more detailed segmentation and classification intended to meet the needs of subsequent linguistic analysis. The preliminary segmentation and segment classification yield the following categories: vowel-like sound; volume dip within vowel-like sound; fricative-like sound; stop consonants, including silence or voice bar, and associated burst. These categories are produced by a deviation tree based upon energy measurements in selected frequency bands, derivatives and ratios of these measurements, a voicing detector, and a few editing rules. The more detailed classification algorithms include: 1) detection and identification of some diphthongs, semivowels, and nasals, through analysis of formant motions, positions, and amplitudes; 2) a vowel identifier, which determines three ranked choices for each vowel based on a comparison of the formant positions in the detected vowel segment to stored formant positions in a speaker-normalized vowel table; 3) a fricative identifier, which employs measurement of relative spectral energies in several bands to group the fricative segments into phoneme-like categories; 4) stop consonant classification based on the properties of the plosive burst. The above algorithms have been tested on a substantial corpus of continuous speech data. Performance results, as well as detailed descriptions of the algorithms are given.
READ LESS

Summary

A system for acoustic-phonetic analysis of continuous speech is being developed to serve as part of an automatic speech understanding system. The acoustic system accepts the speech waveform as an input and produces as output a string of phoneme-like units referred to as acoustic phonetic elements (APEL'S). This paper should...

READ MORE

Effects of finite register length in digital filtering and the fast Fourier transform

Published in:
Proceedings of the IEEE Vol. 60, No. 8, Aug 72, pp. 957-976.

Summary

When digital signal processing operations are implemented on a computer or with special-purpose hardware, errors and constraints due to finite word length are unavoidable. The main categories of finite register length effects are errors due to A/D conversion, errors due to roundoffs in the arithmetic, constraints on signal levels imposed by the need to prevent overflow, and quantization of system coefficients. The effects of finite register length on implementations of linear recursive difference equation digital filters, and the fast Fourier transform (FFT), are discussed in some detail. For these algorithms, the differing quantization effects of fixed point, floating point, and block floating point arithmetic are examined and compared. The paper is intended primarily as a tutorial review of a subject which has received considerable attention over the past few years. The groundwork is set through a discussion of the relationship between the binary representation of numbers and truncation or rounding, and a formulation of a statistical model for arithmetic roundoff. The analyses presented here are intended to illustrate techniques of working with particular models. Results of previous work are discussed and summarized when appropriate. Some examples are presented to indicate how the results developed for simple digital filters and the FFT can be applied to the analysis of more complicated systems which use these algorithms as building blocks.
READ LESS

Summary

When digital signal processing operations are implemented on a computer or with special-purpose hardware, errors and constraints due to finite word length are unavoidable. The main categories of finite register length effects are errors due to A/D conversion, errors due to roundoffs in the arithmetic, constraints on signal levels imposed...

READ MORE

A theory of multiple antenna AMTI radar

Published in:
MIT Lincoln Laboratory Report TN-1971-21

Summary

This note presents a detailed mathematical analysis of a multiple-antenna AMTI radar system capable of detecting moving targets over a significantly wider velocity range than is achievable with a single-antenna system. The general system configuration and signaling strategy is defined, and relationships among system and signaling parameters are investigated. A deterministic model for the target return and a statistical model for the clutter and noise returns are obtained, and an optimum processor for target detection is derived. A performance measure applicable to a large class of processors, including the optimum processor, is defined and some of its analytical properties investigated. It is shown that an easily implementable sub-optimum processor, based on two-dimensional spectral analysis, performs nearly as well as the optimum processor. The resolution and ambiguity properties of this sub-optimum processor are studied and a detailed numerical investigation of system performance is presented, including a study of how performance varies with basic system parameters such as the number of antennas.
READ LESS

Summary

This note presents a detailed mathematical analysis of a multiple-antenna AMTI radar system capable of detecting moving targets over a significantly wider velocity range than is achievable with a single-antenna system. The general system configuration and signaling strategy is defined, and relationships among system and signaling parameters are investigated. A...

READ MORE

Predictive coding in a homomorphic vocoder

Published in:
IEEE Trans. Audio Electroacoust., Vol. AU-19, No. 3 September 1971, pp. 243-248.

Summary

Application of a type of predictive coding to the channel signals of a homomorphic vocoder has produced sizable bit rate reductions. With only slight degradation in speech quality, reduction (for the spectral envelope information) from 7800 to 4000 bits/s was achieved. A technique for obtaining the formant frequencies from the predictive coding parameters is described; this approach promises further bit rate reductions. As a byproduct of this study of predictive coding, direct and cascade form speech synthesizers are compared on the basis of differing quantization effects.
READ LESS

Summary

Application of a type of predictive coding to the channel signals of a homomorphic vocoder has produced sizable bit rate reductions. With only slight degradation in speech quality, reduction (for the spectral envelope information) from 7800 to 4000 bits/s was achieved. A technique for obtaining the formant frequencies from the...

READ MORE