Publications

Refine Results

(Filters Applied) Clear All

Audio signal processing based on sinusoidal analysis/synthesis

Published in:
Chapter 9 in Applications of Digital Signal Processing to Audio and Acoustics, 1998, pp. 343-416.

Summary

Based on a sinusoidal model, an analysis/synthesis technique is developed that characterizes audio signals, such as speech and music, in terms of the amplitudes, frequencies, and phases of the component sine waves. These parameters are estimated by applying a peak-picking algorithm to the short-time Fourier transform of the input waveform. Rapid changes in the highly resolved spectral components are tracked by using a frequency-matching algorithm and the concept of "birth" and "death" of the underlying sine waves. For a given frequency track, a cubic phase function is applied to the sine-wave generator, whose output is amplitude-modulated and added to sines for other frequency tracks. The resulting synthesized signal preserves the general wave form shape and is nearly perceptually indistinguishable from the original, thus providing the basis for a variety of applications including signal modification, sound splicing, morphing and extrapolation, and estimation of sound characteristics such as vibrato. Although this sine-wave analysis/synthesis is applicable to arbitrary signals, tailoring the system to a specific sound class can improve performance. A source/filter phase model is introduced within the sine-wave representation to improve signal modification, as in time-scale and pitch change and dynamic range compression, by attaining phase coherence where sinewave phase relations are preserved or controlled. A similar method of achieving phase coherence is also applied in revisiting the classical phase vocoder to improve modification of certain signal classes. A second refinement of the sine-wave analysis/synthesis invokes an additive deterministic/stochastic representation of sounds consisting of simultaneous harmonic and aharmonic contributions. A method of frequency tracking is given for the separation of these components, and is used in a number of applications. The sinewave model is also extended to two additively combined signals for the separation of simultaneous talkers or music duets. Finally, the use of sine-wave analysis/synthesis in providing insight for FM synthesis is described, and remaining challenges, such as an improved sine-wave representation of rapid attacks and other transient events, are presented.
READ LESS

Summary

Based on a sinusoidal model, an analysis/synthesis technique is developed that characterizes audio signals, such as speech and music, in terms of the amplitudes, frequencies, and phases of the component sine waves. These parameters are estimated by applying a peak-picking algorithm to the short-time Fourier transform of the input waveform...

READ MORE

Noise reduction based on spectral change

Published in:
Proc. of the 1997 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics, Session 8: Noise Reduction, 19-22 October 1997, 4 pages.

Summary

A noise reduction algorithm is designed for the aural enhancement of short-duration wideband signals. The signal of interest contains components possibly only a few milliseconds in duration and corrupted by nonstationary noise background. The essence of the enhancement technique is a Weiner filter that uses a desired signal spectrum whose estimation adapts to the "degree of stationarity" of the measured signal. The degree of stationarity is derived from a short-time spectral derivative measurement, motivated by sensitivity of biological systems to spectral change. Adaptive filter design tradeoffs are described, reflecting the accuracy of signal attack, background fidelity, and perceptual quality of the desired signal. Residual representations for binaural presentation are also considered.
READ LESS

Summary

A noise reduction algorithm is designed for the aural enhancement of short-duration wideband signals. The signal of interest contains components possibly only a few milliseconds in duration and corrupted by nonstationary noise background. The essence of the enhancement technique is a Weiner filter that uses a desired signal spectrum whose...

READ MORE

A subband approach to time-scale expansion of complex acoustic signals

Published in:
IEEE Trans. Speech Audio Process., Vol. 3, No. 6, November 1995, pp. 515-519.

Summary

A new approach to time-scale expansion of short-duration complex acoustic signals is introduced. Using a subband signal representation, channel phases are selected to preserve a desired time-scaled temporal envelope. The phase representation is derived from locations of events that occur within filter bank outputs. A frame-based generalization of the method imposes phase consistency across consecutive synthesis frames. The method is applied to synthetic and actual complex acoustic signals consisting of closely spaced rapidly damped sine wave. Time-frequency resolution limitations are discussed.
READ LESS

Summary

A new approach to time-scale expansion of short-duration complex acoustic signals is introduced. Using a subband signal representation, channel phases are selected to preserve a desired time-scaled temporal envelope. The phase representation is derived from locations of events that occur within filter bank outputs. A frame-based generalization of the method...

READ MORE

Time-scale modification with inconsistent constraints

Published in:
Proc. 1995 Workshop on Applications of Signal Processing to Audio Acoustics, 15-18 October 1995.

Summary

A set theoretic estimation approach is introduced for timescale modification of complex acoustic signals. The method determines a signal that meets, in a least-squared error sense, desired temporal and spectral envelope constraints that are inconsistent. These constraints are generalized within the set theoretic framework to include other signal characteristics such as instantaneous frequency and group delay. The approach can enhance acoustic signals consisting of closely-spaced sequential time components, and is applicable to biological, underwater, and music sound processing.
READ LESS

Summary

A set theoretic estimation approach is introduced for timescale modification of complex acoustic signals. The method determines a signal that meets, in a least-squared error sense, desired temporal and spectral envelope constraints that are inconsistent. These constraints are generalized within the set theoretic framework to include other signal characteristics such...

READ MORE

Time-scale modification of complex acoustic signals

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, Plenary, Special, Audio, Underwater Acoustics, VLSI, Neural Networks, 27-30 April 1993, pp. 213-216.

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The technique constrains the modified signal to take on a specified spectral characteristic while imposing a time-scaled version of the original temporal envelope. Both full-band and sub-band representations of the temporal envelope are considered. In the full-band case, the modified signal is obtained by appropriate selection of its Fourier transform phase. In the sub-band case, using locations of maxima in the sub-band temporal envelopes, the phase of each bandpass signal is formed to preserve "events" in the envelope of the composite signal. The approach is applied to synthetic and actual short-duration acoustic signals consisting of closely-spaced and overlapping sequential time components.
READ LESS

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The technique constrains the modified signal to take on a specified spectral characteristic while imposing a time-scaled version of the original temporal envelope. Both full-band and sub-band representations of the temporal envelope are...

READ MORE

Time-scale modification with temporal envelope invariance

Published in:
Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 17-20 October 1993, pp. 127-130.

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The method preserves the time-scaled temporal envelope of a signal and for enhancement capitalizes on the perceptual importance of a signal's temporal structure. The basis for the approach is a sub-band representation whose channel phases are controlled to shape the temporal envelope of the time-scaled signal. The phase control is derived from locations of events which occur within filterbank outputs. A frame-based generalization of the method imposes phase consistency across consecutive synthesis frames. The approach is applied to synthetic and actual short-duration acoustic signals consisting of closely-spaced and overlapping sequential time components.
READ LESS

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The method preserves the time-scaled temporal envelope of a signal and for enhancement capitalizes on the perceptual importance of a signal's temporal structure. The basis for the approach is a sub-band representation whose...

READ MORE

Shape invariant time-scale and pitch modification of speech

Published in:
IEEE Trans. Signal Process., Vol. 40, No. 3, March 1992, pp. 497-510.

Summary

The simplified linear model of speech production predicts that when the rate of articulation is changed, the resulting waveform takes on the appearance of the original, except for a change in the time scale. The goal of this paper is to develop a time-scale modification system that preserves this shape-invariance property during voicing. This is done using a version of the sinusoidal analysis-synthesis system that models and independently modifies the phase contributions of the vocal tract and vocal cord excitation. An important property of the system is its capability of performing time-varying rates of change. Extensions of the method are applied to fixed and time-varying pitch modification of speech. The sine-wave analysis-synthesis system also allows for shape-invariant joint time-scale and pitch modification, and allows for the adjustment of the time scale and pitch according to speech characteristics such as the degree of voicing.
READ LESS

Summary

The simplified linear model of speech production predicts that when the rate of articulation is changed, the resulting waveform takes on the appearance of the original, except for a change in the time scale. The goal of this paper is to develop a time-scale modification system that preserves this shape-invariance...

READ MORE

Peak-to-rms reduction of speech based on a sinusoidal model

Published in:
IEEE Trans. Signal Process., Vol. 39, No. 2, February 1991, pp. 273-288.

Summary

In a number of applications, a speech waveform is processed using phase dispersion and amplitude compression to reduce its peak-to-rms ratio so as to increase loudness and intelligibility while minimizing perceived distortion. In this paper, a sinusoidal-based analysis/synthesis system is used to apply a radar design solution to the problem of dispersing the phase of a speech waveform. Unlike conventional methods of phase dispersion, this solution technique adapts dynamically to the pitch and spectral characteristics of the speech, while maintaining the original spectral envelope. The solution can also be used to drive the sine-wave amplitude modification for amplitude compression, and is coupled to the desired shaping of the speech spectrum. The new dispersion solution, when integrated with amplitude compression, results in a significant reduction in the peak-to-rms ratio of the speech waveform with acceptable loss in quality. Application of a real-time prototype sine-wave preprocessor to AM radio broadcasting is described.
READ LESS

Summary

In a number of applications, a speech waveform is processed using phase dispersion and amplitude compression to reduce its peak-to-rms ratio so as to increase loudness and intelligibility while minimizing perceived distortion. In this paper, a sinusoidal-based analysis/synthesis system is used to apply a radar design solution to the problem...

READ MORE

Noise reduction using a soft-decision sine-wave vector quantizer

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 2, Speech Processing 2; VLSI, Audio and Electroacoustics, 3-6 April 1990, pp. 821-824.

Summary

The need for noise reduction arises in speech communication channels, such as ground-to-air transmission and ground-based cellular radio, to improve vocoder quality and speech recognition accuracy. In this paper, noise reduction is performed in the context of a high-quality harmonic serc-phase sine-wave analysis/synthesis system which is characterized by sine-wave amplitudes, a voicing probability, and a fundamental frequency. Least-squared error estimation of a harmonic sine-wave representation leads to a "soft decision" template estimate consisting of sine-wave amplitudes and a voicing probability. The least-squares solution is modified to use template-matching with "nearest neighbors." The reconstruction is improved by using the modified least-squares solution only in spectral regions with low signal-to-noise ratio. The results, although preliminary, provide evidence that harmonic zero-phase sine-wave analysis/synthesis, combined with effective estimation of sine-wave amplitudes and probability of voicing, offers a promising approach to noise reduction.
READ LESS

Summary

The need for noise reduction arises in speech communication channels, such as ground-to-air transmission and ground-based cellular radio, to improve vocoder quality and speech recognition accuracy. In this paper, noise reduction is performed in the context of a high-quality harmonic serc-phase sine-wave analysis/synthesis system which is characterized by sine-wave amplitudes...

READ MORE

An approach to co-channel talker interference suppression using a sinusoidal model for speech

Published in:
IEEE Trans. Acoust. Speech Signal Process., Vol. 38, No. 1, January 1990, pp. 56-59.

Summary

This paper describes a new approach to co-channel talker interference suppression on a sinusoidal representation of speech. The technique fits a sinusoidal model to additive vocalic speech segments such that the least mean-squared error between the model and the summed waveforms is obtained. Enhancement is achieved by synthesizing a waveform from the sine waves attributed to the desired speaker. Least-squares estimation is applied to obtain sine-wave amplitudes and phases of both talkers, based on either a priori sine-wave frequencies or a priori fundamental frequency contours. When the frequencies of the two waveforms are closely spaced, the performance is significantly improved by exploiting the time evolution of the sinusoidal parameters across multiple analysis frames. The least-squared error approach is also extended, under restricted conditions, to estimate fundamental frequency contours of both speakers from the summed waveforms. The results obtained, although limited in their scope, provide evidence that the sinusoidal analysis/synthesis model with effective parameter estimation techniques offers a promising approach to the problem of co-channel talker interference suppression over a range of conditions.
READ LESS

Summary

This paper describes a new approach to co-channel talker interference suppression on a sinusoidal representation of speech. The technique fits a sinusoidal model to additive vocalic speech segments such that the least mean-squared error between the model and the summed waveforms is obtained. Enhancement is achieved by synthesizing a waveform...

READ MORE