Publications
Phonologically-based biomarkers for major depressive disorder
Summary
Summary
Of increasing importance in the civilian and military population is the recognition of major depressive disorder at its earliest stages and intervention before the onset of severe symptoms. Toward the goal of more effective monitoring of depression severity, we introduce vocal biomarkers that are derived automatically from phonologically-based measures of...
A time-warping framework for speech turbulence-noise component estimation during aperiodic phonation
Summary
Summary
The accurate estimation of turbulence noise affects many areas of speech processing including separate modification of the noise component, analysis of degree of speech aspiration for treating pathological voice, the automatic labeling of speech voicing, as well as speaker characterization and recognition. Previous work in the literature has provided methods...
Multi-pitch estimation by a joint 2-D representation of pitch and pitch dynamics
Summary
Summary
Multi-pitch estimation of co-channel speech is especially challenging when the underlying pitch tracks are close in pitch value (e.g., when pitch tracks cross). Building on our previous work, we demonstrate the utility of a two-dimensional (2-D) analysis method of speech for this problem by exploiting its joint representation of pitch...
Voice production mechanisms following phonosurgical treatment of early glottic cancer
Summary
Summary
Although near-normal conversational voices can be achieved with the phonosurgical management of early glottic cancer, there are still acoustic and aerodynamic deficits in vocal function that must be better understood to help further optimize phonosurgical interventions. Stroboscopic assessment is inadequate for this purpose. A newly discovered color high-speed videoendoscopy (HSV)...
Preserving the character of perturbations in scaled pitch contours
Summary
Summary
The global and fine dynamic components of a pitch contour in voice production, as in the speaking and singing voice, are important for both the meaning and character of an utterance. In speech, for example, slow pitch inflections, rapid pitch accents, and irregular regions all comprise the pitch contour. In...
High-pitch formant estimation by exploiting temporal change of pitch
Summary
Summary
This paper considers the problem of obtaining an accurate spectral representation of speech formant structure when the voicing source exhibits a high fundamental frequency. Our work is inspired by auditory perception and physiological studies implicating the use of pitch dynamics in speech by humans. We develop and assess signal processing...
Sinewave parameter estimation using the fast Fan-Chirp Transform
Summary
Summary
Sinewave analysis/synthesis has long been an important tool for audio analysis, modification and synthesis [1]. The recently introduced Fan-Chirp Transform (FChT) [2,3] has been shown to improve the fidelity of sinewave parameter estimates for a harmonic audio signal with rapid frequency modulation [4]. A fast version of the FChT [3]...
Towards co-channel speaker separation by 2-D demodulation of spectrograms
Summary
Summary
This paper explores a two-dimensional (2-D) processing approach for co-channel speaker separation of voiced speech. We analyze localized time-frequency regions of a narrowband spectrogram using 2-D Fourier transforms and propose a 2-D amplitude modulation model based on pitch information for single and multi-speaker content in each region. Our model maps...
2-D processing of speech for multi-pitch analysis.
Summary
Summary
This paper introduces a two-dimensional (2-D) processing approach for the analysis of multi-pitch speech sounds. Our framework invokes the short-space 2-D Fourier transform magnitude of a narrowband spectrogram, mapping harmonically related signal components to multiple concentrated entities in a new 2-D space. First, localized time-frequency regions of the spectrogram are...
Time-varying autoregressive tests for multiscale speech analysis
Summary
Summary
In this paper we develop hypothesis tests for speech waveform nonstationarity based on time-varying autoregressive models, and demonstrate their efficacy in speech analysis tasks at both segmental and sub-segmental scales. Key to the successful synthesis of these ideas is our employment of a generalized likelihood ratio testing framework tailored to...