Summary
Regions of nonmodal phonation, which exhibit deviations from uniform glottal-pulse periods and amplitudes, occur often in speech and convey information about linguistic content, speaker identity, and vocal health. Some aspects of these deviations are random, including small perturbations, known as jitter and shimmer, as well as more significant aperiodicities. Other aspects are deterministic, including repeating patterns of fluctuations such as diplophonia and triplophonia. These deviations are often the source of misinterpretation of the spectrum. In this paper, we introduce a general signal-processing framework for interpreting the effects of both stochastic and deterministic aspects of nonmodality on the short-time spectrum. As an example, we show that the spectrum is sensitive to even small perturbations in the timing and amplitudes of glottal pulses. In addition, we illustrate important characteristics that can arise in the spectrum, including apparent shifting of the harmonics and the appearance of multiple pitches. For stochastic perturbations, we arrive at a formulation of the power-spectral density as the sum of a low-pass line spectrum and a high-pass noise floor. Our findings are relevant to a number of speech-processing areas including linear-prediction analysis, sinusoidal analysis-synthesis, spectrally derived features, and the analysis of disordered voices.