Publications
Tagged As
Audio-visual identity grounding for enabling cross media search
Summary
Summary
Automatically searching for media clips in large heterogeneous datasets is an inherently difficult challenge, and nearly impossibly so when searching across distinct media types (e.g. finding audio clips that match an image). In this paper we introduce the exploitation of identity grounding for enabling this cross media search and exploration...
Vocal-source biomarkers for depression - a link to psychomotor activity
Summary
Summary
A hypothesis in characterizing human depression is that change in the brain's basal ganglia results in a decline of motor coordination. Such a neuro-physiological change may therefore affect laryngeal control and dynamics. Under this hypothesis, toward the goal of objective monitoring of depression severity, we investigate vocal-source biomarkers for depression...
Investigating acoustic correlates of human vocal fold vibratory phase asymmetry through modeling and laryngeal high-speed videoendoscopy
Summary
Summary
Vocal fold vibratory asymmetry is often associated with inefficient sound production through its impact on source spectral tilt. This association is investigated in both a computational voice production model and a group of 47 human subjects. The model provides indirect control over the degree of left-right phase asymmetry within a...
Face recognition despite missing information
Summary
Summary
Missing or degraded information continues to be a significant practical challenge facing automatic face representation and recognition. Generally, existing approaches seek either to generatively invert the degradation process or find discriminative representations that are immune to it. Ideally, the solution to this problem exists between these two perspectives. To this...
Automatic detection of depression in speech using Gaussian mixture modeling with factor analysis
Summary
Summary
Of increasing importance in the civilian and military population is the recognition of Major Depressive Disorder at its earliest stages and intervention before the onset of severe symptoms. Toward the goal of more effective monitoring of depression severity, we investigate automatic classifiers of depression state, that have the important property...
USSS-MITLL 2010 human assisted speaker recognition
Summary
Summary
The United States Secret Service (USSS) teamed with MIT Lincoln Laboratory (MIT/LL) in the US National Institute of Standards and Technology's 2010 Speaker Recognition Evaluation of Human Assisted Speaker Recognition (HASR). We describe our qualitative and automatic speaker comparison processes and our fusion of these processes, which are adapted from...
Voice production mechanisms following phonosurgical treatment of early glottic cancer
Summary
Summary
Although near-normal conversational voices can be achieved with the phonosurgical management of early glottic cancer, there are still acoustic and aerodynamic deficits in vocal function that must be better understood to help further optimize phonosurgical interventions. Stroboscopic assessment is inadequate for this purpose. A newly discovered color high-speed videoendoscopy (HSV)...
Cognitive services for the user
Summary
Summary
Software-defined cognitive radios (CRs) use voice as a primary input/output (I/O) modality and are expected to have substantial computational resources capable of supporting advanced speech- and audio-processing applications. This chapter extends previous work on speech applications (e.g., [1]) to cognitive services that enhance military mission capability by capitalizing on automatic...