MIT Lincoln Laboratory team takes honors at the 2014 Audio/Visual Emotion Challenge and Workshop

The Laboratory's technology for automatic assessment of depression severity earns a second consecutive first place in AVEC subchallenge

A team from MIT Lincoln Laboratory's Bioengineering Systems and Technologies Group was named a first-place subchallenge winner at the 2014 Audio/Visual Emotion Challenge and Workshop (AVEC 2014), the fourth annual competition that invites participants to use multimedia processing and machine learning to analyze subjects’ emotional states or estimate subjects’ level of depression.

Held at the annual Association for Computing Machinery (ACM) International Conference on Multimedia, the challenge gauges the success of entrants’ approaches to automated emotion detection on a set of common benchmarks. In 2014, two subchallenges were presented: continuously distinguishing emotions and estimating the level of subjects’ depression from audio and visual data.

Research team that won the AVEC challengeThe Lincoln Laboratory team that developed the algorithms to estimate depression severity through vocal and facial biomarkers was awarded first place in the depression assessment category of both the 2013 and 2014 Audio/Visual Emotion Challenges. Members of the team include, standing, left to right, Gregory Ciccarelli, James Williamson, and Thomas Quatieri, and seated, left to right, Rachelle Horwitz-Martin, Brian Helfer, and Bea Yu. Team member Daryush Mehta was not present at the photo session.

Of the 14 groups competing in the 2014 depression assessment subchallenge, Lincoln Laboratory's team was the most successful in predicting a depression score. Participants in this subchallenge estimate the severity of subjects' depression from either vocal characteristics detected in audio or facial signs identified in video recordings, or both. Because people with major depressive disorder often exhibit altered motor control that affects the mechanisms controlling speech production and facial expression, changes in motor outputs inferred from speech acoustics and facial movements may indicate depression. In the subchallenge, competitors' estimations are compared to previously determined self-reported assessment scores of the subjects' depressive severity; scores are based on the Beck Depression Inventory, an evaluation tool used widely by mental-health professionals and researchers. In 2013, the Laboratory's team also took first place in the AVEC depression assessment challenge.

"This year we used both speech and facial expression to determine depression levels. To exploit speech data, our team used novel biomarkers based on phoneme-dependent speaking rate and timing, and on incoordination of vocal tract articulators, as we did in AVEC 2013. In addition, we introduced vocal features that reflect the timing and coordination between articulators and the speech production source at the vocal folds. In 2014, we also introduced biomarkers based on the timing and coordination of facial features that reflect muscle groups underlying facial expression during speech production. Our vocal and facial biomarkers together formed the basis for predicting depression scores," says Dr. Thomas Quatieri, a senior technical staff member on the AVEC team.

Concept illustration for vocal and facial biomarkersDepression-induced psychomotor retardation alters motor timing and coordination of vocal articulators and facial muscles. Resulting vocal features are based on phoneme-dependent speaking rates and pitch dynamics, coordination of vocal articulators and vocal folds, and coordination and timing of facial muscle units during speech production. (Click on image for larger view.)

The suite of algorithms that Lincoln Laboratory researchers used to predict Beck Depression Inventory ratings combines complementary features in Gaussian mixture model and extreme learning machine classifiers. "We were given training data (with known Beck scores) from which to build a prediction model. At the challenge, we used this model with new test data to demonstrate our technique's capability in predicting Beck scores. Although the speech samples were in German and our biomarkers were designed with English, the biomarkers were applied effectively, indicating an independence across some languages," says Quatieri, who noted that this year's challenge was more difficult than the 2013 one because much less training data were provided.

"It was exciting to extend our previous voice-only feature approaches used in AVEC 2013 to analyzing facial dynamics from video. This extension was accomplished by extracting similar signatures of depression that were based on characterizations of multivariate timing and coordination," says Dr. James Williamson, a technical staff member also on the AVEC team.

Quatieri and Williamson, along with colleagues Brian Helfer and Gregory Ciccarelli from the Bioengineering Systems and Technologies Group and consultant Dr. Daryush Mehta of Massachusetts General Hospital, helped develop the technology that led to the 2014 AVEC win. In addition, Lincoln Laboratory staff members Bea Yu and Rachelle Horwitz-Martin contributed to the earlier AVEC 2013 win.

The team's recent work may lead to research on features for depression assessment based on other cross modalities involving muscle coordination and timing, such as coordination between articulators and facial muscle activation. Lincoln Laboratory's biomarker technology, which has shown good results in predicting an individual's cognitive state, is also being explored for use in evaluating the severity of other neurological disorders, such as traumatic brain injury and dementia.

In addition, the team is collaborating with Dr. Satra Ghosh and Dr. John Gabrieli from the MIT Brain and Cognitive Science Department to develop computational models of speech production in the disordered brain by merging knowledge of neurological disorders, computational modeling, and speech signal processing. 'We are successfully using the same principles of articulatory timing and coordination in other neurological disorders and thus feel we may have discovered a common vocal feature basis for neurocognitive decline. Our collaborative work with the MIT Brain and Cognitive Science Department may provide a neural foundation for this hypothesis and lead to even more effective biomarkers," says Quatieri.

Both the medical and the military communities are very interested in developing tools that quantitatively measure levels of depression and other neurological disorders. According to the National Institutes of Health, major depressive disorder strikes about 6.7% of U.S. adults each year. The U.S. Department of Veterans Affairs' National Center for Post-traumatic Stress Disorder (PTSD) estimates that PTSD will afflict about 7–8% of the U.S. population sometime during their lives; is affecting, in any given year, 11–20% of veterans of the various conflicts in the Middle East; and has been diagnosed in almost 30% of Vietnam War veterans. In a March 2014 issue, the Journal of American Medical Association (JAMA) Psychiatry reported on a study stating that almost 25% of 5500 active-duty, nondeployed military personnel surveyed were assessed as having a mental disorder of some type.

Objective predictors of depression severity, such as the tools used in the AVEC challenge, could supplement the currently used diagnostic techniques that rely on patients' self-reporting and clinicians' subjective assessments. In addition, because these tools exploit subtleties in speech and facial movement that may not be detected by clinicians' observations, they may enable earlier diagnoses of depression.

Helfer summed up the significance of the AVEC recognition: "This win really helps to validate our work and brings us one step closer to transitioning our findings to the general public."

Posted December 2014

 

top of page