Pinpointing the signs of depression
A Lincoln Laboratory team developed a system that uses vocal and facial biomarkers to evaluate depression
By Meg Cichon | Communications and Community Outreach Office
Nearly every hour, one American veteran commits suicide, and nearly every day, one active-duty service member takes his or her life.1 The military suicide rate reached an all-time high in 2012, and this sobering statistic led President Barack Obama to label military suicides an epidemic, calling for an "all-hands-on-deck approach" to address mental health.2 According to the U.S. Department of Veteran Affairs, depression is five times more common in soldiers than in civilians,3 and depressed patients are more likely to take their own lives.4 As the leading cause of disability in America, depression affects more than 18 million adults and costs more than $210 billion per year for healthcare and lost workplace productivity.5 Both the military and medical community are interested in new ways to diagnose, treat, and monitor depression.
Answering the president’s call for action on improving mental health services, in 2012 Lincoln Laboratory researchers began developing a system to improve the accuracy and efficiency of detecting and monitoring depression traditionally diagnosed by subjective testing. This system uses mathematical algorithms that rely on vocal biomarkers to quantitatively predict the severity of the disorder. Since then, the research team has incorporated algorithms that analyze facial biomarkers into the system and, in collaboration with MIT campus, is developing a mobile application that will allow users to evaluate their depression levels daily.
"We're interested in analyzing speech," says Thomas Quatieri of the Bioengineering Systems and Technologies Group at MIT Lincoln Laboratory. Rather than studying what people talk about, Laboratory researchers are examining how people talk. Quatieri and his team are researching speech because they believe that changes in motor outputs inferred from speech acoustics and facial movements during speech may indicate depression, as major depressive disorder (MDD) patients often experience neuromotor changes that affect the mechanisms controlling speech production and facial expression. Because depression symptoms are typically subtle, the disease is difficult to diagnose and treat; according to research, doctors misdiagnose depression in nearly 53% of cases.6 To adequately address the complex problem of detecting depression, Quatieri’s vocal and facial biomarkers (VFB) team members—James Williamson, Gregory Ciccarelli, Brian Helfer, Daryush Mehta, Bea Yu, Rachelle Horwitz, and Jeffrey Palmer—have expertise that spans advanced signal processing, machine learning, neural and speech science, and cognitive psychology.
To detect depression, the VFB system collects training data from dozens of subjects diagnosed with various levels of depression and processes these data to extract vocal and facial biomarkers. Using these biomarkers, the system identifies VFB trends found across all subject data to create a predictive model that estimates depression severity. When a new patient records audio and video samples (i.e., test data) via a microphone and camera attached to a computer, the model estimates the severity of the patient’s depression level from vocal characteristics extracted from the audio recordings, facial signs identified in the video recordings, or both. These characteristics present as subtle changes in the timing, coordination, and frequency of speech and facial movements. To gather training and testing data, researchers ask subjects to record a two- to five-minute sample of their reading a predetermined passage, repeating certain sounds, and/or speaking freely. Although samples of free speech result in usable data, team member James Williamson prefers subjects to read predetermined passages because overall user data may be compromised when people are required to think about what they are going to say (e.g., resulting in slowed speech), potentially skewing the predictive model with biased timing and movements.
The VFB system studies speech characteristics in two ways. First, it analyzes each phoneme—a distinct unit of sound—to identify areas of speech degradation, e.g., slowed or slurred speech. A depressed person's speech may slow during certain phonemes, such as the "a" sound, or quicken during others, such as the "s" sound, because of a decline in motor control. Second, the system identifies vocal coordination features. "When people speak, their vocal tracts move, changing the resonances underlying their voices," says Quatieri. "Depressed people have changes in their vocal expressions that may stem in part from resonance fluctuations caused by alterations in articulatory coordination. These vocal changes may not be detected by the human ear, so we are aiming to detect variations in the precise synchrony of the tongue, lips, jaw, and soft palate in the depressed state."
Using visual recorded data of depression patients, the system also examines physical facial expressions, i.e., facial action units (FAU). The FAUs reflect groups of muscles responsible for particular movements, e.g., smiling, frowning, and lowering or raising eyebrows, with different muscle intensities for each particular expression. Employing principles similar to those used in vocal timing and coordination, the facial component of the VFB system measures muscle movements around the eyes, nose, and mouth, and identifies changes in coordination that reflect expressions indicative of depression. Certain expressions in which facial features turn down, especially those involving disgust (e.g., nose wrinkling) or sadness (e.g., frowning), typically indicate depression. "Like vocal cues, most changes in muscle movement coordination are very slight, and clinicians typically cannot objectively recognize them," says Williamson.
With the VFBs derived from patient data, researchers apply statistical algorithms to map biomarkers to a prediction of depression severity. Forming predictive models requires reference to depression assessments, including the Beck Depression Inventory (BDI) and the Hamilton Rating Scale for Depression (HAM-D). The BDI and HAM-D assessments consist of a series of self-reported and clinically observed answers respectively to questions about the symptoms of depression, such as hopelessness, irritability, insomnia, and anxiety. Values are assigned to each answer, and the total score, which typically ranges from 0 (low) to 50 (high), indicates the severity of the patient’s depression. When creating the predictive model, researchers compared each subject’s VFBs to BDI and HAM-D results to determine which biomarkers most highly correlate with depression. For example, after comparing the facial biomarkers of users with low and high traditional assessment scores, the team found that a jaw drop or a tightened lip is a more significant indicator of depression than a chin raise. With a predictive model in place, a new subject can record data, and the system estimates his or her depression level by using the predictive model formed from the training data.
The VFB system is able to accurately track increases and decreases in depression severity within about eight points of the BDI score. This close prediction of the BDI score resulted in the Lincoln Laboratory team being awarded first place in the depression assessment category of both the 2013 and 2014 International Audio/Visual Emotion Challenges, which took place during the Association for Computing Machinery Multimedia Conference. In 2013, the team used algorithms to estimate depression severity through vocal biomarkers only, while in 2014 they used both vocal and facial biomarkers. Because the system uses machine learning, its pattern recognition accuracy will increase as more subjects enter data. "As more patients use this system, our database will become broader, allowing the system to recognize more VFBs that indicate depression severity and enabling more accurate evaluation of users," says Williamson. However, the biggest challenge in improving accuracy is accounting for natural human variability and statistical confounders—such as age, physical sickness, and other co-occurring neurological disorders—that tend to skew scores. "Our team is continuously working to pinpoint features that only vary with depression," says Quatieri.
In collaboration with Satrajit Ghosh from the McGovern Institute for Brain Research at MIT, the Lincoln Laboratory research team is developing a mobile application called VoiceUp that will allow users daily access to the VFB system. People will use the application by speaking into their device microphone and camera either freely, akin to journaling, or reading a standard passage and reciting specific sounds to receive a depression-level evaluation to consistently monitor their mental health. Quatieri sees the application as a potentially useful tool for monitoring and treating depression: "In the short term, VFBs may be used as a complement to psychiatric evaluation. In the long term, as more patients use the system and as more data are accumulated, this technology could help doctors diagnose patients, prescribe them a treatment plan, and assess whether or not the treatment plan works." VoiceUp is currently capable of data collection only on Android devices, but within the next year, the researchers are working to establish iPhone compatibility with depression-level prediction results that display on a mobile device.
Looking beyond depression, the research team believes that the VFB system has the potential to evaluate other cognitive disorders, such as traumatic brain injury, dementia, post-traumatic stress disorder, and Parkinson’s disease. Because these disorders in part present neuromotor symptoms similar to those of depression, the timing and coordination principles of the VFB system may reveal a common vocal feature basis for neurocognitive decline in a range of disorders and thus may predict the presence or severity of those disorders. In addition, the team is collaborating with Ghosh and John Gabrieli, also from the McGovern Institute, to develop neural computational models of speech production in the disordered brain by combining current knowledge of neurological disorders, computational modeling, and speech signal processing. The collaborative work with the McGovern Institute may provide a neural foundation for vocal timing and coordination principles and lead to stronger clinical support.
"I think the VFB system will be effective, but we cannot be certain until we can validate system performance with large populations and until doctors and clinicians start using it," Quatieri says. "While we are using data collected from more than 100 participants to evaluate depression, we need hundreds more people to start using the system to regularly monitor their mental health and doctors to start using it in conjunction with treatment plans." Williamson adds, "With more data to analyze, we should be able to determine whether or not the VFB system can identify specific symptoms to predict mental disorders in a more proactive way."
Posted June 2016
1 U.S. Department of Veteran Affairs, "VA Issues New Report on Suicide Data," Feb. 2013, available at http://www.va.gov/opa/pressrel/pressrelease.cfm?id=2427
2 The White House Office of the Press Secretary, "Remarks by The First Lady and The President at Disabled American Veterans Convention," 10 Aug. 2013, available at https://www.whitehouse.gov/the-press-office/2013/08/10/remarks-first-lady-and-president-disabled-american-veterans-convention
3 American Foundation for Suicide Prevention, "Key Statistics," available at https://www.afsp.org/understanding-suicide/key-research-findings
4 Science Daily, "How can we prevent suicide? Major study shows risk factors associated with depression," 30 Aug. 2015, available at http://www.sciencedaily.com/releases/2015/08/150830152601.htm
5 World Health Organization, "Depression" Fact Sheet N°369, April 2016, available at http://www.who.int/mediacentre/factsheets/fs369/en/
6 Medscape Medical News, "Depression Often Misdiagnosed in Primary Care," 29 July, 2009, available at http://www.medscape.com/viewarticle/706714
top of page