Publications

Refine Results

(Filters Applied) Clear All

Multi-modal audio, video and physiological sensor learning for continuous emotion prediction

Summary

The automatic determination of emotional state from multimedia content is an inherently challenging problem with a broad range of applications including biomedical diagnostics, multimedia retrieval, and human computer interfaces. The Audio Video Emotion Challenge (AVEC) 2016 provides a well-defined framework for developing and rigorously evaluating innovative approaches for estimating the arousal and valence states of emotion as a function of time. It presents the opportunity for investigating multimodal solutions that include audio, video, and physiological sensor signals. This paper provides an overview of our AVEC Emotion Challenge system, which uses multi-feature learning and fusion across all available modalities. It includes a number of technical contributions, including the development of novel high- and low-level features for modeling emotion in the audio, video, and physiological channels. Low-level features include modeling arousal in audio with minimal prosodic-based descriptors. High-level features are derived from supervised and unsupervised machine learning approaches based on sparse coding and deep learning. Finally, a state space estimation approach is applied for score fusion that demonstrates the importance of exploiting the time-series nature of the arousal and valence states. The resulting system outperforms the baseline systems [10] on the test evaluation set with an achieved Concordant Correlation Coefficient (CCC) for arousal of 0.770 vs 0.702 (baseline) and for valence of 0.687 vs 0.638. Future work will focus on exploiting the time-varying nature of individual channels in the multi-modal framework.
READ LESS

Summary

The automatic determination of emotional state from multimedia content is an inherently challenging problem with a broad range of applications including biomedical diagnostics, multimedia retrieval, and human computer interfaces. The Audio Video Emotion Challenge (AVEC) 2016 provides a well-defined framework for developing and rigorously evaluating innovative approaches for estimating the...

READ MORE

Detecting depression using vocal, facial and semantic communication cues

Summary

Major depressive disorder (MDD) is known to result in neurophysiological and neurocognitive changes that affect control of motor, linguistic, and cognitive functions. MDD's impact on these processes is reflected in an individual's communication via coupled mechanisms: vocal articulation, facial gesturing and choice of content to convey in a dialogue. In particular, MDD-induced neurophysiological changes are associated with a decline in dynamics and coordination of speech and facial motor control, while neurocognitive changes influence dialogue semantics. In this paper, biomarkers are derived from all of these modalities, drawing first from previously developed neurophysiologically motivated speech and facial coordination and timing features. In addition, a novel indicator of lower vocal tract constriction in articulation is incorporated that relates to vocal projection. Semantic features are analyzed for subject/avatar dialogue content using a sparse coded lexical embedding space, and for contextual clues related to the subject's present or past depression status. The features and depression classification system were developed for the 6th International Audio/Video Emotion Challenge (AVEC), which provides data consisting of audio, video-based facial action units, and transcribed text of individuals communicating with the human-controlled avatar. A clinical Patient Health Questionnaire (PHQ) score and binary depression decision are provided for each participant. PHQ predictions were obtained by fusing outputs from a Gaussian staircase regressor for each feature set, with results on the development set of mean F1=0.81, RMSE=5.31, and MAE=3.34. These compare favorably to the challenge baseline development results of mean F1=0.73, RMSE=6.62, and MAE=5.52. On test set evaluation, our system obtained a mean F1=0.70, which is similar to the challenge baseline test result. Future work calls for consideration of joint feature analyses across modalities in an effort to detect neurological disorders based on the interplay of motor, linguistic, affective, and cognitive components of communication.
READ LESS

Summary

Major depressive disorder (MDD) is known to result in neurophysiological and neurocognitive changes that affect control of motor, linguistic, and cognitive functions. MDD's impact on these processes is reflected in an individual's communication via coupled mechanisms: vocal articulation, facial gesturing and choice of content to convey in a dialogue. In...

READ MORE

State of the art focal plane arrays of InP/InGaAsP Geiger-mode avalanche photodiodes for active electro-optical applications

Summary

MIT Lincoln Laboratory has developed InP/InGaAsP Geiger-Mode Avalanche Photodiodes and associated readout integrated circuits (ROICs) that have enabled numerous active optical systems over the past decade. Framed and asynchronous photon timing ROIC architectures have been demonstrated. In recent years, efforts at MITLL have focused on technology development to advance the state of the art of framed Gm APD FPAs and a 256x128 pixel FPA with on-chip data thinning has been demonstrated.
READ LESS

Summary

MIT Lincoln Laboratory has developed InP/InGaAsP Geiger-Mode Avalanche Photodiodes and associated readout integrated circuits (ROICs) that have enabled numerous active optical systems over the past decade. Framed and asynchronous photon timing ROIC architectures have been demonstrated. In recent years, efforts at MITLL have focused on technology development to advance the...

READ MORE

Effects of humidity and surface on photoalignment of brilliant yellow

Summary

Controlling and optimising the alignment of liquid crystals is a crucial process for display application. Here, we investigate the effects of humidity and surface types on photoalignment of an azo-dye brilliant yellow (BY). Specifically, the effect of humidity on the photoalignment of BY was studied at the stage of substrate storage before coating, during the spin-coating process, between film coating and exposure, and after exposure. Surprising results are the drastic effect of humidity during the spin-coating process, the humidity annealing to increase the order of the BY layer after exposure and the dry annealing to stabilise the layer. Our results are interpreted in terms of the effect of water on the aggregation of BY. The type of surface studied had minimal effects. Thin BY films (about 3 nm thickness) were sensitive to the hydrophilicity of the surface while thick BY films (about 30 nm thickness) were not affected by changing the surface. The results of this paper allow for the optimisation of the BY photoalignment for liquid crystal display application as well as a better understanding of the BY photoalignment mechanism.
READ LESS

Summary

Controlling and optimising the alignment of liquid crystals is a crucial process for display application. Here, we investigate the effects of humidity and surface types on photoalignment of an azo-dye brilliant yellow (BY). Specifically, the effect of humidity on the photoalignment of BY was studied at the stage of substrate...

READ MORE

Use of Photoacoustic Excitation and Laser Vibrometry to Remotely Detect Trace Explosives

Summary

In this paper, we examine a laser-based approach to remotely initiate, measure, and differentiate acoustic and vibrational emissions from trace quantities of explosive materials against their environment. Using a pulsed ultraviolet laser (266 nm), we induce a significant (>100  Pa) photoacoustic response from small quantities of military-grade explosives. The photoacoustic signal, with frequencies predominantly between 100 and 500 kHz, is detected remotely via a wideband laser Doppler vibrometer. This two-laser system can be used to rapidly detect and discriminate explosives from ordinary background materials, which have significantly weaker photoacoustic response. A 100  ng/cm2 limit of detection is estimated. Photoablation is proposed as the dominant mechanism for the large photoacoustic signals generated by explosives.
READ LESS

Summary

In this paper, we examine a laser-based approach to remotely initiate, measure, and differentiate acoustic and vibrational emissions from trace quantities of explosive materials against their environment. Using a pulsed ultraviolet laser (266 nm), we induce a significant (>100  Pa) photoacoustic response from small quantities of military-grade explosives. The photoacoustic signal...

READ MORE

Application of a resilience framework to military installations: a methodology for energy resilience business case decisions

Published in:
MIT Lincoln Laboratory Report TR-1216

Summary

The goal of the study was to develop and demonstrate an energy resilience framework at four DoD installations. This framework, predominantly focused on developing a business case, was established for broader application across the DoD. The methodology involves gathering data from an installation on critical energy load requirements, the energy costs and usage, quantifying the cost and performance of the existing energy resilience solution at the installation, and then conducting an analysis of alternatives to look at new system designs. Improvements in data collection at the installation level, as recommended in this report, will further increase the fidelity of future analysis and the accuracy of the recommendations. And most importantly, increased collaboration between the facility personnel and the mission operators at the installation will encourage holistic solutions that improve both the life cycle costs and the resilience of the installation's energy systems and supporting infrastructure.
READ LESS

Summary

The goal of the study was to develop and demonstrate an energy resilience framework at four DoD installations. This framework, predominantly focused on developing a business case, was established for broader application across the DoD. The methodology involves gathering data from an installation on critical energy load requirements, the energy...

READ MORE

Crosstalk characterization and mitigation in Geiger-mode avalanche photodiode arrays

Summary

Intra focal plane array (FPA) crosstalk is a primary development limiter of large, fine-pixel Geiger-mode avalanche photodiode (Gm-APD) arrays beyond 256×256 pixels. General analysis methods and results from MIT Lincoln Laboratory (MIT/LL) InP-based detector arrays will be presented.
READ LESS

Summary

Intra focal plane array (FPA) crosstalk is a primary development limiter of large, fine-pixel Geiger-mode avalanche photodiode (Gm-APD) arrays beyond 256×256 pixels. General analysis methods and results from MIT Lincoln Laboratory (MIT/LL) InP-based detector arrays will be presented.

READ MORE

Biomimetic antenna array using non-foster network to enhance directional sensitivity over broad frequency band

Published in:
IEEE Trans. Antennas Propag., Vol. 64, No. 10, October 2016, pp. 4297-4305.

Summary

Biologically inspired antenna arrays that mimic the hearing mechanism of insects are called biomimetic antenna arrays (BMAAs). They are attractive for microwave applications, such as compact direction finding systems. Earlier, the BMAAs were designed for narrow frequency band phase enhancement, whereas we now propose to design them for use with a non-Foster coupling network (NFC). As the NFCs are not restricted by gain bandwidth product, their incorporation in the design can provide wideband phase enhancement. A method for designing BMAA, using a non-Foster coupling network (NFC-BMAA), and also for obtaining system stability, is presented. Simulated and measured results of the fabricated structure are also presented and discussed.
READ LESS

Summary

Biologically inspired antenna arrays that mimic the hearing mechanism of insects are called biomimetic antenna arrays (BMAAs). They are attractive for microwave applications, such as compact direction finding systems. Earlier, the BMAAs were designed for narrow frequency band phase enhancement, whereas we now propose to design them for use with...

READ MORE

Side channel authenticity discriminant analysis for device class identification

Summary

Counterfeit microelectronics present a significant challenge to commercial and defense supply chains. Many modern anti-counterfeit strategies rely on manufacturer cooperation to include additional identification components. We instead propose Side Channel Authenticity Discriminant Analysis (SICADA) to leverage physical phenomena manifesting from device operation to match suspect parts to a class of authentic parts. This paper examines the extent that power dissipation information can be used to separate unique classes of devices. A methodology for distinguishing device types is presented and tested on both simulation data of a custom circuit and empirical measurements of Microchip dsPIC33F microcontrollers. Experimental results show that power side channels contain significant distinguishing information to identify parts as authentic or suspect counterfeit.
READ LESS

Summary

Counterfeit microelectronics present a significant challenge to commercial and defense supply chains. Many modern anti-counterfeit strategies rely on manufacturer cooperation to include additional identification components. We instead propose Side Channel Authenticity Discriminant Analysis (SICADA) to leverage physical phenomena manifesting from device operation to match suspect parts to a class of...

READ MORE

How deep neural networks can improve emotion recognition on video data

Published in:
ICIP: 2016 IEEE Int. Conf. on Image Processing, 25-28 September 2016.

Summary

We consider the task of dimensional emotion recognition on video data using deep learning. While several previous methods have shown the benefits of training temporal neural network models such as recurrent neural networks (RNNs) on hand-crafted features, few works have considered combining convolutional neural networks (CNNs) with RNNs. In this work, we present a system that performs emotion recognition on video data using both CNNs and RNNs, and we also analyze how much each neural network component contributes to the system's overall performance. We present our findings on videos from the Audio/Visual+Emotion Challenge (AV+EC2015). In our experiments, we analyze the effects of several hyperparameters on overall performance while also achieving superior performance to the baseline and other competing methods.
READ LESS

Summary

We consider the task of dimensional emotion recognition on video data using deep learning. While several previous methods have shown the benefits of training temporal neural network models such as recurrent neural networks (RNNs) on hand-crafted features, few works have considered combining convolutional neural networks (CNNs) with RNNs. In this...

READ MORE