Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

By

Cagri K. Dagli Clear filter

How deep neural networks can improve emotion recognition on video data

September 25, 2016

Conference Paper

Author:

Pooya R. Khorrami

…

Published in:

ICIP: 2016 IEEE Int. Conf. on Image Processing, 25-28 September 2016.

Topic:

biometrics

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

We consider the task of dimensional emotion recognition on video data using deep learning. While several previous methods have shown the benefits of training temporal neural network models such as recurrent neural networks (RNNs) on hand-crafted features, few works have considered combining convolutional neural networks (CNNs) with RNNs. In this work, we present a system that performs emotion recognition on video data using both CNNs and RNNs, and we also analyze how much each neural network component contributes to the system's overall performance. We present our findings on videos from the Audio/Visual+Emotion Challenge (AV+EC2015). In our experiments, we analyze the effects of several hyperparameters on overall performance while also achieving superior performance to the baseline and other competing methods.

READ LESS

Summary

How deep neural networks can improve emotion recognition on video data

Sparse-coded net model and applications

September 13, 2016

Conference Paper

Author:

Youngjune L. Gwon

…

Published in:

2016 IEEE Int. Workshop on Machine Learning for Signal Processing, 13-16 September 2016.

Topic:

artificial intelligence

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

As an unsupervised learning method, sparse coding can discover high-level representations for an input in a large variety of learning problems. Under semi-supervised settings, sparse coding is used to extract features for a supervised task such as classification. While sparse representations learned from unlabeled data independently of the supervised task perform well, we argue that sparse coding should also be built as a holistic learning unit optimizing on the supervised task objectives more explicitly. In this paper, we propose sparse-coded net, a feedforward model that integrates sparse coding and task-driven output layers, and describe training methods in detail. After pretraining a sparse-coded net via semi-supervised learning, we optimize its task-specific performance in a novel backpropagation algorithm that can traverse nonlinear feature pooling operators to update the dictionary. Thus, sparse-coded net can be applied to supervised dictionary learning. We evaluate sparse-coded net with classification problems in sound, image, and text data. The results confirm a significant improvement over semi-supervised learning as well as superior classification performance against deep stacked autoencoder neural network and GMM-SVM pipelines in small to medium-scale settings.

READ LESS

Summary

Sparse-coded net model and applications

Cross-domain entity resolution in social media

July 11, 2016

Conference Paper

Author:

William M. Campbell

…

Published in:

4th Int. Workshop on Natural Language Processing for Social Media, SocialNLP with IJCAI, 11 July 2016.

Topic:

social network

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

The challenge of associating entities across multiple domains is a key problem in social media understanding. Successful cross-domain entity resolution provides integration of information from multiple sites to create a complete picture of user and community activities, characteristics, and trends. In this work, we examine the problem of entity resolution across Twitter and Instagram using general techniques. Our methods fall into three categories: profile, content, and graph based. For the profile-based methods, we consider techniques based on approximate string matching. For content-based methods, we perform author identification. Finally, for graph-based methods, we apply novel cross-domain community detection methods and generate neighborhood-based features. The three categories of methods are applied to a large graph of users in Twitter and Instagram to understand challenges, determine performance, and understand fusion of multiple methods. Final results demonstrate an equal error rate less than 1%.

READ LESS

Summary

Cross-domain entity resolution in social media

Joint audio-visual mining of uncooperatively collected video: FY14 Line-Supported Information, Computation, and Exploitation Program

February 5, 2015

Project Report

Author:

Cagri K. Dagli

…

Published in:

MIT Lincoln Laboratory Report LSP-116

Topic:

biometrics

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

The rate at which video is being created and gathered is rapidly accelerating as access to means of production and distribution expand. This rate of increase, however, is greatly outpacing the development of content-based tools to help users sift through this unstructured, multimedia data. The need for such technologies becomes more acute when considering their potential value in critical, media-rich government applications such as Seized Media Analysis, Social Media Forensics, and Foreign Media Monitoring. A fundamental challenge in developing technologies in these application areas is that they are typically in low-resource data domains. Low-resource domains are ones where the lack of ground-truth labels and statistical support prevent the direct application of traditional machine learning approaches. To help bridge this capability gap, the Joint Audio and Visual Mining of Uncooperatively Collected Video ICE Line Program (2236-1301) is developing new technologies for better content-based search, summarization, and browsing of large collections of unstructured, uncooperatively collected multimedia. In particular, this effort seeks to improve capabilities in video understanding by jointly exploiting time aligned audio, visual, and text information, an approach which has been underutilized in both the academic and commercial communities. Exploiting subtle connections between and across multiple modalities in low-resource multimedia data helps enable deeper video understanding, and in some cases provides new capability where it previously didn't exist. This report outlines work done in Fiscal Year 2014 (FY14) by the cross-divisional, interdisciplinary team tasked to meet these objectives. In the following sections, we highlight technologies developed in FY14 to support efficient Query-by-Example, Attribute, Keyword Search and Cross-Media Exploration and Summarization. Additionally, we preview work proposed for Fiscal Year 2015 as well as summarize our external sponsor interactions and publications/presentations.

READ LESS

Summary

Joint audio-visual mining of uncooperatively collected video: FY14 Line-Supported Information, Computation, and Exploitation Program

NEU_MITLL @ TRECVid 2015: multimedia event detection by pre-trained CNN models

January 1, 2015

Conference Paper

Author:

Joseph P. Robinson

…

Published in:

TRECVid 2015.

Topic:

artificial intelligence

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

We introduce a framework for multimedia event detection (MED), which was developed for TRECVID 2015 using convolutional neural networks (CNNs) to detect complex events via deterministic models trained on video frame data. We used several well-known CNN models designed to detect objects, scenes, and a combination of both (i.e., Hybrid-CNN). We also experimented with features from different networks fused together in different ways. The best score achieved was by fusing objects and scene detections at the feature-level (i.e., early fusion), resulting in a mean average precision (MAP) of 16.02%. Results showed that our framework is capable of detecting various complex events in videos when there are only a few instances of each within a large video search pool.

READ LESS

Summary

NEU_MITLL @ TRECVid 2015: multimedia event detection by pre-trained CNN models

Social network analysis with content and graphs

January 30, 2013

Journal Article

Author:

William M. Campbell

…

Published in:

Lincoln Laboratory Journal, Vol. 20, No. 1, 2013, pp. 62-81.

Topic:

social network

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Social network analysis has undergone a renaissance with the ubiquity and quantity of content from social media, web pages, and sensors. This content is a rich data source for constructing and analyzing social networks, but its enormity and unstructured nature also present multiple challenges. Work at Lincoln Laboratory is addressing the problems in constructing networks from unstructured data, analyzing the community structure of a network, and inferring information from networks. Graph analytics have proven to be valuable tools in solving these challenges. Through the use of these tools, Laboratory researchers have achieved promising results on real-world data. A sampling of these results are presented in this article.

READ LESS

Summary

Social network analysis with content and graphs

Individual and group dynamics in the reality mining corpus

September 3, 2012

Conference Paper

Author:

Cagri K. Dagli

…

William M. Campbell

Published in:

Proc. 2012 ASE/IEEE Int. Conf. on Social Computing, 3-5 September 2012, pp. 61-70.

Topic:

social network

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Though significant progress has been made in recent years, traditional work in social networks has focused on static network analysis or dynamics in a large-scale sense. In this work, we explore ways in which temporal information from sociographic data can be used for the analysis and prediction of individual and group behavior in dynamic, real-world situations. Using the MIT Reality Mining corpus, we show how temporal information in highly-instrumented sociographic data can be used to gain insights otherwise unavailable from static snapshots. We show how pattern of life features extend from the individual to the group level. In particular, we show how anonymized location information can be used to infer individual identity. Additionally, we show how proximity information can be used in a multilinear clustering framework to detect interesting group behavior over time. Experimental results and discussion suggest temporal information has great potential for improving both individual and group level understanding of real-world, dense social network data.

READ LESS

Summary

Individual and group dynamics in the reality mining corpus

Face recognition despite missing information

November 15, 2011

Journal Article

Author:

Cagri K. Dagli

…

Published in:

HST 2011, IEEE Int. Conf. on Technologies for Homeland Security, 15-17 November 2011, pp. 475-480.

Topic:

biometrics

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Missing or degraded information continues to be a significant practical challenge facing automatic face representation and recognition. Generally, existing approaches seek either to generatively invert the degradation process or find discriminative representations that are immune to it. Ideally, the solution to this problem exists between these two perspectives. To this end, in this paper we show the efficacy of using probabilistic linear subspace modes (in particular, variational probabilistic PCA) for both modeling and recognizing facial data under disguise or occlusion. From a discriminative perspective, we verify the efficacy of this approach for attenuating the effect of missing data due to disguise and non-linear speculars in several verification experiments. From a generative view, we show its usefulness in not only estimating missing information but also understanding facial covariates for image reconstruction. In addition, we present a least-squares connection to the maximum likelihood solution under missing data and show its intuitive connection to the geometry of the subspace learning problem.

READ LESS

Summary

Face recognition despite missing information

Publications

Refine Results

By

How deep neural networks can improve emotion recognition on video data

Summary

Summary

Sparse-coded net model and applications

Summary

Summary

Cross-domain entity resolution in social media

Summary

Summary

Joint audio-visual mining of uncooperatively collected video: FY14 Line-Supported Information, Computation, and Exploitation Program

Summary

Summary

NEU_MITLL @ TRECVid 2015: multimedia event detection by pre-trained CNN models

Summary

Summary

Social network analysis with content and graphs

Summary

Summary

Individual and group dynamics in the reality mining corpus

Summary

Summary

Face recognition despite missing information

Summary

Summary

Showing Results