Publications

Refine Results

(Filters Applied) Clear All

The JHU-MIT System Description for NIST SRE19 AV

Summary

This document represents the SRE19 AV submission by the team composed of JHU-CLSP, JHU-HLTCOE and MIT Lincoln Labs. All the developed systems for the audio and videoconditions consisted of Neural network embeddings with some flavor of PLDA/cosine back-end. Primary fusions obtained Actual DCF of 0.250 on SRE18 VAST eval, 0.183 on SRE19 AV dev audio, 0.140 on SRE19 AV dev video and 0.054 on SRE19AV multi-modal.
READ LESS

Summary

This document represents the SRE19 AV submission by the team composed of JHU-CLSP, JHU-HLTCOE and MIT Lincoln Labs. All the developed systems for the audio and videoconditions consisted of Neural network embeddings with some flavor of PLDA/cosine back-end. Primary fusions obtained Actual DCF of 0.250 on SRE18 VAST eval, 0.183...

READ MORE

Graph matching via multi-scale heat diffusion

Author:
Published in:
IEEE Intl. Conf. on Big Data, 9-12 December 2019.

Summary

We propose a novel graph matching algorithm that uses ideas from graph signal processing to match vertices of graphs using alternative graph representations. Specifically, we consider a multi-scale heat diffusion on the graphs to create multiple weighted graph representations that incorporate both direct adjacencies as well as local structures induced from the heat diffusion. Then a multi-objective optimization method is used to match vertices across all pairs of graph representations simultaneously. We show that our proposed algorithm performs significantly better than the algorithm that only uses the adjacency matrices, especially when the number of known latent alignments between vertices (seeds) is small. We test the algorithm on a set of graphs and show that at the low seed level, the proposed algorithm performs at least 15–35% better than the traditional graph matching algorithm.
READ LESS

Summary

We propose a novel graph matching algorithm that uses ideas from graph signal processing to match vertices of graphs using alternative graph representations. Specifically, we consider a multi-scale heat diffusion on the graphs to create multiple weighted graph representations that incorporate both direct adjacencies as well as local structures induced...

READ MORE

This looks like that: deep learning for interpretable image recognition

Published in:
Neural Info. Process., NIPS, 8-14 December 2019.

Summary

When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network architecture that reasons in a similar way: the network dissects the image by finding prototypical parts, and combines evidence from the prototypes to make a final classification. The algorithm thus reasons in a way that is qualitatively similar to the way ornithologists, physicians, geologists, architects, and others would explain to people on how to solve challenging image classification tasks. The network uses only image-level labels for training, meaning that there are no labels for parts of images. We demonstrate the method on the CIFAR-10 dataset and 10 classes from the CUB-200-2011 dataset.
READ LESS

Summary

When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network...

READ MORE

Prototype and analytics for discovery and exploitation of threat networks on social media

Published in:
2019 European Intelligence and Security Informatics Conference, EISIC, 26-27 November 2019.

Summary

Identifying and profiling threat actors are high priority tasks for a number of governmental organizations. These threat actors may operate actively, using the Internet to promote propaganda, recruit new members, or exert command and control over their networks. Alternatively, threat actors may operate passively, demonstrating operational security awareness online while using their Internet presence to gather information they need to pose an offline physical threat. This paper presents a flexible new prototype system that allows analysts to automatically detect, monitor and characterize threat actors and their networks using publicly available information. The proposed prototype system fills a need in the intelligence community for a capability to automate manual construction and analysis of online threat networks. Leveraging graph sampling approaches, we perform targeted data collection of extremist social media accounts and their networks. We design and incorporate new algorithms for role classification and radicalization detection using insights from social science literature of extremism. Additionally, we develop and implement analytics to facilitate monitoring the dynamic social networks over time. The prototype also incorporates several novel machine learning algorithms for threat actor discovery and characterization, such as classification of user posts into discourse categories, user post summaries and gender prediction.
READ LESS

Summary

Identifying and profiling threat actors are high priority tasks for a number of governmental organizations. These threat actors may operate actively, using the Internet to promote propaganda, recruit new members, or exert command and control over their networks. Alternatively, threat actors may operate passively, demonstrating operational security awareness online while...

READ MORE

Identification and detection of human trafficking using language models

Author:
Published in:
European Intelligence and Security Informatics Conf., EISIC, 26-27 November 2019.

Summary

In this paper, we present a novel language model-based method for detecting both human trafficking ads and trafficking indicators. The proposed system leverages language models to learn language structures in adult service ads, automatically select a list of keyword features, and train a machine learning model to detect human trafficking ads. The method is interpretable and adaptable to changing keywords used by traffickers. We apply this method to the Trafficking-10k dataset and show that it achieves better results than the previous models that leverage both ad text and images for detection. Furthermore, we demonstrate that our system can be successfully applied to detect suspected human trafficking organizations and rank these organizations based on their risk scores. This method provides a powerful new capability for law enforcement to rapidly identify ads and organizations that are suspected of human trafficking and allow more proactive policing using data.
READ LESS

Summary

In this paper, we present a novel language model-based method for detecting both human trafficking ads and trafficking indicators. The proposed system leverages language models to learn language structures in adult service ads, automatically select a list of keyword features, and train a machine learning model to detect human trafficking...

READ MORE

Characterization of disinformation networks using graph embeddings and opinion mining

Published in:
2019 European Intelligence and Security Informatics Conference, EISIC, 26-27 November 2019.

Summary

Global social media networks' omnipresent access, real time responsiveness and ability to connect with and influence people have been responsible for these networks' sweeping growth. However, as an unintended consequence, these defining characteristics helped create a powerful new technology for spread of propaganda and false information. We present a novel approach for characterizing disinformation networks on social media and distinguishing between different network roles using graph embeddings and hierarchical clustering. In addition, using topic filtering, we correlate the node characterization results with proxy opinion estimates.We plan to study opinion dynamics using signal processing on graphs approaches using longer-timescale social media datasets with the goal to model and infer influence among users in social media networks.
READ LESS

Summary

Global social media networks' omnipresent access, real time responsiveness and ability to connect with and influence people have been responsible for these networks' sweeping growth. However, as an unintended consequence, these defining characteristics helped create a powerful new technology for spread of propaganda and false information. We present a novel...

READ MORE

FirmFuzz: automated IOT firmware introspection and analysis

Published in:
2nd Workshop on the Internet of Things Security and Privacy, IoT S&P '19, 15 November 2019.

Summary

While the number of IoT devices grows at an exhilarating pace their security remains stagnant. Imposing secure coding standards across all vendors is infeasible. Testing individual devices allows an analyst to evaluate their security post deployment. Any discovered vulnerabilities can then be disclosed to the vendors in order to assist them in securing their products. The search for vulnerabilities should ideally be automated for efficiency and furthermore be device-independent for scalability. We present FirmFuzz, an automated device-independent emulation and dynamic analysis framework for Linux-based firmware images. It employs a greybox-based generational fuzzing approach coupled with static analysis and system introspection to provide targeted and deterministic bug discovery within a firmware image. We evaluate FirmFuzz by emulating and dynamically analyzing 32 images (from 27 unique devices) with a network accessible from the host performing the emulation. During testing, FirmFuzz discovered seven previously undisclosed vulnerabilities across six different devices: two IP cameras and four routers. So far, 4 CVE's have been assigned.
READ LESS

Summary

While the number of IoT devices grows at an exhilarating pace their security remains stagnant. Imposing secure coding standards across all vendors is infeasible. Testing individual devices allows an analyst to evaluate their security post deployment. Any discovered vulnerabilities can then be disclosed to the vendors in order to assist...

READ MORE

On-demand forensic video analytics for large-scale surveillance systems

Published in:
2019 IEEE Intl. Symp. on Technologies for Homeland Security, 5-6 November 2019.

Summary

This work presents FOVEA, an add-on suite of analytic tools for the forensic review of video in large-scale surveillance systems. While significant investment has been made toward improving camera coverage and quality, the burden on video operators for reviewing and extracting useful information from the video has only increased. Daily investigation tasks (such as searching through video, investigating abandoned objects, or piecing together information from multiple cameras) still require a significant amount of manual review by video operators. In contrast to other tools which require exporting video data or otherwise curating the video collection before analysis, FOVEA is designed to integrate with existing surveillance systems. Tools can be applied to any video stream in an on-demand fashion without additional hardware. This paper details the technical approach, underlying algorithms, and effects on video operator performance.
READ LESS

Summary

This work presents FOVEA, an add-on suite of analytic tools for the forensic review of video in large-scale surveillance systems. While significant investment has been made toward improving camera coverage and quality, the burden on video operators for reviewing and extracting useful information from the video has only increased. Daily...

READ MORE

Cultivating professional technical skills and understanding through hands-on online learning experiences

Published in:
2019 IEEE Learning with MOOCS, LWMOOCS, 23-25 October 2019.

Summary

Life-long learning is necessary for all professions because the technologies, tools and skills required for success over the course of a career expand and change. Professionals in science, technology, engineering and mathematics (STEM) fields face particular challenges as new multi-disciplinary methods, e.g. Machine Learning and Artificial Intelligence, mature to replace those learned in undergraduate or graduate programs. Traditionally, industry, professional societies and university programs have provided professional development. While these provide opportunities to develop deeper understanding in STEM specialties and stay current with new techniques, the constraints on formal classes and workshops preclude the possibility of Just-In-Time Mastery Learning, particularly for new domains. The MIT Lincoln Laboratory Supercomputing Center (LLSC) and MIT Supercloud teams have developed online course offerings specifically designed to provide a way for adult learners to build their own educational path based on their immediate needs, problems and schedules. To satisfy adult learners, the courses are formulated as a series of challenges and strategies. Using this perspective, the courses incorporate targeted theory supported by hands-on practice. The focus of this paper is the design of Mastery, Just-in-Time MOOC courses that address the full space of hands-on learning requirements, from digital to analog. The discussion centers on the design of project-based exercises for professional technical education courses. The case studies highlight examples from courses that incorporate practice ranging from the construction of a small radar used for real world data collection and processing to the development of high performance computing applications.
READ LESS

Summary

Life-long learning is necessary for all professions because the technologies, tools and skills required for success over the course of a career expand and change. Professionals in science, technology, engineering and mathematics (STEM) fields face particular challenges as new multi-disciplinary methods, e.g. Machine Learning and Artificial Intelligence, mature to replace...

READ MORE

Investigation of the relationship of vocal, eye-tracking, and fMRI ROI time-series measures with preclinical mild traumatic brain injury

Summary

In this work, we are examining correlations between vocal articulatory features, ocular smooth pursuit measures, and features from the fMRI BOLD response in regions of interest (ROI) time series in a high school athlete population susceptible to repeated head impact within a sports season. Initial results have indicated relationships between vocal features and brain ROIs that may show which components of the neural speech networks effected are effected by preclinical mild traumatic brain injury (mTBI). The data used for this study was collected by Purdue University on 32 high school athletes over the entirety of a sports season (Helfer, et al., 2014), and includes fMRI measurements made pre-season, in-season, and postseason. The athletes are 25 male football players and 7 female soccer players. The Immediate Post-Concussion Assessment and Cognitive Testing suite (ImPACT) was used as a means of assessing cognitive performance (Broglio, Ferrara, Macciocchi, Baumgartner, & Elliott, 2007). The test is made up of six sections, which measure verbal memory, visual memory, visual motor speed, reaction time, impulse control, and a total symptom composite. Using each test, a threshold is set for a change in cognitive performance. The threshold for each test is defined as a decline from baseline that exceeds one standard deviation, where the standard deviation is computed over the change from baseline across all subjects’ test scores. Speech features were extracted from audio recordings of the Grandfather Passage, which provides a standardized and phonetically balanced sample of speech. Oculomotor testing included two experimental conditions. In the smooth pursuit condition, a single target moving circularly, at constant speed. In the saccade condition, a target was jumped between one of three location along the horizontal midline of the screen. In both trial types, subjects visually tracked the targets during the trials, which lasted for one minute. The fMRI features are derived from the bold time-series data from resting state fMRI scans of the subjects. The pre-processing of the resting state fMRI and accompanying structural MRI data (for Atlas registration) was performed with the toolkit CONN (Whitfield-Gabrieli & Nieto-Castanon, 2012). Functional connectivity was generated using cortical and sub-cortical atlas registrations. This investigation will explores correlations between these three modalities and a cognitive performance assessment.
READ LESS

Summary

In this work, we are examining correlations between vocal articulatory features, ocular smooth pursuit measures, and features from the fMRI BOLD response in regions of interest (ROI) time series in a high school athlete population susceptible to repeated head impact within a sports season. Initial results have indicated relationships between...

READ MORE