Publications

Refine Results

(Filters Applied) Clear All

Social network analysis with content and graphs

Published in:
Lincoln Laboratory Journal, Vol. 20, No. 1, 2013, pp. 62-81.

Summary

Social network analysis has undergone a renaissance with the ubiquity and quantity of content from social media, web pages, and sensors. This content is a rich data source for constructing and analyzing social networks, but its enormity and unstructured nature also present multiple challenges. Work at Lincoln Laboratory is addressing the problems in constructing networks from unstructured data, analyzing the community structure of a network, and inferring information from networks. Graph analytics have proven to be valuable tools in solving these challenges. Through the use of these tools, Laboratory researchers have achieved promising results on real-world data. A sampling of these results are presented in this article.
READ LESS

Summary

Social network analysis has undergone a renaissance with the ubiquity and quantity of content from social media, web pages, and sensors. This content is a rich data source for constructing and analyzing social networks, but its enormity and unstructured nature also present multiple challenges. Work at Lincoln Laboratory is addressing...

READ MORE

The application of statistical relational learning to a database of criminal and terrorist activity

Published in:
SIAM Conf. on Data Mining, 29 April - 1 May 2010.

Summary

We apply statistical relational learning to a database of criminal and terrorist activity to predict attributes and event outcomes. The database stems from a collection of news articles and court records which are carefully annotated with a variety of variables, including categorical and continuous fields. Manual analysis of this data can help inform decision makers seeking to curb violent activity within a region. We use this data to build relational models from historical data to predict attributes of groups, individuals, or events. Our first example involves predicting social network roles within a group under a variety of different data conditions. Collective classification can be used to boost the accuracy under data poor conditions. Additionally, we were able to predict the outcome of hostage negotiations using models trained on previous kidnapping events. The overall framework and techniques described here are flexible enough to be used to predict a variety of variables. Such predictions could be used as input to a more complex system to recognize intent of terrorist groups or as input to inform human decision makers.
READ LESS

Summary

We apply statistical relational learning to a database of criminal and terrorist activity to predict attributes and event outcomes. The database stems from a collection of news articles and court records which are carefully annotated with a variety of variables, including categorical and continuous fields. Manual analysis of this data...

READ MORE

Detection and simulation of scenarios with hidden Markov models and event dependency graphs

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 15 March 2010, pp. 5434-5437.

Summary

The wide availability of signal processing and language tools to extract structured data from raw content has created a new opportunity for the processing of structured signals. In this work, we explore models for the simulation and recognition of scenarios - i.e., time sequences of structured data. For simulation, we construct two models - hidden Markov models (HMMs) and event dependency graphs. Combined, these two simulation methods allow the specification of dependencies in event ordering, simultaneous execution of multiple scenarios, and evolving networks of data. For scenario recognition, we consider the application of multi-grained HMMs. We explore, in detail, mismatch between training scenarios and simulated test scenarios. The methods are applied to terrorist scenario detection with a simulation coded by a subject matter expert.
READ LESS

Summary

The wide availability of signal processing and language tools to extract structured data from raw content has created a new opportunity for the processing of structured signals. In this work, we explore models for the simulation and recognition of scenarios - i.e., time sequences of structured data. For simulation, we...

READ MORE

Modeling and detection techniques for counter-terror social network analysis and intent recognition

Summary

In this paper, we describe our approach and initial results on modeling, detection, and tracking of terrorist groups and their intents based on multimedia data. While research on automated information extraction from multimedia data has yielded significant progress in areas such as the extraction of entities, links, and events, less progress has been made in the development of automated tools for analyzing the results of information extraction to ?connect the dots.? Hence, our Counter-Terror Social Network Analysis and Intent Recognition (CT-SNAIR) work focuses on development of automated techniques and tools for detection and tracking of dynamically-changing terrorist networks as well as recognition of capability and potential intent. In addition to obtaining and working with real data for algorithm development and test, we have a major focus on modeling and simulation of terrorist attacks based on real information about past attacks. We describe the development and application of a new Terror Attack Description Language (TADL), which is used as a basis for modeling and simulation of terrorist attacks. Examples are shown which illustrate the use of TADL and a companion simulator based on a Hidden Markov Model (HMM) structure to generate transactions for attack scenarios drawn from real events. We also describe our techniques for generating realistic background clutter traffic to enable experiments to estimate performance in the presence of a mix of data. An important part of our effort is to produce scenarios and corpora for use in our own research, which can be shared with a community of researchers in this area. We describe our scenario and corpus development, including specific examples from the September 2004 bombing of the Australian embassy in Jakarta and a fictitious scenario which was developed in a prior project for research in social network analysis. The scenarios can be created by subject matter experts using a graphical editing tool. Given a set of time ordered transactions between actors, we employ social network analysis (SNA) algorithms as a filtering step to divide the actors into distinct communities before determining intent. This helps reduce clutter and enhances the ability to determine activities within a specific group. For modeling and simulation purposes, we generate random networks with structures and properties similar to real-world social networks. Modeling of background traffic is an important step in generating classifiers that can separate harmless activities from suspicious activity. An algorithm for recognition of simulated potential attack scenarios in clutter based on Support Vector Machine (SVM) techniques is presented. We show performance examples, including probability of detection versus probability of false alarm tradeoffs, for a range of system parameters.
READ LESS

Summary

In this paper, we describe our approach and initial results on modeling, detection, and tracking of terrorist groups and their intents based on multimedia data. While research on automated information extraction from multimedia data has yielded significant progress in areas such as the extraction of entities, links, and events, less...

READ MORE

Cognitive services for the user

Published in:
Chapter 10, Cognitive Radio Technology, 2009, pp. 305-324.

Summary

Software-defined cognitive radios (CRs) use voice as a primary input/output (I/O) modality and are expected to have substantial computational resources capable of supporting advanced speech- and audio-processing applications. This chapter extends previous work on speech applications (e.g., [1]) to cognitive services that enhance military mission capability by capitalizing on automatic processes, such as speech information extraction and understanding the environment. Such capabilities go beyond interaction with the intended user of the software-defined radio (SDR) - they extend to speech and audio applications that can be applied to information that has been extracted from voice and acoustic noise gathered from other users and entities in the environment. For example, in a military environment, situational awareness and understanding could be enhanced by informing users based on processing voice and noise from both friendly and hostile forces operating in a given battle space. This chapter provides a survey of a number of speech- and audio-processing technologies and their potential applications to CR, including: - A description of the technology and its current state of practice. - An explanation of how the technology is currently being applied, or could be applied, to CR. - Descriptions and concepts of operations for how the technology can be applied to benefit users of CRs. - A description of relevant future research directions for both the speech and audio technologies and their applications to CR. A pictorial overview of many of the core technologies with some applications presented in the following sections is shown in Figure 10.1. Also shown are some overlapping components between the technologies. For example, Gaussian mixture models (GMMs) and support vector machines (SVMs) are used in both speaker and language recognition technologies [2]. These technologies and components are described in further detail in the following sections. Speech and concierge cognitive services and their corresponding applications are covered in the following sections. The services covered include speaker recognition, language identification (LID), text-to-speech (TTS) conversion, speech-to-text (STT) conversion, machine translation (MT), background noise suppression, speech coding, speaker characterization, noise management, noise characterization, and concierge services. These technologies and their potential applications to CR are discussed at varying levels of detail commensurate with their innovation and utility.
READ LESS

Summary

Software-defined cognitive radios (CRs) use voice as a primary input/output (I/O) modality and are expected to have substantial computational resources capable of supporting advanced speech- and audio-processing applications. This chapter extends previous work on speech applications (e.g., [1]) to cognitive services that enhance military mission capability by capitalizing on automatic...

READ MORE

Exploiting nonacoustic sensors for speech encoding

Summary

The intelligibility of speech transmitted through low-rate coders is severely degraded when high levels of acoustic noise are present in the acoustic environment. Recent advances in nonacoustic sensors, including microwave radar, skin vibration, and bone conduction sensors, provide the exciting possibility of both glottal excitation and, more generally, vocal tract measurements that are relatively immune to acoustic disturbances and can supplement the acoustic speech waveform. We are currently investigating methods of combining the output of these sensors for use in low-rate encoding according to their capability in representing specific speech characteristics in different frequency bands. Nonacoustic sensors have the ability to reveal certain speech attributes lost in the noisy acoustic signal; for example, low-energy consonant voice bars, nasality, and glottalized excitation. By fusing nonacoustic low-frequency and pitch content with acoustic-microphone content, we have achieved significant intelligibility performance gains using the DRT across a variety of environments over the government standard 2400-bps MELPe coder. By fusing quantized high-band 4-to-8-kHz speech, requiring only an additional 116 bps, we obtain further DRT performance gains by exploiting the ear's insensitivity to fine spectral detail in this frequency region.
READ LESS

Summary

The intelligibility of speech transmitted through low-rate coders is severely degraded when high levels of acoustic noise are present in the acoustic environment. Recent advances in nonacoustic sensors, including microwave radar, skin vibration, and bone conduction sensors, provide the exciting possibility of both glottal excitation and, more generally, vocal tract...

READ MORE

Measuring translation quality by testing English speakers with a new Defense Language Proficiency Test for Arabic

Published in:
Int. Conf. on Intelligence Analysis, 2-5 May 2005.

Summary

We present results from an experiment in which educated English-native speakers answered questions from a machine translated version of a standardized Arabic language test. We compare the machine translation (MT) results with professional reference translations as a baseline for the purpose of determining the level of Arabic reading comprehension that current machine translation technology enables an English speaker to achieve. Furthermore, we explore the relationship between the current, broadly accepted automatic measures of performance for machine translation and the Defense Language Proficiency Test, a broadly accepted measure of effectiveness for evaluating foreign language proficiency. In doing so, we intend to help translate MT system performance into terms that are meaningful for satisfying Government foreign language processing requirements. The results of this experiment suggest that machine translation may enable Interagency Language Roundtable Level 2 performance, but is not yet adequate to achieve ILR Level 3. Our results are based on 69 human subjects reading 68 documents and answering 173 questions, giving a total of 4,692 timed document trials and 7,950 question trials. We propose Level 3 as a reasonable nearterm target for machine translation research and development.
READ LESS

Summary

We present results from an experiment in which educated English-native speakers answered questions from a machine translated version of a standardized Arabic language test. We compare the machine translation (MT) results with professional reference translations as a baseline for the purpose of determining the level of Arabic reading comprehension that...

READ MORE

Measuring human readability of machine generated text: three case studies in speech recognition and machine translation

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 5, ICASSP, 19-23 March 2005, pp. V-1009 - V-1012.

Summary

We present highlights from three experiments that test the readability of current state-of-the art system output from (1) an automated English speech-to-text system (2) a text-based Arabic-to-English machine translation system and (3) an audio-based Arabic-to-English MT process. We measure readability in terms of reaction time and passage comprehension in each case, applying standard psycholinguistic testing procedures and a modified version of the standard Defense Language Proficiency Test for Arabic called the DLPT*. We learned that: (1) subjects are slowed down about 25% when reading system STT output, (2) text-based MT systems enable an English speaker to pass Arabic Level 2 on the DLPT* and (3) audio-based MT systems do not enable English speakers to pass Arabic Level 2. We intend for these generic measures of readability to predict performance of more application-specific tasks.
READ LESS

Summary

We present highlights from three experiments that test the readability of current state-of-the art system output from (1) an automated English speech-to-text system (2) a text-based Arabic-to-English machine translation system and (3) an audio-based Arabic-to-English MT process. We measure readability in terms of reaction time and passage comprehension in each...

READ MORE

New measures of effectiveness for human language technology

Summary

The field of human language technology (HLT) encompasses algorithms and applications dedicated to processing human speech and written communication. We focus on two types of HLT systems: (1) machine translation systems, which convert text and speech files from one human language to another, and (2) speech-to-text (STT) systems, which produce text transcripts when given audio files of human speech as input. Although both processes are subject to machine errors and can produce varying levels of garbling in their output, HLT systems are improving at a remarkable pace, according to system-internal measures of performance. To learn how these system-internal measurements correlate with improved capabilities for accomplishing real-world language-understanding tasks, we have embarked on a collaborative, interdisciplinary project involving Lincoln Laboratory, the MIT Department of Brain and Cognitive Sciences, and the Defense Language Institute Foreign Language Center to develop new techniques to scientifically measure the effectiveness of these technologies when they are used by human subjects.
READ LESS

Summary

The field of human language technology (HLT) encompasses algorithms and applications dedicated to processing human speech and written communication. We focus on two types of HLT systems: (1) machine translation systems, which convert text and speech files from one human language to another, and (2) speech-to-text (STT) systems, which produce...

READ MORE

Robust collaborative multicast service for airborne command and control environment

Summary

RCM (Robust Collaborative Multicast) is a communication service designed to support collaborative applications operating in dynamic, mission-critical environments. RCM implements a set of well-specified message ordering and reliability properties that balance two conflicting goals: a)providing low-latency, highly-available, reliable communication service, and b) guaranteeing global consistency in how different participants perceive their communication. Both of these goals are important for collaborative applications. In this paper, we describe RCM, its modular and flexible design, and a collection of simple, light-weight protocols that implement it. We also report on several experiments with an RCM prototype in a test-bed environment.
READ LESS

Summary

RCM (Robust Collaborative Multicast) is a communication service designed to support collaborative applications operating in dynamic, mission-critical environments. RCM implements a set of well-specified message ordering and reliability properties that balance two conflicting goals: a)providing low-latency, highly-available, reliable communication service, and b) guaranteeing global consistency in how different participants perceive...

READ MORE