Publications

Refine Results

(Filters Applied) Clear All

Resilience of cyber systems with over- and underregulation

Published in:
Risk Analysis, Vol. 37, No. 9, 2017, pp. 1644-51, DOI:10.1111/risa.12729.

Summary

Recent cyber attacks provide evidence of increased threats to our critical systems and infrastructure. A common reaction to a new threat is to harden the system by adding new rules and regulations. As federal and state governments request new procedures to follow, each of their organizations implements their own cyber defense strategies. This unintentionally increases time and effort that employees spend on training and policy implementation and decreases the time and latitude to perform critical job functions, thus raising overall levels of stress. People's performance under stress, coupled with an overabundance of information, results in even more vulnerabilities for adversaries to exploit. In this article, we embed a simple regulatory model that accounts for cybersecurity human factors and an organization's regulatory environment in a model of a corporate cyber network under attack. The resulting model demonstrates the effect of under- and overregulation on an organization's resilience with respect to insider threats. Currently, there is a tendency to use ad-hoc approaches to account for human factors rather than to incorporate them into cyber resilience modeling. It is clear that using a systematic approach utilizing behavioral science, which already exists in cyber resilience assessment, would provide a more holistic view for decisionmakers.
READ LESS

Summary

Recent cyber attacks provide evidence of increased threats to our critical systems and infrastructure. A common reaction to a new threat is to harden the system by adding new rules and regulations. As federal and state governments request new procedures to follow, each of their organizations implements their own cyber...

READ MORE

Intersection and convex combination in multi-source spectral planted cluster detection

Published in:
IEEE Global Conf. on Signal and Information Processing, GlobalSIP, 7-9 December 2016.

Summary

Planted cluster detection is an important form of signal detection when the data are in the form of a graph. When there are multiple graphs representing multiple connection types, the method of aggregation can have significant impact on the results of a detection algorithm. This paper addresses the tradeoff between two possible aggregation methods: convex combination and intersection. For a spectral detection method, convex combination dominates when the cluster is relatively sparse in at least one graph, while the intersection method dominates in cases where it is dense across graphs. Experimental results confirm the theory. We consider the context of adversarial cluster placement, and determine how an adversary would distribute connections among the graphs to best avoid detection.
READ LESS

Summary

Planted cluster detection is an important form of signal detection when the data are in the form of a graph. When there are multiple graphs representing multiple connection types, the method of aggregation can have significant impact on the results of a detection algorithm. This paper addresses the tradeoff between...

READ MORE

Making #sense of #unstructured text data

Published in:
30th Conf. on Neural Info. Processing Syst., NIPS 2016, 5-10 December 2016.

Summary

Automatic extraction of intelligent and useful information from data is one of the main goals in data science. Traditional approaches have focused on learning from structured features, i.e., information in a relational database. However, most of the data encountered in practice are unstructured (i.e., social media posts, forums, emails and web logs); they do not have a predefined schema or format. In this work, we examine unsupervised methods for processing unstructured text data, extracting relevant information, and transforming it into structured information that can then be leveraged in various applications such as graph analysis and matching entities across different platforms. Various efforts have been proposed to develop algorithms for processing unstructured text data. At a top level, text can be either summarized by document level features (i.e., language, topic, genre, etc.) or analyzed at a word or sub-word level. Text analytics can be unsupervised, semi-supervised, or supervised. In this work, we focus on word analysis and unsupervised methods. Unsupervised (or semi-supervised) methods require less human annotation and can easily fulfill the role of automatic analysis. For text analysis, we focus on methods for finding relevant words in the text. Specifically, we look at social media data and attempt to predict hashtags for users' posts. The resulting hashtags can be used for downstream processing such as graph analysis. Automatic hashtag annotation is closely related to automatic tag extraction and keyword extraction. Techniques for hashtags extraction include topic analysis, supervised classifiers, machine translation methods, and collaborative filtering. Methods for keyword extraction include graph-based and topical analysis of text.
READ LESS

Summary

Automatic extraction of intelligent and useful information from data is one of the main goals in data science. Traditional approaches have focused on learning from structured features, i.e., information in a relational database. However, most of the data encountered in practice are unstructured (i.e., social media posts, forums, emails and...

READ MORE

High performance, 3D-printable dielectric nanocomposites for millimeter wave devices

Summary

The creation of millimeter wave, 3D-printable dielectric nanocomposite is demonstrated. Alumina nanoparticles were combined with styrenic block copolymers and solvent to create shear thinning, viscoelastic inks that are printable at room temperature. Particle loadings of up to 41 vol % were achieved. Upon being dried, the highest-performing of these materials has a permittivity of 4.61 and a loss tangent of 0.00298 in the Ka band (26.5-40 GHz), a combination not previously demonstrated for 3D printing. These nanocomposite materials were used to print a simple resonator device with predictable pass-band features.
READ LESS

Summary

The creation of millimeter wave, 3D-printable dielectric nanocomposite is demonstrated. Alumina nanoparticles were combined with styrenic block copolymers and solvent to create shear thinning, viscoelastic inks that are printable at room temperature. Particle loadings of up to 41 vol % were achieved. Upon being dried, the highest-performing of these materials...

READ MORE

LLTools: machine learning for human language processing

Summary

Machine learning methods in Human Language Technology have reached a stage of maturity where widespread use is both possible and desirable. The MIT Lincoln Laboratory LLTools software suite provides a step towards this goal by providing a set of easily accessible frameworks for incorporating speech, text, and entity resolution components into larger applications. For the speech processing component, the pySLGR (Speaker, Language, Gender Recognition) tool provides signal processing, standard feature analysis, speech utterance embedding, and machine learning modeling methods in Python. The text processing component in LLTools extracts semantically meaningful insights from unstructured data via entity extraction, topic modeling, and document classification. The entity resolution component in LLTools provides approximate string matching, author recognition and graph-based methods for identifying and linking different instances of the same real-world entity. We show through two applications that LLTools can be used to rapidly create and train research prototypes for human language processing.
READ LESS

Summary

Machine learning methods in Human Language Technology have reached a stage of maturity where widespread use is both possible and desirable. The MIT Lincoln Laboratory LLTools software suite provides a step towards this goal by providing a set of easily accessible frameworks for incorporating speech, text, and entity resolution components...

READ MORE

An overview of the DARPA Data Driven Discovery of Models (D3M) Program

Published in:
29th Conf. on Neural Information Processing Systems, NIPS, 5-10 December 2016.

Summary

A new DARPA program called Data Driven Discovery of Models (D3M) aims to develop automated model discovery systems that can be used by researchers with specific subject matter expertise to create empirical models of real, complex processes. Two major goals of this program are to allow experts to create empirical models without the need for data scientists and to increase the productivity of data scientists via automation. Automated model discovery systems developed will be tested on real-world problems that progressively get harder during the course of the program. Toward the end of the program, problems will be both unsolved and underspecified in terms of data and desired outcomes. The program will emphasize creating and leveraging open source technology and architecture. Our presentation reviews the goals and structure of this program which will begin early in 2017. Although the deadline for submitting proposals has past, we welcome suggestions concerning challenge tasks, evaluations, or new open-source data sets to be included for system development and evaluation that would supplement data currently being curated from many sources.
READ LESS

Summary

A new DARPA program called Data Driven Discovery of Models (D3M) aims to develop automated model discovery systems that can be used by researchers with specific subject matter expertise to create empirical models of real, complex processes. Two major goals of this program are to allow experts to create empirical...

READ MORE

Bootstrapping and maintaining trust in the cloud

Published in:
32nd Annual Computer Security Applications Conf., ACSAC 2016, 5-9 December 2016.

Summary

Today's infrastructure as a service (IaaS) cloud environments rely upon full trust in the provider to secure applications and data. Cloud providers do not offer the ability to create hardware-rooted cryptographic identities for IaaS cloud resources or sufficient information to verify the integrity of systems. Trusted computing protocols and hardware like the TPM have long promised a solution to this problem. However, these technologies have not seen broad adoption because of their complexity of implementation, low performance, and lack of compatibility with virtualized environments. In this paper we introduce keylime, a scalable trusted cloud key management system. keylime provides an end-to-end solution for both bootstrapping hardware rooted cryptographic identities for IaaS nodes and for system integrity monitoring of those nodes via periodic attestation. We support these functions in both bare-metal and virtualized IaaS environments using a virtual TPM. keylime provides a clean interface that allows higher level security services like disk encryption or configuration management to leverage trusted computing without being trusted computing aware. We show that our bootstrapping protocol can derive a key in less than two seconds, we can detect system integrity violations in as little as 110ms, and that keylime can scale to thousands of IaaS cloud nodes.
READ LESS

Summary

Today's infrastructure as a service (IaaS) cloud environments rely upon full trust in the provider to secure applications and data. Cloud providers do not offer the ability to create hardware-rooted cryptographic identities for IaaS cloud resources or sufficient information to verify the integrity of systems. Trusted computing protocols and hardware...

READ MORE

Predicting and analyzing factors in patent litigation

Published in:
30th Conf. on Neural Information Processing System, NIPS 2016, 5-10 December 2016.

Summary

Patent litigation is an expensive and time-consuming process. To minimize its impact on the participants in the patent lifecycle, automatic determination of litigation potential is a compelling machine learning application. In this paper, we consider preliminary methods for the prediction of a patent being involved in litigation using metadata, content, and graph features. Metadata features are top-level easily-extractable features, i.e., assignee, number of claims, etc. The content feature performs lexical analysis of the claims associated to a patent. Graph features use relational learning to summarize patent references. We apply our methods on US patents using a labeled data set. Prior work has focused on metadata-only features, but we show that both graph and content features have significant predictive capability. Additionally, fusing all features results in improved performance. We also perform a preliminary examination of some of the qualitative factors that may have significant importance in patent litigation.
READ LESS

Summary

Patent litigation is an expensive and time-consuming process. To minimize its impact on the participants in the patent lifecycle, automatic determination of litigation potential is a compelling machine learning application. In this paper, we consider preliminary methods for the prediction of a patent being involved in litigation using metadata, content...

READ MORE

Biomimetic sniffing improves the detection performance of a 3D printed nose of a dog and a commercial trace vapor detector

Published in:
Scientific Reports, Vol. 6 , art. no. 36876, December 2016. DOI: 10.1038/srep36876.

Summary

Unlike current chemical trace detection technology, dogs actively sniff to acquire an odor sample. Flow visualization experiments with an anatomically-similar 3D printed dog's nose revealed the external aerodynamics during canine sniffing, where ventral-laterally expired air jets entrain odorant-laden air toward the nose, thereby extending the "aerodynamic reach" for inspiration of otherwise inaccessible odors. Chemical sampling and detection experiments quantified two modes of operation with the artificial nose-active sniffing and continuous inspiration-and demonstrated an increase in odorant detection by a factor of up to 18 for active sniffing. A 16-fold improvement in detection was demonstrated with a commercially-available explosives detector by applying this bio-inspired design principle and making the device "sniff" like a dog. These lessons learned from the dog may benefit the next-generation of vapor samplers for explosives, narcotics, pathogens, or even cancer, and could inform future bio-inspired designs for optimized sampling of odor plumes.
READ LESS

Summary

Unlike current chemical trace detection technology, dogs actively sniff to acquire an odor sample. Flow visualization experiments with an anatomically-similar 3D printed dog's nose revealed the external aerodynamics during canine sniffing, where ventral-laterally expired air jets entrain odorant-laden air toward the nose, thereby extending the "aerodynamic reach" for inspiration of...

READ MORE

Terminal Flight Data Manager (TFDM) environmental benefits assessment

Published in:
MIT Lincoln Laboratory Report ATC-420

Summary

This work monetizes the environmental benefits of Terminal Flight Data Manager (TFDM) capabilities which reduce fuel burn and gaseous emissions, and in turn reduce climate change and air quality effects. A methodology is created which takes TFDM "engines-on" taxi time savings and converts them to fuel and carbon dioxide (CO2) emissions savings, accounting for aircraft fleet mix at each of 27 TFDM analysis airports over a 2016-2048 analysis timeframe. Total fuel reductions of approximately 300 million U.S. gallons are estimated, resulting in monetized benefits from all TFDM capabilities of $65m-$582m undiscounted, $23m-$310m discounted, depending on the Social Cost of CO2 (SCC) and discount rate used. A similar methodology is used to estimate monetized benefits of reduced air quality emissions as well.
READ LESS

Summary

This work monetizes the environmental benefits of Terminal Flight Data Manager (TFDM) capabilities which reduce fuel burn and gaseous emissions, and in turn reduce climate change and air quality effects. A methodology is created which takes TFDM "engines-on" taxi time savings and converts them to fuel and carbon dioxide (CO2)...

READ MORE