Publications
Predicting exploitation of disclosed software vulnerabilities using open-source data
Summary
Summary
Each year, thousands of software vulnerabilities are discovered and reported to the public. Unpatched known vulnerabilities are a significant security risk. It is imperative that software vendors quickly provide patches once vulnerabilities are known and users quickly install those patches as soon as they are available. However, most vulnerabilities are...
High-efficiency large-angle Pancharatnam phase deflector based on dual-twist design
Summary
Summary
We have previously shown through simulation that an optical beam deflector based on the Pancharatnam (geometric) phase can provide high efficiency with up to 80° deflection using a dual-twist structure for polarization-state control [Appl. Opt. 54, 10035 (2015)]. In this report, we demonstrate that its optical performance is as predicted...
Causal inference under network interference: a framework for experiments on social networks
Summary
Summary
No man is an island, as individuals interact and influence one another daily in our society. When social influence takes place in experiments on a population of interconnected individuals, the treatment on a unit may affect the outcomes of other units, a phenomenon known as interference. This thesis develops a...
Intersection and convex combination in multi-source spectral planted cluster detection
Summary
Summary
Planted cluster detection is an important form of signal detection when the data are in the form of a graph. When there are multiple graphs representing multiple connection types, the method of aggregation can have significant impact on the results of a detection algorithm. This paper addresses the tradeoff between...
LLTools: machine learning for human language processing
Summary
Summary
Machine learning methods in Human Language Technology have reached a stage of maturity where widespread use is both possible and desirable. The MIT Lincoln Laboratory LLTools software suite provides a step towards this goal by providing a set of easily accessible frameworks for incorporating speech, text, and entity resolution components...
Sparse-coded net model and applications
Summary
Summary
As an unsupervised learning method, sparse coding can discover high-level representations for an input in a large variety of learning problems. Under semi-supervised settings, sparse coding is used to extract features for a supervised task such as classification. While sparse representations learned from unlabeled data independently of the supervised task...
Cross-domain entity resolution in social media
Summary
Summary
The challenge of associating entities across multiple domains is a key problem in social media understanding. Successful cross-domain entity resolution provides integration of information from multiple sites to create a complete picture of user and community activities, characteristics, and trends. In this work, we examine the problem of entity resolution...
Generating a multiple-prerequisite attack graph
Summary
Summary
In one aspect, a method to generate an attack graph includes determining if a potential node provides a first precondition equivalent to one of preconditions provided by a group of preexisting nodes on the attack graph. The group of preexisting nodes includes a first state node, a first vulnerability instance...
A reverse approach to named entity extraction and linking in microposts
Summary
Summary
In this paper, we present a pipeline for named entity extraction and linking that is designed specifically for noisy, grammatically inconsistent domains where traditional named entity techniques perform poorly. Our approach leverages a large knowledge base to improve entity recognition, while maintaining the use of traditional NER to identify mentions...
Named entity recognition in 140 characters or less
Summary
Summary
In this paper, we explore the problem of recognizing named entities in microposts, a genre with notoriously little context surrounding each named entity and inconsistent use of grammar, punctuation, capitalization, and spelling conventions by authors. In spite of the challenges associated with information extraction from microposts, it remains an increasingly...