Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

By

Joel Acevedo-Aviles Clear filter

LLTools: machine learning for human language processing

December 5, 2016

Conference Paper

Author:

Cagri K. Dagli

…

Published in:

30th Conf. on Neural Info. Processing Syst., NIPS 2016, 5-10 December 2016.

Topic:

big data

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

Machine learning methods in Human Language Technology have reached a stage of maturity where widespread use is both possible and desirable. The MIT Lincoln Laboratory LLTools software suite provides a step towards this goal by providing a set of easily accessible frameworks for incorporating speech, text, and entity resolution components into larger applications. For the speech processing component, the pySLGR (Speaker, Language, Gender Recognition) tool provides signal processing, standard feature analysis, speech utterance embedding, and machine learning modeling methods in Python. The text processing component in LLTools extracts semantically meaningful insights from unstructured data via entity extraction, topic modeling, and document classification. The entity resolution component in LLTools provides approximate string matching, author recognition and graph-based methods for identifying and linking different instances of the same real-world entity. We show through two applications that LLTools can be used to rapidly create and train research prototypes for human language processing.

READ LESS

Summary

LLTools: machine learning for human language processing

Cross-domain entity resolution in social media

July 11, 2016

Conference Paper

Author:

William M. Campbell

…

Published in:

4th Int. Workshop on Natural Language Processing for Social Media, SocialNLP with IJCAI, 11 July 2016.

Topic:

social network

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

The challenge of associating entities across multiple domains is a key problem in social media understanding. Successful cross-domain entity resolution provides integration of information from multiple sites to create a complete picture of user and community activities, characteristics, and trends. In this work, we examine the problem of entity resolution across Twitter and Instagram using general techniques. Our methods fall into three categories: profile, content, and graph based. For the profile-based methods, we consider techniques based on approximate string matching. For content-based methods, we perform author identification. Finally, for graph-based methods, we apply novel cross-domain community detection methods and generate neighborhood-based features. The three categories of methods are applied to a large graph of users in Twitter and Instagram to understand challenges, determine performance, and understand fusion of multiple methods. Final results demonstrate an equal error rate less than 1%.

READ LESS

Summary

Cross-domain entity resolution in social media

VizLinc: integrating information extraction, search, graph analysis, and geo-location for the visual exploration of large data sets

August 24, 2014

Conference Paper

Author:

Joel Acevedo-Aviles

…

Published in:

Proc. KDD 2014 Workshop on Interactive Data Exploration and Analytics, IDEA, 24 August 2014, pp. 10-18.

Topic:

human language technology

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this demo paper we introduce VizLinc; an open-source software suite that integrates automatic information extraction, search, graph analysis, and geo-location for interactive visualization and exploration of large data sets. VizLinc helps users in: 1) understanding the type of information the data set under study might contain, 2) finding patterns and connections between entities, and 3) narrowing down the corpus to a small fraction of relevant documents that users can quickly read. We apply the tools offered by VizLinc to a subset of the New York Times Annotated Corpus and present use cases that demonstrate VizLinc's search and visualization features.

READ LESS

Summary

VizLinc: integrating information extraction, search, graph analysis, and geo-location for the visual exploration of large data sets

Detection and simulation of scenarios with hidden Markov models and event dependency graphs

March 15, 2010

Conference Paper

Author:

William M. Campbell

…

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 15 March 2010, pp. 5434-5437.

Topic:

social network

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

The wide availability of signal processing and language tools to extract structured data from raw content has created a new opportunity for the processing of structured signals. In this work, we explore models for the simulation and recognition of scenarios - i.e., time sequences of structured data. For simulation, we construct two models - hidden Markov models (HMMs) and event dependency graphs. Combined, these two simulation methods allow the specification of dependencies in event ordering, simultaneous execution of multiple scenarios, and evolving networks of data. For scenario recognition, we consider the application of multi-grained HMMs. We explore, in detail, mismatch between training scenarios and simulated test scenarios. The methods are applied to terrorist scenario detection with a simulation coded by a subject matter expert.

READ LESS

Summary

Detection and simulation of scenarios with hidden Markov models and event dependency graphs

Publications

Refine Results

By

LLTools: machine learning for human language processing

Summary

Summary

Cross-domain entity resolution in social media

Summary

Summary

VizLinc: integrating information extraction, search, graph analysis, and geo-location for the visual exploration of large data sets

Summary

Summary

Detection and simulation of scenarios with hidden Markov models and event dependency graphs

Summary

Summary

Showing Results