Publications

Refine Results

(Filters Applied) Clear All

VizLinc: integrating information extraction, search, graph analysis, and geo-location for the visual exploration of large data sets

Published in:
Proc. KDD 2014 Workshop on Interactive Data Exploration and Analytics, IDEA, 24 August 2014, pp. 10-18.

Summary

In this demo paper we introduce VizLinc; an open-source software suite that integrates automatic information extraction, search, graph analysis, and geo-location for interactive visualization and exploration of large data sets. VizLinc helps users in: 1) understanding the type of information the data set under study might contain, 2) finding patterns and connections between entities, and 3) narrowing down the corpus to a small fraction of relevant documents that users can quickly read. We apply the tools offered by VizLinc to a subset of the New York Times Annotated Corpus and present use cases that demonstrate VizLinc's search and visualization features.
READ LESS

Summary

In this demo paper we introduce VizLinc; an open-source software suite that integrates automatic information extraction, search, graph analysis, and geo-location for interactive visualization and exploration of large data sets. VizLinc helps users in: 1) understanding the type of information the data set under study might contain, 2) finding patterns...

READ MORE

Content + context networks for user classification in Twitter

Published in:
Frontiers of Network Analysis, NIPS Workshop, 9 December 2013.

Summary

Twitter is a massive platform for open communication between diverse groups of people. While traditional media segregates the world's population on lines of language, age, physical location, social status, and many other characteristics, Twitter cuts through these divides. The result is an extremely diverse social network. In this work, we combine features of this network structure with content analytics on the tweets in order to create a content + context network, capturing the relations not only between people, but also between people and content and between content and content. This rich structure allows deep analysis into many aspects of communication over Twitter. We focus on predicting user classifications by using relational probability trees with features from content + context networks. Experiments demonstrate that these features are salient and complementary for user classification.
READ LESS

Summary

Twitter is a massive platform for open communication between diverse groups of people. While traditional media segregates the world's population on lines of language, age, physical location, social status, and many other characteristics, Twitter cuts through these divides. The result is an extremely diverse social network. In this work, we...

READ MORE

Link prediction methods for generating speaker content graphs

Published in:
ICASSP 2013, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 25-31 May 2013.

Summary

In a speaker content graph, vertices represent speech signals and edges represent speaker similarity. Link prediction methods calculate which potential edges are most likely to connect vertices from the same speaker; those edges are included in the generated speaker content graph. Since a variety of speaker recognition tasks can be performed on a content graph, we provide a set of metrics for evaluating the graph's quality independently of any recognition task. We then describe novel global and incremental algorithms for constructing accurate speaker content graphs that outperform the existing k nearest neighbors link prediction method. We evaluate those algorithms on a NIST speaker recognition corpus.
READ LESS

Summary

In a speaker content graph, vertices represent speech signals and edges represent speaker similarity. Link prediction methods calculate which potential edges are most likely to connect vertices from the same speaker; those edges are included in the generated speaker content graph. Since a variety of speaker recognition tasks can be...

READ MORE

RECOG: Recognition and Exploration of Content Graphs

Published in:
Pacific Vision, 26 February - March 1, 2013.

Summary

We present RECOG (Recognition and Exploration of COntent Graphs), a system for visualizing and interacting with speaker content graphs constructed from large data sets of speech recordings. In a speaker content graph, nodes represent speech signals and edges represent speaker similarity. First, we describe a layout algorithm that optimizes content graphs for ease of navigability. We then present an interactive tool set that allows an end user to find and explore interesting occurrences in the corpus. We also present a tool set that allows a researcher to visualize the shortcomings of current content graph generation algorithms. RECOG's layout and toolsets were implemented as Gephi plugins [1].
READ LESS

Summary

We present RECOG (Recognition and Exploration of COntent Graphs), a system for visualizing and interacting with speaker content graphs constructed from large data sets of speech recordings. In a speaker content graph, nodes represent speech signals and edges represent speaker similarity. First, we describe a layout algorithm that optimizes content...

READ MORE