Publications
Link prediction methods for generating speaker content graphs
Summary
Summary
In a speaker content graph, vertices represent speech signals and edges represent speaker similarity. Link prediction methods calculate which potential edges are most likely to connect vertices from the same speaker; those edges are included in the generated speaker content graph. Since a variety of speaker recognition tasks can be...
Large-scale community detection on speaker content graphs
Summary
Summary
We consider the use of community detection algorithms to perform speaker clustering on content graphs built from large audio corpora. We survey the application of agglomerative hierarchical clustering, modularity optimization methods, and spectral clustering as well as two random walk algorithms: Markov clustering and Infomap. Our results on graphs built...
RECOG: Recognition and Exploration of Content Graphs
Summary
Summary
We present RECOG (Recognition and Exploration of COntent Graphs), a system for visualizing and interacting with speaker content graphs constructed from large data sets of speech recordings. In a speaker content graph, nodes represent speech signals and edges represent speaker similarity. First, we describe a layout algorithm that optimizes content...
Social network analysis with content and graphs
Summary
Summary
Social network analysis has undergone a renaissance with the ubiquity and quantity of content from social media, web pages, and sensors. This content is a rich data source for constructing and analyzing social networks, but its enormity and unstructured nature also present multiple challenges. Work at Lincoln Laboratory is addressing...
Graph embedding for speaker recognition
Summary
Summary
This chapter presents applications of graph embedding to the problem of text-independent speaker recognition. Speaker recognition is a general term encompassing multiple applications. At the core is the problem of speaker comparison-given two speech recordings (utterances), produce a score which measures speaker similarity. Using speaker comparison, other applications can be...
Query-by-example using speaker content graphs
Summary
Summary
We describe methods for constructing and using content graphs for query-by-example speaker recognition tasks within a large speech corpus. This goal is achieved as follows: First, we describe an algorithm for constructing speaker content graphs, where nodes represent speech signals and edges represent speaker similarity. Speech signal similarity can be...
Individual and group dynamics in the reality mining corpus
Summary
Summary
Though significant progress has been made in recent years, traditional work in social networks has focused on static network analysis or dynamics in a large-scale sense. In this work, we explore ways in which temporal information from sociographic data can be used for the analysis and prediction of individual and...
Exploring the impact of advanced front-end processing on NIST speaker recognition microphone tasks
Summary
Summary
The NIST speaker recognition evaluation (SRE) featured microphone data in the 2005-2010 evaluations. The preprocessing and use of this data has typically been performed with telephone bandwidth and quantization. Although this approach is viable, it ignores the richer properties of the microphone data-multiple channels, high-rate sampling, linear encoding, ambient noise...
Graph relational features for speaker recognition and mining
Summary
Summary
Recent advances in the field of speaker recognition have resulted in highly efficient speaker comparison algorithms. The advent of these algorithms allows for leveraging a background set, consisting a large numbers of unlabeled recordings, to improve recognition. In this work, a relational graph, where nodes represent utterances and links represent...
NAP for high level language identification
Summary
Summary
Varying channel conditions present a difficult problem for many speech technologies such as language identification (LID). Channel compensation techniques have been shown to significantly improve performance in LID for acoustic systems. For high-level token systems, nuisance attribute projection (NAP) has been shown to perform well in the context of speaker...