Publications
Attacking Embeddings to Counter Community Detection
Summary
Summary
Community detection can be an extremely useful data triage tool, enabling a data analyst to split a largenetwork into smaller portions for a deeper analysis. If, however, a particular node wanted to avoid scrutiny, it could strategically create new connections that make it seem uninteresting. In this work, we investigate...
Seasonal Inhomogeneous Nonconsecutive Arrival Process Search and Evaluation
Summary
Summary
Seasonal data may display different distributions throughout the period of seasonality. We fit this type of model by determiningthe appropriate change points of the distribution and fitting parameters to each interval. This offers the added benefit of searching for disjoint regimes, which may denote the samedistribution occurring nonconsecutively. Our algorithm...
Complex Network Effects on the Robustness of Graph Convolutional Networks
Summary
Summary
Vertex classification—the problem of identifying the class labels of nodes in a graph—has applicability in a wide variety of domains. Examples include classifying subject areas of papers in citation net-works or roles of machines in a computer network. Recent work has demonstrated that vertex classification using graph convolutional networks is...
Bayesian estimation of PLDA with noisy training labels, with applications to speaker verification
Summary
Summary
This paper proposes a method for Bayesian estimation of probabilistic linear discriminant analysis (PLDA) when training labels are noisy. Label errors can be expected during e.g. large or distributed data collections, or for crowd-sourced data labeling. By interpreting true labels as latent random variables, the observed labels are modeled as...
Discriminative PLDA for speaker verification with X-vectors
Summary
Summary
This paper proposes a novel approach to discrimina-tive training of probabilistic linear discriminant analysis (PLDA) for speaker verification with x-vectors. Model over-fitting is a well-known issue with discriminative PLDA (D-PLDA) forspeaker verification. As opposed to prior approaches which address this by limiting the number of trainable parameters, the proposed method...
Topological effects on attacks against vertex classification
Summary
Summary
Vertex classification is vulnerable to perturbations of both graph topology and vertex attributes, as shown in recent research. As in other machine learning domains, concerns about robustness to adversarial manipulation can prevent potential users from adopting proposed methods when the consequence of action is very high. This paper considers two...
The JHU-MIT System Description for NIST SRE19 AV
Summary
Summary
This document represents the SRE19 AV submission by the team composed of JHU-CLSP, JHU-HLTCOE and MIT Lincoln Labs. All the developed systems for the audio and videoconditions consisted of Neural network embeddings with some flavor of PLDA/cosine back-end. Primary fusions obtained Actual DCF of 0.250 on SRE18 VAST eval, 0.183...
Graph matching via multi-scale heat diffusion
Summary
Summary
We propose a novel graph matching algorithm that uses ideas from graph signal processing to match vertices of graphs using alternative graph representations. Specifically, we consider a multi-scale heat diffusion on the graphs to create multiple weighted graph representations that incorporate both direct adjacencies as well as local structures induced...
Prototype and analytics for discovery and exploitation of threat networks on social media
Summary
Summary
Identifying and profiling threat actors are high priority tasks for a number of governmental organizations. These threat actors may operate actively, using the Internet to promote propaganda, recruit new members, or exert command and control over their networks. Alternatively, threat actors may operate passively, demonstrating operational security awareness online while...
Identification and detection of human trafficking using language models
Summary
Summary
In this paper, we present a novel language model-based method for detecting both human trafficking ads and trafficking indicators. The proposed system leverages language models to learn language structures in adult service ads, automatically select a list of keyword features, and train a machine learning model to detect human trafficking...