Publications

Refine Results

(Filters Applied) Clear All

PATHATTACK: attacking shortest paths in complex networks

Summary

Shortest paths in complex networks play key roles in many applications. Examples include routing packets in a computer network, routing traffic on a transportation network, and inferring semantic distances between concepts on the World Wide Web. An adversary with the capability to perturb the graph might make the shortest path between two nodes route traffic through advantageous portions of the graph (e.g., a toll road he owns). In this paper, we introduce the Force Path Cut problem, in which there is a specific route the adversary wants to promote by removing a minimum number of edges in the graph. We show that Force Path Cut is NP-complete, but also that it can be recast as an instance of the Weighted Set Cover problem, enabling the use of approximation algorithms. The size of the universe for the set cover problem is potentially factorial in the number of nodes. To overcome this hurdle, we propose the PATHATTACK algorithm, which via constraint generation considers only a small subset of paths|at most 5% of the number of edges in 99% of our experiments. Across a diverse set of synthetic and real networks, the linear programming formulation of Weighted Set Cover yields the optimal solution in over 98% of cases. We also demonstrate a time/cost tradeoff using two approximation algorithms and greedy baseline methods. This work provides a foundation for addressing similar problems and expands the area of adversarial graph mining beyond recent work on node classification and embedding.
READ LESS

Summary

Shortest paths in complex networks play key roles in many applications. Examples include routing packets in a computer network, routing traffic on a transportation network, and inferring semantic distances between concepts on the World Wide Web. An adversary with the capability to perturb the graph might make the shortest path...

READ MORE

Health-informed policy gradients for multi-agent reinforcement learning

Summary

This paper proposes a definition of system health in the context of multiple agents optimizing a joint reward function. We use this definition as a credit assignment term in a policy gradient algorithm to distinguish the contributions of individual agents to the global reward. The health-informed credit assignment is then extended to a multi-agent variant of the proximal policy optimization algorithm and demonstrated on simple particle environments that have elements of system health, risk-taking, semi-expendable agents, and partial observability. We show significant improvement in learning performance compared to policy gradient methods that do not perform multi-agent credit assignment.
READ LESS

Summary

This paper proposes a definition of system health in the context of multiple agents optimizing a joint reward function. We use this definition as a credit assignment term in a policy gradient algorithm to distinguish the contributions of individual agents to the global reward. The health-informed credit assignment is then...

READ MORE

Combating Misinformation: HLT Highlights from MIT Lincoln Laboratory

Published in:
Human Language Technology Conference (HLTCon), 16-18 March 2021.

Summary

Dr. Joseph Campbell shares several human language technologies highlights from MIT Lincoln Laboratory. These include key enabling technologies in combating misinformation to link personas, analyze content, and understand human networks. Developing operationally relevant technologies requires access to corresponding data with meaningful evaluations, as Dr. Douglas Reynolds presented in his keynote. As Dr. Danelle Shah discussed in her keynote, it’s crucial to develop these technologies to operate at deeper levels than the surface. Producing reliable information from the fusion of missing and inherently unreliable information channels is paramount. Furthermore, the dynamic misinformation environment and the coevolution of allied methods with adversarial methods represent additional challenges
READ LESS

Summary

Dr. Joseph Campbell shares several human language technologies highlights from MIT Lincoln Laboratory. These include key enabling technologies in combating misinformation to link personas, analyze content, and understand human networks. Developing operationally relevant technologies requires access to corresponding data with meaningful evaluations, as Dr. Douglas Reynolds presented in his keynote...

READ MORE

Combating Misinformation: What HLT Can (and Can't) Do When Words Don't Say What They Mean

Author:
Published in:
Human Language Technology Conference (HLTCon), 16-18 March 2021.

Summary

Misinformation, disinformation, and “fake news” have been used as a means of influence for millennia, but the proliferation of the internet and social media in the 21st century has enabled nefarious campaigns to achieve unprecedented scale, speed, precision, and effectiveness. In the past few years, there has been significant recognition of the threats posed by malign influence operations to geopolitical relations, democratic institutions and processes, public health and safety, and more. At the same time, the digitization of communication offers tremendous opportunities for human language technologies (HLT) to observe, interpret, and understand this publicly available content. The ability to infer intent and impact, however, remains much more elusive.
READ LESS

Summary

Misinformation, disinformation, and “fake news” have been used as a means of influence for millennia, but the proliferation of the internet and social media in the 21st century has enabled nefarious campaigns to achieve unprecedented scale, speed, precision, and effectiveness. In the past few years, there has been significant recognition...

READ MORE

More than a fair share: Network Data Remanence attacks against secret sharing-based schemes

Published in:
Network and Distributed Systems Security Symp., NDSS, 23-26 February 2021.

Summary

With progress toward a practical quantum computer has come an increasingly rapid search for quantum-safe, secure communication schemes that do not rely on discrete logarithm or factorization problems. One such encryption scheme, Multi-path Switching with Secret Sharing (MSSS), combines secret sharing with multi-path switching to achieve security as long as the adversary does not have global observability of all paths and thus cannot capture enough shares to reconstruct messages. MSSS assumes that sending a share on a path is an atomic operation and all paths have the same delay. In this paper, we identify a side-channel vulnerability for MSSS, created by the fact that in real networks, sending a share is not an atomic operation as paths have multiple hops and different delays. This channel, referred to as Network Data Remanence (NDR), is present in all schemes like MSSS whose security relies on transfer atomicity and all paths having same delay. We demonstrate the presence of NDR in a physical testbed. We then identify two new attacks that aim to exploit the side channel, referred to as NDR Blind and NDR Planned, propose an analytical model to analyze the attacks, and demonstrate them using an implementation of MSSS based on the ONOS SDN controller. Finally, we present a countermeasure for the attacks and show its effectiveness in simulations and Mininet experiments.
READ LESS

Summary

With progress toward a practical quantum computer has come an increasingly rapid search for quantum-safe, secure communication schemes that do not rely on discrete logarithm or factorization problems. One such encryption scheme, Multi-path Switching with Secret Sharing (MSSS), combines secret sharing with multi-path switching to achieve security as long as...

READ MORE

Beyond expertise and roles: a framework to characterize the stakeholders of interpretable machine learning and their needs

Published in:
Proc. Conf. on Human Factors in Computing Systems, 8-13 May 2021, article no. 74.

Summary

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples stakeholders' knowledge from their interpretability needs. We characterize stakeholders by their formal, instrumental, and personal knowledge and how it manifests in the contexts of machine learning, the data domain, and the general milieu. We additionally distill a hierarchical typology of stakeholder needs that distinguishes higher-level domain goals from lower-level interpretability tasks. In assessing the descriptive, evaluative, and generative powers of our framework, we find our more nuanced treatment of stakeholders reveals gaps and opportunities in the interpretability literature, adds precision to the design and comparison of user studies, and facilitates a more reflexive approach to conducting this research.
READ LESS

Summary

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples...

READ MORE

Seasonal Inhomogeneous Nonconsecutive Arrival Process Search and Evaluation

Published in:
25th International Conference on Pattern Recognition [submitted]

Summary

Time series often exhibit seasonal patterns, and identification of these patterns is essential to understanding thedata and predicting future behavior. Most methods train onlarge datasets and can fail to predict far past the training data. This limitation becomes more pronounced when data is sparse. This paper presents a method to fit a model to seasonal time series data that maintains predictive power when data is limited. This method, called SINAPSE, combines statistical model fitting with an information criteria to search for disjoint, andpossibly nonconsecutive, regimes underlying the data, allowing for a sparse representation resistant to overfitting.
READ LESS

Summary

Time series often exhibit seasonal patterns, and identification of these patterns is essential to understanding thedata and predicting future behavior. Most methods train onlarge datasets and can fail to predict far past the training data. This limitation becomes more pronounced when data is sparse. This paper presents a method to...

READ MORE

The Speech Enhancement via Attention Masking Network (SEAMNET): an end-to-end system for joint suppression of noise and reverberation [early access]

Published in:
IEEE/ACM Trans. on Audio, Speech, and Language Processing, Vol. 29, 2021, pp. 515-26.

Summary

This paper proposes the Speech Enhancement via Attention Masking Network (SEAMNET), a neural network-based end-to-end single-channel speech enhancement system designed for joint suppression of noise and reverberation. It formalizes an end-to-end network architecture, referred to as b-Net, which accomplishes noise suppression through attention masking in a learned embedding space. A key contribution of SEAMNET is that the b-Net architecture contains both an enhancement and an autoencoder path. This paper proposes a novel loss function which simultaneously trains both the enhancement and the autoencoder paths, so that disabling the masking mechanism during inference causes SEAMNET to reconstruct the input speech signal. This allows dynamic control of the level of suppression applied by SEAMNET via a minimum gain level, which is not possible in other state-of-the-art approaches to end-to-end speech enhancement. This paper also proposes a perceptually-motivated waveform distance measure. In addition to the b-Net architecture, this paper proposes a novel method for designing target waveforms for network training, so that joint suppression of additive noise and reverberation can be performed by an end-to-end enhancement system, which has not been previously possible. Experimental results show the SEAMNET system to outperform a variety of state-of-the-art baselines systems, both in terms of objective speech quality measures and subjective listening tests. Finally, this paper draws parallels between SEAMNET and conventional statistical model-based enhancement approaches, offering interpretability of many network components.
READ LESS

Summary

This paper proposes the Speech Enhancement via Attention Masking Network (SEAMNET), a neural network-based end-to-end single-channel speech enhancement system designed for joint suppression of noise and reverberation. It formalizes an end-to-end network architecture, referred to as b-Net, which accomplishes noise suppression through attention masking in a learned embedding space. A...

READ MORE

Information Aware max-norm Dirichlet networks for predictive uncertainty estimation

Published in:
Neural Netw., Vol. 135, 2021, pp. 105–114.

Summary

Precise estimation of uncertainty in predictions for AI systems is a critical factor in ensuring trust and safety. Deep neural networks trained with a conventional method are prone to over-confident predictions. In contrast to Bayesian neural networks that learn approximate distributions on weights to infer prediction confidence, we propose a novel method, Information Aware Dirichlet networks, that learn an explicit Dirichlet prior distribution on predictive distributions by minimizing a bound on the expected max norm of the prediction error and penalizing information associated with incorrect outcomes. Properties of the new cost function are derived to indicate how improved uncertainty estimation is achieved. Experiments using real datasets show that our technique outperforms, by a large margin, state-of-the-art neural networks for estimating within-distribution and out-of-distribution uncertainty, and detecting adversarial examples.
READ LESS

Summary

Precise estimation of uncertainty in predictions for AI systems is a critical factor in ensuring trust and safety. Deep neural networks trained with a conventional method are prone to over-confident predictions. In contrast to Bayesian neural networks that learn approximate distributions on weights to infer prediction confidence, we propose a...

READ MORE

The 2019 NIST Speaker Recognition Evaluation CTS Challenge

Published in:
The Speaker and Language Recognition Workshop: Odyssey 2020, 1-5 November 2020.

Summary

In 2019, the U.S. National Institute of Standards and Technology (NIST) conducted a leaderboard style speaker recognition challenge using conversational telephone speech (CTS) data extracted from the unexposed portion of the Call My Net 2 (CMN2) corpus previously used in the 2018 Speaker Recognition Evaluation (SRE). The SRE19 CTS Challenge was organized in a similar manner to SRE18, except it offered only the open training condition. In addition, similar to the NIST i-vector challenge, the evaluation set consisted of two subsets: a progress subset, and a test subset. The progress subset comprised 30% of the trials and was used to monitor progress on the leaderboad, while the remaining 70% of the trials formed the test subset, which was used to generate the official final results determined at the end of the challenge. Which subset (i.e., progress or test) a trial belonged to was unknown to challenge participants, and each system submission had to contain outputs for all of trials. The CTS Challenge also served as a prerequisite for entrance to the main SRE19 whose primary task was audio-visual person recognition. A total of 67 organizations (forming 51 teams) from academia and industry participated in the CTS Challenge and submitted 1347 valid system outputs. This paper presents an overview of the evaluation and several analyses of system performance for all primary conditions in the CTS Challenge. Compared to the CTS track of the SRE18, the SRE19 CTS Challenge results indicate remarkable improvements in performance which are mainly attributed to 1) the availability of large amounts of in-domain development data from a large number of labeled speakers, 2) speaker representations (aka embeddings) extracted using extended and more complex end-to-end neural network frameworks, and 3) effective use of the provided large development set.
READ LESS

Summary

In 2019, the U.S. National Institute of Standards and Technology (NIST) conducted a leaderboard style speaker recognition challenge using conversational telephone speech (CTS) data extracted from the unexposed portion of the Call My Net 2 (CMN2) corpus previously used in the 2018 Speaker Recognition Evaluation (SRE). The SRE19 CTS Challenge...

READ MORE