
Refine Results

(Filters Applied) Clear All

Hardware foundation for secure computing

Published in:
2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.


Software security solutions are often considered to be more adaptable than their hardware counterparts. However, software has to work within the limitations of the system hardware platform, of which the selection is often dictated by functionality rather than security. Performance issues of security solutions without proper hardware support are easy to understand. The real challenge, however, is in the dilemma of "what should be done?" vs. "what could be done?" Security software could become ineffective if its "liberal" assumptions, e.g., the availability of a substantial trusted computing base (TCB) on the hardware platform, are violated. To address this dilemma, we have been developing and prototyping a security-by-design hardware foundation platform that enhances mainstream microprocessors with proper hardware security primitives to support and enhance software security solutions. This paper describes our progress in the use of a customized security co-processor to provide security services.


Software security solutions are often considered to be more adaptable than their hardware counterparts. However, software has to work within the limitations of the system hardware platform, of which the selection is often dictated by functionality rather than security. Performance issues of security solutions without proper hardware support are easy...


Towards a distributed framework for multi-agent reinforcement learning research


Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into the limits of these approaches as computation increases. In this paper, we present a distributed RL training framework designed for super computing infrastructures such as the MIT SuperCloud. We review a collection of challenging learning environments—such as Google Research Football, StarCraft II, and Multi-Agent Mujoco— which are at the frontier of reinforcement learning research. We provide results on these environments that illustrate the current state of the field on these problems. Finally, we also quantify and discuss the computational requirements needed for performing RL research by enumerating all experiments performed on these environments.


Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into...


Attacking Embeddings to Counter Community Detection

Published in:
Network Science Society Conference 2020 [submitted]


Community detection can be an extremely useful data triage tool, enabling a data analyst to split a largenetwork into smaller portions for a deeper analysis. If, however, a particular node wanted to avoid scrutiny, it could strategically create new connections that make it seem uninteresting. In this work, we investigate theuse of a state-of-the-art attack against node embedding as a means of countering community detection whilebeing blind to the attributes of others. The attack proposed in [1] attempts to maximize the loss function beingminimized by a random-walk-based embedding method (where two nodes are made closer together the more often a random walk starting at one node ends at the other). We propose using this method to attack thecommunity structure of the graph, specifically attacking the community assignment of an adversarial vertex. Since nodes in the same community tend to appear near each other in a random walk, their continuous-space embedding also tend to be close. Thus, we aim to use the general embedding attack in an attempt to shift the community membership of the adversarial vertex. To test this strategy, we adopt an experimental framework as in [2], where each node is given a “temperature” indicating how interesting it is. A node’s temperature can be “hot,” “cold,” or “unknown.” A node can perturbitself by adding new edges to any other node in the graph. The node’s goal is to be placed in a community thatis cold, i.e., where the average node temperature is less than 0. Of the 5 attacks proposed in [2], we use 2 in our experiments. The simpler attack is Cold and Lonely, which first connects to cold nodes, then unknown, then hot, and connects within each temperature in order of increasing degree. The more sophisticated attack is StableStructure. The procedure for this attack is to (1) identify stable structures (containing nodes assigned to the same community each time for several trials), (2) connect to nodes in order of increasing average temperature of their stable structures (randomly within a structure), and (3) connect to nodes with no stable structure in order of increasing temperature. As in [2], we use the Louvain modularity maximization technique for community detection. We slightly modify the embedding attack of [1] by only allowing addition of new edges and requiring that they include the adversary vertex. Since the embedding attack is blind to the temperatures of the nodes, experimenting with these attacks gives insight into how much this attribute information helps the adversary. Experimental results are shown in Figure 1. Graphs considered in these experiments are (1) an 500-node Erdos-Renyi graph with edge probabilityp= 0.02, (2) a stochastic block model with 5 communities of 100nodes each and edge probabilities ofpin= 0.06 andpout= 0.01, (3) the network of Abu Sayyaf Group (ASG)—aviolent non-state Islamist group operating in the Philippines—where two nodes are linked if they both participatein at least one kidnapping event, with labels derived from stable structures (nodes together in at least 95% of 1000 Louvain trials), and (4) the Cora machine learning citation graph, with 7 classes based on subjectarea. Temperature is assigned to the Erdos-Renyi nodes randomly with probability 0.25, 0.5, and 0.25 for hot,unknown, and cold, respectively. For the other graphs, nodes with the same label as the target are hot, unknown,and cold with probability 0.35, 0.55, and 0.1, respectively, and the hot and cold probabilities are swapped forother labels. The results demonstrate that, even without the temperature information, the embedding methodis about as effective as the Cold and Lonely when there is community structure to exploit, though it is not aseffective as Stable Structure, which leverages both community structure and temperature information.


Community detection can be an extremely useful data triage tool, enabling a data analyst to split a largenetwork into smaller portions for a deeper analysis. If, however, a particular node wanted to avoid scrutiny, it could strategically create new connections that make it seem uninteresting. In this work, we investigate...


Seasonal Inhomogeneous Nonconsecutive Arrival Process Search and Evaluation

Published in:
International Conference on Artificial Intelligence and Statistics, 26-28 August 2020 [submitted]


Seasonal data may display different distributions throughout the period of seasonality. We fit this type of model by determiningthe appropriate change points of the distribution and fitting parameters to each interval. This offers the added benefit of searching for disjoint regimes, which may denote the samedistribution occurring nonconsecutively. Our algorithm outperforms SARIMA for prediction.


Seasonal data may display different distributions throughout the period of seasonality. We fit this type of model by determiningthe appropriate change points of the distribution and fitting parameters to each interval. This offers the added benefit of searching for disjoint regimes, which may denote the samedistribution occurring nonconsecutively. Our algorithm...


Complex Network Effects on the Robustness of Graph Convolutional Networks


Vertex classification—the problem of identifying the class labels of nodes in a graph—has applicability in a wide variety of domains. Examples include classifying subject areas of papers in citation net-works or roles of machines in a computer network. Recent work has demonstrated that vertex classification using graph convolutional networks is susceptible to targeted poisoning attacks, in which both graph structure and node attributes can be changed in anattempt to misclassify a target node. This vulnerability decreases users’ confidence in the learning method and can prevent adoption in high-stakes contexts. This paper presents the first work aimed at leveraging network characteristics to improve robustness of these methods. Our focus is on using network features to choose the training set, rather than selecting the training set at random. Our alternative methods of selecting training data are (1) to select the highest-degree nodes in each class and (2) to iteratively select the node with the most neighbors minimally connected to the training set. In the datasets on which the original attack was demonstrated, we show that changing the training set can make the network much harder to attack. To maintain a given probability of attack success, the adversary must use far more perturbations; often a factor of 2–4 over the random training baseline. This increase in robustness is often as substantial as tripling the amount of randomly selected training data. Even in cases where success is relatively easy for the attacker, we show that classification performance degrades much more gradually using the proposed methods, with weaker incorrect predictions for the attacked nodes. Finally, we investigate the potential tradeoff between robustness and performance in various datasets.


Vertex classification—the problem of identifying the class labels of nodes in a graph—has applicability in a wide variety of domains. Examples include classifying subject areas of papers in citation net-works or roles of machines in a computer network. Recent work has demonstrated that vertex classification using graph convolutional networks is...


Deep implicit coordination graphs for multi-agent reinforcement learning [e-print]


Multi-agent reinforcement learning (MARL) requires coordination to efficiently solve certain tasks. Fully centralized control is often infeasible in such domains due to the size of joint action spaces. Coordination graph based formalization allows reasoning about the joint action based on the structure of interactions. However, they often require domain expertise in their design. This paper introduces the deep implicit coordination graph (DICG) architecture for such scenarios. DICG consists of a module for inferring the dynamic coordination graph structure which is then used by a graph neural network based module to learn to implicitly reason about the joint actions or values. DICG allows learning the tradeoff between full centralization and decentralization via standard actor-critic methods to significantly improve coordination for domains with large number of agents. We apply DICG to both centralized-training-centralized-execution and centralized-training-decentralized-execution regimes. We demonstrate that DICG solves the relative overgeneralization pathology in predatory-prey tasks as well as outperforms various MARL baselines on the challenging StarCraft II Multi-agent Challenge (SMAC) and traffic junction environments.


Multi-agent reinforcement learning (MARL) requires coordination to efficiently solve certain tasks. Fully centralized control is often infeasible in such domains due to the size of joint action spaces. Coordination graph based formalization allows reasoning about the joint action based on the structure of interactions. However, they often require domain expertise...


Control systems need software security too: cyber-physical systems and safety-critical application domains must adopt widespread effective software defenses

Published in:
SIGNAL Mag., 1 June 2020.


Low-level embedded control systems are increasingly being targeted by adversaries, and there is a strong need for stronger software defenses for such systems. The cyber-physical nature of such systems impose real-time performance constraints not seen in enterprise computing systems, and such constraints fundamentally alter how software defenses should be designed and applied. MIT Lincoln Laboratory scientists demonstrated that current randomization-based defenses, which have low average-case overhead, can incur significant worst-case overhead that may be untenable in real-time applications, while some low-overhead enforcement-based defenses have low worst-case performance overheads making them more amenable to real-time applications. Such defenses should be incorporated into a comprehensive resilient architecture with a strategy for failover and timely recovery in the case of a cyber threat.


Low-level embedded control systems are increasingly being targeted by adversaries, and there is a strong need for stronger software defenses for such systems. The cyber-physical nature of such systems impose real-time performance constraints not seen in enterprise computing systems, and such constraints fundamentally alter how software defenses should be designed...


75,000,000,000 streaming inserts/second using hierarchical hypersparse GraphBLAS matrices


The SuiteSparse GraphBLAS C-library implements high performance hypersparse matrices with bindings to a variety of languages (Python, Julia, and Matlab/Octave). GraphBLAS provides a lightweight in-memory database implementation of hypersparse matrices that are ideal for analyzing many types of network data, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of hypersparse matrices put enormous pressure on the memory hierarchy. This work benchmarks an implementation of hierarchical hypersparse matrices that reduces memory pressure and dramatically increases the update rate into a hypersparse matrices. The parameters of hierarchical hypersparse matrices rely on controlling the number of entries in each level in the hierarchy before an update is cascaded. The parameters are easily tunable to achieve optimal performance for a variety of applications. Hierarchical hypersparse matrices achieve over 1,000,000 updates per second in a single instance. Scaling to 31,000 instances of hierarchical hypersparse matrices arrays on 1,100 server nodes on the MIT SuperCloud achieved a sustained update rate of 75,000,000,000 updates per second. This capability allows the MIT SuperCloud to analyze extremely large streaming network data sets.


The SuiteSparse GraphBLAS C-library implements high performance hypersparse matrices with bindings to a variety of languages (Python, Julia, and Matlab/Octave). GraphBLAS provides a lightweight in-memory database implementation of hypersparse matrices that are ideal for analyzing many types of network data, while providing rigorous mathematical guarantees, such as linearity. Streaming updates...


Bayesian estimation of PLDA with noisy training labels, with applications to speaker verification

Published in:
2020 IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, ICASSP, 4-8 May 2020.


This paper proposes a method for Bayesian estimation of probabilistic linear discriminant analysis (PLDA) when training labels are noisy. Label errors can be expected during e.g. large or distributed data collections, or for crowd-sourced data labeling. By interpreting true labels as latent random variables, the observed labels are modeled as outputs of a discrete memoryless channel, and the maximum a posteriori (MAP) estimate of the PLDA model is derived via Variational Bayes. The proposed framework can be used for PLDA estimation, PLDA domain adaptation, or to infer the reliability of a PLDA training list. Although presented as a general method, the paper discusses specific applications for speaker verification. When applied to the Speakers in the Wild (SITW) Task, the proposed method achieves graceful performance degradation when label errors are introduced into the training or domain adaptation lists. When applied to the NIST 2018 Speaker Recognition Evaluation (SRE18) Task, which includes adaptation data with noisy speaker labels, the proposed technique provides performance improvements relative to unsupervised domain adaptation.


This paper proposes a method for Bayesian estimation of probabilistic linear discriminant analysis (PLDA) when training labels are noisy. Label errors can be expected during e.g. large or distributed data collections, or for crowd-sourced data labeling. By interpreting true labels as latent random variables, the observed labels are modeled as...


One giant leap for computer security


Today's computer systems trace their roots to an era of trusted users and highly constrained hardware; thus, their designs fundamentally emphasize performance and discount security. This article presents a vision for how small steps using existing technologies can be combined into one giant leap for computer security.


Today's computer systems trace their roots to an era of trusted users and highly constrained hardware; thus, their designs fundamentally emphasize performance and discount security. This article presents a vision for how small steps using existing technologies can be combined into one giant leap for computer security.