Publications

Refine Results

(Filters Applied) Clear All

Interactive supercomputing on 40,000 cores for machine learning and data analysis

Summary

Interactive massively parallel computations are critical for machine learning and data analysis. These computations are a staple of the MIT Lincoln Laboratory Supercomputing Center (LLSC) and has required the LLSC to develop unique interactive supercomputing capabilities. Scaling interactive machine learning frameworks, such as TensorFlow, and data analysis environments, such as MATLAB/Octave, to tens of thousands of cores presents many technical challenges – in particular, rapidly dispatching many tasks through a scheduler, such as Slurm, and starting many instances of applications with thousands of dependencies. Careful tuning of launches and prepositioning of applications overcome these challenges and allow the launching of thousands of tasks in seconds on a 40,000-core supercomputer. Specifically, this work demonstrates launching 32,000 TensorFlow processes in 4 seconds and launching 262,000 Octave processes in 40 seconds. These capabilities allow researchers to rapidly explore novel machine learning architecture and data analysis algorithms.
READ LESS

Summary

Interactive massively parallel computations are critical for machine learning and data analysis. These computations are a staple of the MIT Lincoln Laboratory Supercomputing Center (LLSC) and has required the LLSC to develop unique interactive supercomputing capabilities. Scaling interactive machine learning frameworks, such as TensorFlow, and data analysis environments, such as...

READ MORE

Large-scale Bayesian kinship analysis

Summary

Kinship prediction in forensics is limited to first degree relatives due to the small number of short tandem repeat loci characterized. The Genetic Chain Rule for Probabilistic Kinship Estimation can leverage large panels of single nucleotide polymorphisms (SNPs) or sets of sequence linked SNPs, called haploblocks, to estimate more distant relationships between individuals. This method uses allele frequencies and Markov Chain Monte Carlo methods to determine kinship probabilities. Allele frequencies are a crucial input to this method. Since these frequencies are estimated from finite populations and many alleles are rare, a Bayesian extension to the algorithm has been developed to determine credible intervals for kinship estimates as a function of the certainty in allele frequency estimates. Generation of sufficiently large samples to accurately estimate credible intervals can take significant computational resources. In this paper, we leverage hundreds of compute cores to generate large numbers of Dirichlet random samples for Bayesian kinship prediction. We show that it is possible to generate 2,097,152 random samples on 32,768 cores at a rate of 29.68 samples per second. The ability to generate extremely large number of samples enables the computation of more statistically significant results from a Bayesian approach to kinship analysis.
READ LESS

Summary

Kinship prediction in forensics is limited to first degree relatives due to the small number of short tandem repeat loci characterized. The Genetic Chain Rule for Probabilistic Kinship Estimation can leverage large panels of single nucleotide polymorphisms (SNPs) or sets of sequence linked SNPs, called haploblocks, to estimate more distant...

READ MORE

GraphChallenge.org: raising the bar on graph analytic performance

Summary

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems. GraphChallenge.org provides a wide range of preparsed graph data sets, graph generators, mathematically defined graph algorithms, example serial implementations in a variety of languages, and specific metrics for measuring performance. Graph Challenge 2017 received 22 submissions by 111 authors from 36 organizations. The submissions highlighted graph analytic innovations in hardware, software, algorithms, systems, and visualization. These submissions produced many comparable performance measurements that can be used for assessing the current state of the art of the field. There were numerous submissions that implemented the triangle counting challenge and resulted in over 350 distinct measurements. Analysis of these submissions show that their execution time is a strong function of the number of edges in the graph, Ne, and is typically proportional to N4=3 e for large values of Ne. Combining the model fits of the submissions presents a picture of the current state of the art of graph analysis, which is typically 108 edges processed per second for graphs with 108 edges. These results are 30 times faster than serial implementations commonly used by many graph analysts and underscore the importance of making these performance benefits available to the broader community. Graph Challenge provides a clear picture of current graph analysis systems and underscores the need for new innovations to achieve high performance on very large graphs.
READ LESS

Summary

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems...

READ MORE

Functionality and security co-design environment for embedded systems

Published in:
IEEE High Performance Extreme Computing Conf., HPEC, 25-27 September 2018.

Summary

For decades, embedded systems, ranging from intelligence, surveillance, and reconnaissance (ISR) sensors to electronic warfare and electronic signal intelligence systems, have been an integral part of U.S. Department of Defense (DoD) mission systems. These embedded systems are increasingly the targets of deliberate and sophisticated attacks. Developers thus need to focus equally on functionality and security in both hardware and software development. For critical missions, these systems must be entrusted to perform their intended functions, prevent attacks, and even operate with resilience under attacks. The processor in a critical system must thus provide not only a root of trust, but also a foundation to monitor mission functions, detect anomalies, and perform recovery. We have developed a Lincoln Asymmetric Multicore Processing (LAMP) architecture, which mitigates adversarial cyber effects with separation and cryptography and provides a foundation to build a resilient embedded system. We will describe a design environment that we have created to enable the co-design of functionality and security for mission assurance.
READ LESS

Summary

For decades, embedded systems, ranging from intelligence, surveillance, and reconnaissance (ISR) sensors to electronic warfare and electronic signal intelligence systems, have been an integral part of U.S. Department of Defense (DoD) mission systems. These embedded systems are increasingly the targets of deliberate and sophisticated attacks. Developers thus need to focus...

READ MORE

Linear and rotational microhydraulic actuators driven by electrowetting

Published in:
Sci. Robot., Vol. 3, No. 22, 19 September 2018.

Summary

Microhydraulic actuators offer a new way to convert electrical power to mechanical power on a microscale with an unmatched combination of power density and efficiency. Actuators work by combining surface tension force contributions from a large number of droplets distorted by electrowetting electrodes. This paper reports on the behavior of microgram-scale linear and rotational microhydraulic actuators with output force/weight ratios of 5500, cycle frequencies of 4 kilohertz, <1-micrometer movement precision, and accelerations of 3 kilometers/second. The power density and the efficiency of the actuators were characterized by simultaneously measuring the mechanical work performed and the electrical power applied. Maximum output power density was 0.93 kilowatt/kilogram, comparable with the best electric motors. At maximum power, the actuator was 60% efficient, but efficiencies were as high as 83% at lower power. Rotational actuators demonstrated a torque density of 79 newton meters/kilogram, substantially more than electric motors of comparable diameter. Scaling the droplet pitch from 100 to 48 micrometers increased power density from 0.27 to 0.93 kilowatt/kilogram, validating the quadratic scaling of actuator power.
READ LESS

Summary

Microhydraulic actuators offer a new way to convert electrical power to mechanical power on a microscale with an unmatched combination of power density and efficiency. Actuators work by combining surface tension force contributions from a large number of droplets distorted by electrowetting electrodes. This paper reports on the behavior of...

READ MORE

Neural network topologies for sparse training

Published in:
https://arxiv.org/abs/1809.05242

Summary

The sizes of deep neural networks (DNNs) are rapidly outgrowing the capacity of hardware to store and train them. Research over the past few decades has explored the prospect of sparsifying DNNs before, during, and after training by pruning edges from the underlying topology. The resulting neural network is known as a sparse neural network. More recent work has demonstrated the remarkable result that certain sparse DNNs can train to the same precision as dense DNNs at lower runtime and storage cost. An intriguing class of these sparse DNNs is the X-Nets, which are initialized and trained upon a sparse topology with neither reference to a parent dense DNN nor subsequent pruning. We present an algorithm that deterministically generates sparse DNN topologies that, as a whole, are much more diverse than X-Net topologies, while preserving X-Nets' desired characteristics.
READ LESS

Summary

The sizes of deep neural networks (DNNs) are rapidly outgrowing the capacity of hardware to store and train them. Research over the past few decades has explored the prospect of sparsifying DNNs before, during, and after training by pruning edges from the underlying topology. The resulting neural network is known...

READ MORE

Don't even ask: database access control through query control

Summary

This paper presents a vision and description for query control, which is a paradigm for database access control. In this model, individual queries are examined before being executed and are either allowed or denied by a pre-defined policy. Traditional view-based database access control requires the enforcer to view the query, the records, or both. That may present difficulty when the enforcer is not allowed to view database contents or the query itself. This discussion of query control arises from our experience with privacy-preserving encrypted databases, in which no single entity learns both the query and the database contents. Query control is also a good fit for enforcing rules and regulations that are not well-addressed by view-based access control. With the rise of federated database management systems, we believe that new approaches to access control will be increasingly important.
READ LESS

Summary

This paper presents a vision and description for query control, which is a paradigm for database access control. In this model, individual queries are examined before being executed and are either allowed or denied by a pre-defined policy. Traditional view-based database access control requires the enforcer to view the query...

READ MORE

Human-machine collaborative optimization via apprenticeship scheduling

Summary

Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the "single-expert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes. We propose a new approach for capturing this decision-making process through counterfactual reasoning in pairwise comparisons. Our approach is model-free and does not require iterating through the state space. We demonstrate that this approach accurately learns multifaceted heuristics on a synthetic and real world data sets. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of schedule optimization. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates optimal solutions up to 9.5 times faster than a state-of-the-art optimization algorithm.
READ LESS

Summary

Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale...

READ MORE

Valleytronics: opportunities, challenges, and paths forward

Summary

A lack of inversion symmetry coupled with the presence of time-reversal symmetry endows 2D transition metal dichalcogenides with individually addressable valleys in momentum space at the K and K' points in the first Brillouin zone. This valley addressability opens up the possibility of using the momentum state of electrons, holes, or excitons as a completely new paradigm in information processing. The opportunities and challenges associated with manipulation of the valley degree of freedom for practical quantum and classical information processing applications were analyzed during the 2017 Workshop on Valleytronic Materials, Architectures, and Devices; this Review presents the major findings of the workshop.
READ LESS

Summary

A lack of inversion symmetry coupled with the presence of time-reversal symmetry endows 2D transition metal dichalcogenides with individually addressable valleys in momentum space at the K and K' points in the first Brillouin zone. This valley addressability opens up the possibility of using the momentum state of electrons, holes...

READ MORE

Modeling and validation of a mm-wave shaped dielectric lens antenna

Published in:
2018 Int. Applied Computational Electromagnetics Society Symp., ACES, 29 July - 1 August 2018.

Summary

The modeling and validation of a 33 GHz shaped dielectric antenna design is investigated. The electromagnetic modeling was performed in both WIPL-D and FEKO, and was used to validate the antenna design prior to fabrication of the lens. It is shown that both WIPL-D and FEKO yield similarly accurate results as compared to measured far-field gain radiation patterns.
READ LESS

Summary

The modeling and validation of a 33 GHz shaped dielectric antenna design is investigated. The electromagnetic modeling was performed in both WIPL-D and FEKO, and was used to validate the antenna design prior to fabrication of the lens. It is shown that both WIPL-D and FEKO yield similarly accurate results...

READ MORE