Publications

Refine Results

(Filters Applied) Clear All

Security Design of Mission-Critical Embedded Systems

Published in:
HPEC 2019: IEEE Conf. on High Performance Extreme Computing, 22-24 September 2019.

Summary

This tutorial explains a systematic approach of co-designing functionality and security into mission-criticalembedded systems. The tutorial starts by reviewing common issues in embedded applications to define mission objectives,threat models, and security/resilience goals. We then introduce an overview of security technologies toachieve goals of confidentiality, integrity, and availability given design criteria and a realistic threatmodel. The technologies range from practical cryptography and key management, protection of data atrest, data in transit, and data in use, and tamper resistance.A major portion of the tutorial is dedicated to exploring the mission critical embedded system solutionspace. We discuss the search for security vulnerabilities (red teaming) and the search for solutions (blueteaming). Besides the lecture, attendees, under instructor guidance, will perform realistic andmeaningful hands-on exercises of defining mission and security objectives, assessing principal issues,applying technologies, and understanding their interactions. The instructor will provide an exampleapplication (distributed sensing, communicating, and computing) to be used in these exercises.Attendees could also bring their own applications for the exercises.Attendees are encouraged to work collaboratively throughout the development process, thus creatingopportunities to learn from each other. During the exercise, attendees will consider the use of varioussecurity/resilience features, articulate and justify the use of resources, and assess the system’ssuitability for mission assurance. Attendees can expect to gain valuable insight and experience in thesubject after completing the lecture and exercises.The instructor, who is an expert and practitioner in the field, will offer insight, advice, and concreteexamples and discussions. The tutorial draws from the instructor’s decades of experience in secure,resilient systems and technology.
READ LESS

Summary

This tutorial explains a systematic approach of co-designing functionality and security into mission-criticalembedded systems. The tutorial starts by reviewing common issues in embedded applications to define mission objectives,threat models, and security/resilience goals. We then introduce an overview of security technologies toachieve goals of confidentiality, integrity, and availability given design criteria...

READ MORE

Streaming 1.9 billion hyperspace network updates per second with D4M

Summary

The Dynamic Distributed Dimensional Data Model (D4M) library implements associative arrays in a variety of languages (Python, Julia, and Matlab/Octave) and provides a lightweight in-memory database implementation of hypersparse arrays that are ideal for analyzing many types of network data. D4M relies on associative arrays which combine properties of spreadsheets, databases, matrices, graphs, and networks, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of D4M associative arrays put enormous pressure on the memory hierarchy. This work describes the design and performance optimization of an implementation of hierarchical associative arrays that reduces memory pressure and dramatically increases the update rate into an associative array. The parameters of hierarchical associative arrays rely on controlling the number of entries in each level in the hierarchy before an update is cascaded. The parameters are easily tunable to achieve optimal performance for a variety of applications. Hierarchical arrays achieve over 40,000 updates per second in a single instance. Scaling to 34,000 instances of hierarchical D4M associative arrays on 1,100 server nodes on the MIT SuperCloud achieved a sustained update rate of 1,900,000,000 updates per second. This capability allows the MIT SuperCloud to analyze extremely large streaming network data sets.
READ LESS

Summary

The Dynamic Distributed Dimensional Data Model (D4M) library implements associative arrays in a variety of languages (Python, Julia, and Matlab/Octave) and provides a lightweight in-memory database implementation of hypersparse arrays that are ideal for analyzing many types of network data. D4M relies on associative arrays which combine properties of spreadsheets...

READ MORE

Survey and benchmarking of machine learning accelerators

Published in:
IEEE High Performance Extreme Computing Conf., HPEC, 24-26 September 2019.

Summary

Advances in multicore processors and accelerators have opened the flood gates to greater exploration and application of machine learning techniques to a variety of applications. These advances, along with breakdowns of several trends including Moore's Law, have prompted an explosion of processors and accelerators that promise even greater computational and machine learning capabilities. These processors and accelerators are coming in many forms, from CPUs and GPUs to ASICs, FPGAs, and dataflow accelerators. This paper surveys the current state of these processors and accelerators that have been publicly announced with performance and power consumption numbers. The performance and power values are plotted on a scatter graph and a number of dimensions and observations from the trends on this plot are discussed and analyzed. For instance, there are interesting trends in the plot regarding power consumption, numerical precision, and inference versus training. We then select and benchmark two commercially-available low size, weight, and power (SWaP) accelerators as these processors are the most interesting for embedded and mobile machine learning inference applications that are most applicable to the DoD and other SWaP constrained users. We determine how they actually perform with real-world images and neural network models, compare those results to the reported performance and power consumption values and evaluate them against an Intel CPU that is used in some embedded applications.
READ LESS

Summary

Advances in multicore processors and accelerators have opened the flood gates to greater exploration and application of machine learning techniques to a variety of applications. These advances, along with breakdowns of several trends including Moore's Law, have prompted an explosion of processors and accelerators that promise even greater computational and...

READ MORE

Toward technically feasible and economically efficient integration of distributed energy resources

Author:
Published in:
57th Annual Allerton Conf. on Communication, Control, and Computing, 24-27 September 2019.

Summary

This paper formulates the efficient and feasible participation of distributed energy resources (DERs) in complex electricity services as a centralized nonlinear optimization problem first. This problem is then re-stated using the novel energy/power transformed state space. It is shown that the DER dynamics in closed-loop can be made linear in this new state space. The decision making by the DERs then becomes a distributed model predictive control problem and it forms the basis for deriving physically implementable convex market bids. A multi-layered interactive optimization for clearing the distributed bids by higher layer decision makers, such as market aggregators, is posed and shown to lead to near-optimal system-level performance at the slower market clearing rates. A proof-of-concept example is illustrated involving close to one hundred heterogeneous controllable DERs with real consumption data of a distribution feeder in Texas, contributing to automatic generation control (AGC).
READ LESS

Summary

This paper formulates the efficient and feasible participation of distributed energy resources (DERs) in complex electricity services as a centralized nonlinear optimization problem first. This problem is then re-stated using the novel energy/power transformed state space. It is shown that the DER dynamics in closed-loop can be made linear in...

READ MORE

Corpora design and score calibration for text dependent pronunciation proficiency recognition

Published in:
8th ISCA Workshop on Speech and Language Technology in Education, SLaTe 2019, 20-21 September 2019.

Summary

This work investigates methods for improving a pronunciation proficiency recognition system, both in terms of phonetic level posterior probability calibration, and in ordinal utterance level classification, for Modern Standard Arabic (MSA), Spanish and Russian. To support this work, utterance level labels were obtained by crowd-sourcing the annotation of language learners' recordings. Phonetic posterior probability estimates extracted using automatic speech recognition systems trained in each language were estimated using a beta calibration approach [1] and language proficiency level was estimated using an ordinal regression [2]. Fusion with language recognition (LR) scores from an i-vector system [3] trained on 23 languages is also explored. Initial results were promising for all three languages and it was demonstrated that the calibrated posteriors were effective for predicting pronunciation proficiency. Significant relative gains of 16% mean absolute error for the ordinal regression and 17% normalized cross entropy for the binary beta regression were achieved on MSA through fusion with LR scores.
READ LESS

Summary

This work investigates methods for improving a pronunciation proficiency recognition system, both in terms of phonetic level posterior probability calibration, and in ordinal utterance level classification, for Modern Standard Arabic (MSA), Spanish and Russian. To support this work, utterance level labels were obtained by crowd-sourcing the annotation of language learners'...

READ MORE

Using K-means in SVR-based text difficulty estimation

Published in:
8th ISCA Workshop on Speech and Language Technology in Education, SLaTE, 20-21 September 2019.

Summary

A challenge for second language learners, educators, and test creators is the identification of authentic materials at the right level of difficulty. In this work, we present an approach to automatically measure text difficulty, integrated into Auto-ILR, a web-based system that helps find text material at the right level for learners in 18 languages. The Auto-ILR subscription service scans web feeds, extracts article content, evaluates the difficulty, and notifies users of documents that match their skill level. Difficulty is measured on the standard ILR scale with language-specific support vector machine regression (SVR) models built from vectors incorporating length features, term frequencies, relative entropy, and K-means clustering.
READ LESS

Summary

A challenge for second language learners, educators, and test creators is the identification of authentic materials at the right level of difficulty. In this work, we present an approach to automatically measure text difficulty, integrated into Auto-ILR, a web-based system that helps find text material at the right level for...

READ MORE

State-of-the-art speaker recognition for telephone and video speech: the JHU-MIT submission for NIST SRE18

Summary

We present a condensed description of the joint effort of JHUCLSP, JHU-HLTCOE, MIT-LL., MIT CSAIL and LSE-EPITA for NIST SRE18. All the developed systems consisted of xvector/i-vector embeddings with some flavor of PLDA backend. Very deep x-vector architectures–Extended and Factorized TDNN, and ResNets– clearly outperformed shallower xvectors and i-vectors. The systems were tailored to the video (VAST) or to the telephone (CMN2) condition. The VAST data was challenging, yielding 4 times worse performance than other video based datasets like Speakers in the Wild. We were able to calibrate the VAST data with very few development trials by using careful adaptation and score normalization methods. The VAST primary fusion yielded EER=10.18% and Cprimary= 0.431. By improving calibration in post-eval, we reached Cprimary=0.369. In CMN2, we used unsupervised SPLDA adaptation based on agglomerative clustering and score normalization to correct the domain shift between English and Tunisian Arabic models. The CMN2 primary fusion yielded EER=4.5% and Cprimary=0.313. Extended TDNN x-vector was the best single system obtaining EER=11.1% and Cprimary=0.452 in VAST; and 4.95% and 0.354 in CMN2.
READ LESS

Summary

We present a condensed description of the joint effort of JHUCLSP, JHU-HLTCOE, MIT-LL., MIT CSAIL and LSE-EPITA for NIST SRE18. All the developed systems consisted of xvector/i-vector embeddings with some flavor of PLDA backend. Very deep x-vector architectures–Extended and Factorized TDNN, and ResNets– clearly outperformed shallower xvectors and i-vectors. The...

READ MORE

The leakage-resilience dilemma

Published in:
Proc. European Symp. on Research in Computer Security, ESORICS 2019, pp. 87-106.

Summary

Many control-flow-hijacking attacks rely on information leakage to disclose the location of gadgets. To address this, several leakage-resilient defenses, have been proposed that fundamentally limit the power of information leakage. Examples of such defenses include address-space re-randomization, destructive code reads, and execute-only code memory. Underlying all of these defenses is some form of code randomization. In this paper, we illustrate that randomization at the granularity of a page or coarser is not secure, and can be exploited by generalizing the idea of partial pointer overwrites, which we call the Relative ROP (RelROP) attack. We then analyzed more that 1,300 common binaries and found that 94% of them contained sufficient gadgets for an attacker to spawn a shell. To demonstrate this concretely, we built a proof-of-concept exploit against PHP 7.0.0. Furthermore, randomization at a granularity finer than a memory page faces practicality challenges when applied to shared libraries. Our findings highlight the dilemma that faces randomization techniques: course-grained techniques are efficient but insecure and fine-grained techniques are secure but impractical.
READ LESS

Summary

Many control-flow-hijacking attacks rely on information leakage to disclose the location of gadgets. To address this, several leakage-resilient defenses, have been proposed that fundamentally limit the power of information leakage. Examples of such defenses include address-space re-randomization, destructive code reads, and execute-only code memory. Underlying all of these defenses is...

READ MORE

Monetized weather radar network benefits for tornado cost reduction

Author:
Published in:
MIT Lincoln Laboratory Report NOAA-35

Summary

A monetized tornado benefit model is developed for arbitrary weather radar network configurations. Geospatial regression analyses indicate that improvement in two key radar coverage parameters--fraction of vertical space observed and cross-range horizontal resolution--lead to better tornado warning performance as characterized by tornado detection probability and false alarm ratio. Previous experimental results showing faster volume scan rates yielding greater warning performance, including increased lead times, are also incorporated into the model. Enhanced tornado warning performance, in turn, reduces casualty rates. In combination, then, it is clearly established that better and faster radar observations reduce tornado casualty rates. Furthermore, lower false alarm ratios save costs by cutting down on people's time lost when taking shelter.
READ LESS

Summary

A monetized tornado benefit model is developed for arbitrary weather radar network configurations. Geospatial regression analyses indicate that improvement in two key radar coverage parameters--fraction of vertical space observed and cross-range horizontal resolution--lead to better tornado warning performance as characterized by tornado detection probability and false alarm ratio. Previous experimental...

READ MORE

Guest editorial: special issue on hardware solutions for cyber security

Published in:
J. Hardw. Syst. Secur., Vol. 3, No. 199, 2019.

Summary

A cyber system could be viewed as an architecture consisting of application software, system software, and system hardware. The hardware layer, being at the foundation of the overall architecture, must be secure itself and also provide effective security features to the software layers. In order to seamlessly integrate security hardware into a system with minimal performance compromises, designers must develop and understand tangible security specifications and metrics to trade between security, performance, and cost for an optimal solution. Hardware security components, libraries, and reference architecture are increasingly important in system design and security. This special issue includes four exciting manuscripts on several aspects of developing hardware-oriented security for systems.
READ LESS

Summary

A cyber system could be viewed as an architecture consisting of application software, system software, and system hardware. The hardware layer, being at the foundation of the overall architecture, must be secure itself and also provide effective security features to the software layers. In order to seamlessly integrate security hardware...

READ MORE