Publications

Refine Results

(Filters Applied) Clear All

Large scale parallelization using file-based communications

Summary

In this paper, we present a novel and new file-based communication architecture using the local filesystem for large scale parallelization. This new approach eliminates the issues with filesystem overload and resource contention when using the central filesystem for large parallel jobs. The new approach incurs additional overhead due to inter-node message file transfers when both the sending and receiving processes are not on the same node. However, even with this additional overhead cost, its benefits are far greater for the overall cluster operation in addition to the performance enhancement in message communications for large scale parallel jobs. For example, when running a 2048-process parallel job, it achieved about 34 times better performance with MPI_Bcast() when using the local filesystem. Furthermore, since the security for transferring message files is handled entirely by using the secure copy protocol (scp) and the file system permissions, no additional security measures or ports are required other than those that are typically required on an HPC system.
READ LESS

Summary

In this paper, we present a novel and new file-based communication architecture using the local filesystem for large scale parallelization. This new approach eliminates the issues with filesystem overload and resource contention when using the central filesystem for large parallel jobs. The new approach incurs additional overhead due to inter-node...

READ MORE

Optimizing the visualization pipeline of a 3-D monitoring and management system

Published in:
2019 IEEE High Performance Computing Conf., HPEC, 24-26 September 2019.

Summary

Monitoring and managing High Performance Computing (HPC) systems and environments generate an ever growing amount of data. Making sense of this data and generating a platform where the data can be visualized for system administrators and management to proactively identify system failures or understand the state of the system requires the platform to be as efficient and scalable as the underlying database tools used to store and analyze the data. In this paper we will show how we leverage Accumulo, d4m, and Unity to generate a 3-D visualization platform to monitor and manage the Lincoln Laboratory Supercomputer systems and how we have had to retool our approach to scale with our systems.
READ LESS

Summary

Monitoring and managing High Performance Computing (HPC) systems and environments generate an ever growing amount of data. Making sense of this data and generating a platform where the data can be visualized for system administrators and management to proactively identify system failures or understand the state of the system requires...

READ MORE

Introducing DyMonDS-as-a-Service (DyMaaS) for Internet of Things

Author:
Published in:
2019 IEEE High Performance Computing Conf., HPEC, 24-26 September 2019.

Summary

With recent trends in computation and communication architecture, it is becoming possible to simulate complex networked dynamical systems by employing high-fidelity models. The inherent spatial and temporal complexity of these systems, however, still acts as a roadblock. It is thus desirable to have adaptive platform design facilitating zooming-in and out of the models to emulate time-evolution of processes at a desired spatial and temporal granularity. In this paper, we propose new computing and networking abstractions, that can embrace physical dynamics and computations in a unified manner, by taking advantage of the inherent structure. We further design multi-rate numerical methods that can be implemented by computing architectures to facilitate adaptive zooming-in and out of the models spanning multiple spatial and temporal layers. These methods are all embedded in a platform called Dynamic Monitoring and Decision Systems (DyMonDS). We introduce a new service model of cloud computing called DyMonDS-as-a-Service (DyMaas), for use by operators at various spatial granularities to efficiently emulate the interconnection of IoT devices. The usage of this platform is described in the context of an electric microgrid system emulation.
READ LESS

Summary

With recent trends in computation and communication architecture, it is becoming possible to simulate complex networked dynamical systems by employing high-fidelity models. The inherent spatial and temporal complexity of these systems, however, still acts as a roadblock. It is thus desirable to have adaptive platform design facilitating zooming-in and out...

READ MORE

Toward technically feasible and economically efficient integration of distributed energy resources

Author:
Published in:
57th Annual Allerton Conf. on Communication, Control, and Computing, 24-27 September 2019.

Summary

This paper formulates the efficient and feasible participation of distributed energy resources (DERs) in complex electricity services as a centralized nonlinear optimization problem first. This problem is then re-stated using the novel energy/power transformed state space. It is shown that the DER dynamics in closed-loop can be made linear in this new state space. The decision making by the DERs then becomes a distributed model predictive control problem and it forms the basis for deriving physically implementable convex market bids. A multi-layered interactive optimization for clearing the distributed bids by higher layer decision makers, such as market aggregators, is posed and shown to lead to near-optimal system-level performance at the slower market clearing rates. A proof-of-concept example is illustrated involving close to one hundred heterogeneous controllable DERs with real consumption data of a distribution feeder in Texas, contributing to automatic generation control (AGC).
READ LESS

Summary

This paper formulates the efficient and feasible participation of distributed energy resources (DERs) in complex electricity services as a centralized nonlinear optimization problem first. This problem is then re-stated using the novel energy/power transformed state space. It is shown that the DER dynamics in closed-loop can be made linear in...

READ MORE

Corpora design and score calibration for text dependent pronunciation proficiency recognition

Published in:
8th ISCA Workshop on Speech and Language Technology in Education, SLaTe 2019, 20-21 September 2019.

Summary

This work investigates methods for improving a pronunciation proficiency recognition system, both in terms of phonetic level posterior probability calibration, and in ordinal utterance level classification, for Modern Standard Arabic (MSA), Spanish and Russian. To support this work, utterance level labels were obtained by crowd-sourcing the annotation of language learners' recordings. Phonetic posterior probability estimates extracted using automatic speech recognition systems trained in each language were estimated using a beta calibration approach [1] and language proficiency level was estimated using an ordinal regression [2]. Fusion with language recognition (LR) scores from an i-vector system [3] trained on 23 languages is also explored. Initial results were promising for all three languages and it was demonstrated that the calibrated posteriors were effective for predicting pronunciation proficiency. Significant relative gains of 16% mean absolute error for the ordinal regression and 17% normalized cross entropy for the binary beta regression were achieved on MSA through fusion with LR scores.
READ LESS

Summary

This work investigates methods for improving a pronunciation proficiency recognition system, both in terms of phonetic level posterior probability calibration, and in ordinal utterance level classification, for Modern Standard Arabic (MSA), Spanish and Russian. To support this work, utterance level labels were obtained by crowd-sourcing the annotation of language learners'...

READ MORE

Using K-means in SVR-based text difficulty estimation

Published in:
8th ISCA Workshop on Speech and Language Technology in Education, SLaTE, 20-21 September 2019.

Summary

A challenge for second language learners, educators, and test creators is the identification of authentic materials at the right level of difficulty. In this work, we present an approach to automatically measure text difficulty, integrated into Auto-ILR, a web-based system that helps find text material at the right level for learners in 18 languages. The Auto-ILR subscription service scans web feeds, extracts article content, evaluates the difficulty, and notifies users of documents that match their skill level. Difficulty is measured on the standard ILR scale with language-specific support vector machine regression (SVR) models built from vectors incorporating length features, term frequencies, relative entropy, and K-means clustering.
READ LESS

Summary

A challenge for second language learners, educators, and test creators is the identification of authentic materials at the right level of difficulty. In this work, we present an approach to automatically measure text difficulty, integrated into Auto-ILR, a web-based system that helps find text material at the right level for...

READ MORE

The leakage-resilience dilemma

Published in:
Proc. European Symp. on Research in Computer Security, ESORICS 2019, pp. 87-106.

Summary

Many control-flow-hijacking attacks rely on information leakage to disclose the location of gadgets. To address this, several leakage-resilient defenses, have been proposed that fundamentally limit the power of information leakage. Examples of such defenses include address-space re-randomization, destructive code reads, and execute-only code memory. Underlying all of these defenses is some form of code randomization. In this paper, we illustrate that randomization at the granularity of a page or coarser is not secure, and can be exploited by generalizing the idea of partial pointer overwrites, which we call the Relative ROP (RelROP) attack. We then analyzed more that 1,300 common binaries and found that 94% of them contained sufficient gadgets for an attacker to spawn a shell. To demonstrate this concretely, we built a proof-of-concept exploit against PHP 7.0.0. Furthermore, randomization at a granularity finer than a memory page faces practicality challenges when applied to shared libraries. Our findings highlight the dilemma that faces randomization techniques: course-grained techniques are efficient but insecure and fine-grained techniques are secure but impractical.
READ LESS

Summary

Many control-flow-hijacking attacks rely on information leakage to disclose the location of gadgets. To address this, several leakage-resilient defenses, have been proposed that fundamentally limit the power of information leakage. Examples of such defenses include address-space re-randomization, destructive code reads, and execute-only code memory. Underlying all of these defenses is...

READ MORE

State-of-the-art speaker recognition for telephone and video speech: the JHU-MIT submission for NIST SRE18

Summary

We present a condensed description of the joint effort of JHUCLSP, JHU-HLTCOE, MIT-LL., MIT CSAIL and LSE-EPITA for NIST SRE18. All the developed systems consisted of xvector/i-vector embeddings with some flavor of PLDA backend. Very deep x-vector architectures–Extended and Factorized TDNN, and ResNets– clearly outperformed shallower xvectors and i-vectors. The systems were tailored to the video (VAST) or to the telephone (CMN2) condition. The VAST data was challenging, yielding 4 times worse performance than other video based datasets like Speakers in the Wild. We were able to calibrate the VAST data with very few development trials by using careful adaptation and score normalization methods. The VAST primary fusion yielded EER=10.18% and Cprimary= 0.431. By improving calibration in post-eval, we reached Cprimary=0.369. In CMN2, we used unsupervised SPLDA adaptation based on agglomerative clustering and score normalization to correct the domain shift between English and Tunisian Arabic models. The CMN2 primary fusion yielded EER=4.5% and Cprimary=0.313. Extended TDNN x-vector was the best single system obtaining EER=11.1% and Cprimary=0.452 in VAST; and 4.95% and 0.354 in CMN2.
READ LESS

Summary

We present a condensed description of the joint effort of JHUCLSP, JHU-HLTCOE, MIT-LL., MIT CSAIL and LSE-EPITA for NIST SRE18. All the developed systems consisted of xvector/i-vector embeddings with some flavor of PLDA backend. Very deep x-vector architectures–Extended and Factorized TDNN, and ResNets– clearly outperformed shallower xvectors and i-vectors. The...

READ MORE

Monetized weather radar network benefits for tornado cost reduction

Author:
Published in:
MIT Lincoln Laboratory Report NOAA-35

Summary

A monetized tornado benefit model is developed for arbitrary weather radar network configurations. Geospatial regression analyses indicate that improvement in two key radar coverage parameters--fraction of vertical space observed and cross-range horizontal resolution--lead to better tornado warning performance as characterized by tornado detection probability and false alarm ratio. Previous experimental results showing faster volume scan rates yielding greater warning performance, including increased lead times, are also incorporated into the model. Enhanced tornado warning performance, in turn, reduces casualty rates. In combination, then, it is clearly established that better and faster radar observations reduce tornado casualty rates. Furthermore, lower false alarm ratios save costs by cutting down on people's time lost when taking shelter.
READ LESS

Summary

A monetized tornado benefit model is developed for arbitrary weather radar network configurations. Geospatial regression analyses indicate that improvement in two key radar coverage parameters--fraction of vertical space observed and cross-range horizontal resolution--lead to better tornado warning performance as characterized by tornado detection probability and false alarm ratio. Previous experimental...

READ MORE

Guest editorial: special issue on hardware solutions for cyber security

Published in:
J. Hardw. Syst. Secur., Vol. 3, No. 199, 2019.

Summary

A cyber system could be viewed as an architecture consisting of application software, system software, and system hardware. The hardware layer, being at the foundation of the overall architecture, must be secure itself and also provide effective security features to the software layers. In order to seamlessly integrate security hardware into a system with minimal performance compromises, designers must develop and understand tangible security specifications and metrics to trade between security, performance, and cost for an optimal solution. Hardware security components, libraries, and reference architecture are increasingly important in system design and security. This special issue includes four exciting manuscripts on several aspects of developing hardware-oriented security for systems.
READ LESS

Summary

A cyber system could be viewed as an architecture consisting of application software, system software, and system hardware. The hardware layer, being at the foundation of the overall architecture, must be secure itself and also provide effective security features to the software layers. In order to seamlessly integrate security hardware...

READ MORE