Publications
FastDAWG: improving data migration in the BigDAWG polystore system
Summary
Summary
The problem of data integration has been around for decades, yet a satisfactory solution has not yet emerged. A new type of system called a polystore has surfaced to partially address the integration problem. Based on experience with our own polystore called Big-DAWG, we identify three major roadblocks to an...
Scaling big data platform for big data pipeline
Summary
Summary
Monitoring and Managing High Performance Computing (HPC) systems and environments generate an ever growing amount of data. Making sense of this data and generating a platform where the data can be visualized for system administrators and management to proactively identify system failures or understand the state of the system requires...
Guidelines for secure small satellite design and implementation: FY18 Cyber Security Line-Supported Program
Summary
Summary
We are on the cusp of a computational renaissance in space, and we should not bring past terrestrial missteps along. Commercial off-the-shelf (COTS) processors -- much more powerful than traditional rad-hard devices -- are increasingly used in a variety of low-altitude, short-duration CubeSat class missions. With this new-found headroom, the...
A billion updates per second using 30,000 hierarchical in-memory D4M databases
Summary
Summary
Analyzing large scale networks requires high performance streaming updates of graph representations of these data. Associative arrays are mathematical objects combining properties of spreadsheets, databases, matrices, and graphs, and are well-suited for representing and analyzing streaming network data. The Dynamic Distributed Dimensional Data Model (D4M) library implements associative arrays in...
Shining light on thermophysical Near-Earth Asteroid modeling efforts
Summary
Summary
Comprehensive thermophysical analyses of Near-Earth Asteroids (NEAs) provide important information about their physical properties, including visible albedo, diameter, composition, and thermal inertia. These details are integral to defining asteroid taxonomy and understanding how these objects interact with the solar system. Since infrared (IR) asteroid observations are not widely available, thermophysical...
Artificial intelligence: short history, present developments, and future outlook, final report
Summary
Summary
The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the...
Secure input validation in Rust with parsing-expression grammars
Summary
Summary
Accepting input from the outside world is one of the most dangerous things a system can do. Since type information is lost across system boundaries, systems must perform type-specific input handling routines to recover this information. Adversaries can carefully craft input data to exploit any bugs or vulnerabilities in these...
Security and performance analysis of custom memory allocators
Summary
Summary
Computer programmers use custom memory allocators as an alternative to built-in or general-purpose memory allocators with the intent to improve performance and minimize human error. However, it is difficult to achieve both memory safety and performance gains on custom memory allocators. In this thesis, we study the relationship between memory...
Rulemaking for insider threat mitigation
Summary
Summary
This chapter continues the topic we started to discuss in the previous chapter – the human factors. However, it focuses on a specific method of enhancing cyber resilience via establishing appropriate rules for employees of an organization under consideration. Such rules aim at reducing threats from, for example, current or...
Detecting food safety risks and human trafficking using interpretable machine learning methods
Summary
Summary
Black box machine learning methods have allowed researchers to design accurate models using large amounts of data at the cost of interpretability. Model interpretability not only improves user buy-in, but in many cases provides users with important information. Especially in the case of the classification problems addressed in this thesis...