Publications

Refine Results

(Filters Applied) Clear All

Toward improving EN adoption: Bridging the gap between stated intention and actual use

Summary

As the COVID-19 pandemic swept the globe in the spring of 2020, technologists looked to enlist technology to assist public health authorities (PHAs) and help stem the tide of infections. As part of this technology push, experts in health care, cryptography, and other related fields developed the Private Automated Contact Tracing (PACT) protocol and related projects to assist the public health objective of slowing the spread of SARS-CoV-2 through digital contact tracing. The joint Google and Apple deployed protocol (Google-Apple Exposure Notifications, also known as GAEN or EN), which became the de facto standard in the U.S., employs the same features as detailed by PACT. The protocol leverages smartphone Bluetooth communications to alert users of potential contact with those carrying the COVID-19 virus in a way that preserves the privacy of both the known-infected individual, and the users receiving the alert. Contact tracing and subsequent personal precautions are more effective at reducing disease spread when more of the population participates, but there are known difficulties with the adoption of novel technology. In order to help the U.S. Centers for Disease Control and Prevention (CDC) and U.S. state-level public health teams address these difficulties, a team of staff from MIT's Lincoln Laboratory (MIT LL) and Computer Science and Artificial Intelligence Laboratory (MIT CSAIL) focused on studying user perception and information needs.
READ LESS

Summary

As the COVID-19 pandemic swept the globe in the spring of 2020, technologists looked to enlist technology to assist public health authorities (PHAs) and help stem the tide of infections. As part of this technology push, experts in health care, cryptography, and other related fields developed the Private Automated Contact...

READ MORE

Cross-language attacks

Published in:
Network and Distributed System Security (NDSS) Symposium 2022.

Summary

Memory corruption attacks against unsafe programming languages like C/C++ have been a major threat to computer systems for multiple decades. Various sanitizers and runtime exploit mitigation techniques have been shown to only provide partial protection at best. Recently developed ‘safe’ programming languages such as Rust and Go hold the promise to change this paradigm by preventing memory corruption bugs using a strong type system and proper compile-time and runtime checks. Gradual deployment of these languages has been touted as a way of improving the security of existing applications before entire applications can be developed in safe languages. This is notable in popular applications such as Firefox and Tor. In this paper, we systematically analyze the security of multi-language applications. We show that because language safety checks in safe languages and exploit mitigation techniques applied to unsafe languages (e.g., Control-Flow Integrity) break different stages of an exploit to prevent control hijacking attacks, an attacker can carefully maneuver between the languages to mount a successful attack. In essence, we illustrate that the incompatible set of assumptions made in various languages enables attacks that are not possible in each language alone. We study different variants of these attacks and analyze Firefox to illustrate the feasibility and extent of this problem. Our findings show that gradual deployment of safe programming languages, if not done with extreme care, can indeed be detrimental to security.
READ LESS

Summary

Memory corruption attacks against unsafe programming languages like C/C++ have been a major threat to computer systems for multiple decades. Various sanitizers and runtime exploit mitigation techniques have been shown to only provide partial protection at best. Recently developed ‘safe’ programming languages such as Rust and Go hold the promise...

READ MORE

Preventing Kernel Hacks with HAKCs

Published in:
Network and Distributed System Security (NDSS) Symposium 2022.

Summary

Commodity operating system kernels remain monolithic for practical and historical reasons. All kernel code shares a single address space, executes with elevated processor privileges, and has largely unhindered access to all data, including data irrelevant to the completion of a specific task. Applying the principle of least privilege, which limits available resources only to those needed to perform a particular task, to compartmentalize the kernel would realize major security gains, similar to microkernels yet without the major redesign effort. Here, we introduce a compartmentalization design, called a Hardware-Assisted Kernel Compartmentalization (HAKC), that approximates least privilege separation, while minimizing both developer effort and performance overhead. HAKC divides code and data into separate partitions, and specifies an access policy for each partition. Data is owned by a single partition, and a partition’s access-control policy is enforced at runtime, preventing unauthorized data access. When a partition needs to transfer control flow to outside itself, data ownership is transferred to the target, and transferred back upon return. The HAKC design allows for isolating code and data from the rest of the kernel, without utilizing any additional Trusted Computing Base while compartmentalized code is executing. Instead, HAKC relies on hardware for enforcement. Loadable kernel modules (LKMs), which dynamically load kernel code and data providing specialized functionality, are the single largest part of the Linux source base. Unfortunately, their collective size and complexity makes LKMs the cause of the majority of CVEs issued for the Linux kernel. The combination of a large attack surface in kernel modules, and the monolithic design of the Linux kernel, make LKMs ideal candidates for compartmentalization. To demonstrate the effectiveness of our approach, we implement HAKC in Linux v5.10 using extensions to the Arm v8.5-A ISA, and compartmentalize the ipv6.ko LKM, which consists of over 55k LOC. The average overhead measured in Apachebench tests was just 1.6%–24%. Additionally, we compartmentalize the nf_tables.ko packet filtering LKM, and measure the combined impact of using both LKMs. We find a reasonable linear growth in overhead when both compartmentalized LKMs are used. Finally, we measure no significant difference in performance when using the compartmentalized ipv6.ko LKM over the unmodified LKM during real-world web browsing experiments on the Alexa Top 50 websites.
READ LESS

Summary

Commodity operating system kernels remain monolithic for practical and historical reasons. All kernel code shares a single address space, executes with elevated processor privileges, and has largely unhindered access to all data, including data irrelevant to the completion of a specific task. Applying the principle of least privilege, which limits...

READ MORE

Quantifying bias in face verification system

Summary

Machine learning models perform face verification (FV) for a variety of highly consequential applications, such as biometric authentication, face identification, and surveillance. Many state-of-the-art FV systems suffer from unequal performance across demographic groups, which is commonly overlooked by evaluation measures that do not assess population-specific performance. Deployed systems with bias may result in serious harm against individuals or groups who experience underperformance. We explore several fairness definitions and metrics, attempting to quantify bias in Google’s FaceNet model. In addition to statistical fairness metrics, we analyze clustered face embeddings produced by the FV model. We link well-clustered embeddings (well-defined, dense clusters) for a demographic group to biased model performance against that group. We present the intuition that FV systems underperform on protected demographic groups because they are less sensitive to differences between features within those groups, as evidenced by clustered embeddings. We show how this performance discrepancy results from a combination of representation and aggregation bias.
READ LESS

Summary

Machine learning models perform face verification (FV) for a variety of highly consequential applications, such as biometric authentication, face identification, and surveillance. Many state-of-the-art FV systems suffer from unequal performance across demographic groups, which is commonly overlooked by evaluation measures that do not assess population-specific performance. Deployed systems with bias...

READ MORE

Bayesian estimation of PLDA in the presence of noisy training labels, with applications to speaker verification

Published in:
IEEE/ACM Trans. Audio, Speech, Language Process., Vol. 30, 2022, pp. 414-28.

Summary

This paper presents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using meanfield Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels.
READ LESS

Summary

This paper presents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a...

READ MORE

Tools and practices for responsible AI engineering

Summary

Responsible Artificial Intelligence (AI)—the practice of developing, evaluating, and maintaining accurate AI systems that also exhibit essential properties such as robustness and explainability—represents a multifaceted challenge that often stretches standard machine learning tooling, frameworks, and testing methods beyond their limits. In this paper, we present two new software libraries—hydra-zen and the rAI-toolbox—that address critical needs for responsible AI engineering. hydra-zen dramatically simplifies the process of making complex AI applications configurable, and their behaviors reproducible. The rAI-toolbox is designed to enable methods for evaluating and enhancing the robustness of AI-models in a way that is scalable and that composes naturally with other popular ML frameworks. We describe the design principles and methodologies that make these tools effective, including the use of property-based testing to bolster the reliability of the tools themselves. Finally, we demonstrate the composability and flexibility of the tools by showing how various use cases from adversarial robustness and explainable AI can be concisely implemented with familiar APIs.
READ LESS

Summary

Responsible Artificial Intelligence (AI)—the practice of developing, evaluating, and maintaining accurate AI systems that also exhibit essential properties such as robustness and explainability—represents a multifaceted challenge that often stretches standard machine learning tooling, frameworks, and testing methods beyond their limits. In this paper, we present two new software libraries—hydra-zen and...

READ MORE

Adapting deep learning models to new meteorological contexts using transfer learning

Published in:
2021 IEEE International Conference on Big Data (Big Data), 2021, pp. 4169-4177, doi: 10.1109/BigData52589.2021.9671451.

Summary

Meteorological applications such as precipitation nowcasting, synthetic radar generation, statistical downscaling and others have benefited from deep learning (DL) approaches, however several challenges remain for widespread adaptation of these complex models in operational systems. One of these challenges is adequate generalizability; deep learning models trained from datasets collected in specific contexts should not be expected to perform as well when applied to different contexts required by large operational systems. One obvious mitigation for this is to collect massive amounts of training data that cover all expected meteorological contexts, however this is not only costly and difficult to manage, but is also not possible in many parts of the globe where certain sensing platforms are sparse. In this paper, we describe an application of transfer learning to perform domain transfer for deep learning models. We demonstrate a transfer learning algorithm called weight superposition to adapt a Convolutional Neural Network trained in a source context to a new target context. Weight superposition is a method for storing multiple models within a single set of parameters thus greatly simplifying model maintenance and training. This approach also addresses the issue of catastrophic forgetting where a model, once adapted to a new context, performs poorly in the original context. We apply weight superposition to the problem of synthetic weather radar generation and show that in scenarios where the target context has less data, a model adapted with weight superposition is better at maintaining performance when compared to simpler methods. Conversely, the simple adapted model performs better on the source context when the source and target contexts have comparable amounts of data.
READ LESS

Summary

Meteorological applications such as precipitation nowcasting, synthetic radar generation, statistical downscaling and others have benefited from deep learning (DL) approaches, however several challenges remain for widespread adaptation of these complex models in operational systems. One of these challenges is adequate generalizability; deep learning models trained from datasets collected in specific...

READ MORE

Keeping Safe Rust safe with Galeed

Published in:
Annual Computer Security Applications Conf., ACSAC, December 2021, pp. 824-36.

Summary

Rust is a programming language that simultaneously offers high performance and strong security guarantees. Safe Rust (i.e., Rust code that does not use the unsafe keyword) is memory and type safe. However, these guarantees are violated when safe Rust interacts with unsafe code, most notably code written in other programming languages, including in legacy C/C++ applications that are incrementally deploying Rust. This is a significant problem as major applications such as Firefox, Chrome, AWS, Windows, and Linux have either deployed Rust or are exploring doing so. It is important to emphasize that unsafe code is not only unsafe itself, but also it breaks the safety guarantees of ‘safe’ Rust; e.g., a dangling pointer in a linked C/C++ library can access and overwrite memory allocated to Rust even when the Rust code is fully safe. This paper presents Galeed, a technique to keep safe Rust safe from interference from unsafe code. Galeed has two components: a runtime defense to prevent unintended interactions between safe Rust and unsafe code and a sanitizer to secure intended interactions. The runtime component works by isolating Rust’s heap from any external access and is enforced using Intel Memory Protection Key (MPK) technology. The sanitizer uses a smart data structure that we call pseudo-pointer along with automated code transformation to avoid passing raw pointers across safe/unsafe boundaries during intended interactions (e.g., when Rust and C++ code exchange data). We implement and evaluate the effectiveness and performance of Galeed via micro- and macro-benchmarking, and use it to secure a widely used component of Firefox.
READ LESS

Summary

Rust is a programming language that simultaneously offers high performance and strong security guarantees. Safe Rust (i.e., Rust code that does not use the unsafe keyword) is memory and type safe. However, these guarantees are violated when safe Rust interacts with unsafe code, most notably code written in other programming...

READ MORE

Detecting pathogen exposure during the non-symptomatic incubation period using physiological data: proof of concept in non-human primates

Summary

Background and Objectives: Early warning of bacterial and viral infection, prior to the development of overt clinical symptoms, allows not only for improved patient care and outcomes but also enables faster implementation of public health measures (patient isolation and contact tracing). Our primary objectives in this effort are 3-fold. First, we seek to determine the upper limits of early warning detection through physiological measurements. Second, we investigate whether the detected physiological response is specific to the pathogen. Third, we explore the feasibility of extending early warning detection with wearable devices. Research Methods: For the first objective, we developed a supervised random forest algorithm to detect pathogen exposure in the asymptomatic period prior to overt symptoms (fever). We used high-resolution physiological telemetry data (aortic blood pressure, intrathoracic pressure, electrocardiograms, and core temperature) from non-human primate animal models exposed to two viral pathogens: Ebola and Marburg (N = 20). Second, to determine reusability across different pathogens, we evaluated our algorithm against three independent physiological datasets from non-human primate models (N = 13) exposed to three different pathogens: Lassa and Nipah viruses and Y. pestis. For the third objective, we evaluated performance degradation when the algorithm was restricted to features derived from electrocardiogram (ECG) waveforms to emulate data from a non-invasive wearable device. Results: First, our cross-validated random forest classifier provides a mean early warning of 51 ± 12 h, with an area under the receiver-operating characteristic curve (AUC) of 0.93 ± 0.01. Second, our algorithm achieved comparable performance when applied to datasets from different pathogen exposures – a mean early warning of 51 ± 14 h and AUC of 0.95 ± 0.01. Last, with a degraded feature set derived solely from ECG, we observed minimal degradation – a mean early warning of 46 ± 14 h and AUC of 0.91 ± 0.001. Conclusion: Under controlled experimental conditions, physiological measurements can provide over 2 days of early warning with high AUC. Deviations in physiological signals following exposure to a pathogen are due to the underlying host’s immunological response and are not specific to the pathogen. Pre-symptomatic detection is strong even when features are limited to ECG-derivatives, suggesting that this approach may translate to non-invasive wearable devices.
READ LESS

Summary

Background and Objectives: Early warning of bacterial and viral infection, prior to the development of overt clinical symptoms, allows not only for improved patient care and outcomes but also enables faster implementation of public health measures (patient isolation and contact tracing). Our primary objectives in this effort are 3-fold. First...

READ MORE

Unsupervised Bayesian adaptation of PLDA for speaker verification

Published in:
Interspeech, 30 August - 3 September 2021.

Summary

This paper presents a Bayesian framework for unsupervised domain adaptation of Probabilistic Linear Discriminant Analysis (PLDA). By interpreting class labels as latent random variables, Variational Bayes (VB) is used to derive a maximum a posterior (MAP) solution of the adapted PLDA model when labels are missing, referred to as VB-MAP. The VB solution iteratively infers class labels and updates PLDA hyperparameters, offering a systematic framework for dealing with unlabeled data. While presented as a general solution, this paper includes experimental results for domain adaptation in speaker verification. VBMAP estimation is applied to the 2016 and 2018 NIST Speaker Recognition Evaluations (SREs), both of which included small and unlabeled in-domain data sets, and is shown to provide performance improvements over a variety of state-of-the-art domain adaptation methods. Additionally, VB-MAP estimation is used to train a fully unsupervised PLDA model, suffering only minor performance degradation relative to conventional supervised training, offering promise for training PLDA models when no relevant labeled data exists.
READ LESS

Summary

This paper presents a Bayesian framework for unsupervised domain adaptation of Probabilistic Linear Discriminant Analysis (PLDA). By interpreting class labels as latent random variables, Variational Bayes (VB) is used to derive a maximum a posterior (MAP) solution of the adapted PLDA model when labels are missing, referred to as VB-MAP...

READ MORE