Publications

Refine Results

(Filters Applied) Clear All

GraphChallenge.org triangle counting performance [e-print]

Summary

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems. GraphChallenge.org provides a wide range of preparsed graph data sets, graph generators, mathematically defined graph algorithms, example serial implementations in a variety of languages, and specific metrics for measuring performance. The triangle counting component of GraphChallenge.org tests the performance of graph processing systems to count all the triangles in a graph and exercises key graph operations found in many graph algorithms. In 2017, 2018, and 2019 many triangle counting submissions were received from a wide range of authors and organizations. This paper presents a performance analysis of the best performers of these submissions. These submissions show that their state-of-the-art triangle counting execution time, Ttri, is a strong function of the number of edges in the graph, Ne, which improved significantly from 2017 (Ttri \approx (Ne/10^8)^4=3) to 2018 (Ttri \approx Ne/10^9) and remained comparable from 2018 to 2019. Graph Challenge provides a clear picture of current graph analysis systems and underscores the need for new innovations to achieve high performance on very large graphs
READ LESS

Summary

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems...

READ MORE

COVID-19: famotidine, histamine, mast cells, and mechanisms [eprint]

Summary

SARS-CoV-2 infection is required for COVID-19, but many signs and symptoms of COVID-19 differ from common acute viral diseases. Currently, there are no pre- or post-exposure prophylactic COVID-19 medical countermeasures. Clinical data suggest that famotidine may mitigate COVID-19 disease, but both mechanism of action and rationale for dose selection remain obscure. We explore several plausible avenues of activity including antiviral and host-mediated actions. We propose that the principal famotidine mechanism of action for COVID-19 involves on-target histamine receptor H2 activity, and that development of clinical COVID-19 involves dysfunctional mast cell activation and histamine release.
READ LESS

Summary

SARS-CoV-2 infection is required for COVID-19, but many signs and symptoms of COVID-19 differ from common acute viral diseases. Currently, there are no pre- or post-exposure prophylactic COVID-19 medical countermeasures. Clinical data suggest that famotidine may mitigate COVID-19 disease, but both mechanism of action and rationale for dose selection remain...

READ MORE

Kawasaki disease, multisystem inflammatory syndrome in children: antibody-induced mast cell activation hypothesis

Published in:
J Pediatrics & Pediatr Med. 2020; 4(2): 1-7

Summary

Multisystem Inflammatory Syndrome in Children (MIS-C) is appearing in infants, children, and young adults in association with COVID-19 (coronavirus disease 2019) infections of SARS-CoV-2. Kawasaki Disease (KD) is one of the most common vasculitides of childhood. KD presents with similar symptoms to MIS-C especially in severe forms such as Kawasaki Disease Shock Syndrome (KDSS). The observed symptoms for MIS-C and KD are consistent with Mast Cell Activation Syndrome (MCAS) characterized by inflammatory molecules released from activated mast cells. Based on the associations of KD with multiple viral and bacterial pathogens, we put forward the hypothesis that KD and MIS-C result from antibody activation of mast cells by Fc receptor-bound pathogen antibodies causing a hyperinflammatory response upon second pathogen exposure. Within this hypothesis, MIS-C may be atypical KD or a KD-like disease associated with SARS-CoV-2. We extend the mast cell hypothesis that increased histamine levels are inducing contraction of effector cells with impeded blood flow through cardiac capillaries. In some patients, pressure from impeded blood flow, within cardiac capillaries, may result in increased coronary artery blood pressure leading to aneurysms, a well-known complication in KD.
READ LESS

Summary

Multisystem Inflammatory Syndrome in Children (MIS-C) is appearing in infants, children, and young adults in association with COVID-19 (coronavirus disease 2019) infections of SARS-CoV-2. Kawasaki Disease (KD) is one of the most common vasculitides of childhood. KD presents with similar symptoms to MIS-C especially in severe forms such as Kawasaki...

READ MORE

Medical countermeasures analysis of 2019-nCoV and vaccine risks for antibody-dependent enhancement (ADE)

Published in:
https://www.preprints.org/manuscript/202003.0138/v1

Summary

Background: In 80% of patients, COVID-19 presents as mild disease. 20% of cases develop severe (13%) or critical (6%) illness. More severe forms of COVID-19 present as clinical severe acute respiratory syndrome, but include a T-predominant lymphopenia, high circulating levels of proinflammatory cytokines and chemokines, accumulation of neutrophils and macrophages in lungs, and immune dysregulation including immunosuppression. Methods: All major SARS-CoV-2 proteins were characterized using an amino acid residue variation analysis method. Results predict that most SARS-CoV-2 proteins are evolutionary constrained, with the exception of the spike (S) protein extended outer surface. Results were interpreted based on known SARS-like coronavirus virology and pathophysiology, with a focus on medical countermeasure development implications. Findings: Non-neutralizing antibodies to variable S domains may enable an alternative infection pathway via Fc receptor-mediated uptake. This may be a gating event for the immune response dysregulation observed in more severe COVID-19 disease. Prior studies involving vaccine candidates for FCoV SARS-CoV-1 and Middle East Respiratory Syndrome coronavirus (MERS-CoV) demonstrate vaccination-induced antibody-dependent enhancement of disease (ADE), including infection of phagocytic antigen presenting cells (APC). T effector cells are believed to play an important role in controlling coronavirus infection; pan-T depletion is present in severe COVID-19 disease and may be accelerated by APC infection. Sequence and structural conservation of S motifs suggests that SARS and MERS vaccine ADE risks may foreshadow SARS-CoV-2 S-based vaccine risks. Autophagy inhibitors may reduce APC infection and T-cell depletion. Amino acid residue variation analysis identifies multiple constrained domains suitable as T cell vaccine targets. Evolutionary constraints on proven antiviral drug targets present in SARS-CoV-1 and SARS-CoV-2 may reduce risk of developing antiviral drug escape mutants. Interpretation: Safety testing of COVID-19 S protein-based B cell vaccines in animal models is strongly encouraged prior to clinical trials to reduce risk of ADE upon virus exposure.
READ LESS

Summary

Background: In 80% of patients, COVID-19 presents as mild disease. 20% of cases develop severe (13%) or critical (6%) illness. More severe forms of COVID-19 present as clinical severe acute respiratory syndrome, but include a T-predominant lymphopenia, high circulating levels of proinflammatory cytokines and chemokines, accumulation of neutrophils and macrophages...

READ MORE

AI enabling technologies: a survey

Summary

Artificial Intelligence (AI) has the opportunity to revolutionize the way the United States Department of Defense (DoD) and Intelligence Community (IC) address the challenges of evolving threats, data deluge, and rapid courses of action. Developing an end-to-end artificial intelligence system involves parallel development of different pieces that must work together in order to provide capabilities that can be used by decision makers, warfighters and analysts. These pieces include data collection, data conditioning, algorithms, computing, robust artificial intelligence, and human-machine teaming. While much of the popular press today surrounds advances in algorithms and computing, most modern AI systems leverage advances across numerous different fields. Further, while certain components may not be as visible to end-users as others, our experience has shown that each of these interrelated components play a major role in the success or failure of an AI system. This article is meant to highlight many of these technologies that are involved in an end-to-end AI system. The goal of this article is to provide readers with an overview of terminology, technical details and recent highlights from academia, industry and government. Where possible, we indicate relevant resources that can be used for further reading and understanding.
READ LESS

Summary

Artificial Intelligence (AI) has the opportunity to revolutionize the way the United States Department of Defense (DoD) and Intelligence Community (IC) address the challenges of evolving threats, data deluge, and rapid courses of action. Developing an end-to-end artificial intelligence system involves parallel development of different pieces that must work together...

READ MORE

Artificial intelligence: short history, present developments, and future outlook, final report

Summary

The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the AI field is evolving so rapidly, the study scope was to look at the recent past and ongoing developments to lead to a set of findings and recommendations. It was important to begin with a short AI history and a lay-of-the-land on representative developments across the Department of Defense (DoD), intelligence communities (IC), and Homeland Security. These areas are addressed in more detail within the report. A main deliverable from the study was to formulate an end-to-end AI canonical architecture that was suitable for a range of applications. The AI canonical architecture, formulated in the study, serves as the guiding framework for all the sections in this report. Even though the study primarily focused on cyber security and information sciences, the enabling technologies are broadly applicable to many other areas. Therefore, we dedicate a full section on enabling technologies in Section 3. The discussion on enabling technologies helps the reader clarify the distinction among AI, machine learning algorithms, and specific techniques to make an end-to-end AI system viable. In order to understand what is the lay-of-the-land in AI, study participants performed a fairly wide reach within MIT LL and external to the Laboratory (government, commercial companies, defense industrial base, peers, academia, and AI centers). In addition to the study participants (shown in the next section under acknowledgements), we also assembled an internal review team (IRT). The IRT was extremely helpful in providing feedback and in helping with the formulation of the study briefings, as we transitioned from datagathering mode to the study synthesis. The format followed throughout the study was to highlight relevant content that substantiates the study findings, and identify a set of recommendations. An important finding is the significant AI investment by the so-called "big 6" commercial companies. These major commercial companies are Google, Amazon, Facebook, Microsoft, Apple, and IBM. They dominate in the AI ecosystem research and development (R&D) investments within the U.S. According to a recent McKinsey Global Institute report, cumulative R&D investment in AI amounts to about $30 billion per year. This amount is substantially higher than the R&D investment within the DoD, IC, and Homeland Security. Therefore, the DoD will need to be very strategic about investing where needed, while at the same time leveraging the technologies already developed and available from a wide range of commercial applications. As we will discuss in Section 1 as part of the AI history, MIT LL has been instrumental in developing advanced AI capabilities. For example, MIT LL has a long history in the development of human language technologies (HLT) by successfully applying machine learning algorithms to difficult problems in speech recognition, machine translation, and speech understanding. Section 4 elaborates on prior applications of these technologies, as well as newer applications in the context of multi-modalities (e.g., speech, text, images, and video). An end-to-end AI system is very well suited to enhancing the capabilities of human language analysis. Section 5 discusses AI's nascent role in cyber security. There have been cases where AI has already provided important benefits. However, much more research is needed in both the application of AI to cyber security and the associated vulnerability to the so-called adversarial AI. Adversarial AI is an area very critical to the DoD, IC, and Homeland Security, where malicious adversaries can disrupt AI systems and make them untrusted in operational environments. This report concludes with specific recommendations by formulating the way forward for Division 5 and a discussion of S&T challenges and opportunities. The S&T challenges and opportunities are centered on the key elements of the AI canonical architecture to strengthen the AI capabilities across the DoD, IC, and Homeland Security in support of national security.
READ LESS

Summary

The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the...

READ MORE

Large-scale Bayesian kinship analysis

Summary

Kinship prediction in forensics is limited to first degree relatives due to the small number of short tandem repeat loci characterized. The Genetic Chain Rule for Probabilistic Kinship Estimation can leverage large panels of single nucleotide polymorphisms (SNPs) or sets of sequence linked SNPs, called haploblocks, to estimate more distant relationships between individuals. This method uses allele frequencies and Markov Chain Monte Carlo methods to determine kinship probabilities. Allele frequencies are a crucial input to this method. Since these frequencies are estimated from finite populations and many alleles are rare, a Bayesian extension to the algorithm has been developed to determine credible intervals for kinship estimates as a function of the certainty in allele frequency estimates. Generation of sufficiently large samples to accurately estimate credible intervals can take significant computational resources. In this paper, we leverage hundreds of compute cores to generate large numbers of Dirichlet random samples for Bayesian kinship prediction. We show that it is possible to generate 2,097,152 random samples on 32,768 cores at a rate of 29.68 samples per second. The ability to generate extremely large number of samples enables the computation of more statistically significant results from a Bayesian approach to kinship analysis.
READ LESS

Summary

Kinship prediction in forensics is limited to first degree relatives due to the small number of short tandem repeat loci characterized. The Genetic Chain Rule for Probabilistic Kinship Estimation can leverage large panels of single nucleotide polymorphisms (SNPs) or sets of sequence linked SNPs, called haploblocks, to estimate more distant...

READ MORE

Simulation approach to sensor placement using Unity3D

Summary

3D game simulation engines have demonstrated utility in the areas of training, scientific analysis, and knowledge solicitation. This paper will make the case for the use of 3D game simulation engines in the field of sensor placement optimization. Our study used a series of parallel simulations in the Unity3D simulation framework to answer the questions: how many sensors of various modalities are required and where they should be placed to meet a desired threat detection threshold? The result is a framework that not only answers this sensor placement question, but can be easily expanded to differing optimization criteria as well as answer how a particular configuration responds to differing crowd flows or informed/non-informed adversaries. Additionally, we demonstrate the scalability of this framework by running parallel instances on a supercomputing grid and illustrate the processing speed gained.
READ LESS

Summary

3D game simulation engines have demonstrated utility in the areas of training, scientific analysis, and knowledge solicitation. This paper will make the case for the use of 3D game simulation engines in the field of sensor placement optimization. Our study used a series of parallel simulations in the Unity3D simulation...

READ MORE

Detecting pathogen exposure during the non-symptomatic incubation period using physiological data

Summary

Early pathogen exposure detection allows better patient care and faster implementation of public health measures (patient isolation, contact tracing). Existing exposure detection most frequently relies on overt clinical symptoms, namely fever, during the infectious prodromal period. We have developed a robust machine learning based method to better detect asymptomatic states during the incubation period using subtle, sub-clinical physiological markers. Starting with highresolution physiological waveform data from non-human primate studies of viral (Ebola, Marburg, Lassa, and Nipah viruses) and bacterial (Y. pestis) exposure, we processed the data to reduce short-term variability and normalize diurnal variations, then provided these to a supervised random forest classification algorithm and post-classifier declaration logic step to reduce false alarms. In most subjects detection is achieved well before the onset of fever; subject cross-validation across exposure studies (varying viruses, exposure routes, animal species, and target dose) lead to 51h mean early detection (at 0.93 area under the receiver-operating characteristic curve [AUCROC]). Evaluating the algorithm against entirely independent datasets for Lassa, Nipah, and Y. pestis exposures un-used in algorithm training and development yields a mean 51h early warning time (at AUCROC=0.95). We discuss which physiological indicators are most informative for early detection and options for extending this capability to limited datasets such as those available from wearable, non-invasive, ECG-based sensors.
READ LESS

Summary

Early pathogen exposure detection allows better patient care and faster implementation of public health measures (patient isolation, contact tracing). Existing exposure detection most frequently relies on overt clinical symptoms, namely fever, during the infectious prodromal period. We have developed a robust machine learning based method to better detect asymptomatic states...

READ MORE

A cloud-based brain connectivity analysis tool

Summary

With advances in high throughput brain imaging at the cellular and sub-cellular level, there is growing demand for platforms that can support high performance, large-scale brain data processing and analysis. In this paper, we present a novel pipeline that combines Accumulo, D4M, geohashing, and parallel programming to manage large-scale neuron connectivity graphs in a cloud environment. Our brain connectivity graph is represented using vertices (fiber start/end nodes), edges (fiber tracks), and the 3D coordinates of the fiber tracks. For optimal performance, we take the hybrid approach of storing vertices and edges in Accumulo and saving the fiber track 3D coordinates in flat files. Accumulo database operations offer low latency on sparse queries while flat files offer high throughput for storing, querying, and analyzing bulk data. We evaluated our pipeline by using 250 gigabytes of mouse neuron connectivity data. Benchmarking experiments on retrieving vertices and edges from Accumulo demonstrate that we can achieve 1-2 orders of magnitude speedup in retrieval time when compared to the same operation from traditional flat files. The implementation of graph analytics such as Breadth First Search using Accumulo and D4M offers consistent good performance regardless of data size and density, thus is scalable to very large dataset. Indexing of neuron subvolumes is simple and logical with geohashing-based binary tree encoding. This hybrid data management backend is used to drive an interactive web-based 3D graphical user interface, where users can examine the 3D connectivity map in a Google Map-like viewer. Our pipeline is scalable and extensible to other data modalities.
READ LESS

Summary

With advances in high throughput brain imaging at the cellular and sub-cellular level, there is growing demand for platforms that can support high performance, large-scale brain data processing and analysis. In this paper, we present a novel pipeline that combines Accumulo, D4M, geohashing, and parallel programming to manage large-scale neuron...

READ MORE

Showing Results

1-10 of 19