Publications
Big Data dimensional analysis
Summary
Summary
The ability to collect and analyze large amounts of data is a growing problem within the scientific community. The growing gap between data and users calls for innovative tools that address the challenges faced by big data volume, velocity and variety. One of the main challenges associated with big data...
A test-suite generator for database systems
Summary
Summary
In this paper, we describe the SPAR Test Suite Generator (STSG), a new test-suite generator for SQL style database systems. This tool produced an entire test suite (data, queries, and ground-truth answers) as a unit and in response to a user's specification. Thus, database evaluators could use this tool to...
Sparse matrix partitioning for parallel eigenanalysis of large static and dynamic graphs
Summary
Summary
Numerous applications focus on the analysis of entities and the connections between them, and such data are naturally represented as graphs. In particular, the detection of a small subset of vertices with anomalous coordinated connectivity is of broad interest, for problems such as detecting strange traffic in a computer network...
Computing on masked data: a high performance method for improving big data veracity
Summary
Summary
The growing gap between data and users calls for innovative tools that address the challenges faced by big data volume, velocity and variety. Along with these standard three V's of big data, an emerging fourth "V" is veracity, which addresses the confidentiality, integrity, and availability of the data. Traditional cryptographic...
Genetic sequence matching using D4M big data approaches
Summary
Summary
Recent technological advances in Next Generation Sequencing tools have led to increasing speeds of DNA sample collection, preparation, and sequencing. One instrument can produce over 600 Gb of genetic sequence data in a single run. This creates new opportunities to efficiently handle the increasing workload. We propose a new method...
Adaptive optics program at TMT
Summary
Summary
The TMT first light Adaptive Optics (AO) facility consists of the Narrow Field Infra-Red AO System (NFIRAOS) and the associated Laser Guide Star Facility (LGSF). NFIRAOS is a 60 x 60 laser guide star (LGS) multi-conjugate AO (MCAO) system, which provides uniform, diffraction-limited performance in the J, H, and K...
Detecting small asteroids with the Space Surveillance Telescope
Summary
Summary
The ability of the Space Surveillance Telescope (SST) to find small (2-15 m diameter) NEAs suitable for the NASA asteroid retrieval mission is investigated. Orbits from a simulated population of targetable small asteroids were propagated and observations with the SST were simulated. Different search patterns and telescope time allocation cases...
Comparisons between the extended Kalman filter and the state-dependent Riccati estimator
Summary
Summary
The state-dependent Riccati equation-based estimator is becoming a popular estimation tool for nonlinear systems since it does not use system linearization. In this paper, the state-dependent Riccati equation-based estimator is compared with the widely used extended Kalman filter for three simple examples that appear in the open literature. It is...
VizLinc: integrating information extraction, search, graph analysis, and geo-location for the visual exploration of large data sets
Summary
Summary
In this demo paper we introduce VizLinc; an open-source software suite that integrates automatic information extraction, search, graph analysis, and geo-location for interactive visualization and exploration of large data sets. VizLinc helps users in: 1) understanding the type of information the data set under study might contain, 2) finding patterns...
Content+context=classification: examining the roles of social interactions and linguist content in Twitter user classification
Summary
Summary
Twitter users demonstrate many characteristics via their online presence. Connections, community memberships, and communication patterns reveal both idiosyncratic and general properties of users. In addition, the content of tweets can be critical for distinguishing the role and importance of a user. In this work, we explore Twitter user classification using...