Publications

Refine Results

(Filters Applied) Clear All

Multitap RF canceller for in-band full-duplex wireless communications

Published in:
IEEE Wirel. Commun., Vol. 15, No. 6, June 2016, pp. 4321-34.

Summary

In-band full-duplex wireless communications are challenging because they require the mitigation of self-interference caused by the co-located transmitter to operate effectively. This paper presents a novel tapped delay line RF canceller architecture with multiple non-uniform pre-weighted taps to improve system isolation by cancelling both the direct antenna coupling as well as multipath effects that comprise a typical interference channel. A four-tap canceller prototype was measured over several different operating conditions, and was found to provide an average of 30 dB signal cancellation over a 30 MHz bandwidth centered at 2.45 GHz in isolated scenarios. When combined with an omni-directional high-isolation antenna, the canceller improved the overall analog isolation to 90 dB for these cases. In an indoor setting, the canceller suppressed a +30 dBm OFDM signal by 22 dB over a 20 MHz bandwidth centered at 2.45 GHz, and produced 78 dB of total analog isolation. This complete evaluation demonstrates not only the performance limitations of an optimized multitap RF canceller, but also establishes the amount of analog interference suppression that can be expected for the different environments considered.
READ LESS

Summary

In-band full-duplex wireless communications are challenging because they require the mitigation of self-interference caused by the co-located transmitter to operate effectively. This paper presents a novel tapped delay line RF canceller architecture with multiple non-uniform pre-weighted taps to improve system isolation by cancelling both the direct antenna coupling as well...

READ MORE

Operational assessment of keyword search on oral history

Published in:
10th Language Resources and Evaluation Conf., LREC 2016, 23-8 May 2016.

Summary

This project assesses the resources necessary to make oral history searchable by means of automatic speech recognition (ASR). There are many inherent challenges in applying ASR to conversational speech: smaller training set sizes and varying demographics, among others. We assess the impact of dataset size, word error rate and term-weighted value on human search capability through an information retrieval task on Mechanical Turk. We use English oral history data collected by StoryCorps, a national organization that provides all people with the opportunity to record, share and preserve their stories, and control for a variety of demographics including age, gender, birthplace, and dialect on four different training set sizes. We show comparable search performance using a standard speech recognition system as with hand-transcribed data, which is promising for increased accessibility of conversational speech and oral history archives.
READ LESS

Summary

This project assesses the resources necessary to make oral history searchable by means of automatic speech recognition (ASR). There are many inherent challenges in applying ASR to conversational speech: smaller training set sizes and varying demographics, among others. We assess the impact of dataset size, word error rate and term-weighted...

READ MORE

A fun and engaging interface for crowdsourcing named entities

Published in:
10th Language Resources and Evaluation Conf., LREC 2016, 23-28 May 2016.

Summary

There are many current problems in natural language processing that are best solved by training algorithms on an annotated in-language, in-domain corpus. The more representative the training corpus is of the test data, the better the algorithm will perform, but also the less likely it is that such a corpus has already been annotated. Annotating corpora for natural language processing tasks is typically a time consuming and expensive process. In this paper, we provide a case study in using crowd sourcing to curate an in-domain corpus for named entity recognition, a common problem in natural language processing. In particular, we present our use of fun, engaging user interfaces as a way to entice workers to partake in our crowd sourcing task while avoiding inflating our payments in a way that would attract more mercenary workers than conscientious ones. Additionally, we provide a survey of alternate interfaces for collecting annotations of named entities and compare our approach to those systems.
READ LESS

Summary

There are many current problems in natural language processing that are best solved by training algorithms on an annotated in-language, in-domain corpus. The more representative the training corpus is of the test data, the better the algorithm will perform, but also the less likely it is that such a corpus...

READ MORE

Enforced sparse non-negative matrix factorization

Published in:
30th IEEE Int. Parallel and Distributed Processing Symp., IPDPS 2016, 23-27 May 2016.

Summary

Non-negative matrix factorization (NMF) is a dimensionality reduction algorithm for data that can be represented as an undirected bipartite graph. It has become a common method for generating topic models of text data because it is known to produce good results, despite its relative simplicity of implementation and ease of computation. One challenge with applying the NMF to large datasets is that intermediate matrix products often become dense, thus stressing the memory and compute elements of the underlying system. In this article, we investigate a simple but powerful modification of the alternating least squares method of determining the NMF of a sparse matrix that enforces the generation of sparse intermediate and output matrices. This method enables the application of NMF to large datasets through improved memory and compute performance. Further, we demonstrate, empirically, that this method of enforcing sparsity in the NMF either preserves or improves both the accuracy of the resulting topic model and the convergence rate of the underlying algorithm.
READ LESS

Summary

Non-negative matrix factorization (NMF) is a dimensionality reduction algorithm for data that can be represented as an undirected bipartite graph. It has become a common method for generating topic models of text data because it is known to produce good results, despite its relative simplicity of implementation and ease of...

READ MORE

LLMapReduce: multi-level map-reduce for high performance data analysis

Summary

The map-reduce parallel programming model has become extremely popular in the big data community. Many big data workloads can benefit from the enhanced performance offered by supercomputers. LLMapReduce provides the familiar map-reduce parallel programming model to big data users running on a supercomputer. LLMapReduce dramatically simplifies map-reduce programming by providing simple parallel programming capability in one line of code. LLMapReduce supports all programming languages and many schedulers. LLMapReduce can work with any application without the need to modify the application. Furthermore, LLMapReduce can overcome scaling limits in the map-reduce parallel programming model via options that allow the user to switch to the more efficient single-program-multiple-data (SPMD) parallel programming model. These features allow users to reduce the computational overhead by more than 10x compared to standard map-reduce for certain applications. LLMapReduce is widely used by hundreds of users at MIT. Currently LLMapReduce works with several schedulers such as SLURM, Grid Engine and LSF.
READ LESS

Summary

The map-reduce parallel programming model has become extremely popular in the big data community. Many big data workloads can benefit from the enhanced performance offered by supercomputers. LLMapReduce provides the familiar map-reduce parallel programming model to big data users running on a supercomputer. LLMapReduce dramatically simplifies map-reduce programming by providing...

READ MORE

Generating a multiple-prerequisite attack graph

A data-stream classification system for investigating terrorist threats

Published in:
Proc. SPIE 9851, Next-Generation Analyst IV, 98510L (May 12, 2016); doi:10.1117/12.2224104.

Summary

The role of cyber forensics in criminal investigations has greatly increased in recent years due to the wealth of data that is collected and available to investigators. Physical forensics has also experienced a data volume and fidelity revolution due to advances in methods for DNA and trace evidence analysis. Key to extracting insight is the ability to correlate across multi-modal data, which depends critically on identifying a touch-point connecting the separate data streams. Separate data sources may be connected because they refer to the same individual, entity or event. In this paper we present a data source classification system tailored to facilitate the investigation of potential terrorist activity. This taxonomy is structured to illuminate the defining characteristics of a particular terrorist effort and designed to guide reporting to decision makers that is complete, concise, and evidence-based. The classification system has been validated and empirically utilized in the forensic analysis of a simulated terrorist activity. Next-generation analysts can use this schema to label and correlate across existing data streams, assess which critical information may be missing from the data, and identify options for collecting additional data streams to fill information gaps.
READ LESS

Summary

The role of cyber forensics in criminal investigations has greatly increased in recent years due to the wealth of data that is collected and available to investigators. Physical forensics has also experienced a data volume and fidelity revolution due to advances in methods for DNA and trace evidence analysis. Key...

READ MORE

Feedback-based social media filtering tool for improved situational awareness

Published in:
15th Annual IEEE Int. Symp. on Technologies for Homeland Security, HST 2016, 10-12 May 2016.

Summary

This paper describes a feature-rich model of data relevance, designed to aid first responder retrieval of useful information from social media sources during disasters or emergencies. The approach is meant to address the failure of traditional keyword-based methods to sufficiently suppress clutter during retrieval. The model iteratively incorporates relevance feedback to update feature space selection and classifier construction across a multimodal set of diverse content characterization techniques. This approach is advantageous because the aspects of the data (or even the modalities of the data) that signify relevance cannot always be anticipated ahead of time. Experiments with both microblog text documents and coupled imagery and text documents demonstrate the effectiveness of this model on sample retrieval tasks, in comparison to more narrowly focused models operating in a priori selected feature spaces. The experiments also show that even relatively low feedback levels (i.e., tens of examples) can lead to a significant performance boost during the interactive retrieval process.
READ LESS

Summary

This paper describes a feature-rich model of data relevance, designed to aid first responder retrieval of useful information from social media sources during disasters or emergencies. The approach is meant to address the failure of traditional keyword-based methods to sufficiently suppress clutter during retrieval. The model iteratively incorporates relevance feedback...

READ MORE

Polymer dielectrics for 3D-printed RF devices in the Ka band

Summary

Direct-write printing allows the fabrication of centimeter-wave radio devices. Most polymer dielectric polymer materials become lossy at frequencies above 10 GHz. Presented here is a printable dielectric material with low loss in the K a band (26.5–40 GHz). This process allows the fabrication of resonator filter devices and a radio antenna.
READ LESS

Summary

Direct-write printing allows the fabrication of centimeter-wave radio devices. Most polymer dielectric polymer materials become lossy at frequencies above 10 GHz. Presented here is a printable dielectric material with low loss in the K a band (26.5–40 GHz). This process allows the fabrication of resonator filter devices and a radio...

READ MORE

A key-centric processor architecture for secure computing

Published in:
2016 IEEE Int. Symp. on Hardware-Oriented Security and Trust, HOST 2016, 3-5 May 2016.

Summary

We describe a novel key-centric processor architecture in which each piece of data or code can be protected by encryption while at rest, in transit, and in use. Using embedded key management for cryptographic key handling, our processor permits mutually distrusting software written by different entities to work closely together without divulging algorithmic parameters or secret program data. Since the architecture performs encryption, decryption, and key management deeply within the processor hardware, the attack surface is minimized without significant impact on performance or ease of use. The current prototype implementation is based on the Sparc architecture and is highly applicable to small to medium-sized processing loads.
READ LESS

Summary

We describe a novel key-centric processor architecture in which each piece of data or code can be protected by encryption while at rest, in transit, and in use. Using embedded key management for cryptographic key handling, our processor permits mutually distrusting software written by different entities to work closely together...

READ MORE