Publications

Refine Results

(Filters Applied) Clear All

Twitter language identification of similar languages and dialects without ground truth

Published in:
Proc. 4th Workshop on NLP for Similar Languages, Varieties and Dialects, 3 April 2017, pp. 73-83.

Summary

We present a new method to bootstrap filter Twitter language ID labels in our dataset for automatic language identification (LID). Our method combines geolocation, original Twitter LID labels, and Amazon Mechanical Turk to resolve missing and unreliable labels. We are the first to compare LID classification performance using the MIRA algorithm and langid.py. We show classifier performance on different versions of our dataset with high accuracy using only Twitter data, without ground truth, and very few training examples. We also show how Platt Scaling can be use to calibrate MIRA classifier output values into a probability distribution over candidate classes, making the output more intuitive. Our method allows for fine-grained distinctions between similar languages and dialects and allows us to rediscover the language composition of our Twitter dataset.
READ LESS

Summary

We present a new method to bootstrap filter Twitter language ID labels in our dataset for automatic language identification (LID). Our method combines geolocation, original Twitter LID labels, and Amazon Mechanical Turk to resolve missing and unreliable labels. We are the first to compare LID classification performance using the MIRA...

READ MORE

Bounded-collusion attribute-based encryption from minimal assumptions

Published in:
IACR 20th Int. Conf. on Practice and Theory of Public Key Cryptography, PKC 2017, 28-31 March 2017.

Summary

Attribute-based encryption (ABE) enables encryption of messages under access policies so that only users with attributes satisfying the policy can decrypt the ciphertext. In standard ABE, an arbitrary number of colluding users, each without an authorized attribute set, cannot decrypt the ciphertext. However, all existing ABE schemes rely on concrete cryptographic assumptions such as the hardness of certain problems over bilinear maps or integer lattices. Furthermore, it is known that ABE cannot be constructed from generic assumptions such as public-key encryption using black-box techniques. In this work, we revisit the problem of constructing ABE that tolerates collusions of arbitrary but a priori bounded size. We present two ABE schemes secure against bounded collusions that require only semantically secure public-key encryption. Our schemes achieve significant improvement in the size of the public parameters, secret keys, and ciphertexts over the previous construction of bounded-collusion ABE from minimal assumptions by Gorbunov et al. (CRYPTO 2012). In fact, in our second scheme, the size of ABE secret keys does not grow at all with the collusion bound. As a building block, we introduce a multidimensional secret-sharing scheme that may be of independent interest. We also obtain bounded-collusion symmetric-key ABE (which requires the secret key for encryption) by replacing the public-key encryption with symmetric-key encryption, which can be built from the minimal assumption of one-way functions.
READ LESS

Summary

Attribute-based encryption (ABE) enables encryption of messages under access policies so that only users with attributes satisfying the policy can decrypt the ciphertext. In standard ABE, an arbitrary number of colluding users, each without an authorized attribute set, cannot decrypt the ciphertext. However, all existing ABE schemes rely on concrete...

READ MORE

Predicting exploitation of disclosed software vulnerabilities using open-source data

Published in:
3rd ACM Int. Workshop on Security and Privacy Analytics, IWSPA 2017, 24 March 2017.

Summary

Each year, thousands of software vulnerabilities are discovered and reported to the public. Unpatched known vulnerabilities are a significant security risk. It is imperative that software vendors quickly provide patches once vulnerabilities are known and users quickly install those patches as soon as they are available. However, most vulnerabilities are never actually exploited. Since writing, testing, and installing software patches can involve considerable resources, it would be desirable to prioritize the remediation of vulnerabilities that are likely to be exploited. Several published research studies have reported moderate success in applying machine learning techniques to the task of predicting whether a vulnerability will be exploited. These approaches typically use features derived from vulnerability databases (such as the summary text describing the vulnerability) or social media posts that mention the vulnerability by name. However, these prior studies share multiple methodological shortcomings that infl ate predictive power of these approaches. We replicate key portions of the prior work, compare their approaches, and show how selection of training and test data critically affect the estimated performance of predictive models. The results of this study point to important methodological considerations that should be taken into account so that results reflect real-world utility.
READ LESS

Summary

Each year, thousands of software vulnerabilities are discovered and reported to the public. Unpatched known vulnerabilities are a significant security risk. It is imperative that software vendors quickly provide patches once vulnerabilities are known and users quickly install those patches as soon as they are available. However, most vulnerabilities are...

READ MORE

High-efficiency large-angle Pancharatnam phase deflector based on dual-twist design

Summary

We have previously shown through simulation that an optical beam deflector based on the Pancharatnam (geometric) phase can provide high efficiency with up to 80° deflection using a dual-twist structure for polarization-state control [Appl. Opt. 54, 10035 (2015)]. In this report, we demonstrate that its optical performance is as predicted and far beyond what could be expected for a conventional diffractive optical device. We provide details about construction and characterization of a ± 40° beam-steering device with 90% diffraction efficiency based on our dual-twist design at a 633nm wavelength.
READ LESS

Summary

We have previously shown through simulation that an optical beam deflector based on the Pancharatnam (geometric) phase can provide high efficiency with up to 80° deflection using a dual-twist structure for polarization-state control [Appl. Opt. 54, 10035 (2015)]. In this report, we demonstrate that its optical performance is as predicted...

READ MORE

Wind information requirements for NextGen applications phase 4 final report

Summary

The success of many NextGen applications with time-based control elements, such as Required Time of Arrival (RTA) at a meter fix under 4D-Trajectory Based Operations (4D-TBO/Time of Arrival Control (TOAC) procedures or compliance to an Assigned Spacing Goal (ASG) between aircraft under Interval Management (IM) procedures, are subject to the quality of the atmospheric forecast utilized by participating aircraft. Erroneous information derived from provided forecast data, such as the magnitude of future headwinds relative to the headwinds actually experienced during flight, or forecast data that is insufficient to fully describe the forthcoming atmospheric conditions, can significantly degrade the performance of an attempted procedure. The work described in this report summarizes the major activities conducted in Fiscal Year 2015.
READ LESS

Summary

The success of many NextGen applications with time-based control elements, such as Required Time of Arrival (RTA) at a meter fix under 4D-Trajectory Based Operations (4D-TBO/Time of Arrival Control (TOAC) procedures or compliance to an Assigned Spacing Goal (ASG) between aircraft under Interval Management (IM) procedures, are subject to the...

READ MORE

Detecting virus exposure during the pre-symptomatic incubation period using physiological data

Summary

Early pathogen exposure detection allows better patient care and faster implementation of public health measures (patient isolation, contact tracing). Existing exposure detection most frequently relies on overt clinical symptoms, namely fever, during the infectious prodromal period. We have developed a robust machine learning method to better detect asymptomatic states during the incubation period using subtle, sub-clinical physiological markers. Using high-resolution physiological data from non-human primate studies of Ebola and Marburg viruses, we pre-processed the data to reduce short-term variability and normalize diurnal variations, then provided these to a supervised random forest classification algorithm. In most subjects detection is achieved well before the onset of fever; subject cross-validation lead to 52±14h mean early detection (at >0.90 area under the receiver-operating characteristic curve). Cross-cohort tests across pathogens and exposure routes also lead to successful early detection (28±16h and 43±22h, respectively). We discuss which physiological indicators are most informative for early detection and options for extending this capability to lower data resolution and wearable, non-invasive sensors.
READ LESS

Summary

Early pathogen exposure detection allows better patient care and faster implementation of public health measures (patient isolation, contact tracing). Existing exposure detection most frequently relies on overt clinical symptoms, namely fever, during the infectious prodromal period. We have developed a robust machine learning method to better detect asymptomatic states during...

READ MORE

Characterization of nitrated sugar alcohols by atmospheric-pressure chemical-ionization mass spectrometry

Published in:
Rapid Commun. Mass Spectrom., Vol. 33, 2017, pp. 333-43.

Summary

RATIONALE: The nitrated sugar alcohols mannitol hexanitrate (MHN), sorbitol hexanitrate (SHN) and xylitol pentanitrate (XPN) are in the same class of compounds as the powerful military-grade explosive pentaerythritol tetranitrate (PETN) and the homemade explosive erythritol tetranitrate (ETN) but, unlike for PETN and ETN, ways to detect MHN, SHN and XPN by mass spectrometry (MS) have not been fully investigated. METHODS: Atmospheric-pressure chemical-ionization mass spectrometry (APCI-MS) was used to detect ions characteristic of nitrated sugar alcohols. APCI time-of-flight mass spectrometry (APCI-TOF MS) and collision-induced dissociation tandem mass spectrometry (CID MS/MS) were used for confirmation of each ion assignment. In addition, the use of the chemical ionization reagent dichloromethane was investigated to improve sensitivity and selectivity for detection of MHN, SHN and XPN. RESULTS: All the nitrated sugar alcohols studied followed similar fragmentation pathways in the APCI source. MHN, SHN and XPN were detectable as fragment ions formed by the loss of NO2, HNO2, NO3, and CH2NO2 groups, and in the presence of dichloromethane chlorinated adduct ions were observed. It was determined that in MS/MS mode, chlorinated adducts of MHN and SHN had the lowest limits of detection (LODs), while for XPN the lowest LOD was for the [XPN-NO2]- fragment ion. Partially nitrated analogs of each of the three compounds were also present in the starting materials, and ions attributable to these compounds versus those formed from in-source fragmentation of MHN, SHN, and XPN were distinguished and assigned using liquid chromatography APCI-MS and ESI-MS. CONCLUSIONS: The APCI-MS technique provides a selective and sensitive method for the detection of nitrated sugar alcohols. The methods disclosed here will benefit the area of explosives trace detection for counterterrorism and forensics.
READ LESS

Summary

RATIONALE: The nitrated sugar alcohols mannitol hexanitrate (MHN), sorbitol hexanitrate (SHN) and xylitol pentanitrate (XPN) are in the same class of compounds as the powerful military-grade explosive pentaerythritol tetranitrate (PETN) and the homemade explosive erythritol tetranitrate (ETN) but, unlike for PETN and ETN, ways to detect MHN, SHN and XPN...

READ MORE

High-resolution, high-throughput, CMOS-compatible electron beam patterning

Published in:
SPIE Advanced Lithography, 26 February - 2 March 2017.

Summary

Two scanning electron beam lithography (SEBL) patterning processes have been developed, one positive and one negative tone. The processes feature nanometer-scale resolution, chemical amplification for faster throughput, long film life under vacuum, and sufficient etch resistance to enable patterning of a variety of materials with a metal-free (CMOS/MEMS compatible) tool set. These resist processes were developed to address two limitations of conventional SEBL resist processes: (1) low areal throughput and (2) limited compatibility with the traditional microfabrication infrastructure.
READ LESS

Summary

Two scanning electron beam lithography (SEBL) patterning processes have been developed, one positive and one negative tone. The processes feature nanometer-scale resolution, chemical amplification for faster throughput, long film life under vacuum, and sufficient etch resistance to enable patterning of a variety of materials with a metal-free (CMOS/MEMS compatible) tool...

READ MORE

SIAM data mining "brings it" to annual meeting

Summary

The Data Mining Activity Group is one of SIAM's most vibrant and dynamic activity groups. To better share our enthusiasm for data mining with the broader SIAM community, our activity group organized six minisymposia at the 2016 Annual Meeting. These minisymposia included 48 talks organized by 11 SIAM members.
READ LESS

Summary

The Data Mining Activity Group is one of SIAM's most vibrant and dynamic activity groups. To better share our enthusiasm for data mining with the broader SIAM community, our activity group organized six minisymposia at the 2016 Annual Meeting. These minisymposia included 48 talks organized by 11 SIAM members.

READ MORE

Picosecond kilohertz-class cryogenically cooled multistage Yb-doped chirped pulse amplifier

Published in:
Opt. Lett., Vol. 42, No. 4, 15 February 2017, pp. 707-710.

Summary

A multistage cryogenic chirped pulse amplifier has been developed, utilizing two different Yb-doped gain materials in subsequent amplifier stages. A Yb:GSAG regenerative amplifier followed by a Yb:YAG power amplifier is able to deliver pulses with a broader bandwidth than a system using only one of these two gain media throughout. We demonstrate 90 mJ of pulse energy (113 W of average power) uncompressed and 67 mJ (84 W of average power) compressed at 1.25 kHz pulse repetition frequency, 3.0 ps FWHM Gaussian pulse width, and near-diffraction-limited (M^2 < 1.3) beam quality.
READ LESS

Summary

A multistage cryogenic chirped pulse amplifier has been developed, utilizing two different Yb-doped gain materials in subsequent amplifier stages. A Yb:GSAG regenerative amplifier followed by a Yb:YAG power amplifier is able to deliver pulses with a broader bandwidth than a system using only one of these two gain media throughout...

READ MORE