Publications

Refine Results

(Filters Applied) Clear All

Multimodal representation learning via maximization of local mutual information [e-print]

Published in:
Intl. Conf. on Medical Image Computing and Computer Assisted Intervention, MICCAI, 27 September-1 October 2021.

Summary

We propose and demonstrate a representation learning approach by maximizing the mutual information between local features of images and text. The goal of this approach is to learn useful image representations by taking advantage of the rich information contained in the free text that describes the findings in the image. Our method learns image and text encoders by encouraging the resulting representations to exhibit high local mutual information. We make use of recent advances in mutual information estimation with neural network discriminators. We argue that, typically, the sum of local mutual information is a lower bound on the global mutual information. Our experimental results in the downstream image classification tasks demonstrate the advantages of using local features for image-text representation learning.
READ LESS

Summary

We propose and demonstrate a representation learning approach by maximizing the mutual information between local features of images and text. The goal of this approach is to learn useful image representations by taking advantage of the rich information contained in the free text that describes the findings in the image...

READ MORE

Priority scheduling for multi-function apertures with hard- and soft-time constraints

Published in:
2021 IEEE Aerospace Conf., 6-13 March 2021.

Summary

A multi-function aperture (MFA) is an antenna array that supports multiple RF signals for a diverse set of activities. An MFA may support multiple activities simultaneously if they are compatible, and platforms may utilize multiple MFAs to meet field-of-regard and frequency range requirements. Efficient MFA utilization requires a Resource Manager (RM) that routes signals to the correct MFA based on field-of-view and other requirements, and schedules MFA access to resolve conflicts based on request priority. An efficient RM scheduler time-interleaves requests from different activities as needed. Requested access events may be hard-time—that is, the event must be scheduled at a specified time or not at all, or soft-time, indicating it may be scheduled anytime in a specified window. Hard-time events include communications channels with assigned time slots, and soft-time events include asynchronous communications channels. This paper describes and evaluates an optimal algorithm to jointly schedule sequences of hard-time requests, maximizing the number of scheduled events while meeting priority requirements. An extension of this algorithm provides near-optimal schedules for sequences of soft-time or mixed hard- and soft-time events. Algorithms are evaluated by simulation, using two conflict models. The first is based on fixed signal paths that conflict if two paths share a common resource. The second model assumes the RM dynamically assigns resources. As implemented, these algorithms are too slow for real-time operation, and further work is required. They do provide insight into the MFA management problem, a useful metric for evaluating resource sharing and scheduling approaches, and may suggest efficient sub-optimal algorithms.
READ LESS

Summary

A multi-function aperture (MFA) is an antenna array that supports multiple RF signals for a diverse set of activities. An MFA may support multiple activities simultaneously if they are compatible, and platforms may utilize multiple MFAs to meet field-of-regard and frequency range requirements. Efficient MFA utilization requires a Resource Manager...

READ MORE

Using oculomotor features to predict changes in optic nerve sheath diameter and ImPACT scores from contact-sport athletes

Summary

There is mounting evidence linking the cumulative effects of repetitive head impacts to neuro-degenerative conditions. Robust clinical assessment tools to identify mild traumatic brain injuries are needed to assist with timely diagnosis for return-to-field decisions and appropriately guide rehabilitation. The focus of the present study is to investigate the potential for oculomotor features to complement existing diagnostic tools, such as measurements of Optic Nerve Sheath Diameter (ONSD) and Immediate Post-concussion Assessment and Cognitive Testing (ImPACT). Thirty-one high school American football and soccer athletes were tracked through the course of a sports season. Given the high risk of repetitive head impacts associated with both soccer and football, our hypotheses were that (1) ONSD and ImPACT scores would worsen through the season and (2) oculomotor features would effectively capture both neurophysiological changes reflected by ONSD and neuro-functional status assessed via ImPACT. Oculomotor features were used as input to Linear Mixed-Effects Regression models to predict ONSD and ImPACT scores as outcomes. Prediction accuracy was evaluated to identify explicit relationships between eye movements, ONSD, and ImPACT scores. Significant Pearson correlations were observed between predicted and actual outcomes for ONSD (Raw = 0.70; Normalized = 0.45) and for ImPACT (Raw = 0.86; Normalized = 0.71), demonstrating the capability of oculomotor features to capture neurological changes detected by both ONSD and ImPACT. The most predictive features were found to relate to motor control and visual-motor processing. In future work, oculomotor models, linking neural structures to oculomotor function, can be built to gain extended mechanistic insights into neurophysiological changes observed through seasons of participation in contact sports.
READ LESS

Summary

There is mounting evidence linking the cumulative effects of repetitive head impacts to neuro-degenerative conditions. Robust clinical assessment tools to identify mild traumatic brain injuries are needed to assist with timely diagnosis for return-to-field decisions and appropriately guide rehabilitation. The focus of the present study is to investigate the potential...

READ MORE

Speaker separation in realistic noise environments with applications to a cognitively-controlled hearing aid

Summary

Future wearable technology may provide for enhanced communication in noisy environments and for the ability to pick out a single talker of interest in a crowded room simply by the listener shifting their attentional focus. Such a system relies on two components, speaker separation and decoding the listener's attention to acoustic streams in the environment. To address the former, we present a system for joint speaker separation and noise suppression, referred to as the Binaural Enhancement via Attention Masking Network (BEAMNET). The BEAMNET system is an end-to-end neural network architecture based on self-attention. Binaural input waveforms are mapped to a joint embedding space via a learned encoder, and separate multiplicative masking mechanisms are included for noise suppression and speaker separation. Pairs of output binaural waveforms are then synthesized using learned decoders, each capturing a separated speaker while maintaining spatial cues. A key contribution of BEAMNET is that the architecture contains a separation path, an enhancement path, and an autoencoder path. This paper proposes a novel loss function which simultaneously trains these paths, so that disabling the masking mechanisms during inference causes BEAMNET to reconstruct the input speech signals. This allows dynamic control of the level of suppression applied by BEAMNET via a minimum gain level, which is not possible in other state-of-the-art approaches to end-to-end speaker separation. This paper also proposes a perceptually-motivated waveform distance measure. Using objective speech quality metrics, the proposed system is demonstrated to perform well at separating two equal-energy talkers, even in high levels of background noise. Subjective testing shows an improvement in speech intelligibility across a range of noise levels, for signals with artificially added head-related transfer functions and background noise. Finally, when used as part of an auditory attention decoder (AAD) system using existing electroencephalogram (EEG) data, BEAMNET is found to maintain the decoding accuracy achieved with ideal speaker separation, even in severe acoustic conditions. These results suggest that this enhancement system is highly effective at decoding auditory attention in realistic noise environments, and could possibly lead to improved speech perception in a cognitively controlled hearing aid.
READ LESS

Summary

Future wearable technology may provide for enhanced communication in noisy environments and for the ability to pick out a single talker of interest in a crowded room simply by the listener shifting their attentional focus. Such a system relies on two components, speaker separation and decoding the listener's attention to...

READ MORE

Learning emergent discrete message communication for cooperative reinforcement learning

Published in:
37th Conf. on Uncertainty in Artificial Intelligence, UAI 2021, early access, 26-30 July 2021.

Summary

Communication is a important factor that enables agents work cooperatively in multi-agent reinforcement learning (MARL). Most previous work uses continuous message communication whose high representational capacity comes at the expense of interpretability. Allowing agents to learn their own discrete message communication protocol emerged from a variety of domains can increase the interpretability for human designers and other agents. This paper proposes a method to generate discrete messages analogous to human languages, and achieve communication by a broadcast-and-listen mechanism based on self-attention. We show that discrete message communication has performance comparable to continuous message communication but with much a much smaller vocabulary size. Furthermore, we propose an approach that allows humans to interactively send discrete messages to agents.
READ LESS

Summary

Communication is a important factor that enables agents work cooperatively in multi-agent reinforcement learning (MARL). Most previous work uses continuous message communication whose high representational capacity comes at the expense of interpretability. Allowing agents to learn their own discrete message communication protocol emerged from a variety of domains can increase...

READ MORE

More than a fair share: Network Data Remanence attacks against secret sharing-based schemes

Published in:
Network and Distributed Systems Security Symp., NDSS, 23-26 February 2021.

Summary

With progress toward a practical quantum computer has come an increasingly rapid search for quantum-safe, secure communication schemes that do not rely on discrete logarithm or factorization problems. One such encryption scheme, Multi-path Switching with Secret Sharing (MSSS), combines secret sharing with multi-path switching to achieve security as long as the adversary does not have global observability of all paths and thus cannot capture enough shares to reconstruct messages. MSSS assumes that sending a share on a path is an atomic operation and all paths have the same delay. In this paper, we identify a side-channel vulnerability for MSSS, created by the fact that in real networks, sending a share is not an atomic operation as paths have multiple hops and different delays. This channel, referred to as Network Data Remanence (NDR), is present in all schemes like MSSS whose security relies on transfer atomicity and all paths having same delay. We demonstrate the presence of NDR in a physical testbed. We then identify two new attacks that aim to exploit the side channel, referred to as NDR Blind and NDR Planned, propose an analytical model to analyze the attacks, and demonstrate them using an implementation of MSSS based on the ONOS SDN controller. Finally, we present a countermeasure for the attacks and show its effectiveness in simulations and Mininet experiments.
READ LESS

Summary

With progress toward a practical quantum computer has come an increasingly rapid search for quantum-safe, secure communication schemes that do not rely on discrete logarithm or factorization problems. One such encryption scheme, Multi-path Switching with Secret Sharing (MSSS), combines secret sharing with multi-path switching to achieve security as long as...

READ MORE

Beyond expertise and roles: a framework to characterize the stakeholders of interpretable machine learning and their needs

Published in:
Proc. Conf. on Human Factors in Computing Systems, 8-13 May 2021, article no. 74.

Summary

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples stakeholders' knowledge from their interpretability needs. We characterize stakeholders by their formal, instrumental, and personal knowledge and how it manifests in the contexts of machine learning, the data domain, and the general milieu. We additionally distill a hierarchical typology of stakeholder needs that distinguishes higher-level domain goals from lower-level interpretability tasks. In assessing the descriptive, evaluative, and generative powers of our framework, we find our more nuanced treatment of stakeholders reveals gaps and opportunities in the interpretability literature, adds precision to the design and comparison of user studies, and facilitates a more reflexive approach to conducting this research.
READ LESS

Summary

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples...

READ MORE

Seasonal Inhomogeneous Nonconsecutive Arrival Process Search and Evaluation

Published in:
25th International Conference on Pattern Recognition [submitted]

Summary

Time series often exhibit seasonal patterns, and identification of these patterns is essential to understanding thedata and predicting future behavior. Most methods train onlarge datasets and can fail to predict far past the training data. This limitation becomes more pronounced when data is sparse. This paper presents a method to fit a model to seasonal time series data that maintains predictive power when data is limited. This method, called SINAPSE, combines statistical model fitting with an information criteria to search for disjoint, andpossibly nonconsecutive, regimes underlying the data, allowing for a sparse representation resistant to overfitting.
READ LESS

Summary

Time series often exhibit seasonal patterns, and identification of these patterns is essential to understanding thedata and predicting future behavior. Most methods train onlarge datasets and can fail to predict far past the training data. This limitation becomes more pronounced when data is sparse. This paper presents a method to...

READ MORE

Automatic detection of influential actors in disinformation networks

Summary

The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IO). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and a novel network causal inference approach to quantify the impact of individual actors in spreading IO narratives. We demonstrate its capability on real-world hostile IO campaigns with Twitter datasets collected during the 2017 French presidential elections, and known IO accounts disclosed by Twitter. Our system detects IO accounts with 96% precision, 79% recall, and 96% area-under-the-PR-curve, maps out salient network communities, and discovers high-impact accounts that escape the lens of traditional impact statistics based on activity counts and network centrality. Results are corroborated with independent sources of known IO accounts from U.S. Congressional reports, investigative journalism, and IO datasets provided by Twitter.
READ LESS

Summary

The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IO). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language...

READ MORE

The Speech Enhancement via Attention Masking Network (SEAMNET): an end-to-end system for joint suppression of noise and reverberation [early access]

Published in:
IEEE/ACM Trans. on Audio, Speech, and Language Processing, Vol. 29, 2021, pp. 515-26.

Summary

This paper proposes the Speech Enhancement via Attention Masking Network (SEAMNET), a neural network-based end-to-end single-channel speech enhancement system designed for joint suppression of noise and reverberation. It formalizes an end-to-end network architecture, referred to as b-Net, which accomplishes noise suppression through attention masking in a learned embedding space. A key contribution of SEAMNET is that the b-Net architecture contains both an enhancement and an autoencoder path. This paper proposes a novel loss function which simultaneously trains both the enhancement and the autoencoder paths, so that disabling the masking mechanism during inference causes SEAMNET to reconstruct the input speech signal. This allows dynamic control of the level of suppression applied by SEAMNET via a minimum gain level, which is not possible in other state-of-the-art approaches to end-to-end speech enhancement. This paper also proposes a perceptually-motivated waveform distance measure. In addition to the b-Net architecture, this paper proposes a novel method for designing target waveforms for network training, so that joint suppression of additive noise and reverberation can be performed by an end-to-end enhancement system, which has not been previously possible. Experimental results show the SEAMNET system to outperform a variety of state-of-the-art baselines systems, both in terms of objective speech quality measures and subjective listening tests. Finally, this paper draws parallels between SEAMNET and conventional statistical model-based enhancement approaches, offering interpretability of many network components.
READ LESS

Summary

This paper proposes the Speech Enhancement via Attention Masking Network (SEAMNET), a neural network-based end-to-end single-channel speech enhancement system designed for joint suppression of noise and reverberation. It formalizes an end-to-end network architecture, referred to as b-Net, which accomplishes noise suppression through attention masking in a learned embedding space. A...

READ MORE