
Refine Results

(Filters Applied) Clear All

A neurophysiological-auditory "listen receipt" for communication enhancement

Published in:
49th IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 14-19 April 2024.


Information overload, and specifically auditory overload, is common in critical situations and detrimental to communication. Currently, there is no auditory equivalent of an email read receipt to know if a person has heard a message, other than waiting for a reply. This work hypothesizes that it may be possible to decode whether a person has indeed heard a message, or in other words, create an an auditory "listen receipt," through use of non-invasive physiological or neural monitoring. We extracted a variety of features derived from Electrodermal activity (EDA), Electroencephalography (EEG), and the correlations between the acoustic envelope of the radio message and EEG to use in the decoder. We were able to classify the cases in which the subject responded correctly to the question in the message, versus the cases where they missed or heard the message incorrectly, with an accuracy of 79% and a receiver operating characteristic (ROC) area under the curve (AUC) of 0.83. This work suggests that the concept of a "listen receipt" may be possible, and future wearable machine-brain interface technologies may be able to automatically determine if an important radio message has been missed for both human-to-human and human-to-machine communication.


Information overload, and specifically auditory overload, is common in critical situations and detrimental to communication. Currently, there is no auditory equivalent of an email read receipt to know if a person has heard a message, other than waiting for a reply. This work hypothesizes that it may be possible to...


EEG alpha and pupil diameter reflect endogenous auditory attention switching and listening effort

Published in:
Eur. J. Neurosci., 2022, pp. 1-16.


Everyday environments often contain distracting competing talkers and background noise, requiring listeners to focus their attention on one acoustic source and reject others. During this auditory attention task, listeners may naturally interrupt their sustained attention and switch attended sources. The effort required to perform this attention switch has not been well studied in the context of competing continuous speech. In this work, we developed two variants of endogenous attention switching and a sustained attention control. We characterized these three experimental conditions under the context of decoding auditory attention, while simultaneously evaluating listening effort and neural markers of spatial-audio cues. A least-squares, electroencephalography (EEG) based, attention decoding algorithm was implemented across all conditions. It achieved an accuracy of 69.4% and 64.0% when computed over non-overlapping 10 and 5-second correlation windows, respectively. Both decoders illustrated smooth transitions in the attended talker prediction through switches at approximately half of the analysis window size (e.g. the mean lag taken across the two switch conditions was 2.2 seconds when the 5-second correlation window was used). Expended listening effort, as measured by simultaneous EEG and pupillometry, was also a strong indicator of whether the listeners sustained attention or performed an endogenous attention switch (peak pupil diameter measure (p = 0.034) and minimum parietal alpha power measure (p = 0.016)). We additionally found evidence of talker spatial cues in the form of centrotemporal alpha power lateralization (p = 0.0428). These results suggest that listener effort and spatial cues may be promising features to pursue in a decoding context, in addition to speech-based features.


Everyday environments often contain distracting competing talkers and background noise, requiring listeners to focus their attention on one acoustic source and reject others. During this auditory attention task, listeners may naturally interrupt their sustained attention and switch attended sources. The effort required to perform this attention switch has not been...


Predicting cognitive load and operational performance in a simulated marksmanship task


Modern operational environments can place significant demands on a service member's cognitive resources, increasing the risk of errors or mishaps due to overburden. The ability to monitor cognitive burden and associated performance within operational environments is critical to improving mission readiness. As a key step toward a field-ready system, we developed a simulated marksmanship scenario with an embedded working memory task in an immersive virtual reality environment. As participants performed the marksmanship task, they were instructed to remember numbered targets and recall the sequence of those targets at the end of the trial. Low and high cognitive load conditions were defined as the recall of three- and six-digit strings, respectively. Physiological and behavioral signals recorded included speech, heart rate, breathing rate, and body movement. These features were input into a random forest classifier that significantly discriminated between the low- and high-cognitive load conditions (AUC=0.94). Behavioral features of gait were the most informative, followed by features of speech. We also showed the capability to predict performance on the digit recall (AUC = 0.71) and marksmanship (AUC = 0.58) tasks. The experimental framework can be leveraged in future studies to quantify the interaction of other types of stressors and their impact on operational cognitive and physical performance.


Modern operational environments can place significant demands on a service member's cognitive resources, increasing the risk of errors or mishaps due to overburden. The ability to monitor cognitive burden and associated performance within operational environments is critical to improving mission readiness. As a key step toward a field-ready system, we...


Comparison of two-talker attention decoding from EEG with nonlinear neural networks and linear methods


Auditory attention decoding (AAD) through a brain-computer interface has had a flowering of developments since it was first introduced by Mesgarani and Chang (2012) using electrocorticograph recordings. AAD has been pursued for its potential application to hearing-aid design in which an attention-guided algorithm selects, from multiple competing acoustic sources, which should be enhanced for the listener and which should be suppressed. Traditionally, researchers have separated the AAD problem into two stages: reconstruction of a representation of the attended audio from neural signals, followed by determining the similarity between the candidate audio streams and the reconstruction. Here, we compare the traditional two-stage approach with a novel neural-network architecture that subsumes the explicit similarity step. We compare this new architecture against linear and non-linear (neural-network) baselines using both wet and dry electroencephalogram (EEG) systems. Our results indicate that the new architecture outperforms the baseline linear stimulus-reconstruction method, improving decoding accuracy from 66% to 81% using wet EEG and from 59% to 87% for dry EEG. Also of note was the finding that the dry EEG system can deliver comparable or even better results than the wet, despite the latter having one third as many EEG channels as the former. The 11-subject, wet-electrode AAD dataset for two competing, co-located talkers, the 11-subject, dry-electrode AAD dataset, and our software are available for further validation, experimentation, and modification.


Auditory attention decoding (AAD) through a brain-computer interface has had a flowering of developments since it was first introduced by Mesgarani and Chang (2012) using electrocorticograph recordings. AAD has been pursued for its potential application to hearing-aid design in which an attention-guided algorithm selects, from multiple competing acoustic sources, which...


Showing Results

1-4 of 4