Publications

Refine Results

(Filters Applied) Clear All

Image processing pipeline for liver fibrosis classification using ultrasound shear wave elastography

Published in:
Ultrasound in Med. & Biol., Vol. 46, No. 10, October 2020, pp. 2667-2676.

Summary

The purpose of this study was to develop an automated method for classifying liver fibrosis stage >=F2 based on ultrasound shear wave elastography (SWE) and to assess the system's performance in comparison with a reference manual approach. The reference approach consists of manually selecting a region of interest from each of eight or more SWE images, computing the mean tissue stiffness within each of the regions of interest and computing a resulting stiffness value as the median of the means. The 527-subject database consisted of 5526 SWE images and pathologist-scored biopsies, with data collected from a single system at a single site. The automated method integrates three modules that assess SWE image quality, select a region of interest from each SWE measurement and perform machine learning-based, multi-image SWE classification for fibrosis stage >=F2. Several classification methods were developed and tested using fivefold cross-validation with training, validation and test sets partitioned by subject. Performance metrics were area under receiver operating characteristic curve (AUROC), specificity at 95% sensitivity and number of SWE images required. The final automated method yielded an AUROC of 0.93 (95% confidence interval: 0.90-0.94) versus 0.69 (95% confidence interval: 0.65-0.72) for the reference method, 71% specificity with 95% sensitivity versus 5% and four images per decision versus eight or more. In conclusion, the automated method reported in this study significantly improved the accuracy for >=F2 classification of SWE measurements as well as reduced the number of measurements needed, which has the potential to reduce clinical workflow.
READ LESS

Summary

The purpose of this study was to develop an automated method for classifying liver fibrosis stage >=F2 based on ultrasound shear wave elastography (SWE) and to assess the system's performance in comparison with a reference manual approach. The reference approach consists of manually selecting a region of interest from each...

READ MORE

This looks like that: deep learning for interpretable image recognition

Published in:
Neural Info. Process., NIPS, 8-14 December 2019.

Summary

When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network architecture that reasons in a similar way: the network dissects the image by finding prototypical parts, and combines evidence from the prototypes to make a final classification. The algorithm thus reasons in a way that is qualitatively similar to the way ornithologists, physicians, geologists, architects, and others would explain to people on how to solve challenging image classification tasks. The network uses only image-level labels for training, meaning that there are no labels for parts of images. We demonstrate the method on the CIFAR-10 dataset and 10 classes from the CUB-200-2011 dataset.
READ LESS

Summary

When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network...

READ MORE

Feature forwarding for efficient single image dehazing

Published in:
IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops, CVPRW, 16-17 June 2019.

Summary

Haze degrades content and obscures information of images, which can negatively impact vision-based decision-making in real-time systems. In this paper, we propose an efficient fully convolutional neural network (CNN) image dehazing method designed to run on edge graphical processing units (GPUs). We utilize three variants of our architecture to explore the dependency of dehazed image quality on parameter count and model design. The first two variants presented, a small and big version, make use of a single efficient encoder–decoder convolutional feature extractor. The final variant utilizes a pair of encoder-decoders for atmospheric light and transmission map estimation. Each variant ends with an image refinement pyramid pooling network to form the final dehazed image. For the big variant of the single-encoder network, we demonstrate state-of-the-art performance on the NYU Depth dataset. For the small variant, we maintain competitive performance on the superresolution O/I-HAZE datasets without the need for image cropping. Finally, we examine some challenges presented by the Dense-Haze dataset when leveraging CNN architectures for dehazing of dense haze imagery and examine the impact of loss function selection on image quality. Benchmarks are included to show the feasibility of introducing this approach into real-time systems.
READ LESS

Summary

Haze degrades content and obscures information of images, which can negatively impact vision-based decision-making in real-time systems. In this paper, we propose an efficient fully convolutional neural network (CNN) image dehazing method designed to run on edge graphical processing units (GPUs). We utilize three variants of our architecture to explore...

READ MORE

Machine learning for medical ultrasound: status, methods, and future opportunities

Published in:
Abdom. Radiol., 2018, doi: 10.1007/s00261-018-1517-0.

Summary

Ultrasound (US) imaging is the most commonly performed cross-sectional diagnostic imaging modality in the practice of medicine. It is low-cost, non-ionizing, portable, and capable of real-time image acquisition and display. US is a rapidly evolving technology with significant challenges and opportunities. Challenges include high inter- and intra-operator variability and limited image quality control. Tremendous opportunities have arisen in the last decade as a result of exponential growth in available computational power coupled with progressive miniaturization of US devices. As US devices become smaller, enhanced computational capability can contribute significantly to decreasing variability through advanced image processing. In this paper, we review leading machine learning (ML) approaches and research directions in US, with an emphasis on recent ML advances. We also present our outlook on future opportunities for ML techniques to further improve clinical workflow and US-based disease diagnosis and characterization.
READ LESS

Summary

Ultrasound (US) imaging is the most commonly performed cross-sectional diagnostic imaging modality in the practice of medicine. It is low-cost, non-ionizing, portable, and capable of real-time image acquisition and display. US is a rapidly evolving technology with significant challenges and opportunities. Challenges include high inter- and intra-operator variability and limited...

READ MORE

Multi-modal audio, video and physiological sensor learning for continuous emotion prediction

Summary

The automatic determination of emotional state from multimedia content is an inherently challenging problem with a broad range of applications including biomedical diagnostics, multimedia retrieval, and human computer interfaces. The Audio Video Emotion Challenge (AVEC) 2016 provides a well-defined framework for developing and rigorously evaluating innovative approaches for estimating the arousal and valence states of emotion as a function of time. It presents the opportunity for investigating multimodal solutions that include audio, video, and physiological sensor signals. This paper provides an overview of our AVEC Emotion Challenge system, which uses multi-feature learning and fusion across all available modalities. It includes a number of technical contributions, including the development of novel high- and low-level features for modeling emotion in the audio, video, and physiological channels. Low-level features include modeling arousal in audio with minimal prosodic-based descriptors. High-level features are derived from supervised and unsupervised machine learning approaches based on sparse coding and deep learning. Finally, a state space estimation approach is applied for score fusion that demonstrates the importance of exploiting the time-series nature of the arousal and valence states. The resulting system outperforms the baseline systems [10] on the test evaluation set with an achieved Concordant Correlation Coefficient (CCC) for arousal of 0.770 vs 0.702 (baseline) and for valence of 0.687 vs 0.638. Future work will focus on exploiting the time-varying nature of individual channels in the multi-modal framework.
READ LESS

Summary

The automatic determination of emotional state from multimedia content is an inherently challenging problem with a broad range of applications including biomedical diagnostics, multimedia retrieval, and human computer interfaces. The Audio Video Emotion Challenge (AVEC) 2016 provides a well-defined framework for developing and rigorously evaluating innovative approaches for estimating the...

READ MORE

Detecting depression using vocal, facial and semantic communication cues

Summary

Major depressive disorder (MDD) is known to result in neurophysiological and neurocognitive changes that affect control of motor, linguistic, and cognitive functions. MDD's impact on these processes is reflected in an individual's communication via coupled mechanisms: vocal articulation, facial gesturing and choice of content to convey in a dialogue. In particular, MDD-induced neurophysiological changes are associated with a decline in dynamics and coordination of speech and facial motor control, while neurocognitive changes influence dialogue semantics. In this paper, biomarkers are derived from all of these modalities, drawing first from previously developed neurophysiologically motivated speech and facial coordination and timing features. In addition, a novel indicator of lower vocal tract constriction in articulation is incorporated that relates to vocal projection. Semantic features are analyzed for subject/avatar dialogue content using a sparse coded lexical embedding space, and for contextual clues related to the subject's present or past depression status. The features and depression classification system were developed for the 6th International Audio/Video Emotion Challenge (AVEC), which provides data consisting of audio, video-based facial action units, and transcribed text of individuals communicating with the human-controlled avatar. A clinical Patient Health Questionnaire (PHQ) score and binary depression decision are provided for each participant. PHQ predictions were obtained by fusing outputs from a Gaussian staircase regressor for each feature set, with results on the development set of mean F1=0.81, RMSE=5.31, and MAE=3.34. These compare favorably to the challenge baseline development results of mean F1=0.73, RMSE=6.62, and MAE=5.52. On test set evaluation, our system obtained a mean F1=0.70, which is similar to the challenge baseline test result. Future work calls for consideration of joint feature analyses across modalities in an effort to detect neurological disorders based on the interplay of motor, linguistic, affective, and cognitive components of communication.
READ LESS

Summary

Major depressive disorder (MDD) is known to result in neurophysiological and neurocognitive changes that affect control of motor, linguistic, and cognitive functions. MDD's impact on these processes is reflected in an individual's communication via coupled mechanisms: vocal articulation, facial gesturing and choice of content to convey in a dialogue. In...

READ MORE

How deep neural networks can improve emotion recognition on video data

Published in:
ICIP: 2016 IEEE Int. Conf. on Image Processing, 25-28 September 2016.

Summary

We consider the task of dimensional emotion recognition on video data using deep learning. While several previous methods have shown the benefits of training temporal neural network models such as recurrent neural networks (RNNs) on hand-crafted features, few works have considered combining convolutional neural networks (CNNs) with RNNs. In this work, we present a system that performs emotion recognition on video data using both CNNs and RNNs, and we also analyze how much each neural network component contributes to the system's overall performance. We present our findings on videos from the Audio/Visual+Emotion Challenge (AV+EC2015). In our experiments, we analyze the effects of several hyperparameters on overall performance while also achieving superior performance to the baseline and other competing methods.
READ LESS

Summary

We consider the task of dimensional emotion recognition on video data using deep learning. While several previous methods have shown the benefits of training temporal neural network models such as recurrent neural networks (RNNs) on hand-crafted features, few works have considered combining convolutional neural networks (CNNs) with RNNs. In this...

READ MORE

The Offshore Precipitation Capability

Summary

In this work, machine learning and image processing methods are used to estimate radar-like precipitation intensity and echo top heights beyond the range of weather radar. The technology, called the Offshore Precipitation Capability (OPC), combines global lightning data with existing radar mosaics, five Geostationary Operational Environmental Satellite (GOES) channels, and several fields from the Rapid Refresh (RAP) 13 km numerical weather prediction model to create precipitation and echo top fields similar to those provided by existing Federal Aviation Administration (FAA) weather systems. Preprocessing and feature extraction methods are described to construct inputs for model training. A variety of machine learning algorithms are investigated to identify which provides the most accuracy. Output from the machine learning model is blended with existing radar mosaics to create weather radar-like analyses that extend into offshore regions. The resulting fields are validated using land radars and satellite precipitation measurements provided by the National Aeronautics and Space Administration (NASA) Global Precipitation Measurement Mission (GPM) core observatory satellite. This capability is initially being developed for the Miami Oceanic airspace with the goal of providing improved situational awareness for offshore air traffic control.
READ LESS

Summary

In this work, machine learning and image processing methods are used to estimate radar-like precipitation intensity and echo top heights beyond the range of weather radar. The technology, called the Offshore Precipitation Capability (OPC), combines global lightning data with existing radar mosaics, five Geostationary Operational Environmental Satellite (GOES) channels, and...

READ MORE

Time delay integration and in-pixel spatiotemporal filtering using a nanoscale digital CMOS focal plane readout

Summary

A digital focal plane array (DFPA) architecture has been developed that incorporates per-pixel full-dynamic-range analog-to-digital conversion and orthogonal-transfer-based realtime digital signal processing capability. Several long-wave infrared-optimized pixel processing focal plane readout integrated circuit (ROIC) designs have been implemented, each accommodating a 256 x 256 30-um-pitch detector array. Demonstrated in this paper is the application of this DFPA ROIC architecture to problems of background pedestal mitigation, wide-field imaging, image stabilization, edge detection, and velocimetry. The DFPA architecture is reviewed, and pixel performance metrics are discussed in the context of the application examples. The measured data reported here are for DFPA ROICs implemented in 90-nm CMOS technology and hybridized to HgxCd1-xTe (MCT) detector arrays with cutoff wavelengths ranging from 7 to 14.5 m and a specified operating temperature of 60 K-80 K.
READ LESS

Summary

A digital focal plane array (DFPA) architecture has been developed that incorporates per-pixel full-dynamic-range analog-to-digital conversion and orthogonal-transfer-based realtime digital signal processing capability. Several long-wave infrared-optimized pixel processing focal plane readout integrated circuit (ROIC) designs have been implemented, each accommodating a 256 x 256 30-um-pitch detector array. Demonstrated in this...

READ MORE

A method for correcting Fourier transform spectrometer (FTS) dynamic alignment errors

Published in:
SPIE Vol. 5425, Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery X, 12-15 April 2004, pp. 443-455.

Summary

The Cross-track Infrared Sounder (CrIS), like most Fourier Transform spectrometers, can be sensitive to mechanical disturbances during the time spectral data is collected. The Michelson interferometer within the spectrometer modulates input radiation at a frequency equal to the product of the wavenumber of the radiation and the constant optical path difference (OPD) velocity associated with the moving mirror. The modulation efficiency depends on the angular alignment of the two wavefronts exiting the spectrometer. Mechanical disturbances can cause errors in the alignment of the wavefronts which manifest as noise in the spectrum. To mitigate these affects CrIS will employ a laser to monitor alignment and dynamically correct the errors. Additionally, a vibration isolation system will damp disturbances imparted to the sensor from the spacecraft. Despite these efforts, residual noise may remain under certain conditions. Through simulation of CrIS data, we demonstrated an algorithmic technique to correct residual dynamic alignment errors. The technique requires only the time-dependent wavefront angle, sampled coincidentally with the interferogram, and the second derivative of the erroneous interferogram as inputs to compute the correction. The technique can function with raw interferograms on board the spacecraft, or with decimated interferograms on the ground. We were able to reduce the dynamic alignment noise by approximately a factor of ten in both cases. Performing the correction on the ground would require an increase in data rate of 1-2% over what is currently planned, in the form of 8-bit digitized angle data.
READ LESS

Summary

The Cross-track Infrared Sounder (CrIS), like most Fourier Transform spectrometers, can be sensitive to mechanical disturbances during the time spectral data is collected. The Michelson interferometer within the spectrometer modulates input radiation at a frequency equal to the product of the wavenumber of the radiation and the constant optical path...

READ MORE

Showing Results

1-10 of 15