Abstract: In this paper, we present a novel generalization of the Volterra Series, which can be viewed as a higher-order convolution, to manifold-valued functions. A special case of the manifold-valued Volterra Series (MVVS) gives us a natural extension of the ordinary convolution to manifold-valued functions that we call, the manifold-valued convolution (MVC). We prove that these generalizations preserve the equivariance properties of the Euclidean Volterra Series and the traditional convolution operator. We present novel deep network architectures using the MVVS and the MVC operations which are then validated via two experiments. These include, (i) movement disorder classification from diffusion magnetic resonance images (dMRI), and (ii) fODF reconstruction from compressed sensed dMRIs. Both experiments outperform the state-of-the-art
Authors: Shuo Han (Johns Hopkins University)*; Samuel W Remedios (Johns Hopkins University); Aaron Carass (Johns Hopkins University, USA); Michael Schär (Johns Hopkins University School of Medicine); Jerry L Prince (Johns Hopkins University)
Abstract: To super-resolve the through-plane direction of a multi-slice 2D magnetic resonance (MR) image, its slice selection profile can be used as the degeneration model from high resolution (HR) to low resolution (LR) to create paired data when training a supervised algorithm. Existing super-resolution algorithms make assumptions about the slice selection profile since it is not readily known for a given image. In this work, we estimate a slice selection profile given a specific image by learning to match its internal patch distributions. Specifically, we assume that after applying the correct slice selection profile, the image patch distribution along HR in-plane directions should match the distribution along the LR through-plane direction. Therefore, we incorporate the estimation of a slice selection profile as part of learning a generator in a generative adversarial network (GAN). In this way, the slice selection profile can be learned without any external data. Our algorithm was tested using simulations from isotropic MR images, incorporated in a through-plane super-resolution algorithm to demonstrate its benefits, and also used as a tool to measure image resolution. Our code is at https://github.com/shuohan/espreso2.
Abstract: Data augmentation has proved extremely useful by increasing training data variance to alleviate overfitting and improve deep neural networks’ generalization performance. In medical image analysis, a well-designed augmentation policy usually requires much expert knowledge and is difficult to generalize to multiple tasks due to the vast discrepancies among pixel intensities, image appearances, and object shapes in different medical tasks. To automate medical data augmentation, we propose a regularized adversarial training framework via two min-max objectives and three differentiable augmentation models covering affine transformation, deformation, and appearance changes. Our method is more automatic and efficient than previous automatic augmentation methods, which still rely on pre-defined operations with human-specified ranges and costly bi-level optimization. Extensive experiments demonstrated that our approach, with less training overhead, achieves superior performance over state-of-the-art auto-augmentation methods on both tasks of 2D skin cancer classification and 3D organs-at-risk segmentation.
Abstract: Deep segmentation neural networks require large training datasets with pixel-wise segmentations, which are expensive to obtain in practice. Mixed supervision could mitigate this difficulty, with a small fraction of the data containing complete pixel-wise annotations, while the rest being less supervised, e.g., only a handful of pixels are labeled. In this work, we propose a dual-branch architecture, where the upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch. In conjunction with a standard cross-entropy over the labeled pixels, our novel formulation integrates two important terms: (i) a Shannon entropy loss defined over the less-supervised images, which encourages confident student predictions at the bottom branch; and (ii) a Kullback-Leibler (KL) divergence, which transfers the knowledge from the predictions generated by the strongly supervised branch to the less-supervised branch, and guides the entropy (student-confidence) term to avoid trivial solutions. Very interestingly, we show that the synergy between the entropy and KL divergence yields substantial improvements in performances. Furthermore, we discuss an interesting link between Shannon-entropy minimization and standard pseudo-mask generation, and argue that the former should be preferred over the latter for leveraging information from unlabeled pixels. Through a series of quantitative and qualitative experiments, we show the effectiveness of the proposed formulation in segmenting the left-ventricle endocardium in MRI images. We demonstrate that our method significantly outperforms other strategies to tackle semantic segmentation within a mixed-supervision framework. More interestingly, and in line with recent observations in classification, we show that the branch trained with reduced supervision and guided by the top branch largely outperforms the latter.
Authors: Shaheer Ullah Saeed (University College London); Yunguan Fu (University College London); Zachary M C Baum (University College London); Qianye Yang (University College London); Mirabela Rusu (Stanford University); Richard Fan (Stanford University); Geoffrey A Sonn (Stanford University); Dean Barratt (University College London); Yipeng Hu (University College London)
Abstract: In this paper, we consider a type of image quality assessment (IQA) as a task-specific measurement, which can be used to select images that are more amenable to a given target task, such as image classification or segmentation. We propose to train simultaneously two neural networks for image selection and a target task using reinforcement learning. A controller network learns an image selection policy by maximising an accumulated reward based on the target task performance on the controller-selected validation set, whilst the target task predictor is optimised using the training set. The trained controller is therefore able to reject images that lead to poor accuracy in the target task. In this work, we show that controller-predicted IQA can be significantly different from task-specific quality labels manually defined by humans. Furthermore, we demonstrate that it is possible to learn effective IQA without a “clean” validation set, thereby avoiding the requirement for human labels of task amenability. Using 6712, labelled and segmented, clinical ultrasound images from 259 patients, experimental results on holdout data show that the proposed IQA achieved a mean classification accuracy of 0.94±0.01 and a mean segmentation Dice of 0.89±0.02, by discarding 5% and 15% of the acquired images, respectively. The significantly improved performance was observed for both tested tasks, compared with the respective 0.90±0.01 and 0.82±0.02 from networks without considering task amenability. This enables IQA feedback during real-time ultrasound acquisition among many other medical imaging applications.
Abstract: Estimation of bone age from hand radiographs is essential to determine skeletal bone age in diagnosing endocrine disorders and depicting growth status of children. However, existing automatic methods only apply their models to test images without considering the discrepancy between training samples and test samples, which will lead to a lower generalization ability to the test data. In this paper, we propose an adversarial regression learning network ($ARLNet$) for bone age estimation. Specifically, we first extract bone features from a fine-tuned Inception V3 neural network and propose regression percentage loss for general training procedure. To reduce the discrepancy between training and test data, we then propose adversarial regression loss and feature reconstruction loss to guarantee the transition from training data to test data and vice versa, preserving invariant features from both training and test data. Experimental results show that the proposed model outperforms state-of-the-art methods.
Authors: Zhongyi Han (Shandong University)*; Rundong He (Shandong University); Tianyang Li (Shandong University of Traditional Chinese Medicine); Benzheng Wei (Shandong University of Traditional Chinese Medicine); Jian Wang (ShanDong JiaoTong University); Yilong Yin (Shandong University)
Abstract: With the COVID-19 pandemic bringing about a severe global crisis, our health systems are under tremendous pressure. Automated screening plays a critical role in the fight against this pandemic, and much of the previous work has been very successful in designing effective screening models. However, they would lose effectiveness under the semi-supervised learning environment with only positive and unlabeled (PU) data, which is easy to collect clinically. In this paper, we report our attempt towards achieving semi-supervised screening of COVID-19 from PU data. We propose a new PU learning method called Constraint Non-Negative Positive Unlabeled Learning (cnPU). It suggests the constraint non-negative risk estimator, which is more robust against overfitting than previous PU learning methods when giving limited positive data. It also embodies a new and efficient optimization algorithm that can make the model learn well on positive data and avoid overfitting on unlabeled data. To the best of our knowledge, this is the first work that realizes PU learning of COVID-19. A series of empirical studies show that our algorithm remarkably outperforms state of the art in real datasets of two medical imaging modalities, including X-ray and computed tomography. These advantages endow our algorithm as a robust and useful computer-assisted tool in the semi-supervised screening of COVID-19.
Abstract: Most existing deep learning-based frameworks for image segmentation assume that a unique ground truth is known and can be used for performance evaluation. This is true for many applications, but not all. Myocardial segmentation of Myocardial Contrast Echocardiography (MCE), a critical task in automatic myocardial perfusion analysis, is an example. Due to the low resolution and serious artifacts in MCE data, annotations from different cardiologists can vary significantly, and it is hard to tell which one is the best. In this case, how can we find a good way to evaluate segmentation performance and how do we train the neural network? In this paper, we address the first problem by proposing a new extended Dice to effectively evaluate the segmentation performance when multiple accepted ground truth is available. Then based on our proposed metric, we solve the second problem by further incorporating the new metric into a loss function that enables neural networks to flexibly learn general features of myocardium. Experiment results on our clinical MCE data set demonstrate that the neural network trained with the proposed loss function outperforms those existing ones that try to obtain a unique ground truth from multiple annotations, both quantitatively and qualitatively. Finally, our grading study shows that using extended Dice as an evaluation metric can better identify segmentation results that need manual correction compared with using Dice.
Authors: Long Xie (University of Pennsylvania); Laura Wisse (Lund University); Jiancong Wang (University of Pennsylvania); Sadhana Ravikumar (University of Pennsylvania); Trevor Glenn (University of Pennsylvania); Anica Luther (Lund University); Sydney Lim (University of Pennsylvania); David Wolk (University of Pennsylvania); Paul Yushkevich (UPENN)
Abstract: Deep learning (DL) is the state-of-the-art methodology in various medical image segmentation tasks. However, it requires relatively large amounts of manually labeled training data, which may be infeasible to generate in some applications. In addition, DL methods have relatively poor generalizability to out-of-sample data. Multi-atlas segmentation (MAS), on the other hand, has promising performance using limited amounts of training data and good generalizability. A hybrid method that integrates the high accuracy of DL and good generalizability of MAS is highly desired and could play an important role in segmentation problems where manually labeled data is hard to generate. Most of the prior work focuses on improving single components of MAS using DL rather than directly optimizing the final segmentation accuracy via an end-to-end pipeline. Only one study explored this idea in binary segmentation of 2D images, but it remains unknown whether it generalizes well to multi-class 3D segmentation problems. In this study, we propose a 3D end-to-end hybrid pipeline, named deep label fusion (DLF), that takes advantage of the strengths of MAS and DL. Experimental results demonstrate that DLF yields significant improvements over conventional label fusion methods as well as U-Net, a direct DL approach, in the context of segmenting medial temporal lobe subregions using 3T T1-weighted and T2-weighted structural MRI. Further, when applied to an unseen similar dataset that is acquired in 7T, DLF maintains its superior performance, which demonstrates its good generalizability.
Authors: Yirui Wang ( PAII Inc. ), Kang Zheng ( PAII Inc. ), CHI TUNG CHENG ( Chang Gung Memorial Hospital, Linkou ), Xiao-Yun Zhou ( PAII INC. ), Zhilin Zheng ( PingAn Technology ), Jing Xiao ( Ping An Insurance (Group) Company of China ), Le Lu ( PAII Inc. ), ChienHung Liao ( Chang Gung Memorial hospital ), Shun Miao ( PAII )
Abstract: Exploiting available medical records to train high-performance computer-aided diagnosis (CAD) models via the semi-supervised learning (SSL) setting is emerging to tackle the prohibitively high labor costs involved in large-scale medical image annotations. Despite the extensive attention received on SSL, previous methods failed to 1) account for the low disease prevalence in medical records and 2) utilize the image-level diagnosis indicated from the medical records. Both issues are unique to SSL for CAD models. In this work, we propose a new knowledge distillation method that effectively exploits large-scale image-level labels extracted from the medical records, augmented with limited expert annotated region-level labels, to train a rib and clavicle fracture CADmodel for chest X-ray (CXR). Our method leverages the teacher-student model paradigm and features a novel adaptive asymmetric label sharpening (AALS) algorithm to address the label imbalance problem that especially exists in the medical domain. Our approach is extensively evaluated on all CXR (N=65,845) from the trauma registry of ChangGung Memorial Hospital over a period of 9 years (2008-2016), on the most common rib and clavicle fractures. The experiment results demonstrate that our method achieves the state-of-the-art fracture detection performance, i.e., an area under the receiver operating characteristic curve (AUROC) of 0.9318 and a free-response receiver operating characteristic(FROC) score of 0.8914 on the rib fractures, significantly outperforming previous approaches by an AUROC gap of 1.63% and an FROC improvement by 3.74%. Consistent performance gains are also observed for clavicle fracture detection.