Abstract: This paper introduces an extension of diffeomorphic registration to enable the morphological analysis of data structures with inherent density variations and imbalances. Building on the framework of Large Diffeomorphic Metric Matching (LDDMM) registration and measure representations of shapes, we propose to augment previous measure deformation approaches with an additional density (or mass) transformation process. We then derive a variational formulation for the joint estimation of optimal deformation and density change between two measures. Based on the obtained optimality conditions, we deduce a shooting algorithm to numerically estimate solutions and illustrate the practical interest of this model for several types of geometric data such as fiber bundles with inconsistent fiber densities or incomplete surfaces.
Authors: Daniel H. Pak (Yale University)*; Minliang Liu (Georgia Institute of Technology); Shawn Ahn (Yale University); Andres Caballero (Georgia Institute of Technology); John A Onofrey (Yale University); Liang Liang (University of Miami); Wei Sun (Georgia Institute of Technology); James S Duncan (Yale University)
Abstract: Finite Element Analysis (FEA) is useful for simulating Transcather Aortic Valve Replacement (TAVR), but has a significant bottleneck at input mesh generation. Existing automated methods for imaging-based valve modeling often make heavy assumptions about imaging characteristics and/or output mesh topology, limiting their adaptability. In this work, we propose a deep learning-based deformation strategy for producing aortic valve FE meshes from noisy 3D CT scans of TAVR patients. In particular, we propose a novel image analysis problem formulation that allows for training of mesh prediction models using segmentation labels (i.e. weak supervision), and identify a unique set of losses that improve model performance within this framework. Our method can handle images with large amounts of calcification and low contrast, and is compatible with predicting both surface and volumetric meshes. The predicted meshes have good surface and correspondence accuracy, and produce reasonable FEA results.
Abstract: Probabilistic image segmentation encodes varying prediction confidence and inherent ambiguity in the segmentation problem. While different probabilistic segmentation models are designed to capture different aspects of segmentation uncertainty and ambiguity, these modelling differences are rarely discussed in the context of applications of uncertainty. We consider two common use cases of segmentation uncertainty, namely assessment of segmentation quality and active learning. We consider four established strategies for probabilistic segmentation, discuss their modelling capabilities, and investigate their performance in these two tasks. We find that for all models and both tasks, returned uncertainty correlates positively with segmentation error, but does not prove to be useful for active learning.
Authors: Munan Ning (National University of Defense Technology)*; Cheng Bian (Tencent); DONG WEI (Tencent Jarvis Lab); Shuang Yu (Tencent); Chenglang Yuan (Tencent); Yaohua Wang ( National University of Defense Technology); Yang Guo (National University of Defense Technology); Kai Ma (Tencent); Yefeng Zheng (Tencent)
Abstract: Domain shift happens in cross-domain scenarios commonly because of the wide gaps between different domains: when applying a deep learning model well-trained in one domain to another target domain, the model usually performs poorly. To tackle this problem, unsupervised domain adaptation (UDA) techniques are proposed to bridge the gap between different domains, for the purpose of improving model performance without annotation in the target domain. Particularly, UDA has a great value for multimodal medical image analysis, where annotation difficulty is a practical concern. However, most existing UDA methods can only achieve satisfactory improvements in one adaptation direction (e.g., MRI to CT), but often perform poorly in the other (CT to MRI), limiting their practical usage. In this paper, we propose a bidirectional UDA (BiUDA) framework based on disentangled representation learning for equally competent two-way UDA performances. This framework employs a unified domain-aware pattern encoder which not only can adaptively encode images in different domains through a domain controller, but also improve model efficiency by eliminating redundant parameters. Furthermore, to avoid distortion of contents and patterns of input images during the adaptation process, a content-pattern consistency loss is introduced. Additionally, for better UDA segmentation performance, a label consistency strategy is proposed to provide extra supervision by recomposing target-domain-styled images and corresponding source-domain annotations. Comparison experiments and ablation studies conducted on two public datasets demonstrate the superiority of our BiUDA framework to current state-of-the-art UDA methods and the effectiveness of its novel designs. By successfully addressing two-way adaptations, our BiUDA framework offers a flexible solution of UDA techniques to the real-world scenario.
Authors: Klemens Kasseroller (Graz University of Technology)*; Christian Payer (Graz University of Technology); Franz Thaler (Institute of Computer Graphics and Vision, Graz University of Technology); Darko Stern (Medical University of Graz)
Abstract: We propose a reinforcement learning (RL) based approach for anatomical landmark localization in medical images, where the agent can move in arbitrary directions with a variable step size. The use of continuous action space reduces the average number of steps required to locate a landmark by more than 30 times compared to localization using discrete actions. Our approach outperforms a state-of-the-art RL method based on discrete action space and is inline with the state-of-the-art supervised regression based method. Furthermore, we extend our approach to a multi-agent setting, where we allow collaboration between agents to enable learning of the landmarks’ spatial configuration. The results of the multi-agent RL based approach show that the position of occluded landmarks can be successfully estimated based on the relative position predicted for the visible landmarks.
Authors: Yubo Zhang (The University of North Carolina at Chapel Hill)*; Shuxian Wang (The University of North Carolina at Chapel Hill); Ruibin Ma (University of North Carolina at Chapel Hill); Sarah McGill (University of North Carolina at Chapel Hill); Julian Rosenman (University of North Carolina at Chapel Hill); Steve Pizer (University of North Carolina)
Abstract: High screening coverage during colonoscopy is crucial to effectively prevent colon cancer. Previous work has allowed alerting the doctor to unsurveyed regions by reconstructing the 3D colonoscopic surface from colonoscopy videos in real-time. However, the lighting inconsistency of colonoscopy videos can cause a key component of the colonoscopic reconstruction system, the SLAM optimization, to fail. In this work we focus on the lighting problem in colonoscopy videos. To successfully improve the lighting consistency of colonoscopy videos, we have found necessary a lighting correction that adapts to the intensity distribution of recent video frames. To achieve this in real-time, we have designed and trained an RNN network. This network adapts the gamma value in a gamma-correction process. Applied in the colonoscopic surface reconstruction system, our light-weight model significantly boosts the reconstruction success rate, making a larger proportion of colonoscopy video segments reconstructable and improving the reconstruction quality of the already reconstructed segments.
Authors: Junbo Ma (University of North Carolina at Chapel Hill); Oleh Krupa (UNC-Ch); Rose Glass (University of North Carolina at Chapel Hill); Carolyn McCormick (University of North Carolina at Chapel Hill); David Borland (Renaissance Computing Institute); Minjeong Kim (University of North Carolina at Greensboro); Jason Stein (University of North Carolina at Chapel Hill, USA); Guorong Wu (University of North Carolina)*
Abstract: Tissue clearing and light-sheet microscopy technologies offer new opportunities to quantify the three-dimensional (3D) neural structure at a cellular or even sub-cellular resolution. Although many efforts have been made to recognize nuclei in 3D using deep learning techniques, current state-of-the-art approaches often work in a two-step manner, i.e., first segment nucleus regions within a 2D optical slice and then assemble the regions into the 3D instance of a nucleus. Due to the poor inter-slice resolution in many volumetric microscopy images and lack of contextual information across image slices, the current two-step approaches yield less accurate instance segmentation results. To address these limitations, a novel neural network for 3D nucleus instance segmentation (NIS) is proposed, called NIS-Net, which jointly segments and assembles the 3D instances of nuclei. Specifically, a pretext task is designed to predict the image appearance of the to-be-processed slice using the learned context from the processed slices, where the well-characterized contextual information is leveraged to guide the assembly of 3D nuclei instances. Since our NIS-Net progressively identifies nuclei instances by sliding over the entire image stack, our method is capable of segmenting nuclei instances for the whole mouse brain. Experimental results show that our proposed NIS-Net achieves higher accuracy and more reasonable nuclei instances than the current counterpart methods.
Abstract: Deep learning models with large learning capacities often overfit to medical imaging datasets. This is because training sets are often relatively small due to the significant time and financial costs incurred in medical data acquisition and labelling. Data augmentation is therefore often used to expand the availability of training data and to increase generalization. However, augmentation strategies are often chosen on an ad-hoc basis without justification. In this paper, we present an augmentation policy search method with the goal of improving model classification performance. We include in the augmentation policy search additional transformations that are often used in medical image analysis and evaluate their performance. In addition, we extend the augmentation policy search to include non-linear mixed-example data augmentation strategies. Using these learned policies, we show that principled data augmentation for medical image model training can lead to significant improvements in ultrasound standard plane detection, with an an average F1-score improvement of 7.0\% overall over naive data augmentation strategies in ultrasound fetal standard plane classification. We find that the learned representations of ultrasound images are better clustered and defined with optimized data augmentation.
Authors: Yuting He (Southeast University); Rongjun Ge (Southeast University); Xiaoming Qi (Southeast University); Guanyu Yang (Southeast University)*; Yang Chen (Southeast University); Youyong Kong (Southeast University); Huazhong Shu (Southeast University); Jean-Louis Coatrieux (” LTSI, Rennes, France”); Shuo Li (the University of Western Ontario)
Abstract: 3D complete renal structures (CRS) segmentation targets on segmenting the kidneys, tumors, renal arteries and veins in one inference. Once successful, it will provide preoperative plans and intraoperative guidance for laparoscopic partial nephrectomy (LPN), playing a key role in the renal cancer treatment. However, no success has been reported in 3D CRS segmentation due to the complex shapes of renal structures, low contrast and large anatomical variation. In this study, we utilize the adversarial ensemble learning and propose Ensemble Multi-condition GAN (EnMcGAN) for 3D CRS segmentation for the first time. Its contribution is three-fold. 1) Inspired by windowing, we propose the multi-windowing committee which divides CTA image into multiple narrow windows with different window centers and widths enhancing the contrast for salient boundaries and soft tissues. And then, it builds an ensemble segmentation model on these narrow windows to fuse the segmentation superiorities and improve whole segmentation quality. 2) We propose the multi-condition GAN which equips the segmentation model with multiple discriminators to encourage the segmented structures meeting their real shape conditions, thus improving the shape feature extraction ability. 3) We propose the adversarial weighted ensemble module which uses the trained discriminators to evaluate the quality of segmented structures, and normalizes these evaluation scores for the ensemble weights directed at the input image, thus enhancing the ensemble results. 122 patients are enrolled in this study and the mean Dice coefficient of the renal structures achieves 84.6%. Extensive experiments with promising results on renal structures reveal powerful segmentation accuracy and great clinical significance in renal cancer treatment.
Abstract: Typical methods for semantic image segmentation rely on large training sets comprising pixel-level segmentations and pixel-level classifications. In medical applications, a large number of training images with per-pixel segmentations are diffcult to obtain. In addition, many applications involve images or image tiles containing a single object/region of interest, where the image/tile-level information about object/region class is readily available. We propose a novel deep-neural-network (DNN) framework for joint segmentation and recognition of objects relying on weakly-supervised learning from training sets having very few expert segmentations, but with object-class labels available for all images/tiles. For weakly-supervised learning, we propose a variational-learning framework relying on Monte Carlo expectation maximization (MCEM), infering a posterior distribution on the missing segmentations. We design an effective Metropolis-Hastings posterior sampler coupled with suitable sample reparametrizations to enable end-to-end backpropagation. We design an effective posterior sampler coupled with suitable sample reparametrizations to enable end-to-end backpropagation. Our end-to-end learning DNN first produces probabilistic segmentations of objects, and then their probabilistic classifications. Results on two publicly available real-world datasets show the benefits of our strategies of (i) joint object segmentation and recognition as well as (ii) weakly-supervised MCEM-based learning.