Author: Jon Middleton

IPMI 2021 poster

Collaborative Multi-Agent Reinforcement Learning for Landmark Localization Using Continuous Action Space

Post author By Jon Middleton
Post date June 22, 2021

Authors: Klemens Kasseroller (Graz University of Technology)*; Christian Payer (Graz University of Technology); Franz Thaler (Institute of Computer Graphics and Vision, Graz University of Technology); Darko Stern (Medical University of Graz)

Poster

Abstract: We propose a reinforcement learning (RL) based approach for anatomical landmark localization in medical images, where the agent can move in arbitrary directions with a variable step size. The use of continuous action space reduces the average number of steps required to locate a landmark by more than 30 times compared to localization using discrete actions. Our approach outperforms a state-of-the-art RL method based on discrete action space and is inline with the state-of-the-art supervised regression based method. Furthermore, we extend our approach to a multi-agent setting, where we allow collaboration between agents to enable learning of the landmarks’ spatial configuration. The results of the multi-agent RL based approach show that the position of occluded landmarks can be successfully estimated based on the relative position predicted for the visible landmarks.

IPMI 2021 poster

3D Nucleus Instance Segmentation for Whole-Brain Microscopy Images

Post author By Jon Middleton
Post date June 22, 2021

Authors: Junbo Ma (University of North Carolina at Chapel Hill); Oleh Krupa (UNC-Ch); Rose Glass (University of North Carolina at Chapel Hill); Carolyn McCormick (University of North Carolina at Chapel Hill); David Borland (Renaissance Computing Institute); Minjeong Kim (University of North Carolina at Greensboro); Jason Stein (University of North Carolina at Chapel Hill, USA); Guorong Wu (University of North Carolina)*

Poster

Abstract: Tissue clearing and light-sheet microscopy technologies offer new opportunities to quantify the three-dimensional (3D) neural structure at a cellular or even sub-cellular resolution. Although many efforts have been made to recognize nuclei in 3D using deep learning techniques, current state-of-the-art approaches often work in a two-step manner, i.e., first segment nucleus regions within a 2D optical slice and then assemble the regions into the 3D instance of a nucleus. Due to the poor inter-slice resolution in many volumetric microscopy images and lack of contextual information across image slices, the current two-step approaches yield less accurate instance segmentation results. To address these limitations, a novel neural network for 3D nucleus instance segmentation (NIS) is proposed, called NIS-Net, which jointly segments and assembles the 3D instances of nuclei. Specifically, a pretext task is designed to predict the image appearance of the to-be-processed slice using the learned context from the processed slices, where the well-characterized contextual information is leveraged to guide the assembly of 3D nuclei instances. Since our NIS-Net progressively identifies nuclei instances by sliding over the entire image stack, our method is capable of segmenting nuclei instances for the whole mouse brain. Experimental results show that our proposed NIS-Net achieves higher accuracy and more reasonable nuclei instances than the current counterpart methods.

IPMI 2021 oral

Segmenting two-dimensional structures with strided tensor networks

Post author By Jon Middleton
Post date June 22, 2021

Authors: Raghavendra Selvan (University of Copenhagen)*; Erik B B Dam (University of Copenhagen); Jens Petersen (University of Copenhagen)

Poster

Abstract: Tensor networks provide an efficient approximation of operations involving high dimensional tensors and have been extensively used in modelling quantum many-body systems. More recently, supervised learning has been attempted with tensor networks, primarily focused on tasks such as image classification. In this work, we propose a novel formulation of tensor networks for supervised image segmentation which allows them to operate on high resolution medical images. We use the matrix product state (MPS) tensor network on non-overlapping patches of a given input image to predict the segmentation mask by learning a pixel-wise linear classification rule in a high dimensional space. The proposed model is end-to-end trainable using backpropagation. It is implemented as a strided tensor network to reduce the parameter complexity. The performance of the proposed method is evaluated on two public medical imaging datasets and compared to relevant baselines. The evaluation shows that the strided tensor network yields competitive performance compared to CNN-based models while using lower resources such as GPU memory. Additionally, based on the experiments we discuss the feasibility of using fully linear models for segmentation tasks.

IPMI 2021 oral

Future Frame Prediction for Robot-assisted Surgery

Post author By Jon Middleton
Post date June 22, 2021

Authors: Xiaojie Gao (The Chinese University of Hong Kong)*; Yueming Jin (The Chinese University of Hong Kong); Zixu Zhao (The Chinese University of Hong Kong); Qi Dou (The Chinese University of Hong Kong); Pheng-Ann Heng (The Chinese Univsersity of Hong Kong)

Poster

Abstract: Predicting future frames for robotic surgical video is an interesting, important yet extremely challenging problem, given that the operative tasks may have complex dynamics. Existing approaches on future prediction of natural videos were based on either deterministic models or stochastic models, including deep recurrent neural networks, optical flow, and latent space modeling. However, the potential in predicting meaningful movements of robots with dual arms in surgical scenarios has not been tapped so far, which is typically more challenging than forecasting independent motions of one arm robots in natural scenarios. In this paper, we propose a ternary prior guided variational autoencoder (TPG-VAE) model for future frame prediction in robotic surgical video sequences. Besides content distribution, our model learns motion distribution, which is novel to handle the small movements of surgical tools. Furthermore, we add the invariant prior information from the gesture class into the generation process to constrain the latent space of our model. To our best knowledge, this is the first time that the future frames of dual arm robots are predicted considering their unique characteristics relative to general robotic videos. Experiments demonstrate that our model gains more stable and realistic future frame prediction scenes with the suturing task on the public JIGSAWS dataset.

IPMI 2021 oral

Hierarchical Morphology-Guided Tooth Instance Segmentation from CBCT Images

Post author By Jon Middleton
Post date June 22, 2021

Authors: Zhiming Cui (HKU); Bojun Zhang (Shanghai Jiao Tong University); Chunfeng Lian (Xi’an Jiaotong University); Changjian Li (University College London); LEI YANG (University of Hong Kong); Min Zhu (Shanghai Jiaotong University); Wenping Wang (The University of Hong Kong); Dinggang Shen (United Imaging Intelligence)

Poster

Abstract: Automatic and accurate segmentation of individual teeth, i.e., tooth instance segmentation, from CBCT images is an essential step for computer-aided dentistry. Previous works typically overlooked rich morphological features of teeth, such as tooth root apices, critical for successful treatment outcomes. This paper presents a two-stage learning-based framework that explicitly leverages the comprehensive geometric guidance provided by a hierarchical tooth morphological representation for tooth instance segmentation. Given input CBCT images, our method first learns to extract the tooth centroids and skeletons for identifying each tooth’s rough position and topological structures, respectively. Based on the outputs of the first step, a multi-task learning mechanism is further designed to estimate each tooth’s volumetric mask by simultaneously regressing boundary and root apices as auxiliary tasks. Extensive evaluations, ablation studies, and comparisons with existing methods show that our approach achieved state-of-the-art segmentation performance, especially around the challenging dental parts (i.e., tooth roots and boundaries). These results suggest the potential applicability of our framework in real-world clinical scenarios.