Authors: Akshay V Gaikwad (Indian Institute of Technology (IIT), Bombay)*; Suyash P. Awate (Indian Institute of Technology (IIT) Bombay)
Abstract: Typical methods for semantic image segmentation rely on large training sets comprising pixel-level segmentations and pixel-level classifications. In medical applications, a large number of training images with per-pixel segmentations are diffcult to obtain. In addition, many applications involve images or image tiles containing a single object/region of interest, where the image/tile-level information about object/region class is readily available. We propose a novel deep-neural-network (DNN) framework for joint segmentation and recognition of objects relying on weakly-supervised learning from training sets having very few expert segmentations, but with object-class labels available for all images/tiles. For weakly-supervised learning, we propose a variational-learning framework relying on Monte Carlo expectation maximization (MCEM), infering a posterior distribution on the missing segmentations. We design an effective Metropolis-Hastings posterior sampler coupled with suitable sample reparametrizations to enable end-to-end backpropagation. We design an effective posterior sampler coupled with suitable sample reparametrizations to enable end-to-end backpropagation. Our end-to-end learning DNN first produces probabilistic segmentations of objects, and then their probabilistic classifications. Results on two publicly available real-world datasets show the benefits of our strategies of (i) joint object segmentation and recognition as well as (ii) weakly-supervised MCEM-based learning.