Authors: Zhongyi Han (Shandong University)*; Rundong He (Shandong University); Tianyang Li (Shandong University of Traditional Chinese Medicine); Benzheng Wei (Shandong University of Traditional Chinese Medicine); Jian Wang (ShanDong JiaoTong University); Yilong Yin (Shandong University)
Abstract: With the COVID-19 pandemic bringing about a severe global crisis, our health systems are under tremendous pressure. Automated screening plays a critical role in the fight against this pandemic, and much of the previous work has been very successful in designing effective screening models. However, they would lose effectiveness under the semi-supervised learning environment with only positive and unlabeled (PU) data, which is easy to collect clinically. In this paper, we report our attempt towards achieving semi-supervised screening of COVID-19 from PU data. We propose a new PU learning method called Constraint Non-Negative Positive Unlabeled Learning (cnPU). It suggests the constraint non-negative risk estimator, which is more robust against overfitting than previous PU learning methods when giving limited positive data. It also embodies a new and efficient optimization algorithm that can make the model learn well on positive data and avoid overfitting on unlabeled data. To the best of our knowledge, this is the first work that realizes PU learning of COVID-19. A series of empirical studies show that our algorithm remarkably outperforms state of the art in real datasets of two medical imaging modalities, including X-ray and computed tomography. These advantages endow our algorithm as a robust and useful computer-assisted tool in the semi-supervised screening of COVID-19.