Sunggu Kyung, Jongjun Won, Seongyong Pak, Sunwoo Kim, Sangyoon Lee, Kanggil Park, Gil-Sun Hong, Namkug Kim
{"title":"Generative Adversarial Network with Robust Discriminator Through Multi-Task Learning for Low-Dose CT Denoising.","authors":"Sunggu Kyung, Jongjun Won, Seongyong Pak, Sunwoo Kim, Sangyoon Lee, Kanggil Park, Gil-Sun Hong, Namkug Kim","doi":"10.1109/TMI.2024.3449647","DOIUrl":null,"url":null,"abstract":"<p><p>Reducing the dose of radiation in computed tomography (CT) is vital to decreasing secondary cancer risk. However, the use of low-dose CT (LDCT) images is accompanied by increased noise that can negatively impact diagnoses. Although numerous deep learning algorithms have been developed for LDCT denoising, several challenges persist, including the visual incongruence experienced by radiologists, unsatisfactory performances across various metrics, and insufficient exploration of the networks' robustness in other CT domains. To address such issues, this study proposes three novel accretions. First, we propose a generative adversarial network (GAN) with a robust discriminator through multi-task learning that simultaneously performs three vision tasks: restoration, image-level, and pixel-level decisions. The more multi-tasks that are performed, the better the denoising performance of the generator, which means multi-task learning enables the discriminator to provide more meaningful feedback to the generator. Second, two regulatory mechanisms, restoration consistency (RC) and non-difference suppression (NDS), are introduced to improve the discriminator's representation capabilities. These mechanisms eliminate irrelevant regions and compare the discriminator's results from the input and restoration, thus facilitating effective GAN training. Lastly, we incorporate residual fast Fourier transforms with convolution (Res-FFT-Conv) blocks into the generator to utilize both frequency and spatial representations. This approach provides mixed receptive fields by using spatial (or local), spectral (or global), and residual connections. Our model was evaluated using various pixel- and feature-space metrics in two denoising tasks. Additionally, we conducted visual scoring with radiologists. The results indicate superior performance in both quantitative and qualitative measures compared to state-of-the-art denoising techniques.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TMI.2024.3449647","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Reducing the dose of radiation in computed tomography (CT) is vital to decreasing secondary cancer risk. However, the use of low-dose CT (LDCT) images is accompanied by increased noise that can negatively impact diagnoses. Although numerous deep learning algorithms have been developed for LDCT denoising, several challenges persist, including the visual incongruence experienced by radiologists, unsatisfactory performances across various metrics, and insufficient exploration of the networks' robustness in other CT domains. To address such issues, this study proposes three novel accretions. First, we propose a generative adversarial network (GAN) with a robust discriminator through multi-task learning that simultaneously performs three vision tasks: restoration, image-level, and pixel-level decisions. The more multi-tasks that are performed, the better the denoising performance of the generator, which means multi-task learning enables the discriminator to provide more meaningful feedback to the generator. Second, two regulatory mechanisms, restoration consistency (RC) and non-difference suppression (NDS), are introduced to improve the discriminator's representation capabilities. These mechanisms eliminate irrelevant regions and compare the discriminator's results from the input and restoration, thus facilitating effective GAN training. Lastly, we incorporate residual fast Fourier transforms with convolution (Res-FFT-Conv) blocks into the generator to utilize both frequency and spatial representations. This approach provides mixed receptive fields by using spatial (or local), spectral (or global), and residual connections. Our model was evaluated using various pixel- and feature-space metrics in two denoising tasks. Additionally, we conducted visual scoring with radiologists. The results indicate superior performance in both quantitative and qualitative measures compared to state-of-the-art denoising techniques.