Domain adaptation and representation transfer : 4th MICCAI Workshop, DART 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. Domain Adaptation and Representation Transfer (Workshop) (4th : 2022 : Sin...最新文献
Pub Date : 2022-09-01Epub Date: 2022-09-15DOI: 10.1007/978-3-031-16852-9_8
Jiaxuan Pang, Fatemeh Haghighi, DongAo Ma, Nahid Ul Islam, Mohammad Reza Hosseinzadeh Taher, Michael B Gotway, Jianming Liang
Vision transformer-based self-supervised learning (SSL) approaches have recently shown substantial success in learning visual representations from unannotated photographic images. However, their acceptance in medical imaging is still lukewarm, due to the significant discrepancy between medical and photographic images. Consequently, we propose POPAR (patch order prediction and appearance recovery), a novel vision transformer-based self-supervised learning framework for chest X-ray images. POPAR leverages the benefits of vision transformers and unique properties of medical imaging, aiming to simultaneously learn patch-wise high-level contextual features by correcting shuffled patch orders and fine-grained features by recovering patch appearance. We transfer POPAR pretrained models to diverse downstream tasks. The experiment results suggest that (1) POPAR outperforms state-of-the-art (SoTA) self-supervised models with vision transformer backbone; (2) POPAR achieves significantly better performance over all three SoTA contrastive learning methods; and (3) POPAR also outperforms fully-supervised pretrained models across architectures. In addition, our ablation study suggests that to achieve better performance on medical imaging tasks, both fine-grained and global contextual features are preferred. All code and models are available at GitHub.com/JLiangLab/POPAR.
基于视觉变换器的自监督学习(SSL)方法最近在从无标注的摄影图像中学习视觉表征方面取得了巨大成功。然而,由于医学图像和摄影图像之间存在巨大差异,它们在医学成像中的应用仍然不温不火。因此,我们提出了 POPAR(补丁顺序预测和外观恢复),这是一种基于视觉变换器的新型胸部 X 光图像自监督学习框架。POPAR 充分利用了视觉变换器的优势和医学影像的独特属性,旨在通过校正洗牌补丁顺序来同时学习补丁的高级上下文特征,并通过恢复补丁外观来同时学习细粒度特征。我们将 POPAR 预训练模型应用于各种下游任务。实验结果表明:(1) POPAR 优于采用视觉转换器骨干的最先进(SoTA)自监督模型;(2) POPAR 的性能明显优于所有三种 SoTA 对比学习方法;(3) POPAR 还优于跨架构的全监督预训练模型。此外,我们的消融研究表明,要在医学成像任务中取得更好的性能,细粒度和全局上下文特征都是首选。所有代码和模型均可从 GitHub.com/JLiangLab/POPAR 获取。
{"title":"POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis.","authors":"Jiaxuan Pang, Fatemeh Haghighi, DongAo Ma, Nahid Ul Islam, Mohammad Reza Hosseinzadeh Taher, Michael B Gotway, Jianming Liang","doi":"10.1007/978-3-031-16852-9_8","DOIUrl":"10.1007/978-3-031-16852-9_8","url":null,"abstract":"<p><p>Vision transformer-based self-supervised learning (SSL) approaches have recently shown substantial success in learning visual representations from unannotated photographic images. However, their acceptance in medical imaging is still lukewarm, due to the significant discrepancy between medical and photographic images. Consequently, we propose POPAR (patch order prediction and appearance recovery), a novel vision transformer-based self-supervised learning framework for chest X-ray images. POPAR leverages the benefits of vision transformers and unique properties of medical imaging, aiming to simultaneously learn patch-wise high-level contextual features by correcting shuffled patch orders and fine-grained features by recovering patch appearance. We transfer POPAR pretrained models to diverse downstream tasks. The experiment results suggest that (1) POPAR outperforms state-of-the-art (SoTA) self-supervised models with vision transformer backbone; (2) POPAR achieves significantly better performance over all three SoTA contrastive learning methods; and (3) POPAR also outperforms fully-supervised pretrained models across architectures. In addition, our ablation study suggests that to achieve better performance on medical imaging tasks, both fine-grained and global contextual features are preferred. All code and models are available at GitHub.com/JLiangLab/POPAR.</p>","PeriodicalId":72837,"journal":{"name":"Domain adaptation and representation transfer : 4th MICCAI Workshop, DART 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. Domain Adaptation and Representation Transfer (Workshop) (4th : 2022 : Sin...","volume":"13542 ","pages":"77-87"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728135/pdf/nihms-1846235.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10361125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01Epub Date: 2022-09-15DOI: 10.1007/978-3-031-16852-9_7
Zuwei Guo, Nahid Ui Islam, Michael B Gotway, Jianming Liang
Uniting three self-supervised learning (SSL) ingredients (discriminative, restorative, and adversarial learning) enables collaborative representation learning and yields three transferable components: a discriminative encoder, a restorative decoder, and an adversary encoder. To leverage this advantage, we have redesigned five prominent SSL methods, including Rotation, Jigsaw, Rubik's Cube, Deep Clustering, and TransVW, and formulated each in a United framework for 3D medical imaging. However, such a United framework increases model complexity and pretraining difficulty. To overcome this difficulty, we develop a stepwise incremental pretraining strategy, in which a discriminative encoder is first trained via discriminative learning, the pretrained discriminative encoder is then attached to a restorative decoder, forming a skip-connected encoder-decoder, for further joint discriminative and restorative learning, and finally, the pretrained encoder-decoder is associated with an adversarial encoder for final full discriminative, restorative, and adversarial learning. Our extensive experiments demonstrate that the stepwise incremental pretraining stabilizes United models training, resulting in significant performance gains and annotation cost reduction via transfer learning for five target tasks, encompassing both classification and segmentation, across diseases, organs, datasets, and modalities. This performance is attributed to the synergy of the three SSL ingredients in our United framework unleashed via stepwise incremental pretraining. All codes and pretrained models are available at GitHub.com/JLiangLab/StepwisePretraining.
{"title":"Discriminative, Restorative, and Adversarial Learning: Stepwise Incremental Pretraining.","authors":"Zuwei Guo, Nahid Ui Islam, Michael B Gotway, Jianming Liang","doi":"10.1007/978-3-031-16852-9_7","DOIUrl":"10.1007/978-3-031-16852-9_7","url":null,"abstract":"<p><p>Uniting three self-supervised learning (SSL) ingredients (discriminative, restorative, and adversarial learning) enables collaborative representation learning and yields three transferable components: a discriminative encoder, a restorative decoder, and an adversary encoder. To leverage this advantage, we have redesigned five prominent SSL methods, including Rotation, Jigsaw, Rubik's Cube, Deep Clustering, and TransVW, and formulated each in a <i>United</i> framework for 3D medical imaging. However, such a United framework increases model complexity and pretraining difficulty. To overcome this difficulty, we develop a stepwise incremental pretraining strategy, in which a discriminative encoder is first trained via discriminative learning, the pretrained discriminative encoder is then attached to a restorative decoder, forming a skip-connected encoder-decoder, for further joint discriminative and restorative learning, and finally, the pretrained encoder-decoder is associated with an adversarial encoder for final full discriminative, restorative, and adversarial learning. Our extensive experiments demonstrate that the stepwise incremental pretraining stabilizes United models training, resulting in significant performance gains and annotation cost reduction via transfer learning for five target tasks, encompassing both classification and segmentation, across diseases, organs, datasets, and modalities. This performance is attributed to the synergy of the three SSL ingredients in our United framework unleashed via stepwise incremental pretraining. All codes and pretrained models are available at GitHub.com/JLiangLab/StepwisePretraining.</p>","PeriodicalId":72837,"journal":{"name":"Domain adaptation and representation transfer : 4th MICCAI Workshop, DART 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. Domain Adaptation and Representation Transfer (Workshop) (4th : 2022 : Sin...","volume":"13542 ","pages":"66-76"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728134/pdf/nihms-1846234.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10729956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01Epub Date: 2022-09-15DOI: 10.1007/978-3-031-16852-9_2
DongAo Ma, Mohammad Reza Hosseinzadeh Taher, Jiaxuan Pang, Nahid Ui Islam, Fatemeh Haghighi, Michael B Gotway, Jianming Liang
Visual transformers have recently gained popularity in the computer vision community as they began to outrank convolutional neural networks (CNNs) in one representative visual benchmark after another. However, the competition between visual transformers and CNNs in medical imaging is rarely studied, leaving many important questions unanswered. As the first step, we benchmark how well existing transformer variants that use various (supervised and self-supervised) pre-training methods perform against CNNs on a variety of medical classification tasks. Furthermore, given the data-hungry nature of transformers and the annotation-deficiency challenge of medical imaging, we present a practical approach for bridging the domain gap between photographic and medical images by utilizing unlabeled large-scale in-domain data. Our extensive empirical evaluations reveal the following insights in medical imaging: (1) good initialization is more crucial for transformer-based models than for CNNs, (2) self-supervised learning based on masked image modeling captures more generalizable representations than supervised models, and (3) assembling a larger-scale domain-specific dataset can better bridge the domain gap between photographic and medical images via self-supervised continuous pre-training. We hope this benchmark study can direct future research on applying transformers to medical imaging analysis. All codes and pre-trained models are available on our GitHub page https://github.com/JLiangLab/BenchmarkTransformers.
{"title":"Benchmarking and Boosting Transformers for Medical Image Classification.","authors":"DongAo Ma, Mohammad Reza Hosseinzadeh Taher, Jiaxuan Pang, Nahid Ui Islam, Fatemeh Haghighi, Michael B Gotway, Jianming Liang","doi":"10.1007/978-3-031-16852-9_2","DOIUrl":"10.1007/978-3-031-16852-9_2","url":null,"abstract":"<p><p>Visual transformers have recently gained popularity in the computer vision community as they began to outrank convolutional neural networks (CNNs) in one representative visual benchmark after another. However, the competition between visual transformers and CNNs in medical imaging is rarely studied, leaving many important questions unanswered. As the first step, we benchmark how well existing transformer variants that use various (supervised and self-supervised) pre-training methods perform against CNNs on a variety of medical classification tasks. Furthermore, given the data-hungry nature of transformers and the annotation-deficiency challenge of medical imaging, we present a practical approach for bridging the domain gap between photographic and medical images by utilizing unlabeled large-scale in-domain data. Our extensive empirical evaluations reveal the following insights in medical imaging: (1) good initialization is more crucial for transformer-based models than for CNNs, (2) self-supervised learning based on masked image modeling captures more generalizable representations than supervised models, and (3) assembling a larger-scale domain-specific dataset can better bridge the domain gap between photographic and medical images via self-supervised continuous pre-training. We hope this benchmark study can direct future research on applying transformers to medical imaging analysis. All codes and pre-trained models are available on our GitHub page https://github.com/JLiangLab/BenchmarkTransformers.</p>","PeriodicalId":72837,"journal":{"name":"Domain adaptation and representation transfer : 4th MICCAI Workshop, DART 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. Domain Adaptation and Representation Transfer (Workshop) (4th : 2022 : Sin...","volume":" ","pages":"12-22"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9646404/pdf/nihms-1846236.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40490559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1007/978-3-031-16852-9
{"title":"Domain Adaptation and Representation Transfer: 4th MICCAI Workshop, DART 2022, Held in Conjunction with MICCAI 2022, Singapore, September 22, 2022, Proceedings","authors":"","doi":"10.1007/978-3-031-16852-9","DOIUrl":"https://doi.org/10.1007/978-3-031-16852-9","url":null,"abstract":"","PeriodicalId":72837,"journal":{"name":"Domain adaptation and representation transfer : 4th MICCAI Workshop, DART 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. Domain Adaptation and Representation Transfer (Workshop) (4th : 2022 : Sin...","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78895976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Domain adaptation and representation transfer : 4th MICCAI Workshop, DART 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. Domain Adaptation and Representation Transfer (Workshop) (4th : 2022 : Sin...