{"title":"Distilling BlackBox to Interpretable Models for Efficient Transfer Learning.","authors":"Shantanu Ghosh, Ke Yu, Kayhan Batmanghelich","doi":"10.1007/978-3-031-43895-0_59","DOIUrl":null,"url":null,"abstract":"<p><p>Building generalizable AI models is one of the primary challenges in the healthcare domain. While radiologists rely on generalizable descriptive rules of abnormality, Neural Network (NN) models suffer even with a slight shift in input distribution (<i>e.g</i>., scanner type). Fine-tuning a model to transfer knowledge from one domain to another requires a significant amount of labeled data in the target domain. In this paper, we develop an interpretable model that can be efficiently fine-tuned to an unseen target domain with minimal computational cost. We assume the interpretable component of NN to be approximately domain-invariant. However, interpretable models typically underperform compared to their Blackbox (BB) variants. We start with a BB in the source domain and distill it into a <i>mixture</i> of shallow interpretable models using human-understandable concepts. As each interpretable model covers a subset of data, a mixture of interpretable models achieves comparable performance as BB. Further, we use the pseudo-labeling technique from semi-supervised learning (SSL) to learn the concept classifier in the target domain, followed by fine-tuning the interpretable models in the target domain. We evaluate our model using a real-life large-scale chest-X-ray (CXR) classification dataset. The code is available at: https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14221 ","pages":"628-638"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141113/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-43895-0_59","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Building generalizable AI models is one of the primary challenges in the healthcare domain. While radiologists rely on generalizable descriptive rules of abnormality, Neural Network (NN) models suffer even with a slight shift in input distribution (e.g., scanner type). Fine-tuning a model to transfer knowledge from one domain to another requires a significant amount of labeled data in the target domain. In this paper, we develop an interpretable model that can be efficiently fine-tuned to an unseen target domain with minimal computational cost. We assume the interpretable component of NN to be approximately domain-invariant. However, interpretable models typically underperform compared to their Blackbox (BB) variants. We start with a BB in the source domain and distill it into a mixture of shallow interpretable models using human-understandable concepts. As each interpretable model covers a subset of data, a mixture of interpretable models achieves comparable performance as BB. Further, we use the pseudo-labeling technique from semi-supervised learning (SSL) to learn the concept classifier in the target domain, followed by fine-tuning the interpretable models in the target domain. We evaluate our model using a real-life large-scale chest-X-ray (CXR) classification dataset. The code is available at: https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs.
建立可通用的人工智能模型是医疗保健领域的主要挑战之一。放射科医生依赖于可通用的异常描述规则,而神经网络(NN)模型即使在输入分布(如扫描仪类型)稍有变化的情况下也会受到影响。要对模型进行微调,将知识从一个领域转移到另一个领域,就需要在目标领域获得大量标注数据。在本文中,我们开发了一种可解释模型,它能以最小的计算成本高效地微调到未见过的目标领域。我们假设 NN 的可解释部分近似于域不变。然而,与黑盒(BB)变体相比,可解释模型通常表现不佳。我们从源领域的 BB 开始,利用人类可理解的概念将其提炼为浅层可解释模型的混合物。由于每个可解释模型都涵盖了数据的一个子集,因此可解释模型的混合物可以达到与黑箱模型相当的性能。此外,我们使用半监督学习(SSL)中的伪标记技术来学习目标领域中的概念分类器,然后对目标领域中的可解释模型进行微调。我们使用现实生活中的大规模胸透(CXR)分类数据集对我们的模型进行了评估。代码见:https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs。