Xueting Ren, Guohua Ji, Surong Chu, Shinichi Yoshida, Juanjuan Zhao, Baoping Jia, Yan Qiang
{"title":"A multimodal similarity-aware and knowledge-driven pre-training approach for reliable pneumoconiosis diagnosis.","authors":"Xueting Ren, Guohua Ji, Surong Chu, Shinichi Yoshida, Juanjuan Zhao, Baoping Jia, Yan Qiang","doi":"10.1177/08953996241296400","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Pneumoconiosis staging is challenging due to the low clarity of X-ray images and the small, diffuse nature of the lesions. Additionally, the scarcity of annotated data makes it difficult to develop accurate staging models. Although clinical text reports provide valuable contextual information, existing works primarily focus on designing multimodal image-text contrastive learning tasks, neglecting the high similarity of pneumoconiosis imaging representations. This results in inadequate extraction of fine-grained multimodal information and underutilization of domain knowledge, limiting their application in medical tasks.</p><p><strong>Objective: </strong>The study aims to address the limitations of current multimodal methods by proposing a new approach that improves the precision of pneumoconiosis diagnosis and staging through enhanced fine-grained learning and better utilization of domain knowledge.</p><p><strong>Methods: </strong>The proposed <b>M</b>ultimodal <b>S</b>imilarity-aware and <b>K</b>nowledge-driven <b>P</b>re-<b>T</b>raining (MSK-PT) approach involves two stages. In the first stage, we deeply analyze the similar features of pneumoconiosis images and use a similarity-aware modality alignment strategy to explore the fine-grained representations and associated disturbances of pneumoconiosis lesions between images and texts, guiding the model to match more appropriate feature representations. In the second stage, we utilize data-associated features and pre-stored domain knowledge features as priors and constraints to guide the downstream model in the visual domain without annotations. To address potential erroneous labels generated by model predictions, we further introduce an uncertainty threshold strategy to mitigate the negative impact of imperfect prediction labels and enhance model interpretability.</p><p><strong>Results: </strong>We collected and created the pneumoconiosis chest X-ray (PneumoCXR) dataset to evaluate our proposed MSK-PT method. The experimental results show that our method achieved a classification accuracy of 81.73%, outperforming the state-of-the-art algorithms by 2.53%.</p><p><strong>Conclusions: </strong>MSK-PT showed diagnostic performance that matches or exceeds the average radiologist's level, even with limited labeled data, highlighting the method's effectiveness and robustness.</p>","PeriodicalId":49948,"journal":{"name":"Journal of X-Ray Science and Technology","volume":" ","pages":"229-248"},"PeriodicalIF":1.7000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of X-Ray Science and Technology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/08953996241296400","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/13 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Pneumoconiosis staging is challenging due to the low clarity of X-ray images and the small, diffuse nature of the lesions. Additionally, the scarcity of annotated data makes it difficult to develop accurate staging models. Although clinical text reports provide valuable contextual information, existing works primarily focus on designing multimodal image-text contrastive learning tasks, neglecting the high similarity of pneumoconiosis imaging representations. This results in inadequate extraction of fine-grained multimodal information and underutilization of domain knowledge, limiting their application in medical tasks.
Objective: The study aims to address the limitations of current multimodal methods by proposing a new approach that improves the precision of pneumoconiosis diagnosis and staging through enhanced fine-grained learning and better utilization of domain knowledge.
Methods: The proposed Multimodal Similarity-aware and Knowledge-driven Pre-Training (MSK-PT) approach involves two stages. In the first stage, we deeply analyze the similar features of pneumoconiosis images and use a similarity-aware modality alignment strategy to explore the fine-grained representations and associated disturbances of pneumoconiosis lesions between images and texts, guiding the model to match more appropriate feature representations. In the second stage, we utilize data-associated features and pre-stored domain knowledge features as priors and constraints to guide the downstream model in the visual domain without annotations. To address potential erroneous labels generated by model predictions, we further introduce an uncertainty threshold strategy to mitigate the negative impact of imperfect prediction labels and enhance model interpretability.
Results: We collected and created the pneumoconiosis chest X-ray (PneumoCXR) dataset to evaluate our proposed MSK-PT method. The experimental results show that our method achieved a classification accuracy of 81.73%, outperforming the state-of-the-art algorithms by 2.53%.
Conclusions: MSK-PT showed diagnostic performance that matches or exceeds the average radiologist's level, even with limited labeled data, highlighting the method's effectiveness and robustness.
期刊介绍:
Research areas within the scope of the journal include:
Interaction of x-rays with matter: x-ray phenomena, biological effects of radiation, radiation safety and optical constants
X-ray sources: x-rays from synchrotrons, x-ray lasers, plasmas, and other sources, conventional or unconventional
Optical elements: grazing incidence optics, multilayer mirrors, zone plates, gratings, other diffraction optics
Optical instruments: interferometers, spectrometers, microscopes, telescopes, microprobes