{"title":"OmniFuse: A general modality fusion framework for multi-modality learning on low-quality medical data","authors":"Yixuan Wu, Jintai Chen, Lianting Hu, Hongxia Xu, Huiying Liang, Jian Wu","doi":"10.1016/j.inffus.2024.102890","DOIUrl":null,"url":null,"abstract":"Mirroring the practice of human medical experts, the integration of diverse medical examination modalities enhances the performance of predictive models in clinical settings. However, traditional multi-modal learning systems face significant challenges when dealing with low-quality medical data, which is common due to factors such as inconsistent data collection across multiple sites and varying sensor resolutions, as well as information loss due to poor data management. To address these issues, in this paper, we identify and explore three core technical challenges surrounding multi-modal learning on low-quality medical data: (i) the absence of informative modalities, (ii) imbalanced clinically useful information across modalities, and (iii) the entanglement of valuable information with noise in the data. To fully harness the potential of multi-modal low-quality data for automated high-precision disease diagnosis, we propose a general medical multi-modality learning framework that addresses these three core challenges on varying medical scenarios involving multiple modalities. To compensate for the absence of informative modalities, we utilize existing modalities to selectively integrate valuable information and then perform imputation, which is effective even in extreme absence scenarios. For the issue of modality information imbalance, we explicitly quantify the relationships between different modalities for individual samples, ensuring that the effective information from advantageous modalities is fully utilized. Moreover, to mitigate the conflation of information with noise, our framework traceably identifies and activates lazy modality combinations to eliminate noise and enhance data quality. Extensive experiments demonstrate the superiority and broad applicability of our framework. In predicting in-hospital mortality using joint EHR, Chest X-ray, and Report dara, our framework surpasses existing methods, improving the AUROC from 0.811 to 0.872. When applied to lung cancer pathological subtyping using PET, CT, and Report data, our approach achieves an impressive AUROC of 0.894.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"6 1","pages":""},"PeriodicalIF":14.7000,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1016/j.inffus.2024.102890","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Mirroring the practice of human medical experts, the integration of diverse medical examination modalities enhances the performance of predictive models in clinical settings. However, traditional multi-modal learning systems face significant challenges when dealing with low-quality medical data, which is common due to factors such as inconsistent data collection across multiple sites and varying sensor resolutions, as well as information loss due to poor data management. To address these issues, in this paper, we identify and explore three core technical challenges surrounding multi-modal learning on low-quality medical data: (i) the absence of informative modalities, (ii) imbalanced clinically useful information across modalities, and (iii) the entanglement of valuable information with noise in the data. To fully harness the potential of multi-modal low-quality data for automated high-precision disease diagnosis, we propose a general medical multi-modality learning framework that addresses these three core challenges on varying medical scenarios involving multiple modalities. To compensate for the absence of informative modalities, we utilize existing modalities to selectively integrate valuable information and then perform imputation, which is effective even in extreme absence scenarios. For the issue of modality information imbalance, we explicitly quantify the relationships between different modalities for individual samples, ensuring that the effective information from advantageous modalities is fully utilized. Moreover, to mitigate the conflation of information with noise, our framework traceably identifies and activates lazy modality combinations to eliminate noise and enhance data quality. Extensive experiments demonstrate the superiority and broad applicability of our framework. In predicting in-hospital mortality using joint EHR, Chest X-ray, and Report dara, our framework surpasses existing methods, improving the AUROC from 0.811 to 0.872. When applied to lung cancer pathological subtyping using PET, CT, and Report data, our approach achieves an impressive AUROC of 0.894.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.