Felix Krones , Umar Marikkar , Guy Parsons , Adam Szmul , Adam Mahdi
{"title":"Review of multimodal machine learning approaches in healthcare","authors":"Felix Krones , Umar Marikkar , Guy Parsons , Adam Szmul , Adam Mahdi","doi":"10.1016/j.inffus.2024.102690","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning methods in healthcare have traditionally focused on using data from a single modality, limiting their ability to effectively replicate the clinical practice of integrating multiple sources of information for improved decision making. Clinicians typically rely on a variety of data sources including patients’ demographic information, laboratory data, vital signs and various imaging data modalities to make informed decisions and contextualise their findings. Recent advances in machine learning have facilitated the more efficient incorporation of multimodal data, resulting in applications that better represent the clinician’s approach. Here, we provide an overview of multimodal machine learning approaches in healthcare, encompassing various data modalities commonly used in clinical diagnoses, such as imaging, text, time series and tabular data. We discuss key stages of model development, including pre-training, fine-tuning and evaluation. Additionally, we explore common data fusion approaches used in modelling, highlighting their advantages and performance challenges. An overview is provided of 17 multimodal clinical datasets with detailed description of the specific data modalities used in each dataset. Over 50 studies have been reviewed, with a predominant focus on the integration of imaging and tabular data. While multimodal techniques have shown potential in improving predictive accuracy across many healthcare areas, our review highlights that the effectiveness of a method is contingent upon the specific data and task at hand.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102690"},"PeriodicalIF":14.7000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1566253524004688/pdfft?md5=c13f0b2819a78d412d45575c042d7e61&pid=1-s2.0-S1566253524004688-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524004688","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning methods in healthcare have traditionally focused on using data from a single modality, limiting their ability to effectively replicate the clinical practice of integrating multiple sources of information for improved decision making. Clinicians typically rely on a variety of data sources including patients’ demographic information, laboratory data, vital signs and various imaging data modalities to make informed decisions and contextualise their findings. Recent advances in machine learning have facilitated the more efficient incorporation of multimodal data, resulting in applications that better represent the clinician’s approach. Here, we provide an overview of multimodal machine learning approaches in healthcare, encompassing various data modalities commonly used in clinical diagnoses, such as imaging, text, time series and tabular data. We discuss key stages of model development, including pre-training, fine-tuning and evaluation. Additionally, we explore common data fusion approaches used in modelling, highlighting their advantages and performance challenges. An overview is provided of 17 multimodal clinical datasets with detailed description of the specific data modalities used in each dataset. Over 50 studies have been reviewed, with a predominant focus on the integration of imaging and tabular data. While multimodal techniques have shown potential in improving predictive accuracy across many healthcare areas, our review highlights that the effectiveness of a method is contingent upon the specific data and task at hand.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.