{"title":"BEEx是一个开源工具,用于评估医学图像中的批处理效果,以实现多中心研究。","authors":"Yuxin Wu, Xiongjun Xu, Yuan Cheng, Xiuming Zhang, Fanxi Liu, Zhenhui Li, Lei Hu, Anant Madabhushi, Peng Gao, Zaiyi Liu, Cheng Lu","doi":"10.1158/0008-5472.CAN-23-3846","DOIUrl":null,"url":null,"abstract":"<p><p>The batch effect is a nonbiological variation that arises from technical differences across different batches of data during the data generation process for acquisition-related reasons, such as collection of images at different sites or using different scanners. This phenomenon can affect the robustness and generalizability of computational pathology- or radiology-based cancer diagnostic models, especially in multicenter studies. To address this issue, we developed an open-source platform, Batch Effect Explorer (BEEx), that is designed to qualitatively and quantitatively determine whether batch effects exist among medical image datasets from different sites. A suite of tools was incorporated into BEEx that provide visualization and quantitative metrics based on intensity, gradient, and texture features to allow users to determine whether there are any image variables or combinations of variables that can distinguish datasets from different sites in an unsupervised manner. BEEx was designed to support various medical imaging techniques, including microscopy and radiology. Four use cases clearly demonstrated the ability of BEEx to identify batch effects and validated the effectiveness of rectification methods for batch effect reduction. Overall, BEEx is a scalable and versatile framework designed to read, process, and analyze a wide range of medical images to facilitate the identification and mitigation of batch effects, which can enhance the reliability and validity of image-based studies. Significance: BEEx is a prescreening tool for image-based analyses that allows researchers to evaluate batch effects in multicenter studies and determine their origin and magnitude to facilitate development of accurate AI-based cancer models.</p>","PeriodicalId":9441,"journal":{"name":"Cancer research","volume":" ","pages":"218-230"},"PeriodicalIF":12.5000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735318/pdf/","citationCount":"0","resultStr":"{\"title\":\"BEEx Is an Open-Source Tool That Evaluates Batch Effects in Medical Images to Enable Multicenter Studies.\",\"authors\":\"Yuxin Wu, Xiongjun Xu, Yuan Cheng, Xiuming Zhang, Fanxi Liu, Zhenhui Li, Lei Hu, Anant Madabhushi, Peng Gao, Zaiyi Liu, Cheng Lu\",\"doi\":\"10.1158/0008-5472.CAN-23-3846\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The batch effect is a nonbiological variation that arises from technical differences across different batches of data during the data generation process for acquisition-related reasons, such as collection of images at different sites or using different scanners. This phenomenon can affect the robustness and generalizability of computational pathology- or radiology-based cancer diagnostic models, especially in multicenter studies. To address this issue, we developed an open-source platform, Batch Effect Explorer (BEEx), that is designed to qualitatively and quantitatively determine whether batch effects exist among medical image datasets from different sites. A suite of tools was incorporated into BEEx that provide visualization and quantitative metrics based on intensity, gradient, and texture features to allow users to determine whether there are any image variables or combinations of variables that can distinguish datasets from different sites in an unsupervised manner. BEEx was designed to support various medical imaging techniques, including microscopy and radiology. Four use cases clearly demonstrated the ability of BEEx to identify batch effects and validated the effectiveness of rectification methods for batch effect reduction. Overall, BEEx is a scalable and versatile framework designed to read, process, and analyze a wide range of medical images to facilitate the identification and mitigation of batch effects, which can enhance the reliability and validity of image-based studies. Significance: BEEx is a prescreening tool for image-based analyses that allows researchers to evaluate batch effects in multicenter studies and determine their origin and magnitude to facilitate development of accurate AI-based cancer models.</p>\",\"PeriodicalId\":9441,\"journal\":{\"name\":\"Cancer research\",\"volume\":\" \",\"pages\":\"218-230\"},\"PeriodicalIF\":12.5000,\"publicationDate\":\"2025-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735318/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cancer research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1158/0008-5472.CAN-23-3846\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1158/0008-5472.CAN-23-3846","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
批处理效应是一种非生物变异,它是由于与获取相关的原因(例如在不同地点收集图像或使用不同的扫描仪)在数据生成过程中不同批次数据之间的技术差异而产生的。这种现象会影响基于计算病理学或放射学的癌症诊断模型的稳健性和泛化性,特别是在多中心研究中。为了解决这个问题,我们开发了一个开源平台,Batch Effect Explorer (BEEx),旨在定性和定量地确定来自不同站点的医学图像数据集之间是否存在批处理效果。BEEx集成了一套工具,提供基于强度、梯度和纹理特征的可视化和定量指标,允许用户确定是否存在任何图像变量或变量组合,可以以无监督的方式区分来自不同站点的数据集。BEEx旨在支持各种医学成像技术,包括显微镜和放射学。四个用例清楚地展示了BEEx识别批量影响的能力,并验证了减少批量影响的纠正方法的有效性。总的来说,BEEx是一个可扩展和通用的框架,旨在读取、处理和分析广泛的医学图像,以促进识别和减轻批效应,这可以增强基于图像的研究的可靠性和有效性。
BEEx Is an Open-Source Tool That Evaluates Batch Effects in Medical Images to Enable Multicenter Studies.
The batch effect is a nonbiological variation that arises from technical differences across different batches of data during the data generation process for acquisition-related reasons, such as collection of images at different sites or using different scanners. This phenomenon can affect the robustness and generalizability of computational pathology- or radiology-based cancer diagnostic models, especially in multicenter studies. To address this issue, we developed an open-source platform, Batch Effect Explorer (BEEx), that is designed to qualitatively and quantitatively determine whether batch effects exist among medical image datasets from different sites. A suite of tools was incorporated into BEEx that provide visualization and quantitative metrics based on intensity, gradient, and texture features to allow users to determine whether there are any image variables or combinations of variables that can distinguish datasets from different sites in an unsupervised manner. BEEx was designed to support various medical imaging techniques, including microscopy and radiology. Four use cases clearly demonstrated the ability of BEEx to identify batch effects and validated the effectiveness of rectification methods for batch effect reduction. Overall, BEEx is a scalable and versatile framework designed to read, process, and analyze a wide range of medical images to facilitate the identification and mitigation of batch effects, which can enhance the reliability and validity of image-based studies. Significance: BEEx is a prescreening tool for image-based analyses that allows researchers to evaluate batch effects in multicenter studies and determine their origin and magnitude to facilitate development of accurate AI-based cancer models.
期刊介绍:
Cancer Research, published by the American Association for Cancer Research (AACR), is a journal that focuses on impactful original studies, reviews, and opinion pieces relevant to the broad cancer research community. Manuscripts that present conceptual or technological advances leading to insights into cancer biology are particularly sought after. The journal also places emphasis on convergence science, which involves bridging multiple distinct areas of cancer research.
With primary subsections including Cancer Biology, Cancer Immunology, Cancer Metabolism and Molecular Mechanisms, Translational Cancer Biology, Cancer Landscapes, and Convergence Science, Cancer Research has a comprehensive scope. It is published twice a month and has one volume per year, with a print ISSN of 0008-5472 and an online ISSN of 1538-7445.
Cancer Research is abstracted and/or indexed in various databases and platforms, including BIOSIS Previews (R) Database, MEDLINE, Current Contents/Life Sciences, Current Contents/Clinical Medicine, Science Citation Index, Scopus, and Web of Science.