Fan Fan, Georgia Martinez, Thomas DeSilvio, John Shin, Yijiang Chen, Jackson Jacobs, Bangchen Wang, Takaya Ozeki, Maxime W. Lafarge, Viktor H. Koelzer, Laura Barisoni, Anant Madabhushi, Satish E. Viswanath, Andrew Janowczyk
{"title":"CohortFinder:一种开源工具,用于对数字病理学和成像队列进行数据驱动的分区,以建立强大的机器学习模型","authors":"Fan Fan, Georgia Martinez, Thomas DeSilvio, John Shin, Yijiang Chen, Jackson Jacobs, Bangchen Wang, Takaya Ozeki, Maxime W. Lafarge, Viktor H. Koelzer, Laura Barisoni, Anant Madabhushi, Satish E. Viswanath, Andrew Janowczyk","doi":"10.1038/s44303-024-00018-2","DOIUrl":null,"url":null,"abstract":"Batch effects (BEs) refer to systematic technical differences in data collection unrelated to biological variations whose noise is shown to negatively impact machine learning (ML) model generalizability. Here we release CohortFinder ( http://cohortfinder.com ), an open-source tool aimed at mitigating BEs via data-driven cohort partitioning. We demonstrate CohortFinder improves ML model performance in downstream digital pathology and medical image processing tasks. CohortFinder is freely available for download at cohortfinder.com.","PeriodicalId":501709,"journal":{"name":"npj Imaging","volume":" ","pages":"1-7"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s44303-024-00018-2.pdf","citationCount":"0","resultStr":"{\"title\":\"CohortFinder: an open-source tool for data-driven partitioning of digital pathology and imaging cohorts to yield robust machine-learning models\",\"authors\":\"Fan Fan, Georgia Martinez, Thomas DeSilvio, John Shin, Yijiang Chen, Jackson Jacobs, Bangchen Wang, Takaya Ozeki, Maxime W. Lafarge, Viktor H. Koelzer, Laura Barisoni, Anant Madabhushi, Satish E. Viswanath, Andrew Janowczyk\",\"doi\":\"10.1038/s44303-024-00018-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Batch effects (BEs) refer to systematic technical differences in data collection unrelated to biological variations whose noise is shown to negatively impact machine learning (ML) model generalizability. Here we release CohortFinder ( http://cohortfinder.com ), an open-source tool aimed at mitigating BEs via data-driven cohort partitioning. We demonstrate CohortFinder improves ML model performance in downstream digital pathology and medical image processing tasks. CohortFinder is freely available for download at cohortfinder.com.\",\"PeriodicalId\":501709,\"journal\":{\"name\":\"npj Imaging\",\"volume\":\" \",\"pages\":\"1-7\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.nature.com/articles/s44303-024-00018-2.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"npj Imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.nature.com/articles/s44303-024-00018-2\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"npj Imaging","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s44303-024-00018-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CohortFinder: an open-source tool for data-driven partitioning of digital pathology and imaging cohorts to yield robust machine-learning models
Batch effects (BEs) refer to systematic technical differences in data collection unrelated to biological variations whose noise is shown to negatively impact machine learning (ML) model generalizability. Here we release CohortFinder ( http://cohortfinder.com ), an open-source tool aimed at mitigating BEs via data-driven cohort partitioning. We demonstrate CohortFinder improves ML model performance in downstream digital pathology and medical image processing tasks. CohortFinder is freely available for download at cohortfinder.com.