PROSurvival: A Technical Case Report on Creating and Publishing a Dataset for Federated Learning on Survival Prediction of Prostate Cancer Patients.

Tingyan Xu, Timo Wolters, Johannes Lotz, Tom Bisson, Tim-Rasmus Kiehl, Nadine Flinner, Norman Zerbe, Marco Eichelberg
{"title":"PROSurvival: A Technical Case Report on Creating and Publishing a Dataset for Federated Learning on Survival Prediction of Prostate Cancer Patients.","authors":"Tingyan Xu, Timo Wolters, Johannes Lotz, Tom Bisson, Tim-Rasmus Kiehl, Nadine Flinner, Norman Zerbe, Marco Eichelberg","doi":"10.3233/SHTI241096","DOIUrl":null,"url":null,"abstract":"<p><p>The PROSurvival project aims to improve the prediction of recurrence-free survival in prostate cancer by applying federated machine learning to whole slide images combined with selected clinical data. Both the image and clinical data will be aggregated into an anonymized dataset compliant with the General Data Protection Regulation and published under the principles of findable, accessible, interoperable, and reusable data. The DICOM standard will be used for the image data. For the accompanying clinical data, a human-readable, compact and flexible standard is yet to be defined. From the set of existing standards, mostly extendable with varying degrees of modifications, we chose oBDS as a starting point and modified it to include missing data points and to remove mandatory items not applicable to our dataset. Clinical and survival data from clinic-specific spreadsheets were converted into this modified standard, ensuring on-site data privacy during processing. For publication of the dataset, both image and clinical data are anonymized using established methods. The key challenges arose during the clinical data anonymization and in identifying research repositories meeting all of our requirements. Each clinic had to coordinate the publication with their responsible data protection officers, requiring different approval processes due to the individual states' differing interpretations of the legal regulations. The newly established German Health Data Utilization Act is expected to simplify future data sharing in a responsible and powerful way.</p>","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"321 ","pages":"220-224"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI241096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The PROSurvival project aims to improve the prediction of recurrence-free survival in prostate cancer by applying federated machine learning to whole slide images combined with selected clinical data. Both the image and clinical data will be aggregated into an anonymized dataset compliant with the General Data Protection Regulation and published under the principles of findable, accessible, interoperable, and reusable data. The DICOM standard will be used for the image data. For the accompanying clinical data, a human-readable, compact and flexible standard is yet to be defined. From the set of existing standards, mostly extendable with varying degrees of modifications, we chose oBDS as a starting point and modified it to include missing data points and to remove mandatory items not applicable to our dataset. Clinical and survival data from clinic-specific spreadsheets were converted into this modified standard, ensuring on-site data privacy during processing. For publication of the dataset, both image and clinical data are anonymized using established methods. The key challenges arose during the clinical data anonymization and in identifying research repositories meeting all of our requirements. Each clinic had to coordinate the publication with their responsible data protection officers, requiring different approval processes due to the individual states' differing interpretations of the legal regulations. The newly established German Health Data Utilization Act is expected to simplify future data sharing in a responsible and powerful way.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PROSurvival:关于创建和发布前列腺癌患者生存预测联合学习数据集的技术案例报告。
PROSurvival 项目旨在通过将联合机器学习应用于整张切片图像并结合选定的临床数据,改进对前列腺癌无复发生存期的预测。图像和临床数据都将汇总成一个符合《通用数据保护条例》的匿名数据集,并按照数据可查找、可访问、可互操作和可重复使用的原则进行发布。图像数据将使用 DICOM 标准。至于随附的临床数据,一个人类可读、紧凑和灵活的标准尚待定义。现有的标准大多可以通过不同程度的修改进行扩展,我们选择了 oBDS 作为起点,并对其进行了修改,以纳入缺失的数据点并删除不适用于我们数据集的必填项。诊所专用电子表格中的临床和生存数据被转换成了这一修改后的标准,从而确保了处理过程中的现场数据隐私。为便于数据集的发布,图像和临床数据都使用既定方法进行了匿名化处理。关键的挑战出现在临床数据匿名化过程中,以及在确定符合我们所有要求的研究资料库时。每个诊所都必须与负责数据保护的官员协调出版事宜,由于各州对法律法规的解释不同,因此需要不同的审批流程。新制定的《德国健康数据利用法》有望以负责任和强有力的方式简化未来的数据共享。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
PROSurvival: A Technical Case Report on Creating and Publishing a Dataset for Federated Learning on Survival Prediction of Prostate Cancer Patients. Survival Stacking Ensemble Model for Lung Cancer Risk Prediction. The Creation of Intensional Medication Lists Using the NHS Dictionary of Medicines and Devices. Scaling up Environmental Governance in Precision Forestry. Securing a Generative AI-Powered Healthcare Chatbot.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1