Preconditioning of clinical data for intraocular lens formula constant optimisation using Random Forest Quantile Regression Trees

IF 2.4 4区 医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Zeitschrift fur Medizinische Physik Pub Date : 2024-11-01 DOI:10.1016/j.zemedi.2022.11.009
Achim Langenbucher , Nóra Szentmáry , Alan Cayless , Jascha Wendelstein , Peter Hoffmann
{"title":"Preconditioning of clinical data for intraocular lens formula constant optimisation using Random Forest Quantile Regression Trees","authors":"Achim Langenbucher ,&nbsp;Nóra Szentmáry ,&nbsp;Alan Cayless ,&nbsp;Jascha Wendelstein ,&nbsp;Peter Hoffmann","doi":"10.1016/j.zemedi.2022.11.009","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>To implement a fully data driven strategy for identifying outliers in clinical datasets used for formula constant optimisation, in order to achieve proper formula predicted refraction after cataract surgery, and to assess the capabilities of this outlier detection method.</div></div><div><h3>Methods</h3><div>2 clinical datasets (DS1/DS2: N = 888/403) of eyes treated with a monofocal aspherical intraocular lens (Hoya XY1/Johnson&amp;Johnson Vision Z9003) containing preoperative biometric data, power of the lens implant and postoperative spherical equivalent (SEQ) were transferred to us for formula constant optimisation. Original datasets were used to generate baseline formula constants. A random forest quantile regression algorithm was set up using bootstrap resampling with replacement. Quantile regression trees were grown and the 25% and 75% quantile, and the interquartile range were extracted from SEQ and formula predicted refraction REF for the SRKT, Haigis and Castrop formulae. Fences were defined from the quantiles and data points outside the fences were marked and removed as outliers before recalculating the formula constants.</div></div><div><h3>Results</h3><div>N<sub>B</sub> = 1000 bootstrap samples were derived from both datasets, and random forest quantile regression trees were grown to model SEQ versus REF and to estimate the median and 25% and 75% quantiles. The fence boundaries were defined as being from 25% quantile - 1.5·IQR to 75% quantile + 1.5·IQR, with data points outside the fence being marked as outliers. In total, for DS1 and DS2, 25/27/32 and 4/5/4 data points were identified as outliers for the SRKT/Haigis/Castrop formulae respectively. The respective root mean squared formula prediction errors for the three formulae were slightly reduced from: 0.4370 dpt;0.4449 dpt/0.3625 dpt;0.4056 dpt/and 0.3376 dpt;0.3532 dpt to: 0.4271 dpt;0.4348 dpt/0.3528 dpt;0.3952 dpt/0.3277 dpt;0.3432 dpt for DS1;DS2.</div></div><div><h3>Conclusion</h3><div>We were able to prove that with random forest quantile regression trees a fully data driven outlier identification strategy acting in the response space is achievable. In a real life scenario this strategy has to be complemented by an outlier identification method acting in the parameter space for a proper qualification of datasets prior to formula constant optimisation.</div></div>","PeriodicalId":54397,"journal":{"name":"Zeitschrift fur Medizinische Physik","volume":"34 4","pages":"Pages 632-640"},"PeriodicalIF":2.4000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Zeitschrift fur Medizinische Physik","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0939388922001295","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose

To implement a fully data driven strategy for identifying outliers in clinical datasets used for formula constant optimisation, in order to achieve proper formula predicted refraction after cataract surgery, and to assess the capabilities of this outlier detection method.

Methods

2 clinical datasets (DS1/DS2: N = 888/403) of eyes treated with a monofocal aspherical intraocular lens (Hoya XY1/Johnson&Johnson Vision Z9003) containing preoperative biometric data, power of the lens implant and postoperative spherical equivalent (SEQ) were transferred to us for formula constant optimisation. Original datasets were used to generate baseline formula constants. A random forest quantile regression algorithm was set up using bootstrap resampling with replacement. Quantile regression trees were grown and the 25% and 75% quantile, and the interquartile range were extracted from SEQ and formula predicted refraction REF for the SRKT, Haigis and Castrop formulae. Fences were defined from the quantiles and data points outside the fences were marked and removed as outliers before recalculating the formula constants.

Results

NB = 1000 bootstrap samples were derived from both datasets, and random forest quantile regression trees were grown to model SEQ versus REF and to estimate the median and 25% and 75% quantiles. The fence boundaries were defined as being from 25% quantile - 1.5·IQR to 75% quantile + 1.5·IQR, with data points outside the fence being marked as outliers. In total, for DS1 and DS2, 25/27/32 and 4/5/4 data points were identified as outliers for the SRKT/Haigis/Castrop formulae respectively. The respective root mean squared formula prediction errors for the three formulae were slightly reduced from: 0.4370 dpt;0.4449 dpt/0.3625 dpt;0.4056 dpt/and 0.3376 dpt;0.3532 dpt to: 0.4271 dpt;0.4348 dpt/0.3528 dpt;0.3952 dpt/0.3277 dpt;0.3432 dpt for DS1;DS2.

Conclusion

We were able to prove that with random forest quantile regression trees a fully data driven outlier identification strategy acting in the response space is achievable. In a real life scenario this strategy has to be complemented by an outlier identification method acting in the parameter space for a proper qualification of datasets prior to formula constant optimisation.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用随机森林定量回归树为优化眼内透镜配方常数的临床数据进行预处理。
目的:在用于公式常数优化的临床数据集中实施一种完全数据驱动的异常值识别策略,以便在白内障手术后获得适当的公式预测屈光度,并评估这种异常值检测方法的能力。方法:我们获得了两个临床数据集(DS1/DS2:N = 888/403),这些数据集包含术前生物测量数据、晶体植入功率和术后球面等值(SEQ),这些数据集使用单焦点非球面眼内透镜(Hoya XY1/Johnson&Johnson Vision Z9003)进行治疗。原始数据集用于生成基线公式常数。使用带替换的引导重采样建立了随机森林量子回归算法。生成量值回归树,并从 SEQ 和 SRKT、Haigis 和 Castrop 公式的预测折射率 REF 中提取 25% 和 75% 的量值以及四分位数间范围。在重新计算公式常量之前,根据量值定义栅栏,并将栅栏外的数据点标记为异常值并清除:从两个数据集中获得了 NB = 1000 个引导样本,并建立了随机森林量化回归树,以模拟 SEQ 与 REF 的关系,并估算出中位数、25% 和 75% 的量化值。栅栏边界被定义为从 25% 量值-1.5-IQR 到 75% 量值+1.5-IQR,栅栏外的数据点被标记为异常值。对于 DS1 和 DS2,SRKT/Haigis/Castrop 公式分别将 25/27/32 和 4/5/4 个数据点确定为离群值。三个公式的均方根公式预测误差分别从 0.4370 dpt;0.4449 dpt/0.3625 dpt;0.4056 dpt/ 和 0.3376 dpt;0.3532 dpt 略微降低到 0.4271 dpt;0.4271 dpt;0.4271 dpt:DS1 和 DS2 的 0.4271 dpt;0.4348 dpt/0.3528 dpt;0.3952 dpt/0.3277 dpt;0.3432 dpt:我们能够证明,利用随机森林量化回归树,可以在响应空间中实现完全由数据驱动的离群值识别策略。在现实生活中,这一策略必须辅以参数空间中的离群值识别方法,以便在公式常数优化之前对数据集进行适当的鉴定。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
3.70
自引率
10.00%
发文量
69
审稿时长
65 days
期刊介绍: Zeitschrift fur Medizinische Physik (Journal of Medical Physics) is an official organ of the German and Austrian Society of Medical Physic and the Swiss Society of Radiobiology and Medical Physics.The Journal is a platform for basic research and practical applications of physical procedures in medical diagnostics and therapy. The articles are reviewed following international standards of peer reviewing. Focuses of the articles are: -Biophysical methods in radiation therapy and nuclear medicine -Dosimetry and radiation protection -Radiological diagnostics and quality assurance -Modern imaging techniques, such as computed tomography, magnetic resonance imaging, positron emission tomography -Ultrasonography diagnostics, application of laser and UV rays -Electronic processing of biosignals -Artificial intelligence and machine learning in medical physics In the Journal, the latest scientific insights find their expression in the form of original articles, reviews, technical communications, and information for the clinical practice.
期刊最新文献
Editorial Board Contents Source-detector trajectory optimization for CBCT metal artifact reduction based on PICCS reconstruction Reduction of patient specific quality assurance through plan complexity metrics for VMAT plans with an open-source TPS script Post-mastectomy radiotherapy: Impact of bolus thickness and irradiation technique on skin dose
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1