Achim Langenbucher, Nóra Szentmáry, Alan Cayless, Jascha Wendelstein, Peter Hoffmann
{"title":"使用随机森林定量回归树为优化眼内透镜配方常数的临床数据进行预处理。","authors":"Achim Langenbucher, Nóra Szentmáry, Alan Cayless, Jascha Wendelstein, Peter Hoffmann","doi":"10.1016/j.zemedi.2022.11.009","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To implement a fully data driven strategy for identifying outliers in clinical datasets used for formula constant optimisation, in order to achieve proper formula predicted refraction after cataract surgery, and to assess the capabilities of this outlier detection method.</p><p><strong>Methods: </strong>2 clinical datasets (DS1/DS2: N = 888/403) of eyes treated with a monofocal aspherical intraocular lens (Hoya XY1/Johnson&Johnson Vision Z9003) containing preoperative biometric data, power of the lens implant and postoperative spherical equivalent (SEQ) were transferred to us for formula constant optimisation. Original datasets were used to generate baseline formula constants. A random forest quantile regression algorithm was set up using bootstrap resampling with replacement. Quantile regression trees were grown and the 25% and 75% quantile, and the interquartile range were extracted from SEQ and formula predicted refraction REF for the SRKT, Haigis and Castrop formulae. Fences were defined from the quantiles and data points outside the fences were marked and removed as outliers before recalculating the formula constants.</p><p><strong>Results: </strong>N<sub>B</sub> = 1000 bootstrap samples were derived from both datasets, and random forest quantile regression trees were grown to model SEQ versus REF and to estimate the median and 25% and 75% quantiles. The fence boundaries were defined as being from 25% quantile - 1.5·IQR to 75% quantile + 1.5·IQR, with data points outside the fence being marked as outliers. In total, for DS1 and DS2, 25/27/32 and 4/5/4 data points were identified as outliers for the SRKT/Haigis/Castrop formulae respectively. The respective root mean squared formula prediction errors for the three formulae were slightly reduced from: 0.4370 dpt;0.4449 dpt/0.3625 dpt;0.4056 dpt/and 0.3376 dpt;0.3532 dpt to: 0.4271 dpt;0.4348 dpt/0.3528 dpt;0.3952 dpt/0.3277 dpt;0.3432 dpt for DS1;DS2.</p><p><strong>Conclusion: </strong>We were able to prove that with random forest quantile regression trees a fully data driven outlier identification strategy acting in the response space is achievable. In a real life scenario this strategy has to be complemented by an outlier identification method acting in the parameter space for a proper qualification of datasets prior to formula constant optimisation.</p>","PeriodicalId":54397,"journal":{"name":"Zeitschrift fur Medizinische Physik","volume":" ","pages":"632-640"},"PeriodicalIF":2.4000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Preconditioning of clinical data for intraocular lens formula constant optimisation using Random Forest Quantile Regression Trees.\",\"authors\":\"Achim Langenbucher, Nóra Szentmáry, Alan Cayless, Jascha Wendelstein, Peter Hoffmann\",\"doi\":\"10.1016/j.zemedi.2022.11.009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>To implement a fully data driven strategy for identifying outliers in clinical datasets used for formula constant optimisation, in order to achieve proper formula predicted refraction after cataract surgery, and to assess the capabilities of this outlier detection method.</p><p><strong>Methods: </strong>2 clinical datasets (DS1/DS2: N = 888/403) of eyes treated with a monofocal aspherical intraocular lens (Hoya XY1/Johnson&Johnson Vision Z9003) containing preoperative biometric data, power of the lens implant and postoperative spherical equivalent (SEQ) were transferred to us for formula constant optimisation. Original datasets were used to generate baseline formula constants. A random forest quantile regression algorithm was set up using bootstrap resampling with replacement. Quantile regression trees were grown and the 25% and 75% quantile, and the interquartile range were extracted from SEQ and formula predicted refraction REF for the SRKT, Haigis and Castrop formulae. Fences were defined from the quantiles and data points outside the fences were marked and removed as outliers before recalculating the formula constants.</p><p><strong>Results: </strong>N<sub>B</sub> = 1000 bootstrap samples were derived from both datasets, and random forest quantile regression trees were grown to model SEQ versus REF and to estimate the median and 25% and 75% quantiles. The fence boundaries were defined as being from 25% quantile - 1.5·IQR to 75% quantile + 1.5·IQR, with data points outside the fence being marked as outliers. In total, for DS1 and DS2, 25/27/32 and 4/5/4 data points were identified as outliers for the SRKT/Haigis/Castrop formulae respectively. The respective root mean squared formula prediction errors for the three formulae were slightly reduced from: 0.4370 dpt;0.4449 dpt/0.3625 dpt;0.4056 dpt/and 0.3376 dpt;0.3532 dpt to: 0.4271 dpt;0.4348 dpt/0.3528 dpt;0.3952 dpt/0.3277 dpt;0.3432 dpt for DS1;DS2.</p><p><strong>Conclusion: </strong>We were able to prove that with random forest quantile regression trees a fully data driven outlier identification strategy acting in the response space is achievable. In a real life scenario this strategy has to be complemented by an outlier identification method acting in the parameter space for a proper qualification of datasets prior to formula constant optimisation.</p>\",\"PeriodicalId\":54397,\"journal\":{\"name\":\"Zeitschrift fur Medizinische Physik\",\"volume\":\" \",\"pages\":\"632-640\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Zeitschrift fur Medizinische Physik\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.zemedi.2022.11.009\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/2/20 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Zeitschrift fur Medizinische Physik","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.zemedi.2022.11.009","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/2/20 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
Preconditioning of clinical data for intraocular lens formula constant optimisation using Random Forest Quantile Regression Trees.
Purpose: To implement a fully data driven strategy for identifying outliers in clinical datasets used for formula constant optimisation, in order to achieve proper formula predicted refraction after cataract surgery, and to assess the capabilities of this outlier detection method.
Methods: 2 clinical datasets (DS1/DS2: N = 888/403) of eyes treated with a monofocal aspherical intraocular lens (Hoya XY1/Johnson&Johnson Vision Z9003) containing preoperative biometric data, power of the lens implant and postoperative spherical equivalent (SEQ) were transferred to us for formula constant optimisation. Original datasets were used to generate baseline formula constants. A random forest quantile regression algorithm was set up using bootstrap resampling with replacement. Quantile regression trees were grown and the 25% and 75% quantile, and the interquartile range were extracted from SEQ and formula predicted refraction REF for the SRKT, Haigis and Castrop formulae. Fences were defined from the quantiles and data points outside the fences were marked and removed as outliers before recalculating the formula constants.
Results: NB = 1000 bootstrap samples were derived from both datasets, and random forest quantile regression trees were grown to model SEQ versus REF and to estimate the median and 25% and 75% quantiles. The fence boundaries were defined as being from 25% quantile - 1.5·IQR to 75% quantile + 1.5·IQR, with data points outside the fence being marked as outliers. In total, for DS1 and DS2, 25/27/32 and 4/5/4 data points were identified as outliers for the SRKT/Haigis/Castrop formulae respectively. The respective root mean squared formula prediction errors for the three formulae were slightly reduced from: 0.4370 dpt;0.4449 dpt/0.3625 dpt;0.4056 dpt/and 0.3376 dpt;0.3532 dpt to: 0.4271 dpt;0.4348 dpt/0.3528 dpt;0.3952 dpt/0.3277 dpt;0.3432 dpt for DS1;DS2.
Conclusion: We were able to prove that with random forest quantile regression trees a fully data driven outlier identification strategy acting in the response space is achievable. In a real life scenario this strategy has to be complemented by an outlier identification method acting in the parameter space for a proper qualification of datasets prior to formula constant optimisation.
期刊介绍:
Zeitschrift fur Medizinische Physik (Journal of Medical Physics) is an official organ of the German and Austrian Society of Medical Physic and the Swiss Society of Radiobiology and Medical Physics.The Journal is a platform for basic research and practical applications of physical procedures in medical diagnostics and therapy. The articles are reviewed following international standards of peer reviewing.
Focuses of the articles are:
-Biophysical methods in radiation therapy and nuclear medicine
-Dosimetry and radiation protection
-Radiological diagnostics and quality assurance
-Modern imaging techniques, such as computed tomography, magnetic resonance imaging, positron emission tomography
-Ultrasonography diagnostics, application of laser and UV rays
-Electronic processing of biosignals
-Artificial intelligence and machine learning in medical physics
In the Journal, the latest scientific insights find their expression in the form of original articles, reviews, technical communications, and information for the clinical practice.