Objectives: To compare the radiologic assessment of Hirschsprung disease (HD) based on contrast enema with automated image analysis using a deep neural network (DNN) for image recognition.
Materials and methods: A retrospective observational single-centre study was conducted at a tertiary care hospital, including paediatric patients who underwent contrast enema between January 2011 and December 2023, either for suspected HD or other clinical indications. A classifier based on a pretrained DNN (DenseNet121) was developed to detect HD in contrast enema images. DNN performance was assessed using balanced accuracy, sensitivity, and the area under the receiver-operating characteristic curve (AUC-ROC) and area under the precision-recall curve (AUC-PR) analyses. Rectal biopsy was the reference standard, with clinical follow-up in cases where a biopsy was not performed. The DNN classification performance was compared to historical expert radiologic assessment.
Results: A total of 278 contrast enemas were performed in 221 patients (64.8% male, 35.2% female), mean age of 4.14 years and a median of 2.65 years. DenseNet121 achieved 75.3% balanced accuracy, 58.5% sensitivity, and 92.1% specificity per individual image, improving to 82.8%, 72.7%, and 93.0%, respectively, at the contrast enema level. The model achieved a similar AUC-ROC compared to expert radiologists in their original reports (0.830 vs 0.804), and the interobserver agreement was moderate (Cohen´s kappa = 0.475).
Conclusion: The DNN model demonstrated higher specificity than radiologists in the interpretation of contrast enemas in patients with suspected HD. Moderate interobserver agreement underscores the model's potential value as a tool for diagnostic support and standardisation, particularly in settings where access to experienced specialists may be limited or in borderline cases.
Key points: Question Contrast enema is commonly used to evaluate suspected HD, but its diagnostic accuracy is variable and dependent on the radiologist's expertise. Findings A deep learning model outperformed radiologists in specificity (93.0% vs 79.1%), however, the difference was not statistically significant, and the interobserver agreement was moderate (Cohen´s kappa = 0.475). Clinical relevanceA DNN trained for automated analysis of contrast enema can identify patterns suggestive of HD with performance comparable to conventional radiological assessment, underscoring its value as a tool for diagnostic support in borderline cases or when access to experienced specialists may be limited.
扫码关注我们
求助内容:
应助结果提醒方式:
