Microsatellite instability (MSI) is an important biomarker in colorectal cancer, influencing both patient prognosis and treatment decisions. Current approaches for MSI prediction from hematoxylin and eosin--stained whole-slide images (WSI) rely on end-to-end deep learning (“black-box”) models with limited interpretability, often relying on heatmaps for visualization. However, experienced pathologists can intuitively identify MSI through specific histologic features and have developed manual classification systems such as MS-Path for Lynch syndrome screening. We present a novel hybrid approach that combines computational and pathologist expertise to create an explainable and verifiable method for MSI prediction in colorectal cancer, applicable to resection and biopsy WSI. Our proposed method uses nuclei and tissue segmentation models to automatically quantify MSI-associated histologic features outlined in the Bethesda guidelines, including intraepithelial lymphocytes, grade of differentiation, mucinous components, and tertiary lymphoid structures. After validation on annotated data sets, these features are integrated with clinical data and used in logistic regression and random forest models to predict MSI status. We validated our approach using 3256 WSI from 2267 patients across 7 cohorts from 5 centers. The method achieved an area under the curve of up to 0.88 across all resection cohorts, and 0.90 on biopsies, performing on par with published black-box deep learning models. Importantly, the learned variable importances strongly correlated with manual scoring systems and aligned with manual pathologist assessments. We observed significant intrapatient heterogeneity in predicted scores, emphasizing the importance of whole-case analysis. Our approach also shows potential as a screening tool that could exclude 41% of patients from gold-standard MSI testing while maintaining 95% sensitivity. This study demonstrates that classifiers based on clinical and validated histologic information can predict MSI status as effectively as black-box models while providing complete interpretability. Our method offers an alternative pathway for understandable, explainable, and trustworthy biomarker prediction in computational pathology.
扫码关注我们
求助内容:
应助结果提醒方式:
