{"title":"根据核磁共振成像报告开发用于检测脊柱转移的自然语言处理算法","authors":"Evan Mostafa MD , Aaron Hui BS , Boudewijn Aasman BS , Kamlesh Chowdary BS , Kyle Mani BS , Edward Mardakhaev MD , Richard Zampolin MD , Einat Blumfield MD , Jesse Berman MD , Rafael De La Garza Ramos MD , Mitchell Fourman MD , Reza Yassari MD , Ananth Eleswarapu MD , Parsa Mirhaji PhD","doi":"10.1016/j.xnsj.2024.100513","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Metastasis to the spinal column is a common complication of malignancy, potentially causing pain and neurologic injury. An automated system to identify and refer patients with spinal metastases can help overcome barriers to timely treatment. We describe the training, optimization and validation of a natural language processing algorithm to identify the presence of vertebral metastasis and metastatic epidural cord compression (MECC) from radiology reports of spinal MRIs.</p></div><div><h3>Methods</h3><p>Reports from patients with spine MRI studies performed between January 1, 2008 and April 14, 2019 were reviewed by a team of radiologists to assess for the presence of cancer and generate a labeled dataset for model training. Using regular expression, impression sections were extracted from the reports and converted to all lower-case letters with all nonalphabetic characters removed. The reports were then tokenized and vectorized using the doc2vec algorithm. These were then used to train a neural network to predict the likelihood of spinal tumor or MECC. For each report, the model provided a number from 0 to 1 corresponding to its impression. We then obtained 111 MRI reports from outside the test set, 92 manually labeled negative and 19 with MECC to test the model's performance.</p></div><div><h3>Results</h3><p>About 37,579 radiology reports were reviewed. About 36,676 were labeled negative, and 903 with MECC. We chose a cutoff of 0.02 as a positive result to optimize for a low false negative rate. At this threshold we found a 100% sensitivity rate with a low false positive rate of 2.2%.</p></div><div><h3>Conclusions</h3><p>The NLP model described predicts the presence of spinal tumor and MECC in spine MRI reports with high accuracy. We plan to implement the algorithm into our EMR to allow for faster referral of these patients to appropriate specialists, allowing for reduced morbidity and increased survival.</p></div>","PeriodicalId":34622,"journal":{"name":"North American Spine Society Journal","volume":"19 ","pages":"Article 100513"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666548424002063/pdfft?md5=01450dbc60198665e1ef96ae67a4c9a2&pid=1-s2.0-S2666548424002063-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Development of a natural language processing algorithm for the detection of spinal metastasis based on magnetic resonance imaging reports\",\"authors\":\"Evan Mostafa MD , Aaron Hui BS , Boudewijn Aasman BS , Kamlesh Chowdary BS , Kyle Mani BS , Edward Mardakhaev MD , Richard Zampolin MD , Einat Blumfield MD , Jesse Berman MD , Rafael De La Garza Ramos MD , Mitchell Fourman MD , Reza Yassari MD , Ananth Eleswarapu MD , Parsa Mirhaji PhD\",\"doi\":\"10.1016/j.xnsj.2024.100513\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><p>Metastasis to the spinal column is a common complication of malignancy, potentially causing pain and neurologic injury. An automated system to identify and refer patients with spinal metastases can help overcome barriers to timely treatment. We describe the training, optimization and validation of a natural language processing algorithm to identify the presence of vertebral metastasis and metastatic epidural cord compression (MECC) from radiology reports of spinal MRIs.</p></div><div><h3>Methods</h3><p>Reports from patients with spine MRI studies performed between January 1, 2008 and April 14, 2019 were reviewed by a team of radiologists to assess for the presence of cancer and generate a labeled dataset for model training. Using regular expression, impression sections were extracted from the reports and converted to all lower-case letters with all nonalphabetic characters removed. The reports were then tokenized and vectorized using the doc2vec algorithm. These were then used to train a neural network to predict the likelihood of spinal tumor or MECC. For each report, the model provided a number from 0 to 1 corresponding to its impression. We then obtained 111 MRI reports from outside the test set, 92 manually labeled negative and 19 with MECC to test the model's performance.</p></div><div><h3>Results</h3><p>About 37,579 radiology reports were reviewed. About 36,676 were labeled negative, and 903 with MECC. We chose a cutoff of 0.02 as a positive result to optimize for a low false negative rate. At this threshold we found a 100% sensitivity rate with a low false positive rate of 2.2%.</p></div><div><h3>Conclusions</h3><p>The NLP model described predicts the presence of spinal tumor and MECC in spine MRI reports with high accuracy. We plan to implement the algorithm into our EMR to allow for faster referral of these patients to appropriate specialists, allowing for reduced morbidity and increased survival.</p></div>\",\"PeriodicalId\":34622,\"journal\":{\"name\":\"North American Spine Society Journal\",\"volume\":\"19 \",\"pages\":\"Article 100513\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666548424002063/pdfft?md5=01450dbc60198665e1ef96ae67a4c9a2&pid=1-s2.0-S2666548424002063-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"North American Spine Society Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666548424002063\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"North American Spine Society Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666548424002063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
Development of a natural language processing algorithm for the detection of spinal metastasis based on magnetic resonance imaging reports
Background
Metastasis to the spinal column is a common complication of malignancy, potentially causing pain and neurologic injury. An automated system to identify and refer patients with spinal metastases can help overcome barriers to timely treatment. We describe the training, optimization and validation of a natural language processing algorithm to identify the presence of vertebral metastasis and metastatic epidural cord compression (MECC) from radiology reports of spinal MRIs.
Methods
Reports from patients with spine MRI studies performed between January 1, 2008 and April 14, 2019 were reviewed by a team of radiologists to assess for the presence of cancer and generate a labeled dataset for model training. Using regular expression, impression sections were extracted from the reports and converted to all lower-case letters with all nonalphabetic characters removed. The reports were then tokenized and vectorized using the doc2vec algorithm. These were then used to train a neural network to predict the likelihood of spinal tumor or MECC. For each report, the model provided a number from 0 to 1 corresponding to its impression. We then obtained 111 MRI reports from outside the test set, 92 manually labeled negative and 19 with MECC to test the model's performance.
Results
About 37,579 radiology reports were reviewed. About 36,676 were labeled negative, and 903 with MECC. We chose a cutoff of 0.02 as a positive result to optimize for a low false negative rate. At this threshold we found a 100% sensitivity rate with a low false positive rate of 2.2%.
Conclusions
The NLP model described predicts the presence of spinal tumor and MECC in spine MRI reports with high accuracy. We plan to implement the algorithm into our EMR to allow for faster referral of these patients to appropriate specialists, allowing for reduced morbidity and increased survival.