Giorgio Cazzaniga, Albino Eccher, Enrico Munari, Stefano Marletta, Emanuela Bonoldi, Vincenzo Della Mea, Moris Cadei, Marta Sbaraglia, Angela Guerriero, Angelo Paolo Dei Tos, Fabio Pagni, Vincenzo L'Imperio
{"title":"利用自然语言处理技术从病理报告中提取 SNOMED-CT 代码。","authors":"Giorgio Cazzaniga, Albino Eccher, Enrico Munari, Stefano Marletta, Emanuela Bonoldi, Vincenzo Della Mea, Moris Cadei, Marta Sbaraglia, Angela Guerriero, Angelo Paolo Dei Tos, Fabio Pagni, Vincenzo L'Imperio","doi":"10.32074/1591-951X-952","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The use of standardized structured reports (SSR) and suitable terminologies like SNOMED-CT can enhance data retrieval and analysis, fostering large-scale studies and collaboration. However, the still large prevalence of narrative reports in our laboratories warrants alternative and automated labeling approaches. In this project, natural language processing (NLP) methods were used to associate SNOMED-CT codes to structured and unstructured reports from an Italian Digital Pathology Department.</p><p><strong>Methods: </strong>Two NLP-based automatic coding systems (support vector machine, SVM, and long-short term memory, LSTM) were trained and applied to a series of narrative reports.</p><p><strong>Results: </strong>The 1163 cases were tested with both algorithms, showing good performances in terms of accuracy, precision, recall, and F1 score, with SVM showing slightly better performances as compared to LSTM (0.84, 0.87, 0.83, 0.82 vs 0.83, 0.85, 0.83, 0.82, respectively). The integration of an explainability allowed identification of terms and groups of words of importance, enabling fine-tuning, balancing semantic meaning and model performance.</p><p><strong>Conclusions: </strong>AI tools allow the automatic SNOMED-CT labeling of the pathology archives, providing a retrospective fix to the large lack of organization of narrative reports.</p>","PeriodicalId":45893,"journal":{"name":"PATHOLOGICA","volume":"115 6","pages":"318-324"},"PeriodicalIF":4.4000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10767798/pdf/","citationCount":"0","resultStr":"{\"title\":\"Natural Language Processing to extract SNOMED-CT codes from pathological reports.\",\"authors\":\"Giorgio Cazzaniga, Albino Eccher, Enrico Munari, Stefano Marletta, Emanuela Bonoldi, Vincenzo Della Mea, Moris Cadei, Marta Sbaraglia, Angela Guerriero, Angelo Paolo Dei Tos, Fabio Pagni, Vincenzo L'Imperio\",\"doi\":\"10.32074/1591-951X-952\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>The use of standardized structured reports (SSR) and suitable terminologies like SNOMED-CT can enhance data retrieval and analysis, fostering large-scale studies and collaboration. However, the still large prevalence of narrative reports in our laboratories warrants alternative and automated labeling approaches. In this project, natural language processing (NLP) methods were used to associate SNOMED-CT codes to structured and unstructured reports from an Italian Digital Pathology Department.</p><p><strong>Methods: </strong>Two NLP-based automatic coding systems (support vector machine, SVM, and long-short term memory, LSTM) were trained and applied to a series of narrative reports.</p><p><strong>Results: </strong>The 1163 cases were tested with both algorithms, showing good performances in terms of accuracy, precision, recall, and F1 score, with SVM showing slightly better performances as compared to LSTM (0.84, 0.87, 0.83, 0.82 vs 0.83, 0.85, 0.83, 0.82, respectively). The integration of an explainability allowed identification of terms and groups of words of importance, enabling fine-tuning, balancing semantic meaning and model performance.</p><p><strong>Conclusions: </strong>AI tools allow the automatic SNOMED-CT labeling of the pathology archives, providing a retrospective fix to the large lack of organization of narrative reports.</p>\",\"PeriodicalId\":45893,\"journal\":{\"name\":\"PATHOLOGICA\",\"volume\":\"115 6\",\"pages\":\"318-324\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10767798/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PATHOLOGICA\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32074/1591-951X-952\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PATHOLOGICA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32074/1591-951X-952","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PATHOLOGY","Score":null,"Total":0}
Natural Language Processing to extract SNOMED-CT codes from pathological reports.
Objective: The use of standardized structured reports (SSR) and suitable terminologies like SNOMED-CT can enhance data retrieval and analysis, fostering large-scale studies and collaboration. However, the still large prevalence of narrative reports in our laboratories warrants alternative and automated labeling approaches. In this project, natural language processing (NLP) methods were used to associate SNOMED-CT codes to structured and unstructured reports from an Italian Digital Pathology Department.
Methods: Two NLP-based automatic coding systems (support vector machine, SVM, and long-short term memory, LSTM) were trained and applied to a series of narrative reports.
Results: The 1163 cases were tested with both algorithms, showing good performances in terms of accuracy, precision, recall, and F1 score, with SVM showing slightly better performances as compared to LSTM (0.84, 0.87, 0.83, 0.82 vs 0.83, 0.85, 0.83, 0.82, respectively). The integration of an explainability allowed identification of terms and groups of words of importance, enabling fine-tuning, balancing semantic meaning and model performance.
Conclusions: AI tools allow the automatic SNOMED-CT labeling of the pathology archives, providing a retrospective fix to the large lack of organization of narrative reports.