David Begert, Justin Granek, Brian Irwin, Chris Brogly
{"title":"使用基于自然语言处理的高级提取系统实现免疫系统综述的自动化。","authors":"David Begert, Justin Granek, Brian Irwin, Chris Brogly","doi":"10.14745/ccdr.v46i06a04","DOIUrl":null,"url":null,"abstract":"<p><p>Evidence-informed decision making is based on the premise that the entirety of information on a topic is collected and analyzed. Systematic reviews allow for data from different studies to be rigorously assessed according to PICO principles (population, intervention, control, outcomes). However, conducting a systematic review is generally a slow process that is a significant drain on resources. The fundamental problem is that the current approach to creating a systematic review cannot scale to meet the challenges resulting from the massive body of unstructured evidence. For this reason, the Public Health Agency of Canada has been examining the automation of different stages of evidence synthesis to increase efficiencies. In this article, we present an overview of an initial version of a novel machine learning-based system that is powered by recent advances in natural language processing (NLP), such as BioBERT, with further optimizations completed using a new immunization-specific document database. The resulting optimized NLP model at the core of this system is able to identify and extract PICO-related fields from publications on immunization with an average accuracy of 88% across five classes of text. Functionality is provided through a straightforward web interface.</p>","PeriodicalId":94304,"journal":{"name":"Canada communicable disease report = Releve des maladies transmissibles au Canada","volume":"46 6","pages":"174-179"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11182649/pdf/","citationCount":"0","resultStr":"{\"title\":\"Towards automating systematic reviews on immunization using an advanced natural language processing-based extraction system.\",\"authors\":\"David Begert, Justin Granek, Brian Irwin, Chris Brogly\",\"doi\":\"10.14745/ccdr.v46i06a04\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Evidence-informed decision making is based on the premise that the entirety of information on a topic is collected and analyzed. Systematic reviews allow for data from different studies to be rigorously assessed according to PICO principles (population, intervention, control, outcomes). However, conducting a systematic review is generally a slow process that is a significant drain on resources. The fundamental problem is that the current approach to creating a systematic review cannot scale to meet the challenges resulting from the massive body of unstructured evidence. For this reason, the Public Health Agency of Canada has been examining the automation of different stages of evidence synthesis to increase efficiencies. In this article, we present an overview of an initial version of a novel machine learning-based system that is powered by recent advances in natural language processing (NLP), such as BioBERT, with further optimizations completed using a new immunization-specific document database. The resulting optimized NLP model at the core of this system is able to identify and extract PICO-related fields from publications on immunization with an average accuracy of 88% across five classes of text. Functionality is provided through a straightforward web interface.</p>\",\"PeriodicalId\":94304,\"journal\":{\"name\":\"Canada communicable disease report = Releve des maladies transmissibles au Canada\",\"volume\":\"46 6\",\"pages\":\"174-179\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11182649/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Canada communicable disease report = Releve des maladies transmissibles au Canada\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14745/ccdr.v46i06a04\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canada communicable disease report = Releve des maladies transmissibles au Canada","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14745/ccdr.v46i06a04","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards automating systematic reviews on immunization using an advanced natural language processing-based extraction system.
Evidence-informed decision making is based on the premise that the entirety of information on a topic is collected and analyzed. Systematic reviews allow for data from different studies to be rigorously assessed according to PICO principles (population, intervention, control, outcomes). However, conducting a systematic review is generally a slow process that is a significant drain on resources. The fundamental problem is that the current approach to creating a systematic review cannot scale to meet the challenges resulting from the massive body of unstructured evidence. For this reason, the Public Health Agency of Canada has been examining the automation of different stages of evidence synthesis to increase efficiencies. In this article, we present an overview of an initial version of a novel machine learning-based system that is powered by recent advances in natural language processing (NLP), such as BioBERT, with further optimizations completed using a new immunization-specific document database. The resulting optimized NLP model at the core of this system is able to identify and extract PICO-related fields from publications on immunization with an average accuracy of 88% across five classes of text. Functionality is provided through a straightforward web interface.