Evidence-informed decision making is based on the premise that the entirety of information on a topic is collected and analyzed. Systematic reviews allow for data from different studies to be rigorously assessed according to PICO principles (population, intervention, control, outcomes). However, conducting a systematic review is generally a slow process that is a significant drain on resources. The fundamental problem is that the current approach to creating a systematic review cannot scale to meet the challenges resulting from the massive body of unstructured evidence. For this reason, the Public Health Agency of Canada has been examining the automation of different stages of evidence synthesis to increase efficiencies. In this article, we present an overview of an initial version of a novel machine learning-based system that is powered by recent advances in natural language processing (NLP), such as BioBERT, with further optimizations completed using a new immunization-specific document database. The resulting optimized NLP model at the core of this system is able to identify and extract PICO-related fields from publications on immunization with an average accuracy of 88% across five classes of text. Functionality is provided through a straightforward web interface.
Open Data is part of a broad global movement that is not only advancing science and scientific communication but also transforming modern society and how decisions are made. What began with a call for Open Science and the rise of online journals has extended to Open Data, based on the premise that if reports on data are open, then the generated or supporting data should be open as well. There have been a number of advances in Open Data over the last decade, spearheaded largely by governments. A real benefit of Open Data is not simply that single databases can be used more widely; it is that these data can also be leveraged, shared and combined with other data. Open Data facilitates scientific collaboration, enriches research and advances analytical capacity to inform decisions. In the human and environmental health realms, for example, the ability to access and combine diverse data can advance early signal detection, improve analysis and evaluation, inform program and policy development, increase capacity for public participation, enable transparency and improve accountability. However, challenges remain. Enormous resources are needed to make the technological shift to open and interoperable databases accessible with common protocols and terminology. Amongst data generators and users, this shift also involves a cultural change: from regarding databases as restricted intellectual property, to considering data as a common good. There is a need to address legal and ethical considerations in making this shift. Finally, along with efforts to modify infrastructure and address the cultural, legal and ethical issues, it is important to share the information equitably and effectively. While there is great potential of the open, timely, equitable and straightforward sharing of data, fully realizing the myriad of benefits of Open Data will depend on how effectively these challenges are addressed.
The Canadian Notifiable Disease Surveillance System (CNDSS) provides data on diseases that have been identified as priorities for public health monitoring and control. Several advances that have been made on Notifiable Diseases Online, the CNDSS interactive website, are consistent with the Government of Canada's commitment to Open Data. This article provides an update on changes in case definitions that have been made since the case definitions were last published in 2009, and describes updates that have been made to the interactive website since 2013. Changes were made to the case definitions of five diseases. For hepatitis C, the new case definition now distinguishes between acute and chronic infection. For cyclosporiasis, the probable case definition requires an epidemiologic link, with the clarification that this would likely be due to exposure to a common food source. For rabies, the probable case definition now refers to detection of rabies-neutralizing antibody instead of specific antibody titres. For Lyme disease, the revised confirmed and probable case definitions now identify five options for Lyme disease risk areas instead of endemic areas. For tuberculosis the revised case definition now includes nucleic acid amplification testing in addition to culture for diagnosis. The Notifiable Diseases Online website is an interactive tool that enables users to create customized figures and tables. Since a major redesign in 2013, numerous changes have been made to the look and feel of the site. Figures and tables can now be extracted as Excel or PDF files and large datasets are exportable into Excel files for further analysis. Case definitions in the national surveillance system will be updated as needed and its interactive website will continue to be improved and updated in response to user comments.