{"title":"Ontology-Based Data Collection for a Hybrid Outbreak Detection Method Using Social Media","authors":"Ghazaleh Babanejaddehaki;Aijun An;Heidar Davoudi","doi":"10.1109/TNB.2024.3442912","DOIUrl":null,"url":null,"abstract":"Given the persistent global challenge presented by rapidly spreading diseases, as evidenced notably by the widespread impact of the COVID-19 pandemic on both human health and economies worldwide, the necessity of developing effective infectious disease prediction models has become of utmost importance. In this context, the utilization of online social media platforms as valuable tools in healthcare settings has gained prominence, offering direct avenues for disseminating critical health information to the public in a timely and accessible manner. Propelled by the ubiquitous accessibility of the internet through computers and mobile devices, these platforms promise to revolutionize traditional detection methods, providing more immediate and reliable epidemiological insights. Leveraging this paradigm shift, our proposed framework harnesses Twitter data associated with infectious disease symptoms, employing ontology to identify and curate relevant tweets. Central to our methodology is a hybrid model that integrates XGBoost and Bidirectional Long Short-Term Memory (BiLSTM) architectures. The integration of XGBoost addresses the challenge of handling small dataset sizes, inherent during outbreaks due to limited time series data. XGBoost serves as a cornerstone for minimizing the loss function and identifying optimal features from our multivariate time series data. Subsequently, the combined dataset, comprising original features and predicted values by XGBoost, is channeled into the BiLSTM for further processing. Through extensive experimentation with a dataset spanning multiple infectious disease outbreaks, our hybrid model demonstrates superior predictive performance compared to state-of-the-art and baseline models. By enhancing forecasting accuracy and outbreak tracking capabilities, our model offers promising prospects for assisting health authorities in mitigating fatalities and proactively preparing for potential outbreaks.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on NanoBioscience","FirstCategoryId":"99","ListUrlMain":"https://ieeexplore.ieee.org/document/10634582/","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Given the persistent global challenge presented by rapidly spreading diseases, as evidenced notably by the widespread impact of the COVID-19 pandemic on both human health and economies worldwide, the necessity of developing effective infectious disease prediction models has become of utmost importance. In this context, the utilization of online social media platforms as valuable tools in healthcare settings has gained prominence, offering direct avenues for disseminating critical health information to the public in a timely and accessible manner. Propelled by the ubiquitous accessibility of the internet through computers and mobile devices, these platforms promise to revolutionize traditional detection methods, providing more immediate and reliable epidemiological insights. Leveraging this paradigm shift, our proposed framework harnesses Twitter data associated with infectious disease symptoms, employing ontology to identify and curate relevant tweets. Central to our methodology is a hybrid model that integrates XGBoost and Bidirectional Long Short-Term Memory (BiLSTM) architectures. The integration of XGBoost addresses the challenge of handling small dataset sizes, inherent during outbreaks due to limited time series data. XGBoost serves as a cornerstone for minimizing the loss function and identifying optimal features from our multivariate time series data. Subsequently, the combined dataset, comprising original features and predicted values by XGBoost, is channeled into the BiLSTM for further processing. Through extensive experimentation with a dataset spanning multiple infectious disease outbreaks, our hybrid model demonstrates superior predictive performance compared to state-of-the-art and baseline models. By enhancing forecasting accuracy and outbreak tracking capabilities, our model offers promising prospects for assisting health authorities in mitigating fatalities and proactively preparing for potential outbreaks.
期刊介绍:
The IEEE Transactions on NanoBioscience reports on original, innovative and interdisciplinary work on all aspects of molecular systems, cellular systems, and tissues (including molecular electronics). Topics covered in the journal focus on a broad spectrum of aspects, both on foundations and on applications. Specifically, methods and techniques, experimental aspects, design and implementation, instrumentation and laboratory equipment, clinical aspects, hardware and software data acquisition and analysis and computer based modelling are covered (based on traditional or high performance computing - parallel computers or computer networks).