Pub Date : 2021-01-01Epub Date: 2021-05-01DOI: 10.1007/s41666-021-00094-8
Jeff S Wesner, Dan Van Peursem, José D Flores, Yuhlong Lio, Chelsea A Wesner
Anticipating the number of hospital beds needed for patients with COVID-19 remains a challenge. Early efforts to predict hospital bed needs focused on deriving predictions from SIR models, largely at the level of countries, provinces, or states. In the USA, these models rely on data reported by state health agencies. However, predicting disease and hospitalization dynamics at the state level is complicated by geographic variation in disease parameters. In addition, it is difficult to make forecasts early in a pandemic due to minimal data. Bayesian approaches that allow models to be specified with informed prior information from areas that have already completed a disease curve can serve as prior estimates for areas that are beginning their curve. Here, a Bayesian non-linear regression (Weibull function) was used to forecast cumulative and active COVID-19 hospitalizations for SD, USA, based on data available up to 2020-07-22. As expected, early forecasts were dominated by prior information, which was derived from New York City. Importantly, hospitalization trends differed within South Dakota due to early peaks in an urban area, followed by later peaks in rural areas of the state. Combining these trends led to altered forecasts with relevant policy implications.
Supplementary information: The online version contains supplementary material available at 10.1007/s41666-021-00094-8.
{"title":"Forecasting Hospitalizations Due to COVID-19 in South Dakota, USA.","authors":"Jeff S Wesner, Dan Van Peursem, José D Flores, Yuhlong Lio, Chelsea A Wesner","doi":"10.1007/s41666-021-00094-8","DOIUrl":"10.1007/s41666-021-00094-8","url":null,"abstract":"<p><p>Anticipating the number of hospital beds needed for patients with COVID-19 remains a challenge. Early efforts to predict hospital bed needs focused on deriving predictions from SIR models, largely at the level of countries, provinces, or states. In the USA, these models rely on data reported by state health agencies. However, predicting disease and hospitalization dynamics at the state level is complicated by geographic variation in disease parameters. In addition, it is difficult to make forecasts early in a pandemic due to minimal data. Bayesian approaches that allow models to be specified with informed prior information from areas that have already completed a disease curve can serve as prior estimates for areas that are beginning their curve. Here, a Bayesian non-linear regression (Weibull function) was used to forecast cumulative and active COVID-19 hospitalizations for SD, USA, based on data available up to 2020-07-22. As expected, early forecasts were dominated by prior information, which was derived from New York City. Importantly, hospitalization trends differed within South Dakota due to early peaks in an urban area, followed by later peaks in rural areas of the state. Combining these trends led to altered forecasts with relevant policy implications.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s41666-021-00094-8.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8088317/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38965218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01Epub Date: 2021-09-07DOI: 10.1007/s41666-021-00103-w
Kai Lisa Lo, Minglei Zhang, Yanhui Chen, Jinhong Jackson Mi
Purpose: COVID-19 is still showing a tendency of spreading around the world. In order to improve the subsequent control of COVID-19, it is essential to conduct a study on measuring and predicting the scale of the outbreak in the future.
Methods: This paper uses rolling mechanism and grid search to find the best fractional order of Fractional Order Accumulation Grey Model (FGM). Buffer level is proposed based on the general form of weakening buffer operator to measure the effect of government control measurements on the epidemic. And the buffer level is associated with the Government Response Stringency index and the Mobility Index.
Results: Firstly, the model proposed in this paper dominates the ARIMA model which has been widely used in predicting the confirmed COVID-19 cases. Secondly, in the process of using the buffer level to modify the FGM, this paper finds that government measurements require the active cooperation of the public and often have a time lag when they are effective. Only when government increase its stringency and the public observe the order can the spread of COVID-19 be slowed down. If there is only the controlling measure and the public does not react actively, it will not slow down the epidemic. Thirdly, according to the Mobility Index and Government Response Stringency Index in December, this paper predicts the cumulative confirmed cases of the end of January in different scenarios according to different buffer levels. The study suggests that the world should continue to maintain high vigilance and take corresponding control measures for the outbreak of COVID-19.
Conclusions: Government's control measures and public's abidance are both important in this battle with COVID-19. Governments control measures have time-lag effect and the time lag is about 9 days. When the government increases its stringency and the public cooperates with the government, we must consider the weaken buffer operator with proper buffer level in the prediction process. These prediction methods can be considered in the prediction of COVID-19 confirmed cases in the future or the trend of other epidemics.
{"title":"Forecasting the Trend of COVID-19 Considering the Impacts of Public Health Interventions: An Application of FGM and Buffer Level.","authors":"Kai Lisa Lo, Minglei Zhang, Yanhui Chen, Jinhong Jackson Mi","doi":"10.1007/s41666-021-00103-w","DOIUrl":"https://doi.org/10.1007/s41666-021-00103-w","url":null,"abstract":"<p><strong>Purpose: </strong>COVID-19 is still showing a tendency of spreading around the world. In order to improve the subsequent control of COVID-19, it is essential to conduct a study on measuring and predicting the scale of the outbreak in the future.</p><p><strong>Methods: </strong>This paper uses rolling mechanism and grid search to find the best fractional order of Fractional Order Accumulation Grey Model (FGM). Buffer level is proposed based on the general form of weakening buffer operator to measure the effect of government control measurements on the epidemic. And the buffer level is associated with the Government Response Stringency index and the Mobility Index.</p><p><strong>Results: </strong>Firstly, the model proposed in this paper dominates the ARIMA model which has been widely used in predicting the confirmed COVID-19 cases. Secondly, in the process of using the buffer level to modify the FGM, this paper finds that government measurements require the active cooperation of the public and often have a time lag when they are effective. Only when government increase its stringency and the public observe the order can the spread of COVID-19 be slowed down. If there is only the controlling measure and the public does not react actively, it will not slow down the epidemic. Thirdly, according to the Mobility Index and Government Response Stringency Index in December, this paper predicts the cumulative confirmed cases of the end of January in different scenarios according to different buffer levels. The study suggests that the world should continue to maintain high vigilance and take corresponding control measures for the outbreak of COVID-19.</p><p><strong>Conclusions: </strong>Government's control measures and public's abidance are both important in this battle with COVID-19. Governments control measures have time-lag effect and the time lag is about 9 days. When the government increases its stringency and the public cooperates with the government, we must consider the weaken buffer operator with proper buffer level in the prediction process. These prediction methods can be considered in the prediction of COVID-19 confirmed cases in the future or the trend of other epidemics.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8422838/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39410507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Countries across the world are in different stages of COVID-19 trajectory, among which many have implemented lockdown measures to prevent its spread. Although the lockdown is effective in such prevention, it may put the economy into a depression. Predicting the epidemic progression with the government switching the lockdown on or off is critical. We propose a transfer learning approach called ALeRT-COVID using attention-based recurrent neural network (RNN) architecture to predict the epidemic trends for different countries. A source model was trained on the pre-defined source countries and then transferred to each target country. The lockdown measure was introduced to our model as a predictor and the attention mechanism was utilized to learn the different contributions of the confirmed cases in the past days to the future trend. Results demonstrated that the transfer learning strategy is helpful especially for early-stage countries. By introducing the lockdown predictor and the attention mechanism, ALeRT-COVID showed a significant improvement in the prediction performance. We predicted the confirmed cases in 1 week when extending and easing lockdown separately. Our results show that lockdown measures are still necessary for several countries. We expect our research can help different countries to make better decisions on the lockdown measures.
{"title":"ALeRT-COVID: Attentive Lockdown-awaRe Transfer Learning for Predicting COVID-19 Pandemics in Different Countries.","authors":"Yingxue Li, Wenxiao Jia, Junmei Wang, Jianying Guo, Qin Liu, Xiang Li, Guotong Xie, Fei Wang","doi":"10.1007/s41666-020-00088-y","DOIUrl":"https://doi.org/10.1007/s41666-020-00088-y","url":null,"abstract":"<p><p>Countries across the world are in different stages of COVID-19 trajectory, among which many have implemented lockdown measures to prevent its spread. Although the lockdown is effective in such prevention, it may put the economy into a depression. Predicting the epidemic progression with the government switching the lockdown on or off is critical. We propose a transfer learning approach called ALeRT-COVID using attention-based recurrent neural network (RNN) architecture to predict the epidemic trends for different countries. A source model was trained on the pre-defined source countries and then transferred to each target country. The lockdown measure was introduced to our model as a predictor and the attention mechanism was utilized to learn the different contributions of the confirmed cases in the past days to the future trend. Results demonstrated that the transfer learning strategy is helpful especially for early-stage countries. By introducing the lockdown predictor and the attention mechanism, ALeRT-COVID showed a significant improvement in the prediction performance. We predicted the confirmed cases in 1 week when extending and easing lockdown separately. Our results show that lockdown measures are still necessary for several countries. We expect our research can help different countries to make better decisions on the lockdown measures.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s41666-020-00088-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38804209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01Epub Date: 2020-08-03DOI: 10.1007/s41666-020-00077-1
Xinmeng Zhang, Chao Yan, Cheng Gao, Bradley A Malin, You Chen
Purpose: The data in a patient's laboratory test result is a notable resource to support clinical investigation and enhance medical research. However, for a variety of reasons, this type of data often contains a non-trivial number of missing values. For example, physicians may neglect to order tests or document the results. Such a phenomenon reduces the degree to which this data can be utilized to learn efficient and effective predictive models. To address this problem, various approaches have been developed to impute missing laboratory values; however, their performance has been limited. This is due, in part, to the fact no approaches effectively leverage the contextual information 1) in individual or 2) between laboratory test variables.
Method: We introduce an approach to combine an unsupervised prefilling strategy with a supervised machine learning approach, in the form of extreme gradient boosting (XGBoost), to leverage both types of context for imputation purposes. We evaluated the methodology through a series of experiments on approximately 8,200 patients' records in the MIMIC-III dataset.
Result: The results demonstrate that the new model outperforms baseline and state-of-the-art models on 13 commonly collected laboratory test variables. In terms of the normalized root mean square derivation (nRMSD), our model exhibits an imputation improvement by over 20%, on average.
Conclusion: Missing data imputation on the temporal variables can be largely improved via prefilling strategy and the supervised training technique, which leverages both the longitudinal and cross-sectional context simultaneously.
{"title":"Predicting Missing Values in Medical Data via XGBoost Regression.","authors":"Xinmeng Zhang, Chao Yan, Cheng Gao, Bradley A Malin, You Chen","doi":"10.1007/s41666-020-00077-1","DOIUrl":"https://doi.org/10.1007/s41666-020-00077-1","url":null,"abstract":"<p><strong>Purpose: </strong>The data in a patient's laboratory test result is a notable resource to support clinical investigation and enhance medical research. However, for a variety of reasons, this type of data often contains a non-trivial number of missing values. For example, physicians may neglect to order tests or document the results. Such a phenomenon reduces the degree to which this data can be utilized to learn efficient and effective predictive models. To address this problem, various approaches have been developed to impute missing laboratory values; however, their performance has been limited. This is due, in part, to the fact no approaches effectively leverage the contextual information 1) in individual or 2) between laboratory test variables.</p><p><strong>Method: </strong>We introduce an approach to combine an unsupervised prefilling strategy with a supervised machine learning approach, in the form of extreme gradient boosting (XGBoost), to leverage both types of context for imputation purposes. We evaluated the methodology through a series of experiments on approximately 8,200 patients' records in the MIMIC-III dataset.</p><p><strong>Result: </strong>The results demonstrate that the new model outperforms baseline and state-of-the-art models on 13 commonly collected laboratory test variables. In terms of the normalized root mean square derivation (nRMSD), our model exhibits an imputation improvement by over 20%, on average.</p><p><strong>Conclusion: </strong>Missing data imputation on the temporal variables can be largely improved via prefilling strategy and the supervised training technique, which leverages both the longitudinal and cross-sectional context simultaneously.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s41666-020-00077-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38679213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-09eCollection Date: 2021-03-01DOI: 10.1007/s41666-020-00081-5
Christian Alvin H Buhat, Jessa Camille C Duero, Edd Francis O Felix, Jomar F Rabajante, Jonathan B Mamplata
Testing is crucial for early detection, isolation, and treatment of coronavirus disease (COVID-19)-infected individuals. However, in resource-constrained countries such as the Philippines, test kits have limited availability. As of 11 April 2020, there are 11 testing centers in the country that have been accredited by the Department of Health (DOH) to conduct testing. In this paper, we use nonlinear programming (NLP) to determine the optimal percentage allocation of COVID-19 test kits among accredited testing centers in the Philippines that gives an equitable chance to all infected individuals to be tested. Heterogeneity in testing accessibility, population density of municipalities, and the capacity of testing facilities are included in the model. Our results show that the range of optimal allocation per testing center are as follows: Research Institute for Tropical Medicine (4.17-6.34%), San Lazaro Hospital (14.65-24.03%), University of the Philippines-National Institutes of Health (16.25-44.80%), Lung Center of the Philippines (15.8-26.40%), Baguio General Hospital Medical Center (0.58-0.76%), The Medical City, Pasig City (5.96-25.51%), St. Luke's Medical Center, Quezon City (1.09-6.70%), Bicol Public Health Laboratory (0.06-0.08%), Western Visayas Medical Center (0.71-4.52%), Vicente Sotto Memorial Medical Center (1.02-2.61%), and Southern Philippines Medical Center (≈ 0.01%). Our results can serve as a guide to the authorities in distributing the COVID-19 test kits. These can also be used for proposing additional testing centers and utilizing the available test kits properly and equitably, which helps in "flattening" the epidemic curve.
{"title":"Optimal Allocation of COVID-19 Test Kits Among Accredited Testing Centers in the Philippines.","authors":"Christian Alvin H Buhat, Jessa Camille C Duero, Edd Francis O Felix, Jomar F Rabajante, Jonathan B Mamplata","doi":"10.1007/s41666-020-00081-5","DOIUrl":"https://doi.org/10.1007/s41666-020-00081-5","url":null,"abstract":"<p><p>Testing is crucial for early detection, isolation, and treatment of coronavirus disease (COVID-19)-infected individuals. However, in resource-constrained countries such as the Philippines, test kits have limited availability. As of 11 April 2020, there are 11 testing centers in the country that have been accredited by the Department of Health (DOH) to conduct testing. In this paper, we use nonlinear programming (NLP) to determine the optimal percentage allocation of COVID-19 test kits among accredited testing centers in the Philippines that gives an equitable chance to all infected individuals to be tested. Heterogeneity in testing accessibility, population density of municipalities, and the capacity of testing facilities are included in the model. Our results show that the range of optimal allocation per testing center are as follows: Research Institute for Tropical Medicine (4.17-6.34%), San Lazaro Hospital (14.65-24.03%), University of the Philippines-National Institutes of Health (16.25-44.80%), Lung Center of the Philippines (15.8-26.40%), Baguio General Hospital Medical Center (0.58-0.76%), The Medical City, Pasig City (5.96-25.51%), St. Luke's Medical Center, Quezon City (1.09-6.70%), Bicol Public Health Laboratory (0.06-0.08%), Western Visayas Medical Center (0.71-4.52%), Vicente Sotto Memorial Medical Center (1.02-2.61%), and Southern Philippines Medical Center (≈ 0.01<i>%</i>). Our results can serve as a guide to the authorities in distributing the COVID-19 test kits. These can also be used for proposing additional testing centers and utilizing the available test kits properly and equitably, which helps in \"flattening\" the epidemic curve.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s41666-020-00081-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38609863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-18DOI: 10.1007/s41666-020-00075-3
Kejing Yin, Liaoliao Feng, W. K. Cheung
{"title":"Context-Aware Time Series Imputation for Multi-Analyte Clinical Data","authors":"Kejing Yin, Liaoliao Feng, W. K. Cheung","doi":"10.1007/s41666-020-00075-3","DOIUrl":"https://doi.org/10.1007/s41666-020-00075-3","url":null,"abstract":"","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9,"publicationDate":"2020-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s41666-020-00075-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46160791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-16DOI: 10.1007/s41666-020-00079-z
Sapna Trivedi, R. Gildersleeve, Sandra Franco, A. Kanter, A. Chaudhry
{"title":"Evaluation of a Concept Mapping Task Using Named Entity Recognition and Normalization in Unstructured Clinical Text","authors":"Sapna Trivedi, R. Gildersleeve, Sandra Franco, A. Kanter, A. Chaudhry","doi":"10.1007/s41666-020-00079-z","DOIUrl":"https://doi.org/10.1007/s41666-020-00079-z","url":null,"abstract":"","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9,"publicationDate":"2020-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s41666-020-00079-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47378623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-09eCollection Date: 2021-06-01DOI: 10.1007/s41666-020-00078-0
Ramalingam Shanmugam
This article constructs and demonstrates an alternate probabilistic approach (using incidence rate restricted model), compared with the deterministic mathematical models such as SIR, to capture the impact of healthcare efforts on the prevalence rate of the COVID-19's infectivity, hospitalization, recovery, and mortality in the eastern, central, mountain, and pacific time zone states in the USA. We add additional new properties for the incidence rate restricted Poisson probability distribution. With new properties, our method becomes feasible to comprehend not only the patterns of the prevalence rate of the COVID-19's infectivity, hospitalization, recovery, and mortality but also to quantitatively assess the effectiveness of social distancing, healthcare management's efforts to hospitalize the patients, the patient's immunity to recover, and lastly the unfortunate mortality itself. To make regional comparisons (as the people's movement is far more frequent within than outside the regional zone on daily basis), we group the COVID-19 data in terms of eastern, central, mountain, and pacific zone states. Several non-intuitive findings in the data results are noticed. They include the existence of imbalance, different vulnerability, and risk reduction in these four regions. For example, the impact of healthcare efforts is high in the recovery category in the pacific states. The impact is less in the hospitalization category in the mountain states. The least impact is seen in the infectivity category in the eastern zone states. A few thoughts on future research work are cited. It requires collecting rich data on COVID-19 and extracting valuable information for better public health policies.
{"title":"Restricted Prevalence Rates of COVID-19's Infectivity, Hospitalization, Recovery, Mortality in the USA and Their Implications.","authors":"Ramalingam Shanmugam","doi":"10.1007/s41666-020-00078-0","DOIUrl":"https://doi.org/10.1007/s41666-020-00078-0","url":null,"abstract":"<p><p>This article constructs and demonstrates an alternate probabilistic approach (using incidence rate restricted model), compared with the deterministic mathematical models such as SIR, to capture the impact of healthcare efforts on the prevalence rate of the COVID-19's infectivity, hospitalization, recovery, and mortality in the eastern, central, mountain, and pacific time zone states in the USA. We add additional new properties for the incidence <i>rate restricted Poisson</i> probability distribution. With new properties, our method becomes feasible to comprehend not only the patterns of the <i>prevalence rate</i> of the COVID-19's infectivity, hospitalization, recovery, and mortality but also to quantitatively assess the effectiveness of <i>social distancing</i>, healthcare management's efforts to hospitalize the patients, the patient's immunity to recover, and lastly the unfortunate mortality itself. To make regional comparisons (as the people's movement is far more frequent within than outside the regional zone on daily basis), we group the COVID-19 data in terms of eastern, central, mountain, and pacific zone states. Several non-intuitive findings in the data results are noticed. They include the existence of imbalance, different vulnerability, and risk reduction in these four regions. For example, the impact of healthcare efforts is high in the recovery category in the pacific states. The impact is less in the hospitalization category in the mountain states. The least impact is seen in the infectivity category in the eastern zone states. A few thoughts on future research work are cited. It requires collecting rich data on COVID-19 and extracting valuable information for better public health policies.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9,"publicationDate":"2020-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s41666-020-00078-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38508338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-05DOI: 10.1101/2020.06.04.20122150
Hegler C. Tissot, L. Pedebôs
Miscarriages are the most common type of pregnancy loss, mostly occurring in the first 12 weeks of pregnancy. Pregnancy risk assessment aims to quantify evidence to reduce such maternal morbidities, and personalized decision support systems are the cornerstone of high-quality, patient-centered care to improve diagnosis, treatment selection, and risk assessment. However, data sparsity and the increasing number of patient-level observations require more effective forms of representing clinical knowledge to encode known information that enables performing inference and reasoning. Whereas knowledge embedding representation has been widely explored in the open domain data, there are few efforts for its application in the clinical domain. In this study, we contrast differences among multiple embedding strategies, and we demonstrate how these methods can assist in performing risk assessment of miscarriage before and during pregnancy. Our experiments show that simple knowledge embedding approaches that utilize domain-specific metadata perform better than complex embedding strategies, although both can improve results comparatively to a population probabilistic baseline in both AUPRC, F1-score, and a proposed normalized version of these evaluation metrics that better reflects accuracy for unbalanced datasets. Finally, embedding approaches provide evidence about each individual, supporting explainability for its model predictions in such a way that humans understand.
{"title":"Improving Risk Assessment of Miscarriage During Pregnancy with Knowledge Graph Embeddings","authors":"Hegler C. Tissot, L. Pedebôs","doi":"10.1101/2020.06.04.20122150","DOIUrl":"https://doi.org/10.1101/2020.06.04.20122150","url":null,"abstract":"Miscarriages are the most common type of pregnancy loss, mostly occurring in the first 12 weeks of pregnancy. Pregnancy risk assessment aims to quantify evidence to reduce such maternal morbidities, and personalized decision support systems are the cornerstone of high-quality, patient-centered care to improve diagnosis, treatment selection, and risk assessment. However, data sparsity and the increasing number of patient-level observations require more effective forms of representing clinical knowledge to encode known information that enables performing inference and reasoning. Whereas knowledge embedding representation has been widely explored in the open domain data, there are few efforts for its application in the clinical domain. In this study, we contrast differences among multiple embedding strategies, and we demonstrate how these methods can assist in performing risk assessment of miscarriage before and during pregnancy. Our experiments show that simple knowledge embedding approaches that utilize domain-specific metadata perform better than complex embedding strategies, although both can improve results comparatively to a population probabilistic baseline in both AUPRC, F1-score, and a proposed normalized version of these evaluation metrics that better reflects accuracy for unbalanced datasets. Finally, embedding approaches provide evidence about each individual, supporting explainability for its model predictions in such a way that humans understand.","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9,"publicationDate":"2020-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41450101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-05-07DOI: 10.1007/s41666-020-00073-5
A. Jazayeri, Ou Stella Liang, Christopher C. Yang
{"title":"Imputation of Missing Data in Electronic Health Records Based on Patients’ Similarities","authors":"A. Jazayeri, Ou Stella Liang, Christopher C. Yang","doi":"10.1007/s41666-020-00073-5","DOIUrl":"https://doi.org/10.1007/s41666-020-00073-5","url":null,"abstract":"","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":null,"pages":null},"PeriodicalIF":5.9,"publicationDate":"2020-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s41666-020-00073-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"53225333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}