应用推荐系统和时间序列模型监测艾滋病毒/艾滋病卫生设施的质量

IF 1.8 Q3 PUBLIC ADMINISTRATION Data & policy Pub Date : 2022-07-11 DOI:10.1017/dap.2022.15
J. Friedman, Zola Allen, Allison Fox, Jose Webert, A. Devlin
{"title":"应用推荐系统和时间序列模型监测艾滋病毒/艾滋病卫生设施的质量","authors":"J. Friedman, Zola Allen, Allison Fox, Jose Webert, A. Devlin","doi":"10.1017/dap.2022.15","DOIUrl":null,"url":null,"abstract":"Abstract The US government invests substantial sums to control the HIV/AIDS epidemic. To monitor progress toward epidemic control, PEPFAR, or the President’s Emergency Plan for AIDS Relief, oversees a data reporting system that includes standard indicators, reporting formats, information systems, and data warehouses. These data, reported quarterly, inform understanding of the global epidemic, resource allocation, and identification of trouble spots. PEPFAR has developed tools to assess the quality of data reported. These tools made important contributions but are limited in the methods used to identify anomalous data points. The most advanced consider univariate probability distributions, whereas correlations between indicators suggest a multivariate approach is better suited. For temporal analysis, the same tool compares values to the averages of preceding periods, though does not consider underlying trends and seasonal factors. To that end, we apply two methods to identify anomalous data points among routinely collected facility-level HIV/AIDS data. One approach is Recommender Systems, an unsupervised machine learning method that captures relationships between users and items. We apply the approach in a novel way by predicting reported values, comparing predicted to reported values, and identifying the greatest deviations. For a temporal perspective, we apply time series models that are flexible to include trend and seasonality. Results of these methods were validated against manual review (95% agreement on non-anomalies, 56% agreement on anomalies for recommender systems; 96% agreement on non-anomalies, 91% agreement on anomalies for time series). This tool will apply greater methodological sophistication to monitoring data quality in an accelerated and standardized manner.","PeriodicalId":93427,"journal":{"name":"Data & policy","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Application of recommender systems and time series models to monitor quality at HIV/AIDS health facilities\",\"authors\":\"J. Friedman, Zola Allen, Allison Fox, Jose Webert, A. Devlin\",\"doi\":\"10.1017/dap.2022.15\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract The US government invests substantial sums to control the HIV/AIDS epidemic. To monitor progress toward epidemic control, PEPFAR, or the President’s Emergency Plan for AIDS Relief, oversees a data reporting system that includes standard indicators, reporting formats, information systems, and data warehouses. These data, reported quarterly, inform understanding of the global epidemic, resource allocation, and identification of trouble spots. PEPFAR has developed tools to assess the quality of data reported. These tools made important contributions but are limited in the methods used to identify anomalous data points. The most advanced consider univariate probability distributions, whereas correlations between indicators suggest a multivariate approach is better suited. For temporal analysis, the same tool compares values to the averages of preceding periods, though does not consider underlying trends and seasonal factors. To that end, we apply two methods to identify anomalous data points among routinely collected facility-level HIV/AIDS data. One approach is Recommender Systems, an unsupervised machine learning method that captures relationships between users and items. We apply the approach in a novel way by predicting reported values, comparing predicted to reported values, and identifying the greatest deviations. For a temporal perspective, we apply time series models that are flexible to include trend and seasonality. Results of these methods were validated against manual review (95% agreement on non-anomalies, 56% agreement on anomalies for recommender systems; 96% agreement on non-anomalies, 91% agreement on anomalies for time series). This tool will apply greater methodological sophistication to monitoring data quality in an accelerated and standardized manner.\",\"PeriodicalId\":93427,\"journal\":{\"name\":\"Data & policy\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2022-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data & policy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1017/dap.2022.15\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"PUBLIC ADMINISTRATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & policy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/dap.2022.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PUBLIC ADMINISTRATION","Score":null,"Total":0}
引用次数: 0

摘要

美国政府投入了大量资金来控制艾滋病的流行。为了监测疫情控制的进展,总统艾滋病紧急救援计划(PEPFAR)监督一个数据报告系统,该系统包括标准指标、报告格式、信息系统和数据仓库。这些数据每季度报告一次,有助于了解全球流行病、资源分配和查明问题点。总统防治艾滋病紧急救援计划开发了评估报告数据质量的工具。这些工具做出了重要贡献,但在用于识别异常数据点的方法中受到限制。最先进的方法考虑单变量概率分布,而指标之间的相关性表明多变量方法更适合。对于时间分析,同样的工具将数值与前几个时期的平均值进行比较,但不考虑潜在的趋势和季节因素。为此,我们采用两种方法来识别常规收集的设施级艾滋病毒/艾滋病数据中的异常数据点。一种方法是推荐系统,这是一种无监督的机器学习方法,可以捕获用户和项目之间的关系。我们通过预测报告值,比较预测值和报告值,并确定最大偏差,以一种新颖的方式应用该方法。对于时间的观点,我们应用时间序列模型是灵活的,包括趋势和季节性。这些方法的结果经过了人工审查的验证(推荐系统的非异常一致性为95%,异常一致性为56%;非异常一致性96%,时间序列异常一致性91%)。这一工具将以更快和标准化的方式在方法上更加复杂地监测数据质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Application of recommender systems and time series models to monitor quality at HIV/AIDS health facilities
Abstract The US government invests substantial sums to control the HIV/AIDS epidemic. To monitor progress toward epidemic control, PEPFAR, or the President’s Emergency Plan for AIDS Relief, oversees a data reporting system that includes standard indicators, reporting formats, information systems, and data warehouses. These data, reported quarterly, inform understanding of the global epidemic, resource allocation, and identification of trouble spots. PEPFAR has developed tools to assess the quality of data reported. These tools made important contributions but are limited in the methods used to identify anomalous data points. The most advanced consider univariate probability distributions, whereas correlations between indicators suggest a multivariate approach is better suited. For temporal analysis, the same tool compares values to the averages of preceding periods, though does not consider underlying trends and seasonal factors. To that end, we apply two methods to identify anomalous data points among routinely collected facility-level HIV/AIDS data. One approach is Recommender Systems, an unsupervised machine learning method that captures relationships between users and items. We apply the approach in a novel way by predicting reported values, comparing predicted to reported values, and identifying the greatest deviations. For a temporal perspective, we apply time series models that are flexible to include trend and seasonality. Results of these methods were validated against manual review (95% agreement on non-anomalies, 56% agreement on anomalies for recommender systems; 96% agreement on non-anomalies, 91% agreement on anomalies for time series). This tool will apply greater methodological sophistication to monitoring data quality in an accelerated and standardized manner.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.10
自引率
0.00%
发文量
0
审稿时长
12 weeks
期刊最新文献
Determinants for university students’ location data sharing with public institutions during COVID-19: The Italian case Bus Rapid Transit: End of trend in Latin America? Accelerating and enhancing the generation of socioeconomic data to inform forced displacement policy and response “That is why users do not understand the maps we make for them”: Cartographic gaps between experts and domestic workers and the Right to the City Analysis of spatial–temporal validation patterns in Fortaleza’s public transport systems: a data mining approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1