首页 > 最新文献

International Journal of Population Data Science最新文献

英文 中文
Challenges and lessons learned from two countries using linked administrative data to evaluate the Family Nurse Partnership. 两个国家利用相关行政数据评估家庭护士伙伴关系所面临的挑战和经验教训。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1833
F. Cavallaro, R. Cannings‐John, F. Lugg-Widger, J. H. van der Meulen, R. Gilbert, E. Kennedy, M. Robling, Hywel Jones
ObjectivesWe describe the challenges and lessons learned from two studies using linked administrative data from health, education and social care sectors to evaluate the Family Nurse Partnership (FNP), an intervention supporting adolescent mothers in England(E) and Scotland(S). We present recommendations for studies using linked administrative data to evaluate complex interventions. ApproachWe constructed two cohorts of all mothers aged 13-19 giving birth in NHS hospitals in England and Scotland between 2010-2016/17 using linkage of mothers and babies in hospital admissions data (E:Hospital Episode Statistics/S:Maternity Inpatient and Day Case), and identified FNP participation through linkage to FNP programme data. We additionally linked to health, educational and social care data for mothers and their babies (E:National Pupil Database/S:eDRIS). We used these data to identify key risk factors for enrolment in the FNP, assess the effect of the FNP on maternal and child outcomes, and determine programme characteristics modifying the effect of the FNP. ResultsKey challenges: characterising the intervention and usual care, understanding quality of multi-sector data linkage, data access delays, constructing appropriate comparator groups and interpreting outcomes captured in administrative data. Lessons learned: evaluations require detailed data on intervention activity (dates/geography), and assessment of usual care, which are rarely readily available and are time-consuming to gather; data linkage quality is variable/not available, making defining denominators challenging; data access delays impeded on data analysis time; unmeasured confounders not captured in administrative data may prevent generation of an appropriate comparator group. Recommendations: Characteristics informing targeting should be explicitly documented, and could be enhanced using linked primary care data and information on household members (e.g. fathers). Process evaluation and qualitative research could help to provide better understanding of mechanisms of effect. ConclusionLinkage of administrative data presents exciting opportunities for efficient evaluation of large-scale, complex public health interventions. However, sufficient information is needed on programme meta-data, targeting and important confounders in order to generate meaningful results. Study findings should help stimulate exploration with practitioners about how programmes can be improved.
目的我们描述了两项研究的挑战和经验教训,这两项研究使用了来自卫生、教育和社会护理部门的相关行政数据来评估家庭护士伙伴关系(FNP),这是一项支持英格兰(E)和苏格兰(S)青少年母亲的干预措施。我们提出了使用相关行政数据评估复杂干预措施的研究建议。方法我们利用入院数据中母亲和婴儿的联系(E:医院事件统计/S:产妇住院和日间病例),构建了2010-2016/17年间在英格兰和苏格兰NHS医院分娩的所有13-19岁母亲的两个队列,并通过与FNP计划数据的联系确定了FNP的参与情况。我们还链接了母亲及其婴儿的健康、教育和社会护理数据(E:国家学生数据库/S:eDRIS)。我们使用这些数据来确定FNP注册的关键风险因素,评估FNP对孕产妇和儿童结果的影响,并确定改变FNP效果的计划特征。结果关键挑战:表征干预和常规护理,了解多部门数据链接的质量,数据访问延迟,构建适当的比较组,并解释行政数据中的结果。经验教训:评估需要关于干预活动(日期/地理位置)的详细数据,以及对日常护理的评估,这些数据很少现成,而且收集起来很耗时;数据链接质量可变/不可用,使得定义分母具有挑战性;数据访问延迟阻碍了数据分析时间;管理数据中未捕捉到的未测量的混杂因素可能会阻止生成适当的对照组。建议:应明确记录告知目标的特征,并可使用相关的初级保健数据和家庭成员(如父亲)信息来加强这些特征。过程评估和定性研究有助于更好地了解影响机制。结论行政数据的关联为有效评估大规模、复杂的公共卫生干预措施提供了令人兴奋的机会。然而,为了产生有意义的结果,需要提供关于方案元数据、目标和重要混杂因素的足够信息。研究结果应有助于激发从业者对如何改进方案的探索。
{"title":"Challenges and lessons learned from two countries using linked administrative data to evaluate the Family Nurse Partnership.","authors":"F. Cavallaro, R. Cannings‐John, F. Lugg-Widger, J. H. van der Meulen, R. Gilbert, E. Kennedy, M. Robling, Hywel Jones","doi":"10.23889/ijpds.v7i3.1833","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1833","url":null,"abstract":"ObjectivesWe describe the challenges and lessons learned from two studies using linked administrative data from health, education and social care sectors to evaluate the Family Nurse Partnership (FNP), an intervention supporting adolescent mothers in England(E) and Scotland(S). We present recommendations for studies using linked administrative data to evaluate complex interventions. \u0000ApproachWe constructed two cohorts of all mothers aged 13-19 giving birth in NHS hospitals in England and Scotland between 2010-2016/17 using linkage of mothers and babies in hospital admissions data (E:Hospital Episode Statistics/S:Maternity Inpatient and Day Case), and identified FNP participation through linkage to FNP programme data. We additionally linked to health, educational and social care data for mothers and their babies (E:National Pupil Database/S:eDRIS). We used these data to identify key risk factors for enrolment in the FNP, assess the effect of the FNP on maternal and child outcomes, and determine programme characteristics modifying the effect of the FNP. \u0000ResultsKey challenges: characterising the intervention and usual care, understanding quality of multi-sector data linkage, data access delays, constructing appropriate comparator groups and interpreting outcomes captured in administrative data. Lessons learned: evaluations require detailed data on intervention activity (dates/geography), and assessment of usual care, which are rarely readily available and are time-consuming to gather; data linkage quality is variable/not available, making defining denominators challenging; data access delays impeded on data analysis time; unmeasured confounders not captured in administrative data may prevent generation of an appropriate comparator group. Recommendations: Characteristics informing targeting should be explicitly documented, and could be enhanced using linked primary care data and information on household members (e.g. fathers). Process evaluation and qualitative research could help to provide better understanding of mechanisms of effect. \u0000ConclusionLinkage of administrative data presents exciting opportunities for efficient evaluation of large-scale, complex public health interventions. However, sufficient information is needed on programme meta-data, targeting and important confounders in order to generate meaningful results. Study findings should help stimulate exploration with practitioners about how programmes can be improved.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42119569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The changing nature of patient attributes available for matching. 可用于匹配的患者属性的变化性质。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2079
Yu Deng, Lacey P. Gleason, Adam Culbertson, Don Asmonga, S. Grannis, A. Kho
ObjectivesPatient matching rates between organizations can be as low as fifty percent. Challenges to matching include the variation in quality and availability of patient attributes. Here we describe the changing nature of patient attributes available over the past 11-years across a diversity of care settings in the United States. ApproachOur expert panel identified 64 patient attributes that are currently used or could potentially be candidates for patient matching. We identified a national sample of 14 health care sites who sent us aggregated information on the 64 patient attributes from 2010 to 2020 (inclusive). The information included overall counts and percent availability, overall counts and percent availability by race, and counts and availability by year. Only patients having at least one visit to the site since 2010 and who were between 18 and 89 years of age at time of extraction were included. ResultsThe aggregated results revealed that first name, last name, gender, postal codes, and date of birth are highly available (>90%) across healthcare organizations and time.  Patient reported social security number, work phone number, and emergency contact declined markedly, potentially reflecting privacy concerns.  Email addresses (from 18.0% to 63.7%) and phone numbers (from 14.7% to 69.4%) increased greatly over the past 11 years. Novel patient matching attributes such as blood type, facial image, thumb print, or eye color are rarely collected across sites for all years. We observed emerging attributes including sexuality, occupation, and nickname with a small number of sites collecting these over 70%, reflecting the feasibility of wider adoption in the future. ConclusionIn this study, we examined the availability of 64 patient attributes across 14 sites from 2010 and 2020. Our findings could inform policy makers and readers about patient attributes that are used for current patient matching and emerging data attributes that could be considered for incorporation into future matching algorithms.
目标组织间的患者匹配率可以低至50%。匹配的挑战包括患者属性的质量和可用性的差异。在这里,我们描述了过去11年来美国各种护理环境中患者特征的变化。方法我们的专家小组确定了64个患者属性,这些属性目前正在使用或可能是患者匹配的候选者。我们确定了14个医疗机构的全国样本,这些医疗机构向我们发送了2010年至2020年(包括2020年)64名患者特征的汇总信息。这些信息包括总体计数和可用性百分比、按种族划分的总体计数和可利用性百分比,以及按年份划分的计数和可用率。只有自2010年以来至少访问过一次该部位的患者,以及在提取时年龄在18至89岁之间的患者才被包括在内。结果汇总结果显示,不同医疗机构和时间的名字、姓氏、性别、邮政编码和出生日期的可用性很高(>90%)。患者报告的社会安全号码、工作电话号码和紧急联系明显减少,这可能反映了隐私问题。电子邮件地址(从18.0%增加到63.7%)和电话号码(从14.7%增加到69.4%)在过去11年中大幅增加。多年来,很少在不同地点收集新的患者匹配属性,如血型、面部图像、拇指指纹或眼睛颜色。我们观察到了包括性、职业和昵称在内的新兴属性,少数网站收集了超过70%的这些属性,反映了未来更广泛采用的可行性。结论在这项研究中,我们检查了2010年至2020年14个地点64个患者属性的可用性。我们的研究结果可以让政策制定者和读者了解用于当前患者匹配的患者属性,以及可以考虑纳入未来匹配算法的新兴数据属性。
{"title":"The changing nature of patient attributes available for matching.","authors":"Yu Deng, Lacey P. Gleason, Adam Culbertson, Don Asmonga, S. Grannis, A. Kho","doi":"10.23889/ijpds.v7i3.2079","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2079","url":null,"abstract":"ObjectivesPatient matching rates between organizations can be as low as fifty percent. Challenges to matching include the variation in quality and availability of patient attributes. Here we describe the changing nature of patient attributes available over the past 11-years across a diversity of care settings in the United States. \u0000ApproachOur expert panel identified 64 patient attributes that are currently used or could potentially be candidates for patient matching. We identified a national sample of 14 health care sites who sent us aggregated information on the 64 patient attributes from 2010 to 2020 (inclusive). The information included overall counts and percent availability, overall counts and percent availability by race, and counts and availability by year. Only patients having at least one visit to the site since 2010 and who were between 18 and 89 years of age at time of extraction were included. \u0000ResultsThe aggregated results revealed that first name, last name, gender, postal codes, and date of birth are highly available (>90%) across healthcare organizations and time.  Patient reported social security number, work phone number, and emergency contact declined markedly, potentially reflecting privacy concerns.  Email addresses (from 18.0% to 63.7%) and phone numbers (from 14.7% to 69.4%) increased greatly over the past 11 years. Novel patient matching attributes such as blood type, facial image, thumb print, or eye color are rarely collected across sites for all years. We observed emerging attributes including sexuality, occupation, and nickname with a small number of sites collecting these over 70%, reflecting the feasibility of wider adoption in the future. \u0000ConclusionIn this study, we examined the availability of 64 patient attributes across 14 sites from 2010 and 2020. Our findings could inform policy makers and readers about patient attributes that are used for current patient matching and emerging data attributes that could be considered for incorporation into future matching algorithms.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48932868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data First: Criminal Courts Linked Data research report. 数据为先:刑事法院关联数据研究报告。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1920
Georgina Eaton, Kylie Hill, A. Summerfield
ObjectivesThe Ministry of Justice’s pioneering data linking programme Data First, funded by Administrative Data Research UK, links administrative datasets across the justice system and with other government departments to enable research providing critical new insights on justice system users, their pathways, and outcomes across a range of public services. ApproachThe first two datasets shared under the Data First project are magistrates’ courts and Crown Court data which have been deidentified, deduplicated and linked to provide a joined-up picture of criminal court defendant and case journeys. Accredited researchers can access this data using the ONS Secure Research Service to conduct research. Administrative Data Research UK has funded four Research Fellows to conduct analysis using this linked data. Additionally, analysts within the Ministry of Justice Data First team have published a research report showcasing the potential of the linked criminal courts data. The presentation will primarily focus on this work. ResultsThe Data First criminal courts datasets have enabled, for the first time, the extent and nature of repeat users to be explored at scale for research. In March 2022, the Ministry of Justice published exploratory analysis of returning defendants and the potential of linked criminal courts data. The key findings of this report will be covered in the presentation. The research demonstrates more than half of defendants returned to the courts within the data period, but this was highest for specific offence groups, including theft, robbery and drug offences. Locality-based analysis on Crown Court defendants highlights important insights on the backgrounds of justice system users, showing an over-representation of defendants residing in the most deprived areas in England and Wales compared to the general population. ConclusionThe presentation will demonstrate how linked administrative data available through the ground-breaking Data First programme can be effectively used for research. This insight improves our understanding of individuals in the justice system as well as providing a rich resource to develop the evidence base for government policy and practice.
目的司法部开创性的数据链接计划data First由英国行政数据研究所资助,将整个司法系统和其他政府部门的行政数据集链接起来,使研究能够对司法系统用户、他们的途径和一系列公共服务的结果提供关键的新见解。方法数据优先项目下共享的前两个数据集是治安法院和刑事法院的数据,这些数据经过去标识、去重复和链接,以提供刑事法院被告和案件旅程的联合图像。经过认证的研究人员可以使用国家统计局的安全研究服务访问这些数据进行研究。英国行政数据研究所资助了四名研究员使用这些关联数据进行分析。此外,司法部数据第一团队的分析师发表了一份研究报告,展示了相关刑事法院数据的潜力。演讲将主要集中在这项工作上。结果数据优先刑事法院数据集首次使重复用户的范围和性质能够得到大规模的研究。2022年3月,司法部发布了对返回被告的探索性分析以及相关刑事法院数据的潜力。本报告的主要结论将在介绍中介绍。研究表明,超过一半的被告在数据期内重返法庭,但这一数字在特定犯罪群体中最高,包括盗窃、抢劫和毒品犯罪。对刑事法院被告的基于地区的分析突出了对司法系统用户背景的重要见解,显示与普通人群相比,居住在英格兰和威尔士最贫困地区的被告比例过高。结论该演示文稿将展示如何通过开创性的数据优先方案获得的相关行政数据可以有效地用于研究。这一见解提高了我们对司法系统中个人的理解,并为政府政策和实践提供了丰富的证据基础。
{"title":"Data First: Criminal Courts Linked Data research report.","authors":"Georgina Eaton, Kylie Hill, A. Summerfield","doi":"10.23889/ijpds.v7i3.1920","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1920","url":null,"abstract":"ObjectivesThe Ministry of Justice’s pioneering data linking programme Data First, funded by Administrative Data Research UK, links administrative datasets across the justice system and with other government departments to enable research providing critical new insights on justice system users, their pathways, and outcomes across a range of public services. \u0000ApproachThe first two datasets shared under the Data First project are magistrates’ courts and Crown Court data which have been deidentified, deduplicated and linked to provide a joined-up picture of criminal court defendant and case journeys. Accredited researchers can access this data using the ONS Secure Research Service to conduct research. Administrative Data Research UK has funded four Research Fellows to conduct analysis using this linked data. Additionally, analysts within the Ministry of Justice Data First team have published a research report showcasing the potential of the linked criminal courts data. The presentation will primarily focus on this work. \u0000ResultsThe Data First criminal courts datasets have enabled, for the first time, the extent and nature of repeat users to be explored at scale for research. In March 2022, the Ministry of Justice published exploratory analysis of returning defendants and the potential of linked criminal courts data. The key findings of this report will be covered in the presentation. The research demonstrates more than half of defendants returned to the courts within the data period, but this was highest for specific offence groups, including theft, robbery and drug offences. Locality-based analysis on Crown Court defendants highlights important insights on the backgrounds of justice system users, showing an over-representation of defendants residing in the most deprived areas in England and Wales compared to the general population. \u0000ConclusionThe presentation will demonstrate how linked administrative data available through the ground-breaking Data First programme can be effectively used for research. This insight improves our understanding of individuals in the justice system as well as providing a rich resource to develop the evidence base for government policy and practice.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47540311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The impact of COVID-19 pandemic on trends in the recorded incidence of Long-Term Conditions identified from routine electronic health records between 2000 and 2021 in Wales: a population data linkage study. 2019冠状病毒病大流行对2000年至2021年威尔士常规电子健康记录中确定的长期疾病记录发病率趋势的影响:一项人口数据联系研究。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2011
Cathy Qi, T. Osborne, R. Bailey, J. Hollinghurst, A. Akbari, A. Cooper, Ruth Crowder, H. Peters, R. Law, Anthony Davies, R. Lewis, Mark C Walker, Adrian Edwards, R. Lyons
BackgroundThe COVID-19 pandemic has resulted in delayed diagnosis and treatment for cancer patients and increases in elective surgery waiting lists. The impact on other ‘long-term’ conditions (LTCs) is unclear. We examined the effects of the pandemic on the recorded incidence of 20 LTCs to inform decisions on treatment pathways and resource allocation. ApproachWe included Welsh residents diagnosed with any of 20 LTCs for the first time between 2000-2021. Data were accessed and analysed within the Secure Anonymised Information Linkage (SAIL) Databank. The primary aim was to assess the impact of the COVID-19 pandemic on trends in recorded incidence. Secondarily we examined incidence by socio-demographic and clinical subgroups: age, sex, deprivation quintile, ethnicity, frailty score and learning disability. Incidence were presented as monthly rates for each LTC. We performed interrupted time series analyses to estimate; the immediate and long-term change in rates following the pandemic; and the size of the undiagnosed population. ResultsWe included 2,206,070 individuals diagnosed with at least one LTC. An immediate reduction in recording of new diagnoses was observed in April 2020 across all 20 LTCs, followed by a gradual recovery towards pre-pandemic levels over the next 18 months, though at different rates across conditions. The largest difference between observed and expected (as predicted using pre-pandemic trends) incidence between January 2020 and June 2021 were in the diagnoses of COPD (-43%, 95% CI (-50%, -34%)), Asthma, Hypertension and Depression and the smallest difference was in Type 1 diabetes, dementia, stroke and TIA (-8%, 95% CI (-19% ,5%)). Differences in the proportions of incidence by socio-demographic and clinical subgroups in the years preceding and following the pandemic have also been analysed (results to be finalised). ConclusionThere was an abrupt reduction in the observed incidence of all 20 LTCs after March 2020 followed by a gradual recovery over consequent months towards pre-pandemic levels. Of 20 LTCs, 15 strongly indicate a reservoir of yet undiagnosed patients. The results from this study will have implications in resource allocation.
新冠肺炎疫情导致癌症患者的诊断和治疗延迟,择期手术等待名单增加。对其他“长期”状况(ltc)的影响尚不清楚。我们研究了大流行对20例LTCs记录发病率的影响,为治疗途径和资源分配决策提供信息。我们纳入了2000年至2021年间首次被诊断患有20种LTCs中的任何一种的威尔士居民。数据在安全匿名信息链接(SAIL)数据库中被访问和分析。主要目的是评估COVID-19大流行对记录发病率趋势的影响。其次,我们检查了社会人口统计学和临床亚组的发病率:年龄、性别、剥夺五分之一、种族、虚弱评分和学习障碍。发病率以每个LTC的月发病率表示。我们进行了中断时间序列分析来估计;大流行后发病率的近期和长期变化;以及未确诊人群的规模。结果:我们纳入了2,206,070名被诊断为至少一种LTC的个体。2020年4月,所有20个长期诊断中心的新诊断记录立即减少,随后在接下来的18个月里逐渐恢复到大流行前的水平,尽管不同条件下的恢复速度不同。2020年1月至2021年6月期间观察到的发病率与预期发病率(根据大流行前趋势预测)之间的最大差异是慢性阻塞性肺病(-43%,95%可信区间(-50%,-34%))、哮喘、高血压和抑郁症的诊断,差异最小的是1型糖尿病、痴呆、中风和TIA(-8%, 95%可信区间(-19%,5%))。还分析了在大流行前后几年中社会人口和临床亚组发病率的差异(结果有待最后确定)。结论:2020年3月之后,所有20例LTCs的观察发病率突然下降,随后几个月逐渐恢复到大流行前的水平。在20个LTCs中,有15个强烈表明存在尚未确诊的患者。这项研究的结果将对资源分配产生影响。
{"title":"The impact of COVID-19 pandemic on trends in the recorded incidence of Long-Term Conditions identified from routine electronic health records between 2000 and 2021 in Wales: a population data linkage study.","authors":"Cathy Qi, T. Osborne, R. Bailey, J. Hollinghurst, A. Akbari, A. Cooper, Ruth Crowder, H. Peters, R. Law, Anthony Davies, R. Lewis, Mark C Walker, Adrian Edwards, R. Lyons","doi":"10.23889/ijpds.v7i3.2011","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2011","url":null,"abstract":"BackgroundThe COVID-19 pandemic has resulted in delayed diagnosis and treatment for cancer patients and increases in elective surgery waiting lists. The impact on other ‘long-term’ conditions (LTCs) is unclear. We examined the effects of the pandemic on the recorded incidence of 20 LTCs to inform decisions on treatment pathways and resource allocation. \u0000ApproachWe included Welsh residents diagnosed with any of 20 LTCs for the first time between 2000-2021. \u0000Data were accessed and analysed within the Secure Anonymised Information Linkage (SAIL) Databank. \u0000The primary aim was to assess the impact of the COVID-19 pandemic on trends in recorded incidence. Secondarily we examined incidence by socio-demographic and clinical subgroups: age, sex, deprivation quintile, ethnicity, frailty score and learning disability. \u0000Incidence were presented as monthly rates for each LTC. We performed interrupted time series analyses to estimate; the immediate and long-term change in rates following the pandemic; and the size of the undiagnosed population. \u0000ResultsWe included 2,206,070 individuals diagnosed with at least one LTC. \u0000An immediate reduction in recording of new diagnoses was observed in April 2020 across all 20 LTCs, followed by a gradual recovery towards pre-pandemic levels over the next 18 months, though at different rates across conditions. The largest difference between observed and expected (as predicted using pre-pandemic trends) incidence between January 2020 and June 2021 were in the diagnoses of COPD (-43%, 95% CI (-50%, -34%)), Asthma, Hypertension and Depression and the smallest difference was in Type 1 diabetes, dementia, stroke and TIA (-8%, 95% CI (-19% ,5%)). \u0000Differences in the proportions of incidence by socio-demographic and clinical subgroups in the years preceding and following the pandemic have also been analysed (results to be finalised). \u0000ConclusionThere was an abrupt reduction in the observed incidence of all 20 LTCs after March 2020 followed by a gradual recovery over consequent months towards pre-pandemic levels. Of 20 LTCs, 15 strongly indicate a reservoir of yet undiagnosed patients. The results from this study will have implications in resource allocation.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"1977 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41262938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Anti-asthmatic prescriptions in children with and without congenital anomalies: a European data linkage study. 先天性和非先天性畸形儿童的平喘处方:一项欧洲数据关联研究。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1889
Natalie Divin, E. Garne, J. Morris, M. Loane
ObjectivesAsthma is the most common chronic disease in childhood, yet little is known about rates of asthma and wheezing in children with congenital anomalies. This study explored the prevalence and risk of receiving anti-asthmatic prescriptions in children with congenital anomalies compared to children without anomalies in six European regions/countries.ApproachThis was a EUROlinkCAT population-based linkage cohort study involving children from 0-9 years of age born between 2000-2014. Congenital anomaly data from six EUROCAT registries were linked to births data in national/vital statistics and to electronic prescription databases. Prescription/pharmacy dispensing records across regions were standardised to a Common Data Model. Anatomical Therapeutic Chemical classification codes beginning with R03 were used to identify anti-asthmatic prescriptions. Random-effects meta-analyses were performed to identify both the relative risk (RR) of receiving >1 anti-asthmatic prescription in a year relative to the reference group, and the heterogeneity of prevalence rates across registries and age group.ResultsA total of 5.1% of children with congenital anomalies and 4.9% of reference children were dropped from the study as they were not linked. Children with congenital anomalies (n=60,662) had a higher prevalence of >1 anti-asthmatic prescription and a significantly higher risk of being prescribed anti-asthmatics (RR=1.41, 95% CI 1.35-1.48) compared to reference children (n=1,722,912). The increased risk was consistent across all age groups. Children with congenital anomalies were more likely to be prescribed beta-2 agonists (RR=1.71, 95% CI 1.60-1.83) and inhaled corticosteroids (RR=1.74, 95% CI 1.61-1.87). Children with oesophageal atresia, diaphragmatic hernia, genetic syndromes and chromosomal anomalies had over twice the risk of being prescribed anti-asthmatics compared to reference children. Regional differences in prevalence and risk of anti-asthmatic prescriptions were identified.ConclusionChildren aged <10 years with congenital anomalies consistently had higher prevalence and risk of receiving >1 anti-asthmatic prescription across age group and across European regions. This study demonstrates that information on the prevalence of anti-asthmatic prescriptions issued/dispensed can be obtained through data linkage to monitor changes in prevalence over time.
目的哮喘是儿童最常见的慢性疾病,但对先天性异常儿童哮喘和喘息的发病率知之甚少。本研究探讨了6个欧洲地区/国家先天性异常儿童与非异常儿童接受抗哮喘处方的患病率和风险。这是一项基于EUROlinkCAT人群的连锁队列研究,涉及2000-2014年间出生的0-9岁儿童。来自六个EUROCAT登记处的先天性异常数据与国家/生命统计中的出生数据和电子处方数据库相关联。各地区的处方/药房配药记录被标准化为通用数据模型。以R03开头的解剖治疗化学分类代码用于识别抗哮喘处方。进行随机效应荟萃分析,以确定一年内相对于参照组接受bbb1抗哮喘处方的相对风险(RR),以及不同登记中心和年龄组患病率的异质性。结果5.1%的先天性异常患儿和4.9%的对照患儿因未关联而被排除在研究之外。与对照儿童(n=1,722,912)相比,先天性异常儿童(n=60,662)使用bbb1类抗哮喘药物的比例更高,服用抗哮喘药物的风险也明显更高(RR=1.41, 95% CI 1.35-1.48)。风险的增加在所有年龄组中都是一致的。有先天性异常的儿童更可能使用β -2激动剂(RR=1.71, 95% CI 1.60-1.83)和吸入皮质类固醇(RR=1.74, 95% CI 1.61-1.87)。与对照儿童相比,患有食管闭锁、膈疝、遗传综合征和染色体异常的儿童服用抗哮喘药物的风险超过两倍。确定了抗哮喘处方患病率和风险的地区差异。结论1岁儿童抗哮喘处方具有跨年龄组和跨欧洲地区的特点。本研究表明,可以通过数据链接获得有关已开/配发的抗哮喘处方的患病率信息,以监测患病率随时间的变化。
{"title":"Anti-asthmatic prescriptions in children with and without congenital anomalies: a European data linkage study.","authors":"Natalie Divin, E. Garne, J. Morris, M. Loane","doi":"10.23889/ijpds.v7i3.1889","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1889","url":null,"abstract":"ObjectivesAsthma is the most common chronic disease in childhood, yet little is known about rates of asthma and wheezing in children with congenital anomalies. This study explored the prevalence and risk of receiving anti-asthmatic prescriptions in children with congenital anomalies compared to children without anomalies in six European regions/countries.\u0000ApproachThis was a EUROlinkCAT population-based linkage cohort study involving children from 0-9 years of age born between 2000-2014. Congenital anomaly data from six EUROCAT registries were linked to births data in national/vital statistics and to electronic prescription databases. Prescription/pharmacy dispensing records across regions were standardised to a Common Data Model. Anatomical Therapeutic Chemical classification codes beginning with R03 were used to identify anti-asthmatic prescriptions. Random-effects meta-analyses were performed to identify both the relative risk (RR) of receiving >1 anti-asthmatic prescription in a year relative to the reference group, and the heterogeneity of prevalence rates across registries and age group.\u0000ResultsA total of 5.1% of children with congenital anomalies and 4.9% of reference children were dropped from the study as they were not linked. Children with congenital anomalies (n=60,662) had a higher prevalence of >1 anti-asthmatic prescription and a significantly higher risk of being prescribed anti-asthmatics (RR=1.41, 95% CI 1.35-1.48) compared to reference children (n=1,722,912). The increased risk was consistent across all age groups. Children with congenital anomalies were more likely to be prescribed beta-2 agonists (RR=1.71, 95% CI 1.60-1.83) and inhaled corticosteroids (RR=1.74, 95% CI 1.61-1.87). Children with oesophageal atresia, diaphragmatic hernia, genetic syndromes and chromosomal anomalies had over twice the risk of being prescribed anti-asthmatics compared to reference children. Regional differences in prevalence and risk of anti-asthmatic prescriptions were identified.\u0000ConclusionChildren aged <10 years with congenital anomalies consistently had higher prevalence and risk of receiving >1 anti-asthmatic prescription across age group and across European regions. This study demonstrates that information on the prevalence of anti-asthmatic prescriptions issued/dispensed can be obtained through data linkage to monitor changes in prevalence over time.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44544631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feasibility of evaluating COVID-19 vaccine effectiveness and variant severity through a routine health information exchange. 通过常规健康信息交流评估新冠肺炎疫苗有效性和变异严重性的可行性。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2066
A. Boulle, A. Heekes, H. Hussey, Reshna Kassanjee, M. Davies
ObjectivesTo date South Africa has experienced four distinct COVID-19 waves due to ancestral, Beta, Delta and Omnicron SARS-CoV-2 variants. We sought to answer pertinent public health questions in a timely manner as new COVID-19 variants emerge(d) using routine health service data linked through a service-facing health information exchange (HIE). ApproachA population cohort was defined amongst regular health service users in the Western Cape Province of South Africa based on recent utilisation of public sector services as reflected in the Provincial Health Data Centre (PHDC) which functions as a HIE.  Infection, hospitalisation and mortality data were derived from routinely linked laboratory, service and national vital registration data sources.  Serology done on residual specimens of patients monitored for HIV and diabetes treatment progress were linked to the PHDC, as were vaccination data from the national vaccination information system.  A single linked and de-identified dataset was exported for analysis purposes. ResultsBased on accessing services in the preceding 3 years, a cohort of 3.5 million adult patients could be enumerated and linked to co-morbidity and SARS-CoV-2 outcome data. Serology from 16,000 specimens spread across the three inter-wave periods, and vaccine data from amongst the 5 million vaccine doses given in the Province, could also be linked.  Variants could be identified by wave or by PCR assay target anomalies during cross-over periods. Publishable variant severity analyses were feasible from the sub-cohort of patients with diagnosed COVID-19, and variant-specific vaccine effectiveness was assessible amongst cases, in the population cohort, and in patients with HIV.  The impact of prior infection and marginal value of vaccination in those with prior infection was assessible within the serology sub-cohort. ConclusionA single linked de-identified dataset derived from an operational HIE was able to quickly address critical public health questions related to COVID-19 variants in a privacy-preserving manner.
目的迄今为止,由于祖先、贝塔、德尔塔和奥密克戎SARS-CoV-2变种,南非经历了四次不同的新冠肺炎疫情。随着新冠肺炎新变种的出现,我们寻求及时回答相关的公共卫生问题(d)使用通过面向服务的健康信息交换(HIE)链接的常规健康服务数据。方法根据作为HIE的省卫生数据中心(PHDC)最近对公共部门服务的使用情况,在南非西开普省的常规卫生服务用户中定义了人口群体。感染、住院和死亡率数据来自常规关联的实验室、服务和国家生命登记数据来源。对监测艾滋病毒和糖尿病治疗进展的患者残余样本进行的血清学检查与PHDC有关,国家疫苗接种信息系统的疫苗接种数据也是如此。为了进行分析,导出了一个单独的链接和取消标识的数据集。结果根据前3年获得的服务,可以列举350万成年患者的队列,并将其与合并发病率和严重急性呼吸系统综合征冠状病毒2型的结果数据联系起来。来自16000个样本的血清学分布在三个波间期,以及来自该省500万剂疫苗的疫苗数据也可能存在关联。变异可以通过波或通过交叉期内的PCR检测目标异常来识别。从诊断为新冠肺炎的患者的亚队列中进行可发表的变异严重性分析是可行的,并且可在病例、人群队列和艾滋病毒患者中评估变异特异性疫苗的有效性。既往感染的影响和疫苗接种对既往感染者的边际价值可在血清学亚队列中评估。结论来自操作性HIE的单一链接去识别数据集能够以保密的方式快速解决与新冠肺炎变异相关的关键公共卫生问题。
{"title":"Feasibility of evaluating COVID-19 vaccine effectiveness and variant severity through a routine health information exchange.","authors":"A. Boulle, A. Heekes, H. Hussey, Reshna Kassanjee, M. Davies","doi":"10.23889/ijpds.v7i3.2066","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2066","url":null,"abstract":"ObjectivesTo date South Africa has experienced four distinct COVID-19 waves due to ancestral, Beta, Delta and Omnicron SARS-CoV-2 variants. We sought to answer pertinent public health questions in a timely manner as new COVID-19 variants emerge(d) using routine health service data linked through a service-facing health information exchange (HIE). \u0000ApproachA population cohort was defined amongst regular health service users in the Western Cape Province of South Africa based on recent utilisation of public sector services as reflected in the Provincial Health Data Centre (PHDC) which functions as a HIE.  Infection, hospitalisation and mortality data were derived from routinely linked laboratory, service and national vital registration data sources.  Serology done on residual specimens of patients monitored for HIV and diabetes treatment progress were linked to the PHDC, as were vaccination data from the national vaccination information system.  A single linked and de-identified dataset was exported for analysis purposes. \u0000ResultsBased on accessing services in the preceding 3 years, a cohort of 3.5 million adult patients could be enumerated and linked to co-morbidity and SARS-CoV-2 outcome data. Serology from 16,000 specimens spread across the three inter-wave periods, and vaccine data from amongst the 5 million vaccine doses given in the Province, could also be linked.  Variants could be identified by wave or by PCR assay target anomalies during cross-over periods. Publishable variant severity analyses were feasible from the sub-cohort of patients with diagnosed COVID-19, and variant-specific vaccine effectiveness was assessible amongst cases, in the population cohort, and in patients with HIV.  The impact of prior infection and marginal value of vaccination in those with prior infection was assessible within the serology sub-cohort. \u0000ConclusionA single linked de-identified dataset derived from an operational HIE was able to quickly address critical public health questions related to COVID-19 variants in a privacy-preserving manner.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42492628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Policy Impact Case Study Using Real World Data from Welsh Government Fuel Poverty Schemes to Inform Scheme Design. 使用威尔士政府燃料贫困计划的真实世界数据进行政策影响案例研究,以告知计划设计。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1894
Sarah E. Lowe, S. Morrison-Rees
ObjectivesTo reduce fuel poverty in Wales: the Welsh Government developed schemes to provide energy efficiency improvements to lower income households. To inform scheme design: investigate health impacts by linking scheme data to health records. Presented objective: to demonstrate how research findings using real world data can impact policy focus. ApproachThe research was conducted by an independent researcher at Swansea University who co-produced research questions with the Welsh Government Fuel Poverty Policy Team. A longitudinal dataset was created by linking anonymised ‘Warm Homes: Nest’ improvements data to residents’ routine health records in the SAIL Databank at Swansea University. We examined recipient health before and after intervention compared with controls. A high-level policy briefing and research report were published in the Welsh Government Social Research – Analysis for Policy series. Findings were used to design and pilot new eligibility criteria to capture low-income individuals with a respiratory, circulatory or mental health condition. ResultsThis presentation will describe the policy impact pathway from initial discussions with policymakers to real world change, including: securing ESRC funding for a Knowledge Transfer Fellowship, which included a 2013 data linking demonstration project… …which allowed funding to be secured for a 2015-18 research project focused on the impact of improvements on recipient health… …which published emerging findings in 2016… …and substantive findings in 2017, showing a significant positive impact of improvements on recipient health… …which policymakers used to design a pilot to test ways to widen eligibility criteria to include individuals on a low income with a respiratory, circulatory or mental health condition… …which led to scheme criteria being widened in 2019. By 2021, 25% of recipients entered the scheme via the ‘health route’. ConclusionBy delivering research findings generated using linked real world data, and focused on questions co-produced with policymakers, researchers can successfully impact the design and implementation of government policy, thereby improving the lives of people in the real world - in this case, the health of the citizens of Wales.
目标:减少威尔士的燃料贫困:威尔士政府制定了提高低收入家庭能源效率的计划。为方案设计提供信息:通过将方案数据与健康记录联系起来,调查健康影响。提出的目标:展示使用真实世界数据的研究结果如何影响政策重点。这项研究是由斯旺西大学的一名独立研究员进行的,他与威尔士政府燃料贫困政策小组共同提出了研究问题。通过将匿名的“温暖之家:巢”改善数据与斯旺西大学SAIL数据库中居民的日常健康记录联系起来,创建了一个纵向数据集。与对照组相比,我们在干预前后检查了接受者的健康状况。《威尔士政府社会研究-政策分析》系列发表了一份高级别政策简报和研究报告。研究结果被用于设计和试点新的资格标准,以捕获有呼吸、循环或精神健康状况的低收入个体。本演讲将描述从与政策制定者的初步讨论到现实世界变化的政策影响途径,包括:确保ESRC为知识转移研究金提供资金,其中包括2013年的数据链接示范项目... ...,该项目为2015-18年的研究项目提供资金,重点关注改善对接受者健康的影响... ...,该项目于2016年发表了新发现... ...,并于2017年发表了实质性发现;显示对接受者健康的改善产生了重大的积极影响... ...政策制定者用来设计一个试点,以测试扩大资格标准的方法,以包括患有呼吸、循环或精神健康状况的低收入个人... ...,这导致计划标准在2019年扩大。到2021年,25%的受助人通过"健康途径"加入该计划。通过提供使用关联的真实世界数据产生的研究结果,并将重点放在与政策制定者共同产生的问题上,研究人员可以成功地影响政府政策的设计和实施,从而改善现实世界中人们的生活——在这种情况下,改善威尔士公民的健康。
{"title":"A Policy Impact Case Study Using Real World Data from Welsh Government Fuel Poverty Schemes to Inform Scheme Design.","authors":"Sarah E. Lowe, S. Morrison-Rees","doi":"10.23889/ijpds.v7i3.1894","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1894","url":null,"abstract":"ObjectivesTo reduce fuel poverty in Wales: the Welsh Government developed schemes to provide energy efficiency improvements to lower income households. \u0000To inform scheme design: investigate health impacts by linking scheme data to health records. \u0000Presented objective: to demonstrate how research findings using real world data can impact policy focus. \u0000ApproachThe research was conducted by an independent researcher at Swansea University who co-produced research questions with the Welsh Government Fuel Poverty Policy Team. \u0000A longitudinal dataset was created by linking anonymised ‘Warm Homes: Nest’ improvements data to residents’ routine health records in the SAIL Databank at Swansea University. We examined recipient health before and after intervention compared with controls. \u0000A high-level policy briefing and research report were published in the Welsh Government Social Research – Analysis for Policy series. \u0000Findings were used to design and pilot new eligibility criteria to capture low-income individuals with a respiratory, circulatory or mental health condition. \u0000ResultsThis presentation will describe the policy impact pathway from initial discussions with policymakers to real world change, including: \u0000 \u0000securing ESRC funding for a Knowledge Transfer Fellowship, which included a 2013 data linking demonstration project… \u0000…which allowed funding to be secured for a 2015-18 research project focused on the impact of improvements on recipient health… \u0000…which published emerging findings in 2016… \u0000…and substantive findings in 2017, showing a significant positive impact of improvements on recipient health… \u0000…which policymakers used to design a pilot to test ways to widen eligibility criteria to include individuals on a low income with a respiratory, circulatory or mental health condition… \u0000…which led to scheme criteria being widened in 2019. \u0000 \u0000By 2021, 25% of recipients entered the scheme via the ‘health route’. \u0000ConclusionBy delivering research findings generated using linked real world data, and focused on questions co-produced with policymakers, researchers can successfully impact the design and implementation of government policy, thereby improving the lives of people in the real world - in this case, the health of the citizens of Wales.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46075573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysing Siamese Neural Network Architectures for Computing Name Similarity. 分析用于计算名称相似性的暹罗神经网络架构。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2077
Nicholas Vinden, Jérémy Foxcroft, L. Antonie
IntroductionA wide assortment of string similarity measures can be used to determine how similar two names are. A diverse set of discriminating and independent features for name similarity are important for classification during record linkage. A Siamese neural network could surpass traditional string similarity measures for the name similarity problem. Objectives and ApproachThis research aims to compare a classifier based on the Siamese network architecture with a Random Forest classifier. In addition to comparing overall performance, we seek to answer whether there are any special properties of certain matching name pairs where the complexity of the Siamese network offers particular benefit. Our data consists of 25,000 last name pairings, with each pair being two variants of a family name. Name similarity predictions from the Siamese network are compared to a Random Forest model that serves as an ensemble of existing string similarity measures. ResultsWe compare the similarity scores yielded by the two methods and discuss the results. We describe the representation of names to each method; name representation is computed formulaically for the traditional measures but is learned by the Siamese network during training. The comparison of different methods is made both in terms of their similarity prediction quality, and the computational cost to generate the predictions. As expected, the Siamese network necessitates a significant computational cost to train. Unexpectedly, the ensemble of traditional measures yields almost identical overall classification performance. However, we expect that further analysis of false positives and false negatives will yield some insight into when practitioners should consider one method over the other. Conclusions/ImplicationsResults suggest that there may be instances where a Siamese network outperforms other similarity measures, although training a Siamese network comes at a considerable computational cost. It is worth considering this approach to name similarity as an additional similarity feature when performing record linkage tasks.
引言可以使用各种各样的字符串相似性度量来确定两个名称的相似程度。在记录链接过程中,名称相似性的一组不同的判别和独立特征对于分类很重要。在名称相似性问题上,暹罗神经网络可以超越传统的字符串相似性度量。目的和方法本研究旨在将基于暹罗网络架构的分类器与随机森林分类器进行比较。除了比较整体性能外,我们还试图回答某些匹配名称对是否有任何特殊性质,其中暹罗网络的复杂性提供了特别的好处。我们的数据由25000个姓氏配对组成,每个配对都是一个姓氏的两个变体。将暹罗网络的名称相似性预测与随机森林模型进行比较,该模型用作现有字符串相似性度量的集合。结果我们比较了两种方法得出的相似性得分,并对结果进行了讨论。我们描述了每个方法的名称表示;名称表示是为传统度量公式化计算的,但在训练过程中由暹罗网络学习。对不同方法的相似性预测质量和生成预测的计算成本进行了比较。正如预期的那样,暹罗网络需要大量的计算成本来进行训练。出乎意料的是,传统度量的集合产生了几乎相同的总体分类性能。然而,我们预计,对假阳性和假阴性的进一步分析将对从业者何时应该考虑一种方法而不是另一种方法产生一些见解。结论/含义结果表明,尽管训练暹罗网络需要相当大的计算成本,但暹罗网络可能在某些情况下优于其他相似性度量。在执行记录链接任务时,值得考虑将这种名称相似性方法作为额外的相似性特征。
{"title":"Analysing Siamese Neural Network Architectures for Computing Name Similarity.","authors":"Nicholas Vinden, Jérémy Foxcroft, L. Antonie","doi":"10.23889/ijpds.v7i3.2077","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2077","url":null,"abstract":"IntroductionA wide assortment of string similarity measures can be used to determine how similar two names are. A diverse set of discriminating and independent features for name similarity are important for classification during record linkage. A Siamese neural network could surpass traditional string similarity measures for the name similarity problem. \u0000Objectives and ApproachThis research aims to compare a classifier based on the Siamese network architecture with a Random Forest classifier. In addition to comparing overall performance, we seek to answer whether there are any special properties of certain matching name pairs where the complexity of the Siamese network offers particular benefit. \u0000Our data consists of 25,000 last name pairings, with each pair being two variants of a family name. Name similarity predictions from the Siamese network are compared to a Random Forest model that serves as an ensemble of existing string similarity measures. \u0000ResultsWe compare the similarity scores yielded by the two methods and discuss the results. We describe the representation of names to each method; name representation is computed formulaically for the traditional measures but is learned by the Siamese network during training. The comparison of different methods is made both in terms of their similarity prediction quality, and the computational cost to generate the predictions. \u0000As expected, the Siamese network necessitates a significant computational cost to train. Unexpectedly, the ensemble of traditional measures yields almost identical overall classification performance. However, we expect that further analysis of false positives and false negatives will yield some insight into when practitioners should consider one method over the other. \u0000Conclusions/ImplicationsResults suggest that there may be instances where a Siamese network outperforms other similarity measures, although training a Siamese network comes at a considerable computational cost. It is worth considering this approach to name similarity as an additional similarity feature when performing record linkage tasks.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46138847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Care trajectory in homes care users across mortality-risk profiles: an observational study. 家庭护理轨迹-护理使用者的死亡风险概况:一项观察性研究。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1847
Maya Murmann, D. Manuel, P. Tanuseputro, C. Bennett, M. Pugliese, Rhiannon Roberts, Wenshan Li, A. Hsu
ObjectivesRESPECT is a prognostic tool, developed using linked population-based data, to predict 6-month mortality in community-dwelling older adults. RESPECT is implemented and openly accessible as a web-based tool on ProjectBigLife.ca, where over 700,000 calculations have been performed to date. Our objective was to describe healthcare utilization patterns among home care (HC) users across mortality risk profiles generated from RESPECT to inform care planning for older persons who have varying mortality risks and levels of care needs as they decline. ApproachWe conducted a retrospective cohort study examining healthcare use among HC users in Ontario, Canada, who received at least one interRAI HC assessment between April 2018 and September 2019.  Using linked health administrative data at the individual level, we examined the use of acute care (hospitalizations and emergency department (ED) visits), long-term care (LTC), and palliative home care within 6-months of each assessment and prognostication using RESPECT. Mortality risk profiles from RESPECT were created based on the median survival. ResultsThe cohort comprised 247,377 community-dwelling older adults; 14.3% died within 6-months of an assessment. Among decedents, half (51.51%) of HC users with a predicted median survival of less than 3-months received at least one palliative care home visit; 39.17%, 34.82% and 13.84% visited the ED, were hospitalized, or were admitted to LTC, respectively. The proportion of assessments that received at least one palliative HC visit declined to 43.11% and 30.28% of assessments with a median survival between 3- and 6-months and those between 6-months and 12-months, respectively.  The proportion of assessments with an acute care use increases with increasing median survival. ConclusionA considerable proportion of people at the end-of-life do not receive any palliative home care and continued to be institutionalized. This may be indication that the reduced life expectancies and palliative care needs of many older adults are not being recognized, thus demonstrating the value of prognostic models like RESPECT to inform care planning  for individuals in their final years of life.
respect是一种预后工具,使用相关的基于人群的数据开发,用于预测社区居住的老年人6个月死亡率。RESPECT作为基于web的工具在ProjectBigLife上实现并公开访问。Ca,迄今为止已经进行了70多万次计算。我们的目的是通过RESPECT生成的死亡率风险概况来描述家庭护理(HC)使用者的医疗保健利用模式,为老年人的护理规划提供信息,这些老年人随着年龄的下降有不同的死亡率风险和护理需求水平。方法:我们进行了一项回顾性队列研究,调查了加拿大安大略省HC使用者的医疗保健使用情况,这些使用者在2018年4月至2019年9月期间至少接受了一次rai HC评估。使用个人层面的相关健康管理数据,我们检查了每次评估和使用RESPECT进行预测后6个月内的急性护理(住院和急诊就诊)、长期护理(LTC)和姑息性家庭护理的使用情况。RESPECT的死亡风险概况是基于中位生存期创建的。结果该队列包括247,377名社区居住老年人;14.3%在评估后6个月内死亡。在死者中,半数(51.51%)预期中位生存期小于3个月的HC使用者至少接受过一次姑息治疗家访;分别有39.17%、34.82%和13.84%的患者去急诊科就诊、住院或住院。接受至少一次姑息性HC访问的评估比例分别下降至43.11%和30.28%,中位生存期分别为3至6个月和6至12个月。急性护理使用的评估比例随着中位生存期的增加而增加。结论相当比例的临终患者没有接受任何姑息性家庭护理,而是继续被机构化。这可能表明,许多老年人的预期寿命缩短和姑息治疗需求没有得到认识,从而证明了像RESPECT这样的预后模型的价值,可以为个人生命最后几年的护理计划提供信息。
{"title":"Care trajectory in homes care users across mortality-risk profiles: an observational study.","authors":"Maya Murmann, D. Manuel, P. Tanuseputro, C. Bennett, M. Pugliese, Rhiannon Roberts, Wenshan Li, A. Hsu","doi":"10.23889/ijpds.v7i3.1847","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1847","url":null,"abstract":"ObjectivesRESPECT is a prognostic tool, developed using linked population-based data, to predict 6-month mortality in community-dwelling older adults. RESPECT is implemented and openly accessible as a web-based tool on ProjectBigLife.ca, where over 700,000 calculations have been performed to date. Our objective was to describe healthcare utilization patterns among home care (HC) users across mortality risk profiles generated from RESPECT to inform care planning for older persons who have varying mortality risks and levels of care needs as they decline. \u0000ApproachWe conducted a retrospective cohort study examining healthcare use among HC users in Ontario, Canada, who received at least one interRAI HC assessment between April 2018 and September 2019.  Using linked health administrative data at the individual level, we examined the use of acute care (hospitalizations and emergency department (ED) visits), long-term care (LTC), and palliative home care within 6-months of each assessment and prognostication using RESPECT. Mortality risk profiles from RESPECT were created based on the median survival. \u0000ResultsThe cohort comprised 247,377 community-dwelling older adults; 14.3% died within 6-months of an assessment. Among decedents, half (51.51%) of HC users with a predicted median survival of less than 3-months received at least one palliative care home visit; 39.17%, 34.82% and 13.84% visited the ED, were hospitalized, or were admitted to LTC, respectively. The proportion of assessments that received at least one palliative HC visit declined to 43.11% and 30.28% of assessments with a median survival between 3- and 6-months and those between 6-months and 12-months, respectively.  The proportion of assessments with an acute care use increases with increasing median survival. \u0000ConclusionA considerable proportion of people at the end-of-life do not receive any palliative home care and continued to be institutionalized. This may be indication that the reduced life expectancies and palliative care needs of many older adults are not being recognized, thus demonstrating the value of prognostic models like RESPECT to inform care planning  for individuals in their final years of life.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46199487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Goldstein et al.’s Scalelink method of data linkage. 探索Goldstein等人的Scalelink数据链接方法。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2042
M. A. M. Cleaton, Josie Plachta, R. Shipsey
ObjectivesScalelink is an innovative probabilistic data linkage method based on correspondence analysis. Unlike the popular and widely-used Fellegi-Sunter algorithm, it does not assume linkage variable independence. It also claims to be more intuitive and computationally efficient. We aim to test this method for the first time on real-world big data. ApproachScalelink uses agreement states for each linkage variable and candidate pair. These are compared to determine how frequently, for all candidate pairs, any given agreement state is held at the same time as any other agreement state (this accounts for variable dependence). The results of this comparison are inputted into a loss function and the minimisation of this function is derived within constraints to produce weights. Currently, the method is accessible via Goldstein et al.’s paper and R package. We are translating it into PySpark to enable testing on datasets that are too large to link without using distributed computing. ResultsInitial testing of Goldstein et al.’s Scalelink method on small samples of real-world datasets shows that it performs as expected for a probabilistic linkage method, although cannot currently deal with missingness. To test the quality of the method on real-world big data, a high-quality linked dataset of the 2021 England and Wales Census and follow-up Census Coverage Survey will be used as a Gold Standard (GS). After developing a method that enables Scalelink to deal with missingness, we will apply Scalelink and automatic Fellegi-Sunter probabilistic linkage to this GS. We can thus establish and compare the precision and recall of both methods. We will also investigate linkage bias for particular demographics, test computational efficiency and estimate the clerical review burden for each method. ConclusionGoldstein et al.’s Scalelink algorithm shows promise as a high quality, scalable, dependence-free linkage algorithm for use in any matching project. Here, for the first time, we research the method’s quality and feasibility with real-world big data. From this we will produce recommendations regarding its utility.
目的scalelink是一种基于对应分析的概率数据链接方法。与流行且广泛使用的Fellegi-Sunter算法不同,它不假设连杆变量独立。它还声称更直观,计算效率更高。我们的目标是在现实世界的大数据上首次测试这种方法。ApproachScalelink为每个链接变量和候选对使用一致状态。对它们进行比较,以确定对于所有候选对,任何给定的协议状态与任何其他协议状态同时保持的频率(这说明了变量依赖性)。该比较的结果被输入到损失函数中,并且该函数的最小化在约束内导出以产生权重。目前,该方法可通过Goldstein等人的论文和R包访问。我们正在将其转换为PySpark,以便在不使用分布式计算就无法链接的数据集上进行测试。结果Goldstein等人的Scalelink方法在真实世界数据集的小样本上的初步测试表明,它的性能与概率链接方法的预期一样,尽管目前无法处理缺失。为了在真实世界的大数据上测试该方法的质量,2021年英格兰和威尔士人口普查和后续人口普查覆盖率调查的高质量关联数据集将被用作黄金标准(GS)。在开发出一种使Scalelink能够处理缺失的方法后,我们将把Scalelink和自动Fellegi-Sunter概率链接应用于该GS。因此,我们可以建立并比较这两种方法的精度和召回率。我们还将调查特定人口统计学的联系偏差,测试计算效率,并估计每种方法的文书审查负担。结论Goldstein等人的Scalelink算法有望成为一种高质量、可扩展、无依赖的链接算法,可用于任何匹配项目。在这里,我们首次利用真实世界的大数据研究了该方法的质量和可行性。据此,我们将提出关于其效用的建议。
{"title":"Exploring Goldstein et al.’s Scalelink method of data linkage.","authors":"M. A. M. Cleaton, Josie Plachta, R. Shipsey","doi":"10.23889/ijpds.v7i3.2042","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2042","url":null,"abstract":"ObjectivesScalelink is an innovative probabilistic data linkage method based on correspondence analysis. Unlike the popular and widely-used Fellegi-Sunter algorithm, it does not assume linkage variable independence. It also claims to be more intuitive and computationally efficient. We aim to test this method for the first time on real-world big data. \u0000ApproachScalelink uses agreement states for each linkage variable and candidate pair. These are compared to determine how frequently, for all candidate pairs, any given agreement state is held at the same time as any other agreement state (this accounts for variable dependence). The results of this comparison are inputted into a loss function and the minimisation of this function is derived within constraints to produce weights. Currently, the method is accessible via Goldstein et al.’s paper and R package. We are translating it into PySpark to enable testing on datasets that are too large to link without using distributed computing. \u0000ResultsInitial testing of Goldstein et al.’s Scalelink method on small samples of real-world datasets shows that it performs as expected for a probabilistic linkage method, although cannot currently deal with missingness. To test the quality of the method on real-world big data, a high-quality linked dataset of the 2021 England and Wales Census and follow-up Census Coverage Survey will be used as a Gold Standard (GS). After developing a method that enables Scalelink to deal with missingness, we will apply Scalelink and automatic Fellegi-Sunter probabilistic linkage to this GS. We can thus establish and compare the precision and recall of both methods. We will also investigate linkage bias for particular demographics, test computational efficiency and estimate the clerical review burden for each method. \u0000ConclusionGoldstein et al.’s Scalelink algorithm shows promise as a high quality, scalable, dependence-free linkage algorithm for use in any matching project. Here, for the first time, we research the method’s quality and feasibility with real-world big data. From this we will produce recommendations regarding its utility.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41432374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Population Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1