Pub Date : 2022-08-25DOI: 10.23889/ijpds.v7i3.1833
F. Cavallaro, R. Cannings‐John, F. Lugg-Widger, J. H. van der Meulen, R. Gilbert, E. Kennedy, M. Robling, Hywel Jones
ObjectivesWe describe the challenges and lessons learned from two studies using linked administrative data from health, education and social care sectors to evaluate the Family Nurse Partnership (FNP), an intervention supporting adolescent mothers in England(E) and Scotland(S). We present recommendations for studies using linked administrative data to evaluate complex interventions. ApproachWe constructed two cohorts of all mothers aged 13-19 giving birth in NHS hospitals in England and Scotland between 2010-2016/17 using linkage of mothers and babies in hospital admissions data (E:Hospital Episode Statistics/S:Maternity Inpatient and Day Case), and identified FNP participation through linkage to FNP programme data. We additionally linked to health, educational and social care data for mothers and their babies (E:National Pupil Database/S:eDRIS). We used these data to identify key risk factors for enrolment in the FNP, assess the effect of the FNP on maternal and child outcomes, and determine programme characteristics modifying the effect of the FNP. ResultsKey challenges: characterising the intervention and usual care, understanding quality of multi-sector data linkage, data access delays, constructing appropriate comparator groups and interpreting outcomes captured in administrative data. Lessons learned: evaluations require detailed data on intervention activity (dates/geography), and assessment of usual care, which are rarely readily available and are time-consuming to gather; data linkage quality is variable/not available, making defining denominators challenging; data access delays impeded on data analysis time; unmeasured confounders not captured in administrative data may prevent generation of an appropriate comparator group. Recommendations: Characteristics informing targeting should be explicitly documented, and could be enhanced using linked primary care data and information on household members (e.g. fathers). Process evaluation and qualitative research could help to provide better understanding of mechanisms of effect. ConclusionLinkage of administrative data presents exciting opportunities for efficient evaluation of large-scale, complex public health interventions. However, sufficient information is needed on programme meta-data, targeting and important confounders in order to generate meaningful results. Study findings should help stimulate exploration with practitioners about how programmes can be improved.
{"title":"Challenges and lessons learned from two countries using linked administrative data to evaluate the Family Nurse Partnership.","authors":"F. Cavallaro, R. Cannings‐John, F. Lugg-Widger, J. H. van der Meulen, R. Gilbert, E. Kennedy, M. Robling, Hywel Jones","doi":"10.23889/ijpds.v7i3.1833","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1833","url":null,"abstract":"ObjectivesWe describe the challenges and lessons learned from two studies using linked administrative data from health, education and social care sectors to evaluate the Family Nurse Partnership (FNP), an intervention supporting adolescent mothers in England(E) and Scotland(S). We present recommendations for studies using linked administrative data to evaluate complex interventions. \u0000ApproachWe constructed two cohorts of all mothers aged 13-19 giving birth in NHS hospitals in England and Scotland between 2010-2016/17 using linkage of mothers and babies in hospital admissions data (E:Hospital Episode Statistics/S:Maternity Inpatient and Day Case), and identified FNP participation through linkage to FNP programme data. We additionally linked to health, educational and social care data for mothers and their babies (E:National Pupil Database/S:eDRIS). We used these data to identify key risk factors for enrolment in the FNP, assess the effect of the FNP on maternal and child outcomes, and determine programme characteristics modifying the effect of the FNP. \u0000ResultsKey challenges: characterising the intervention and usual care, understanding quality of multi-sector data linkage, data access delays, constructing appropriate comparator groups and interpreting outcomes captured in administrative data. Lessons learned: evaluations require detailed data on intervention activity (dates/geography), and assessment of usual care, which are rarely readily available and are time-consuming to gather; data linkage quality is variable/not available, making defining denominators challenging; data access delays impeded on data analysis time; unmeasured confounders not captured in administrative data may prevent generation of an appropriate comparator group. Recommendations: Characteristics informing targeting should be explicitly documented, and could be enhanced using linked primary care data and information on household members (e.g. fathers). Process evaluation and qualitative research could help to provide better understanding of mechanisms of effect. \u0000ConclusionLinkage of administrative data presents exciting opportunities for efficient evaluation of large-scale, complex public health interventions. However, sufficient information is needed on programme meta-data, targeting and important confounders in order to generate meaningful results. Study findings should help stimulate exploration with practitioners about how programmes can be improved.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42119569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-25DOI: 10.23889/ijpds.v7i3.2079
Yu Deng, Lacey P. Gleason, Adam Culbertson, Don Asmonga, S. Grannis, A. Kho
ObjectivesPatient matching rates between organizations can be as low as fifty percent. Challenges to matching include the variation in quality and availability of patient attributes. Here we describe the changing nature of patient attributes available over the past 11-years across a diversity of care settings in the United States. ApproachOur expert panel identified 64 patient attributes that are currently used or could potentially be candidates for patient matching. We identified a national sample of 14 health care sites who sent us aggregated information on the 64 patient attributes from 2010 to 2020 (inclusive). The information included overall counts and percent availability, overall counts and percent availability by race, and counts and availability by year. Only patients having at least one visit to the site since 2010 and who were between 18 and 89 years of age at time of extraction were included. ResultsThe aggregated results revealed that first name, last name, gender, postal codes, and date of birth are highly available (>90%) across healthcare organizations and time. Patient reported social security number, work phone number, and emergency contact declined markedly, potentially reflecting privacy concerns. Email addresses (from 18.0% to 63.7%) and phone numbers (from 14.7% to 69.4%) increased greatly over the past 11 years. Novel patient matching attributes such as blood type, facial image, thumb print, or eye color are rarely collected across sites for all years. We observed emerging attributes including sexuality, occupation, and nickname with a small number of sites collecting these over 70%, reflecting the feasibility of wider adoption in the future. ConclusionIn this study, we examined the availability of 64 patient attributes across 14 sites from 2010 and 2020. Our findings could inform policy makers and readers about patient attributes that are used for current patient matching and emerging data attributes that could be considered for incorporation into future matching algorithms.
{"title":"The changing nature of patient attributes available for matching.","authors":"Yu Deng, Lacey P. Gleason, Adam Culbertson, Don Asmonga, S. Grannis, A. Kho","doi":"10.23889/ijpds.v7i3.2079","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2079","url":null,"abstract":"ObjectivesPatient matching rates between organizations can be as low as fifty percent. Challenges to matching include the variation in quality and availability of patient attributes. Here we describe the changing nature of patient attributes available over the past 11-years across a diversity of care settings in the United States. \u0000ApproachOur expert panel identified 64 patient attributes that are currently used or could potentially be candidates for patient matching. We identified a national sample of 14 health care sites who sent us aggregated information on the 64 patient attributes from 2010 to 2020 (inclusive). The information included overall counts and percent availability, overall counts and percent availability by race, and counts and availability by year. Only patients having at least one visit to the site since 2010 and who were between 18 and 89 years of age at time of extraction were included. \u0000ResultsThe aggregated results revealed that first name, last name, gender, postal codes, and date of birth are highly available (>90%) across healthcare organizations and time. Patient reported social security number, work phone number, and emergency contact declined markedly, potentially reflecting privacy concerns. Email addresses (from 18.0% to 63.7%) and phone numbers (from 14.7% to 69.4%) increased greatly over the past 11 years. Novel patient matching attributes such as blood type, facial image, thumb print, or eye color are rarely collected across sites for all years. We observed emerging attributes including sexuality, occupation, and nickname with a small number of sites collecting these over 70%, reflecting the feasibility of wider adoption in the future. \u0000ConclusionIn this study, we examined the availability of 64 patient attributes across 14 sites from 2010 and 2020. Our findings could inform policy makers and readers about patient attributes that are used for current patient matching and emerging data attributes that could be considered for incorporation into future matching algorithms.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48932868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-25DOI: 10.23889/ijpds.v7i3.1920
Georgina Eaton, Kylie Hill, A. Summerfield
ObjectivesThe Ministry of Justice’s pioneering data linking programme Data First, funded by Administrative Data Research UK, links administrative datasets across the justice system and with other government departments to enable research providing critical new insights on justice system users, their pathways, and outcomes across a range of public services. ApproachThe first two datasets shared under the Data First project are magistrates’ courts and Crown Court data which have been deidentified, deduplicated and linked to provide a joined-up picture of criminal court defendant and case journeys. Accredited researchers can access this data using the ONS Secure Research Service to conduct research. Administrative Data Research UK has funded four Research Fellows to conduct analysis using this linked data. Additionally, analysts within the Ministry of Justice Data First team have published a research report showcasing the potential of the linked criminal courts data. The presentation will primarily focus on this work. ResultsThe Data First criminal courts datasets have enabled, for the first time, the extent and nature of repeat users to be explored at scale for research. In March 2022, the Ministry of Justice published exploratory analysis of returning defendants and the potential of linked criminal courts data. The key findings of this report will be covered in the presentation. The research demonstrates more than half of defendants returned to the courts within the data period, but this was highest for specific offence groups, including theft, robbery and drug offences. Locality-based analysis on Crown Court defendants highlights important insights on the backgrounds of justice system users, showing an over-representation of defendants residing in the most deprived areas in England and Wales compared to the general population. ConclusionThe presentation will demonstrate how linked administrative data available through the ground-breaking Data First programme can be effectively used for research. This insight improves our understanding of individuals in the justice system as well as providing a rich resource to develop the evidence base for government policy and practice.
{"title":"Data First: Criminal Courts Linked Data research report.","authors":"Georgina Eaton, Kylie Hill, A. Summerfield","doi":"10.23889/ijpds.v7i3.1920","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1920","url":null,"abstract":"ObjectivesThe Ministry of Justice’s pioneering data linking programme Data First, funded by Administrative Data Research UK, links administrative datasets across the justice system and with other government departments to enable research providing critical new insights on justice system users, their pathways, and outcomes across a range of public services. \u0000ApproachThe first two datasets shared under the Data First project are magistrates’ courts and Crown Court data which have been deidentified, deduplicated and linked to provide a joined-up picture of criminal court defendant and case journeys. Accredited researchers can access this data using the ONS Secure Research Service to conduct research. Administrative Data Research UK has funded four Research Fellows to conduct analysis using this linked data. Additionally, analysts within the Ministry of Justice Data First team have published a research report showcasing the potential of the linked criminal courts data. The presentation will primarily focus on this work. \u0000ResultsThe Data First criminal courts datasets have enabled, for the first time, the extent and nature of repeat users to be explored at scale for research. In March 2022, the Ministry of Justice published exploratory analysis of returning defendants and the potential of linked criminal courts data. The key findings of this report will be covered in the presentation. The research demonstrates more than half of defendants returned to the courts within the data period, but this was highest for specific offence groups, including theft, robbery and drug offences. Locality-based analysis on Crown Court defendants highlights important insights on the backgrounds of justice system users, showing an over-representation of defendants residing in the most deprived areas in England and Wales compared to the general population. \u0000ConclusionThe presentation will demonstrate how linked administrative data available through the ground-breaking Data First programme can be effectively used for research. This insight improves our understanding of individuals in the justice system as well as providing a rich resource to develop the evidence base for government policy and practice.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47540311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-25DOI: 10.23889/ijpds.v7i3.2011
Cathy Qi, T. Osborne, R. Bailey, J. Hollinghurst, A. Akbari, A. Cooper, Ruth Crowder, H. Peters, R. Law, Anthony Davies, R. Lewis, Mark C Walker, Adrian Edwards, R. Lyons
BackgroundThe COVID-19 pandemic has resulted in delayed diagnosis and treatment for cancer patients and increases in elective surgery waiting lists. The impact on other ‘long-term’ conditions (LTCs) is unclear. We examined the effects of the pandemic on the recorded incidence of 20 LTCs to inform decisions on treatment pathways and resource allocation. ApproachWe included Welsh residents diagnosed with any of 20 LTCs for the first time between 2000-2021. Data were accessed and analysed within the Secure Anonymised Information Linkage (SAIL) Databank. The primary aim was to assess the impact of the COVID-19 pandemic on trends in recorded incidence. Secondarily we examined incidence by socio-demographic and clinical subgroups: age, sex, deprivation quintile, ethnicity, frailty score and learning disability. Incidence were presented as monthly rates for each LTC. We performed interrupted time series analyses to estimate; the immediate and long-term change in rates following the pandemic; and the size of the undiagnosed population. ResultsWe included 2,206,070 individuals diagnosed with at least one LTC. An immediate reduction in recording of new diagnoses was observed in April 2020 across all 20 LTCs, followed by a gradual recovery towards pre-pandemic levels over the next 18 months, though at different rates across conditions. The largest difference between observed and expected (as predicted using pre-pandemic trends) incidence between January 2020 and June 2021 were in the diagnoses of COPD (-43%, 95% CI (-50%, -34%)), Asthma, Hypertension and Depression and the smallest difference was in Type 1 diabetes, dementia, stroke and TIA (-8%, 95% CI (-19% ,5%)). Differences in the proportions of incidence by socio-demographic and clinical subgroups in the years preceding and following the pandemic have also been analysed (results to be finalised). ConclusionThere was an abrupt reduction in the observed incidence of all 20 LTCs after March 2020 followed by a gradual recovery over consequent months towards pre-pandemic levels. Of 20 LTCs, 15 strongly indicate a reservoir of yet undiagnosed patients. The results from this study will have implications in resource allocation.
{"title":"The impact of COVID-19 pandemic on trends in the recorded incidence of Long-Term Conditions identified from routine electronic health records between 2000 and 2021 in Wales: a population data linkage study.","authors":"Cathy Qi, T. Osborne, R. Bailey, J. Hollinghurst, A. Akbari, A. Cooper, Ruth Crowder, H. Peters, R. Law, Anthony Davies, R. Lewis, Mark C Walker, Adrian Edwards, R. Lyons","doi":"10.23889/ijpds.v7i3.2011","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2011","url":null,"abstract":"BackgroundThe COVID-19 pandemic has resulted in delayed diagnosis and treatment for cancer patients and increases in elective surgery waiting lists. The impact on other ‘long-term’ conditions (LTCs) is unclear. We examined the effects of the pandemic on the recorded incidence of 20 LTCs to inform decisions on treatment pathways and resource allocation. \u0000ApproachWe included Welsh residents diagnosed with any of 20 LTCs for the first time between 2000-2021. \u0000Data were accessed and analysed within the Secure Anonymised Information Linkage (SAIL) Databank. \u0000The primary aim was to assess the impact of the COVID-19 pandemic on trends in recorded incidence. Secondarily we examined incidence by socio-demographic and clinical subgroups: age, sex, deprivation quintile, ethnicity, frailty score and learning disability. \u0000Incidence were presented as monthly rates for each LTC. We performed interrupted time series analyses to estimate; the immediate and long-term change in rates following the pandemic; and the size of the undiagnosed population. \u0000ResultsWe included 2,206,070 individuals diagnosed with at least one LTC. \u0000An immediate reduction in recording of new diagnoses was observed in April 2020 across all 20 LTCs, followed by a gradual recovery towards pre-pandemic levels over the next 18 months, though at different rates across conditions. The largest difference between observed and expected (as predicted using pre-pandemic trends) incidence between January 2020 and June 2021 were in the diagnoses of COPD (-43%, 95% CI (-50%, -34%)), Asthma, Hypertension and Depression and the smallest difference was in Type 1 diabetes, dementia, stroke and TIA (-8%, 95% CI (-19% ,5%)). \u0000Differences in the proportions of incidence by socio-demographic and clinical subgroups in the years preceding and following the pandemic have also been analysed (results to be finalised). \u0000ConclusionThere was an abrupt reduction in the observed incidence of all 20 LTCs after March 2020 followed by a gradual recovery over consequent months towards pre-pandemic levels. Of 20 LTCs, 15 strongly indicate a reservoir of yet undiagnosed patients. The results from this study will have implications in resource allocation.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"1977 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41262938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-25DOI: 10.23889/ijpds.v7i3.1889
Natalie Divin, E. Garne, J. Morris, M. Loane
ObjectivesAsthma is the most common chronic disease in childhood, yet little is known about rates of asthma and wheezing in children with congenital anomalies. This study explored the prevalence and risk of receiving anti-asthmatic prescriptions in children with congenital anomalies compared to children without anomalies in six European regions/countries. ApproachThis was a EUROlinkCAT population-based linkage cohort study involving children from 0-9 years of age born between 2000-2014. Congenital anomaly data from six EUROCAT registries were linked to births data in national/vital statistics and to electronic prescription databases. Prescription/pharmacy dispensing records across regions were standardised to a Common Data Model. Anatomical Therapeutic Chemical classification codes beginning with R03 were used to identify anti-asthmatic prescriptions. Random-effects meta-analyses were performed to identify both the relative risk (RR) of receiving >1 anti-asthmatic prescription in a year relative to the reference group, and the heterogeneity of prevalence rates across registries and age group. ResultsA total of 5.1% of children with congenital anomalies and 4.9% of reference children were dropped from the study as they were not linked. Children with congenital anomalies (n=60,662) had a higher prevalence of >1 anti-asthmatic prescription and a significantly higher risk of being prescribed anti-asthmatics (RR=1.41, 95% CI 1.35-1.48) compared to reference children (n=1,722,912). The increased risk was consistent across all age groups. Children with congenital anomalies were more likely to be prescribed beta-2 agonists (RR=1.71, 95% CI 1.60-1.83) and inhaled corticosteroids (RR=1.74, 95% CI 1.61-1.87). Children with oesophageal atresia, diaphragmatic hernia, genetic syndromes and chromosomal anomalies had over twice the risk of being prescribed anti-asthmatics compared to reference children. Regional differences in prevalence and risk of anti-asthmatic prescriptions were identified. ConclusionChildren aged <10 years with congenital anomalies consistently had higher prevalence and risk of receiving >1 anti-asthmatic prescription across age group and across European regions. This study demonstrates that information on the prevalence of anti-asthmatic prescriptions issued/dispensed can be obtained through data linkage to monitor changes in prevalence over time.
目的哮喘是儿童最常见的慢性疾病,但对先天性异常儿童哮喘和喘息的发病率知之甚少。本研究探讨了6个欧洲地区/国家先天性异常儿童与非异常儿童接受抗哮喘处方的患病率和风险。这是一项基于EUROlinkCAT人群的连锁队列研究,涉及2000-2014年间出生的0-9岁儿童。来自六个EUROCAT登记处的先天性异常数据与国家/生命统计中的出生数据和电子处方数据库相关联。各地区的处方/药房配药记录被标准化为通用数据模型。以R03开头的解剖治疗化学分类代码用于识别抗哮喘处方。进行随机效应荟萃分析,以确定一年内相对于参照组接受bbb1抗哮喘处方的相对风险(RR),以及不同登记中心和年龄组患病率的异质性。结果5.1%的先天性异常患儿和4.9%的对照患儿因未关联而被排除在研究之外。与对照儿童(n=1,722,912)相比,先天性异常儿童(n=60,662)使用bbb1类抗哮喘药物的比例更高,服用抗哮喘药物的风险也明显更高(RR=1.41, 95% CI 1.35-1.48)。风险的增加在所有年龄组中都是一致的。有先天性异常的儿童更可能使用β -2激动剂(RR=1.71, 95% CI 1.60-1.83)和吸入皮质类固醇(RR=1.74, 95% CI 1.61-1.87)。与对照儿童相比,患有食管闭锁、膈疝、遗传综合征和染色体异常的儿童服用抗哮喘药物的风险超过两倍。确定了抗哮喘处方患病率和风险的地区差异。结论1岁儿童抗哮喘处方具有跨年龄组和跨欧洲地区的特点。本研究表明,可以通过数据链接获得有关已开/配发的抗哮喘处方的患病率信息,以监测患病率随时间的变化。
{"title":"Anti-asthmatic prescriptions in children with and without congenital anomalies: a European data linkage study.","authors":"Natalie Divin, E. Garne, J. Morris, M. Loane","doi":"10.23889/ijpds.v7i3.1889","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1889","url":null,"abstract":"ObjectivesAsthma is the most common chronic disease in childhood, yet little is known about rates of asthma and wheezing in children with congenital anomalies. This study explored the prevalence and risk of receiving anti-asthmatic prescriptions in children with congenital anomalies compared to children without anomalies in six European regions/countries.\u0000ApproachThis was a EUROlinkCAT population-based linkage cohort study involving children from 0-9 years of age born between 2000-2014. Congenital anomaly data from six EUROCAT registries were linked to births data in national/vital statistics and to electronic prescription databases. Prescription/pharmacy dispensing records across regions were standardised to a Common Data Model. Anatomical Therapeutic Chemical classification codes beginning with R03 were used to identify anti-asthmatic prescriptions. Random-effects meta-analyses were performed to identify both the relative risk (RR) of receiving >1 anti-asthmatic prescription in a year relative to the reference group, and the heterogeneity of prevalence rates across registries and age group.\u0000ResultsA total of 5.1% of children with congenital anomalies and 4.9% of reference children were dropped from the study as they were not linked. Children with congenital anomalies (n=60,662) had a higher prevalence of >1 anti-asthmatic prescription and a significantly higher risk of being prescribed anti-asthmatics (RR=1.41, 95% CI 1.35-1.48) compared to reference children (n=1,722,912). The increased risk was consistent across all age groups. Children with congenital anomalies were more likely to be prescribed beta-2 agonists (RR=1.71, 95% CI 1.60-1.83) and inhaled corticosteroids (RR=1.74, 95% CI 1.61-1.87). Children with oesophageal atresia, diaphragmatic hernia, genetic syndromes and chromosomal anomalies had over twice the risk of being prescribed anti-asthmatics compared to reference children. Regional differences in prevalence and risk of anti-asthmatic prescriptions were identified.\u0000ConclusionChildren aged <10 years with congenital anomalies consistently had higher prevalence and risk of receiving >1 anti-asthmatic prescription across age group and across European regions. This study demonstrates that information on the prevalence of anti-asthmatic prescriptions issued/dispensed can be obtained through data linkage to monitor changes in prevalence over time.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44544631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-25DOI: 10.23889/ijpds.v7i3.2066
A. Boulle, A. Heekes, H. Hussey, Reshna Kassanjee, M. Davies
ObjectivesTo date South Africa has experienced four distinct COVID-19 waves due to ancestral, Beta, Delta and Omnicron SARS-CoV-2 variants. We sought to answer pertinent public health questions in a timely manner as new COVID-19 variants emerge(d) using routine health service data linked through a service-facing health information exchange (HIE). ApproachA population cohort was defined amongst regular health service users in the Western Cape Province of South Africa based on recent utilisation of public sector services as reflected in the Provincial Health Data Centre (PHDC) which functions as a HIE. Infection, hospitalisation and mortality data were derived from routinely linked laboratory, service and national vital registration data sources. Serology done on residual specimens of patients monitored for HIV and diabetes treatment progress were linked to the PHDC, as were vaccination data from the national vaccination information system. A single linked and de-identified dataset was exported for analysis purposes. ResultsBased on accessing services in the preceding 3 years, a cohort of 3.5 million adult patients could be enumerated and linked to co-morbidity and SARS-CoV-2 outcome data. Serology from 16,000 specimens spread across the three inter-wave periods, and vaccine data from amongst the 5 million vaccine doses given in the Province, could also be linked. Variants could be identified by wave or by PCR assay target anomalies during cross-over periods. Publishable variant severity analyses were feasible from the sub-cohort of patients with diagnosed COVID-19, and variant-specific vaccine effectiveness was assessible amongst cases, in the population cohort, and in patients with HIV. The impact of prior infection and marginal value of vaccination in those with prior infection was assessible within the serology sub-cohort. ConclusionA single linked de-identified dataset derived from an operational HIE was able to quickly address critical public health questions related to COVID-19 variants in a privacy-preserving manner.
{"title":"Feasibility of evaluating COVID-19 vaccine effectiveness and variant severity through a routine health information exchange.","authors":"A. Boulle, A. Heekes, H. Hussey, Reshna Kassanjee, M. Davies","doi":"10.23889/ijpds.v7i3.2066","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2066","url":null,"abstract":"ObjectivesTo date South Africa has experienced four distinct COVID-19 waves due to ancestral, Beta, Delta and Omnicron SARS-CoV-2 variants. We sought to answer pertinent public health questions in a timely manner as new COVID-19 variants emerge(d) using routine health service data linked through a service-facing health information exchange (HIE). \u0000ApproachA population cohort was defined amongst regular health service users in the Western Cape Province of South Africa based on recent utilisation of public sector services as reflected in the Provincial Health Data Centre (PHDC) which functions as a HIE. Infection, hospitalisation and mortality data were derived from routinely linked laboratory, service and national vital registration data sources. Serology done on residual specimens of patients monitored for HIV and diabetes treatment progress were linked to the PHDC, as were vaccination data from the national vaccination information system. A single linked and de-identified dataset was exported for analysis purposes. \u0000ResultsBased on accessing services in the preceding 3 years, a cohort of 3.5 million adult patients could be enumerated and linked to co-morbidity and SARS-CoV-2 outcome data. Serology from 16,000 specimens spread across the three inter-wave periods, and vaccine data from amongst the 5 million vaccine doses given in the Province, could also be linked. Variants could be identified by wave or by PCR assay target anomalies during cross-over periods. Publishable variant severity analyses were feasible from the sub-cohort of patients with diagnosed COVID-19, and variant-specific vaccine effectiveness was assessible amongst cases, in the population cohort, and in patients with HIV. The impact of prior infection and marginal value of vaccination in those with prior infection was assessible within the serology sub-cohort. \u0000ConclusionA single linked de-identified dataset derived from an operational HIE was able to quickly address critical public health questions related to COVID-19 variants in a privacy-preserving manner.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42492628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-25DOI: 10.23889/ijpds.v7i3.1894
Sarah E. Lowe, S. Morrison-Rees
ObjectivesTo reduce fuel poverty in Wales: the Welsh Government developed schemes to provide energy efficiency improvements to lower income households. To inform scheme design: investigate health impacts by linking scheme data to health records. Presented objective: to demonstrate how research findings using real world data can impact policy focus. ApproachThe research was conducted by an independent researcher at Swansea University who co-produced research questions with the Welsh Government Fuel Poverty Policy Team. A longitudinal dataset was created by linking anonymised ‘Warm Homes: Nest’ improvements data to residents’ routine health records in the SAIL Databank at Swansea University. We examined recipient health before and after intervention compared with controls. A high-level policy briefing and research report were published in the Welsh Government Social Research – Analysis for Policy series. Findings were used to design and pilot new eligibility criteria to capture low-income individuals with a respiratory, circulatory or mental health condition. ResultsThis presentation will describe the policy impact pathway from initial discussions with policymakers to real world change, including: securing ESRC funding for a Knowledge Transfer Fellowship, which included a 2013 data linking demonstration project… …which allowed funding to be secured for a 2015-18 research project focused on the impact of improvements on recipient health… …which published emerging findings in 2016… …and substantive findings in 2017, showing a significant positive impact of improvements on recipient health… …which policymakers used to design a pilot to test ways to widen eligibility criteria to include individuals on a low income with a respiratory, circulatory or mental health condition… …which led to scheme criteria being widened in 2019. By 2021, 25% of recipients entered the scheme via the ‘health route’. ConclusionBy delivering research findings generated using linked real world data, and focused on questions co-produced with policymakers, researchers can successfully impact the design and implementation of government policy, thereby improving the lives of people in the real world - in this case, the health of the citizens of Wales.
{"title":"A Policy Impact Case Study Using Real World Data from Welsh Government Fuel Poverty Schemes to Inform Scheme Design.","authors":"Sarah E. Lowe, S. Morrison-Rees","doi":"10.23889/ijpds.v7i3.1894","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1894","url":null,"abstract":"ObjectivesTo reduce fuel poverty in Wales: the Welsh Government developed schemes to provide energy efficiency improvements to lower income households. \u0000To inform scheme design: investigate health impacts by linking scheme data to health records. \u0000Presented objective: to demonstrate how research findings using real world data can impact policy focus. \u0000ApproachThe research was conducted by an independent researcher at Swansea University who co-produced research questions with the Welsh Government Fuel Poverty Policy Team. \u0000A longitudinal dataset was created by linking anonymised ‘Warm Homes: Nest’ improvements data to residents’ routine health records in the SAIL Databank at Swansea University. We examined recipient health before and after intervention compared with controls. \u0000A high-level policy briefing and research report were published in the Welsh Government Social Research – Analysis for Policy series. \u0000Findings were used to design and pilot new eligibility criteria to capture low-income individuals with a respiratory, circulatory or mental health condition. \u0000ResultsThis presentation will describe the policy impact pathway from initial discussions with policymakers to real world change, including: \u0000 \u0000securing ESRC funding for a Knowledge Transfer Fellowship, which included a 2013 data linking demonstration project… \u0000…which allowed funding to be secured for a 2015-18 research project focused on the impact of improvements on recipient health… \u0000…which published emerging findings in 2016… \u0000…and substantive findings in 2017, showing a significant positive impact of improvements on recipient health… \u0000…which policymakers used to design a pilot to test ways to widen eligibility criteria to include individuals on a low income with a respiratory, circulatory or mental health condition… \u0000…which led to scheme criteria being widened in 2019. \u0000 \u0000By 2021, 25% of recipients entered the scheme via the ‘health route’. \u0000ConclusionBy delivering research findings generated using linked real world data, and focused on questions co-produced with policymakers, researchers can successfully impact the design and implementation of government policy, thereby improving the lives of people in the real world - in this case, the health of the citizens of Wales.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46075573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-25DOI: 10.23889/ijpds.v7i3.2077
Nicholas Vinden, Jérémy Foxcroft, L. Antonie
IntroductionA wide assortment of string similarity measures can be used to determine how similar two names are. A diverse set of discriminating and independent features for name similarity are important for classification during record linkage. A Siamese neural network could surpass traditional string similarity measures for the name similarity problem. Objectives and ApproachThis research aims to compare a classifier based on the Siamese network architecture with a Random Forest classifier. In addition to comparing overall performance, we seek to answer whether there are any special properties of certain matching name pairs where the complexity of the Siamese network offers particular benefit. Our data consists of 25,000 last name pairings, with each pair being two variants of a family name. Name similarity predictions from the Siamese network are compared to a Random Forest model that serves as an ensemble of existing string similarity measures. ResultsWe compare the similarity scores yielded by the two methods and discuss the results. We describe the representation of names to each method; name representation is computed formulaically for the traditional measures but is learned by the Siamese network during training. The comparison of different methods is made both in terms of their similarity prediction quality, and the computational cost to generate the predictions. As expected, the Siamese network necessitates a significant computational cost to train. Unexpectedly, the ensemble of traditional measures yields almost identical overall classification performance. However, we expect that further analysis of false positives and false negatives will yield some insight into when practitioners should consider one method over the other. Conclusions/ImplicationsResults suggest that there may be instances where a Siamese network outperforms other similarity measures, although training a Siamese network comes at a considerable computational cost. It is worth considering this approach to name similarity as an additional similarity feature when performing record linkage tasks.
{"title":"Analysing Siamese Neural Network Architectures for Computing Name Similarity.","authors":"Nicholas Vinden, Jérémy Foxcroft, L. Antonie","doi":"10.23889/ijpds.v7i3.2077","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2077","url":null,"abstract":"IntroductionA wide assortment of string similarity measures can be used to determine how similar two names are. A diverse set of discriminating and independent features for name similarity are important for classification during record linkage. A Siamese neural network could surpass traditional string similarity measures for the name similarity problem. \u0000Objectives and ApproachThis research aims to compare a classifier based on the Siamese network architecture with a Random Forest classifier. In addition to comparing overall performance, we seek to answer whether there are any special properties of certain matching name pairs where the complexity of the Siamese network offers particular benefit. \u0000Our data consists of 25,000 last name pairings, with each pair being two variants of a family name. Name similarity predictions from the Siamese network are compared to a Random Forest model that serves as an ensemble of existing string similarity measures. \u0000ResultsWe compare the similarity scores yielded by the two methods and discuss the results. We describe the representation of names to each method; name representation is computed formulaically for the traditional measures but is learned by the Siamese network during training. The comparison of different methods is made both in terms of their similarity prediction quality, and the computational cost to generate the predictions. \u0000As expected, the Siamese network necessitates a significant computational cost to train. Unexpectedly, the ensemble of traditional measures yields almost identical overall classification performance. However, we expect that further analysis of false positives and false negatives will yield some insight into when practitioners should consider one method over the other. \u0000Conclusions/ImplicationsResults suggest that there may be instances where a Siamese network outperforms other similarity measures, although training a Siamese network comes at a considerable computational cost. It is worth considering this approach to name similarity as an additional similarity feature when performing record linkage tasks.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46138847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-25DOI: 10.23889/ijpds.v7i3.1847
Maya Murmann, D. Manuel, P. Tanuseputro, C. Bennett, M. Pugliese, Rhiannon Roberts, Wenshan Li, A. Hsu
ObjectivesRESPECT is a prognostic tool, developed using linked population-based data, to predict 6-month mortality in community-dwelling older adults. RESPECT is implemented and openly accessible as a web-based tool on ProjectBigLife.ca, where over 700,000 calculations have been performed to date. Our objective was to describe healthcare utilization patterns among home care (HC) users across mortality risk profiles generated from RESPECT to inform care planning for older persons who have varying mortality risks and levels of care needs as they decline. ApproachWe conducted a retrospective cohort study examining healthcare use among HC users in Ontario, Canada, who received at least one interRAI HC assessment between April 2018 and September 2019. Using linked health administrative data at the individual level, we examined the use of acute care (hospitalizations and emergency department (ED) visits), long-term care (LTC), and palliative home care within 6-months of each assessment and prognostication using RESPECT. Mortality risk profiles from RESPECT were created based on the median survival. ResultsThe cohort comprised 247,377 community-dwelling older adults; 14.3% died within 6-months of an assessment. Among decedents, half (51.51%) of HC users with a predicted median survival of less than 3-months received at least one palliative care home visit; 39.17%, 34.82% and 13.84% visited the ED, were hospitalized, or were admitted to LTC, respectively. The proportion of assessments that received at least one palliative HC visit declined to 43.11% and 30.28% of assessments with a median survival between 3- and 6-months and those between 6-months and 12-months, respectively. The proportion of assessments with an acute care use increases with increasing median survival. ConclusionA considerable proportion of people at the end-of-life do not receive any palliative home care and continued to be institutionalized. This may be indication that the reduced life expectancies and palliative care needs of many older adults are not being recognized, thus demonstrating the value of prognostic models like RESPECT to inform care planning for individuals in their final years of life.
{"title":"Care trajectory in homes care users across mortality-risk profiles: an observational study.","authors":"Maya Murmann, D. Manuel, P. Tanuseputro, C. Bennett, M. Pugliese, Rhiannon Roberts, Wenshan Li, A. Hsu","doi":"10.23889/ijpds.v7i3.1847","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1847","url":null,"abstract":"ObjectivesRESPECT is a prognostic tool, developed using linked population-based data, to predict 6-month mortality in community-dwelling older adults. RESPECT is implemented and openly accessible as a web-based tool on ProjectBigLife.ca, where over 700,000 calculations have been performed to date. Our objective was to describe healthcare utilization patterns among home care (HC) users across mortality risk profiles generated from RESPECT to inform care planning for older persons who have varying mortality risks and levels of care needs as they decline. \u0000ApproachWe conducted a retrospective cohort study examining healthcare use among HC users in Ontario, Canada, who received at least one interRAI HC assessment between April 2018 and September 2019. Using linked health administrative data at the individual level, we examined the use of acute care (hospitalizations and emergency department (ED) visits), long-term care (LTC), and palliative home care within 6-months of each assessment and prognostication using RESPECT. Mortality risk profiles from RESPECT were created based on the median survival. \u0000ResultsThe cohort comprised 247,377 community-dwelling older adults; 14.3% died within 6-months of an assessment. Among decedents, half (51.51%) of HC users with a predicted median survival of less than 3-months received at least one palliative care home visit; 39.17%, 34.82% and 13.84% visited the ED, were hospitalized, or were admitted to LTC, respectively. The proportion of assessments that received at least one palliative HC visit declined to 43.11% and 30.28% of assessments with a median survival between 3- and 6-months and those between 6-months and 12-months, respectively. The proportion of assessments with an acute care use increases with increasing median survival. \u0000ConclusionA considerable proportion of people at the end-of-life do not receive any palliative home care and continued to be institutionalized. This may be indication that the reduced life expectancies and palliative care needs of many older adults are not being recognized, thus demonstrating the value of prognostic models like RESPECT to inform care planning for individuals in their final years of life.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46199487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-25DOI: 10.23889/ijpds.v7i3.2042
M. A. M. Cleaton, Josie Plachta, R. Shipsey
ObjectivesScalelink is an innovative probabilistic data linkage method based on correspondence analysis. Unlike the popular and widely-used Fellegi-Sunter algorithm, it does not assume linkage variable independence. It also claims to be more intuitive and computationally efficient. We aim to test this method for the first time on real-world big data. ApproachScalelink uses agreement states for each linkage variable and candidate pair. These are compared to determine how frequently, for all candidate pairs, any given agreement state is held at the same time as any other agreement state (this accounts for variable dependence). The results of this comparison are inputted into a loss function and the minimisation of this function is derived within constraints to produce weights. Currently, the method is accessible via Goldstein et al.’s paper and R package. We are translating it into PySpark to enable testing on datasets that are too large to link without using distributed computing. ResultsInitial testing of Goldstein et al.’s Scalelink method on small samples of real-world datasets shows that it performs as expected for a probabilistic linkage method, although cannot currently deal with missingness. To test the quality of the method on real-world big data, a high-quality linked dataset of the 2021 England and Wales Census and follow-up Census Coverage Survey will be used as a Gold Standard (GS). After developing a method that enables Scalelink to deal with missingness, we will apply Scalelink and automatic Fellegi-Sunter probabilistic linkage to this GS. We can thus establish and compare the precision and recall of both methods. We will also investigate linkage bias for particular demographics, test computational efficiency and estimate the clerical review burden for each method. ConclusionGoldstein et al.’s Scalelink algorithm shows promise as a high quality, scalable, dependence-free linkage algorithm for use in any matching project. Here, for the first time, we research the method’s quality and feasibility with real-world big data. From this we will produce recommendations regarding its utility.
{"title":"Exploring Goldstein et al.’s Scalelink method of data linkage.","authors":"M. A. M. Cleaton, Josie Plachta, R. Shipsey","doi":"10.23889/ijpds.v7i3.2042","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2042","url":null,"abstract":"ObjectivesScalelink is an innovative probabilistic data linkage method based on correspondence analysis. Unlike the popular and widely-used Fellegi-Sunter algorithm, it does not assume linkage variable independence. It also claims to be more intuitive and computationally efficient. We aim to test this method for the first time on real-world big data. \u0000ApproachScalelink uses agreement states for each linkage variable and candidate pair. These are compared to determine how frequently, for all candidate pairs, any given agreement state is held at the same time as any other agreement state (this accounts for variable dependence). The results of this comparison are inputted into a loss function and the minimisation of this function is derived within constraints to produce weights. Currently, the method is accessible via Goldstein et al.’s paper and R package. We are translating it into PySpark to enable testing on datasets that are too large to link without using distributed computing. \u0000ResultsInitial testing of Goldstein et al.’s Scalelink method on small samples of real-world datasets shows that it performs as expected for a probabilistic linkage method, although cannot currently deal with missingness. To test the quality of the method on real-world big data, a high-quality linked dataset of the 2021 England and Wales Census and follow-up Census Coverage Survey will be used as a Gold Standard (GS). After developing a method that enables Scalelink to deal with missingness, we will apply Scalelink and automatic Fellegi-Sunter probabilistic linkage to this GS. We can thus establish and compare the precision and recall of both methods. We will also investigate linkage bias for particular demographics, test computational efficiency and estimate the clerical review burden for each method. \u0000ConclusionGoldstein et al.’s Scalelink algorithm shows promise as a high quality, scalable, dependence-free linkage algorithm for use in any matching project. Here, for the first time, we research the method’s quality and feasibility with real-world big data. From this we will produce recommendations regarding its utility.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41432374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}