首页 > 最新文献

International Journal of Population Data Science最新文献

英文 中文
Neonatal mortality in NHS maternity units by timing of birth and method of delivery: a retrospective linked cohort study. 新生儿死亡率NHS产科单位分娩时间和分娩方法:回顾性相关队列研究。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2024
Lucy Carty, M. Cortina-Borja, Rachel Plachcinsk, C. Grollman, A. Macfarlane
ObjectivesPotential ‘weekend effects’ in healthcare prompt concerns that care could be of lower quality during non-working hours, but may reflect differences in case mix or other factors This research aimed to compare neonatal mortality in English hospitals from 2005 to 2014 by time of day and day of the week. ApproachWe analysed data from a retrospective cohort of 6,054,536 singleton births in England 2005—2014, created by linking ONS birth and death registration and birth notification data with Hospital Episode Statistics. Working hours were defined as 07:00—19:00 on weekdays, and non-working hours were all other times on weekdays and all weekends and public holidays. The primary outcome was all-cause neonatal mortality unattributed to congenital anomaly. We also modelled cause-specific neonatal mortality attributed to asphyxia, anoxia or trauma (AAT). On advice through our public involvement and strategy, analysis was stratified by mode of onset of labour and method of delivery. ResultsAfter adjustment for confounders, the odds of all-cause neonatal mortality outside of working hours were similar to those during working hours for spontaneous births, instrumental births and emergency caesareans. Planned caesareans occurring in non-working hours had a high crude risk compared to planned caesareans in working hours, but were considered to be unreliably recorded and likely to reflect emergency caesarean delivery of babies originally scheduled for planned caesarean birth. Further stratification of emergency caesareans by onset of labour showed higher odds of cause-specific neonatal mortality (AAT) during non-working compared with working hours for emergency caesareans without labour recorded but not for emergency caesareans after spontaneous or induced onset of labour. ConclusionIt may be that the apparent ‘weekend effect’ is caused by deaths among the relatively small number of babies who were born by caesarean section apparently without labour outside normal working hours. Obstetric staffing should be planned to allow for these relatively unusual emergencies.
目的医疗保健领域潜在的“周末效应”引发了人们的担忧,即非工作时间的护理质量可能较低,但可能反映了病例组合或其他因素的差异。本研究旨在比较2005年至2014年英国医院按一天中的时间和一周中的一天的新生儿死亡率。方法:我们分析了2005-2014年英格兰6054536名单胎新生儿的回顾性队列数据,这些数据是通过将国家统计局的出生和死亡登记以及出生通知数据与医院事件统计数据联系起来创建的。工作时间定义为工作日的07:00-19:00,非工作时间定义为工作日的所有其他时间以及所有周末和公众假期。主要结局是无先天性异常的全因新生儿死亡率。我们还模拟了由窒息、缺氧或创伤(AAT)引起的新生儿死亡率。根据公众参与和策略的建议,根据分娩方式和分娩方法进行分层分析。结果调整混杂因素后,工作时间以外全因新生儿死亡率与自然分娩、辅助分娩和紧急剖腹产的工作时间内死亡率相似。与在工作时间进行的计划剖腹产相比,在非工作时间进行的计划剖腹产的粗风险较高,但被认为记录不可靠,可能反映了原计划进行计划剖腹产的婴儿的紧急剖腹产。按分娩开始进行的紧急剖腹产进一步分层显示,与工作时间相比,非工作时间的特殊原因新生儿死亡率(AAT)高于没有分娩记录的紧急剖腹产,但与自然分娩或引产后的紧急剖腹产相比,这一比例更高。结论明显的“周末效应”可能是由少数在正常工作时间以外明显未分娩的剖宫产婴儿死亡所致。应计划产科人员配备,以允许这些相对不寻常的紧急情况。
{"title":"Neonatal mortality in NHS maternity units by timing of birth and method of delivery: a retrospective linked cohort study.","authors":"Lucy Carty, M. Cortina-Borja, Rachel Plachcinsk, C. Grollman, A. Macfarlane","doi":"10.23889/ijpds.v7i3.2024","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2024","url":null,"abstract":"ObjectivesPotential ‘weekend effects’ in healthcare prompt concerns that care could be of lower quality during non-working hours, but may reflect differences in case mix or other factors This research aimed to compare neonatal mortality in English hospitals from 2005 to 2014 by time of day and day of the week. \u0000ApproachWe analysed data from a retrospective cohort of 6,054,536 singleton births in England 2005—2014, created by linking ONS birth and death registration and birth notification data with Hospital Episode Statistics. \u0000Working hours were defined as 07:00—19:00 on weekdays, and non-working hours were all other times on weekdays and all weekends and public holidays. \u0000The primary outcome was all-cause neonatal mortality unattributed to congenital anomaly. We also modelled cause-specific neonatal mortality attributed to asphyxia, anoxia or trauma (AAT). On advice through our public involvement and strategy, analysis was stratified by mode of onset of labour and method of delivery. \u0000ResultsAfter adjustment for confounders, the odds of all-cause neonatal mortality outside of working hours were similar to those during working hours for spontaneous births, instrumental births and emergency caesareans. Planned caesareans occurring in non-working hours had a high crude risk compared to planned caesareans in working hours, but were considered to be unreliably recorded and likely to reflect emergency caesarean delivery of babies originally scheduled for planned caesarean birth. \u0000Further stratification of emergency caesareans by onset of labour showed higher odds of cause-specific neonatal mortality (AAT) during non-working compared with working hours for emergency caesareans without labour recorded but not for emergency caesareans after spontaneous or induced onset of labour. \u0000ConclusionIt may be that the apparent ‘weekend effect’ is caused by deaths among the relatively small number of babies who were born by caesarean section apparently without labour outside normal working hours. Obstetric staffing should be planned to allow for these relatively unusual emergencies.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68930235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A population data-driven approach to identifying ‘Long COVID’ cases in support of diagnosis and treatment. 一种人群数据驱动的方法,用于识别“长期新冠肺炎”病例,以支持诊断和治疗。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1924
J. Enns, A. Katz, M. Yogendran, Marcelo L. Urquia, S. Muthukumarana, Surani Matharaarachchi, A. Singer, Nathan C. Nickel, L. Star, Teresa Cavett, Y. Keynan, L. Lix, D. Sanchez-Ramirez
ObjectivePost-acute COVID-19 (or ‘long COVID’) manifests as a wide range of long-lasting symptoms affecting multiple organ systems. We are developing criteria for identifying long COVID cases using administrative, clinical, survey and other data from Manitoba, Canada, with the ultimate goal of examining long COVID prevalence, risk factors, prognosis and recovery. ApproachGiven the lack of an accepted clinical definition and resulting lack of diagnostic codes, we are adopting several different creative and complementary strategies to identify long COVID cases. We are examining administrative and clinical data sources (laboratory data, physician claims, drug prescriptions, and electronic medical records) for information on positive COVID tests, common symptoms and complaints, and treatment provided. To identify people with long COVID who may not have sought healthcare, we are collecting survey data from a convenience community sample (members of a medical health fitness facility) and mining data on long COVID symptoms from Twitter. ResultsThe combination of approaches we have adopted and the expanding scientific literature on long COVID are contributing to a more comprehensive understanding of the impacts of long COVID in Manitoba. Through preliminary work on the laboratory data (positive COVID tests March 2020-June 2021), we have developed and characterized a COVID-positive cohort (n=47,515). Work is now underway to develop an algorithm for long COVID using symptoms from free text in electronic medical records, ICD-9 codes, and changes in health-seeking behaviour (compared to the pre-positive COVID test period and a matched sample). This population data-driven approach will then allow us to examine how multiple underlying health conditions, COVID illness severity, COVID vaccination status, and various socio-demographic factors are related to risk of long COVID. ConclusionThis research is generating actionable information by identifying risk factors to support clinical diagnosis of long COVID, making it easier for clinicians to recognize this new illness and develop plans to manage it, and will inform healthcare system planning by quantifying the burden of long COVID at the population level.
急性后COVID-19(或“长期COVID”)表现为影响多器官系统的广泛持久症状。我们正在利用加拿大马尼托巴省的行政、临床、调查和其他数据制定识别长期COVID病例的标准,最终目标是检查长期COVID的患病率、风险因素、预后和恢复情况。由于缺乏公认的临床定义,因此缺乏诊断代码,我们正在采取几种不同的创造性和互补策略来识别长期COVID病例。我们正在审查行政和临床数据源(实验室数据、医生索赔、药物处方和电子病历),以获取有关COVID阳性检测、常见症状和投诉以及所提供治疗的信息。为了识别可能没有寻求医疗保健的长COVID患者,我们正在从便利社区样本(医疗健康健身设施的成员)收集调查数据,并从Twitter上挖掘长COVID症状的数据。我们采取的方法和不断扩大的关于长冠状病毒的科学文献相结合,有助于更全面地了解长冠状病毒在马尼托巴省的影响。通过对实验室数据的初步研究(2020年3月至2021年6月的COVID阳性检测),我们建立了一个COVID阳性队列(n=47,515)并确定了其特征。目前正在开展工作,利用电子病历中的免费文本、ICD-9代码和求医行为的变化(与COVID阳性前检测期和匹配样本相比)开发一种长COVID算法。这种人口数据驱动的方法将使我们能够研究多种潜在健康状况、COVID疾病严重程度、COVID疫苗接种状况和各种社会人口因素与长期COVID风险的关系。结论本研究通过识别风险因素生成可操作的信息,以支持长冠肺炎的临床诊断,使临床医生更容易识别这种新疾病并制定管理计划,并通过量化人口水平的长冠肺炎负担为医疗保健系统规划提供信息。
{"title":"A population data-driven approach to identifying ‘Long COVID’ cases in support of diagnosis and treatment.","authors":"J. Enns, A. Katz, M. Yogendran, Marcelo L. Urquia, S. Muthukumarana, Surani Matharaarachchi, A. Singer, Nathan C. Nickel, L. Star, Teresa Cavett, Y. Keynan, L. Lix, D. Sanchez-Ramirez","doi":"10.23889/ijpds.v7i3.1924","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1924","url":null,"abstract":"ObjectivePost-acute COVID-19 (or ‘long COVID’) manifests as a wide range of long-lasting symptoms affecting multiple organ systems. We are developing criteria for identifying long COVID cases using administrative, clinical, survey and other data from Manitoba, Canada, with the ultimate goal of examining long COVID prevalence, risk factors, prognosis and recovery. \u0000ApproachGiven the lack of an accepted clinical definition and resulting lack of diagnostic codes, we are adopting several different creative and complementary strategies to identify long COVID cases. We are examining administrative and clinical data sources (laboratory data, physician claims, drug prescriptions, and electronic medical records) for information on positive COVID tests, common symptoms and complaints, and treatment provided. To identify people with long COVID who may not have sought healthcare, we are collecting survey data from a convenience community sample (members of a medical health fitness facility) and mining data on long COVID symptoms from Twitter. \u0000ResultsThe combination of approaches we have adopted and the expanding scientific literature on long COVID are contributing to a more comprehensive understanding of the impacts of long COVID in Manitoba. Through preliminary work on the laboratory data (positive COVID tests March 2020-June 2021), we have developed and characterized a COVID-positive cohort (n=47,515). Work is now underway to develop an algorithm for long COVID using symptoms from free text in electronic medical records, ICD-9 codes, and changes in health-seeking behaviour (compared to the pre-positive COVID test period and a matched sample). This population data-driven approach will then allow us to examine how multiple underlying health conditions, COVID illness severity, COVID vaccination status, and various socio-demographic factors are related to risk of long COVID. \u0000ConclusionThis research is generating actionable information by identifying risk factors to support clinical diagnosis of long COVID, making it easier for clinicians to recognize this new illness and develop plans to manage it, and will inform healthcare system planning by quantifying the burden of long COVID at the population level.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48630806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ICES Data and Analytic Services: Eight Years Young. ICES数据与分析服务:年轻八年。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2096
M. Ho, Stefana Jovanovska, Jennafer Novess, Dina Skvirsky, R. Saskin, J. C. Victor
ObjectiveIn March 2014, ICES launched Data & Analytic Services (DAS), expanding the access to ICES data and analytics beyond ICES scientists and analytic staff. In eight years, DAS has grown and evolved to increase high quality services offered to an expanding client base of external researchers. ApproachAt the inception of DAS, two services were offered to public sector researchers: data access and analytics. Data access enabled researchers to analyze coded record-level data through a secure virtual environment. Analytics, conducted by DAS staff in ICES analytic environment, provided researchers with risk-cleared summary level reports. In response to growing demand from an increasingly diverse range of researchers, ICES engaged in extensive consultations with internal and external stakeholders to re-evaluate and operationalize new services. Compliance with contractual obligations and Ontario law, organizational capacity to scale up, alignment with ICES’ mission, vision and values, were cornerstones in establishing new offerings. ResultsAnalytic services became available to private sector researchers in June 2016. In March 2017, support for cohort and longitudinal follow-up studies became the newest service offering (researchers provided with a list of applicable individuals defined for the purposes of conducting publicly funded research). As more data assets become available to researchers, requests continue to increase in volume and complexity, particularly of projects seeking to import external data for linkage to ICES data. A second high performance computing virtual environment onboarded researchers September 2021 while the original analytic environment has undergone multiple upgrades, and will soon be fully refreshed. Regular solicitation of feedback has enabled DAS to increase staffing and diversify resources available which improves the client experience at all stages. ConclusionsSince its inception, DAS has expanded from five to thirty personnel, grown and diversified its new and returning client base and has responded to demand for new services. DAS continues to provide high quality services which enable impactful research and is responsive to new opportunities for collaboration and service provision.
目标2014年3月,ICES推出了数据与分析服务(DAS),将对ICES数据和分析的访问范围扩大到ICES科学家和分析人员之外。在八年的时间里,DAS不断发展壮大,为不断扩大的外部研究人员客户群提供了更高质量的服务。方法DAS成立之初,为公共部门研究人员提供了两项服务:数据访问和分析。数据访问使研究人员能够通过安全的虚拟环境分析编码记录级别的数据。DAS工作人员在ICES分析环境中进行的分析为研究人员提供了风险清除的总结级报告。为了应对日益多样化的研究人员日益增长的需求,ICES与内部和外部利益相关者进行了广泛协商,以重新评估和实施新的服务。遵守合同义务和安大略省法律、扩大规模的组织能力、与ICES的使命、愿景和价值观保持一致,是建立新产品的基石。结果2016年6月,私营部门的研究人员可以获得分析服务。2017年3月,对队列和纵向随访研究的支持成为最新的服务(研究人员提供了一份适用于进行公共资助研究的个人名单)。随着越来越多的数据资产可供研究人员使用,请求的数量和复杂性不断增加,尤其是寻求导入外部数据以链接到ICES数据的项目。第二个高性能计算虚拟环境于2021年9月加入研究人员,而原始分析环境已经进行了多次升级,并将很快全面更新。定期征求反馈意见使DAS能够增加人员配置并使可用资源多样化,从而在各个阶段改善客户体验。结论自成立以来,DAS的人员已从5人扩大到30人,其新客户和回头客群不断扩大和多样化,并对新服务的需求做出了回应。DAS继续提供高质量的服务,实现有影响力的研究,并对新的合作和服务提供机会做出反应。
{"title":"ICES Data and Analytic Services: Eight Years Young.","authors":"M. Ho, Stefana Jovanovska, Jennafer Novess, Dina Skvirsky, R. Saskin, J. C. Victor","doi":"10.23889/ijpds.v7i3.2096","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2096","url":null,"abstract":"ObjectiveIn March 2014, ICES launched Data & Analytic Services (DAS), expanding the access to ICES data and analytics beyond ICES scientists and analytic staff. In eight years, DAS has grown and evolved to increase high quality services offered to an expanding client base of external researchers. \u0000ApproachAt the inception of DAS, two services were offered to public sector researchers: data access and analytics. Data access enabled researchers to analyze coded record-level data through a secure virtual environment. Analytics, conducted by DAS staff in ICES analytic environment, provided researchers with risk-cleared summary level reports. In response to growing demand from an increasingly diverse range of researchers, ICES engaged in extensive consultations with internal and external stakeholders to re-evaluate and operationalize new services. Compliance with contractual obligations and Ontario law, organizational capacity to scale up, alignment with ICES’ mission, vision and values, were cornerstones in establishing new offerings. \u0000ResultsAnalytic services became available to private sector researchers in June 2016. In March 2017, support for cohort and longitudinal follow-up studies became the newest service offering (researchers provided with a list of applicable individuals defined for the purposes of conducting publicly funded research). As more data assets become available to researchers, requests continue to increase in volume and complexity, particularly of projects seeking to import external data for linkage to ICES data. A second high performance computing virtual environment onboarded researchers September 2021 while the original analytic environment has undergone multiple upgrades, and will soon be fully refreshed. Regular solicitation of feedback has enabled DAS to increase staffing and diversify resources available which improves the client experience at all stages. \u0000ConclusionsSince its inception, DAS has expanded from five to thirty personnel, grown and diversified its new and returning client base and has responded to demand for new services. DAS continues to provide high quality services which enable impactful research and is responsive to new opportunities for collaboration and service provision.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48761540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Education and social care predictors of offending trajectories: A UK administrative data linkage study. 犯罪轨迹的教育和社会护理预测因素:一项英国行政数据关联研究。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1928
Hannah Dickson, G. Vamvakas, N. Blackwood
ObjectivesTotal annual costs of crime in England and Wales is estimated at £50bn.The age-crime curve indicates that criminal behavioural peaks in adolescence and decreases in adulthood.  Life-course persistent offenders begin to behave antisocially early in childhood and continue this behaviour into adulthood. By contrast, adolescent-limited offenders exhibit most of their antisocial behaviour during adolescence, with a minority continuing to offend into adulthood. However, evidence suggests that this curve conceals distinct developmental trajectories. Prospective cohort study data has highlighted distinct risk factors for these offending trajectories, but this research is limited because of small sample sizes for disadvantaged groups, selection bias and infrequency of data collection. ApproachThe current study began in February 2022 and is one of the first to use UK linked national crime and education records. The aim is to: (1) establish the offending trajectories of individuals between the ages of 10 and 32 years following their first recorded conviction or caution using national crime records; and (2) develop prediction models of these offending trajectories using administrative education and social care data. ResultsIn my talk, I will share findings on the offending trajectories identified and present some early results on the key education and social care drivers of the offending trajectories. ConclusionsFindings from the project have the potential to identify previously unknown, or confirm lesser known, offending trajectories using real world data based on the UK population. It may also lead to the detection of previously unknown risk or protective factors for offending, which has implications for early intervention and could help inform criminal justice system responses to early antisocial behaviour.
目的英格兰和威尔士每年的犯罪总成本估计为500亿英镑。年龄-犯罪曲线表明,犯罪行为在青春期达到峰值,在成年后下降。终身罪犯在童年早期就开始表现出反社会行为,并将这种行为持续到成年。相比之下,青少年限定犯在青春期表现出大部分反社会行为,少数人在成年后仍在继续犯罪。然而,有证据表明,这条曲线掩盖了不同的发展轨迹。前瞻性队列研究数据强调了这些违规轨迹的不同风险因素,但由于弱势群体的样本量小、选择偏差和数据收集频率低,这项研究受到限制。方法目前的研究始于2022年2月,是首批使用与英国相关的国家犯罪和教育记录的研究之一。其目的是:(1)利用国家犯罪记录,确定10至32岁个人在首次被定罪或被警告后的犯罪轨迹;以及(2)利用行政教育和社会护理数据开发这些犯罪轨迹的预测模型。结果在我的演讲中,我将分享关于已确定的犯罪轨迹的发现,并介绍一些关于犯罪轨迹的关键教育和社会护理驱动因素的早期结果。结论该项目的发现有可能使用基于英国人口的真实世界数据来识别以前未知的或鲜为人知的犯罪轨迹。它还可能导致发现以前未知的犯罪风险或保护因素,这对早期干预有影响,并有助于为刑事司法系统对早期反社会行为的反应提供信息。
{"title":"Education and social care predictors of offending trajectories: A UK administrative data linkage study.","authors":"Hannah Dickson, G. Vamvakas, N. Blackwood","doi":"10.23889/ijpds.v7i3.1928","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1928","url":null,"abstract":"ObjectivesTotal annual costs of crime in England and Wales is estimated at £50bn.The age-crime curve indicates that criminal behavioural peaks in adolescence and decreases in adulthood.  Life-course persistent offenders begin to behave antisocially early in childhood and continue this behaviour into adulthood. By contrast, adolescent-limited offenders exhibit most of their antisocial behaviour during adolescence, with a minority continuing to offend into adulthood. However, evidence suggests that this curve conceals distinct developmental trajectories. Prospective cohort study data has highlighted distinct risk factors for these offending trajectories, but this research is limited because of small sample sizes for disadvantaged groups, selection bias and infrequency of data collection. \u0000ApproachThe current study began in February 2022 and is one of the first to use UK linked national crime and education records. The aim is to: (1) establish the offending trajectories of individuals between the ages of 10 and 32 years following their first recorded conviction or caution using national crime records; and (2) develop prediction models of these offending trajectories using administrative education and social care data. \u0000ResultsIn my talk, I will share findings on the offending trajectories identified and present some early results on the key education and social care drivers of the offending trajectories. \u0000ConclusionsFindings from the project have the potential to identify previously unknown, or confirm lesser known, offending trajectories using real world data based on the UK population. It may also lead to the detection of previously unknown risk or protective factors for offending, which has implications for early intervention and could help inform criminal justice system responses to early antisocial behaviour.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44324426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Māori and Linked Administrative Data: A Critical Review of the Literature and Suggestions to Realise Māori Data Aspirations. Māori和关联的行政数据:对文献的批判性审查和实现Māori数据愿望的建议。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1793
Lara M. Greaves, Cinnamon-Jo Lindsay, Eileen Li, Emerald Muriwai, A. Sporle
Linked data presents different social and ethical issues for different contexts and communities. The Statistics New Zealand Integrated Data Infrastructure (IDI) is a collection of de-identified whole-population administrative datasets that researchers are increasingly using to answer pressing social and policy research questions. Our work seeks to provide an overview of the IDI, associated issues for Māori (the Indigenous peoples of New Zealand), and steps to realise Māori data aspirations. In this paper, we first introduce the IDI including what it is and how it developed. We then move to an overview of Māori Data Sovereignty. Our paper then turns to examples of organisations, agreements, and frameworks which seek to make the IDI and data better for Māori communities. We then discuss the main issues with the IDI for Māori including technical issues, deficit-framed work, involvement from communities, consent, social license, further data linkage, and barriers to access for Māori. We finish with a set of recommendations around how to improve the IDI for Māori, making sure that Māori can get the most out of administrative data for our communities. These include the need to build data researcher capacity and capability for Māori, Māori data co-governance and accountability, reducing practical and skill barriers for access by Māori and Māori organisations, providing robust, consistent and transparent practice exemplars for best practice, and potentially even abolishing the IDI and starting again. These issues are being worked through via Indigenous engagement and co-governance processes that could provide useful exemplars for Indigenous and community engagement with linked data resources.
关联数据为不同的背景和社区呈现了不同的社会和道德问题。新西兰统计局综合数据基础设施(IDI)是一个未识别的整体人口管理数据集的集合,研究人员越来越多地使用这些数据集来回答紧迫的社会和政策研究问题。我们的工作旨在概述IDI、毛利人(新西兰土著人民)的相关问题,以及实现毛利人数据愿望的步骤。在本文中,我们首先介绍IDI,包括它是什么以及它是如何发展的。然后,我们将对毛利数据主权进行概述。然后,我们的论文转向了一些组织、协议和框架的例子,这些组织、协议、框架旨在为毛利人社区提供更好的IDI和数据。然后,我们与毛利人IDI讨论了主要问题,包括技术问题、赤字框架工作、社区参与、同意、社会许可、进一步的数据联系以及毛利人获取信息的障碍。最后,我们就如何改善毛利人的IDI提出了一系列建议,确保毛利人能够充分利用我们社区的行政数据。其中包括需要为毛利人、毛利人的数据共同治理和问责制建设数据研究人员的能力,减少毛利人和毛利人组织获取数据的实际和技能障碍,为最佳实践提供强有力、一致和透明的实践范例,甚至有可能废除IDI并重新开始。这些问题正在通过土著参与和共同治理过程来解决,这些过程可以为土著和社区参与相关数据资源提供有用的范例。
{"title":"Māori and Linked Administrative Data: A Critical Review of the Literature and Suggestions to Realise Māori Data Aspirations.","authors":"Lara M. Greaves, Cinnamon-Jo Lindsay, Eileen Li, Emerald Muriwai, A. Sporle","doi":"10.23889/ijpds.v7i3.1793","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1793","url":null,"abstract":"Linked data presents different social and ethical issues for different contexts and communities. The Statistics New Zealand Integrated Data Infrastructure (IDI) is a collection of de-identified whole-population administrative datasets that researchers are increasingly using to answer pressing social and policy research questions. Our work seeks to provide an overview of the IDI, associated issues for Māori (the Indigenous peoples of New Zealand), and steps to realise Māori data aspirations. In this paper, we first introduce the IDI including what it is and how it developed. We then move to an overview of Māori Data Sovereignty. Our paper then turns to examples of organisations, agreements, and frameworks which seek to make the IDI and data better for Māori communities. We then discuss the main issues with the IDI for Māori including technical issues, deficit-framed work, involvement from communities, consent, social license, further data linkage, and barriers to access for Māori. We finish with a set of recommendations around how to improve the IDI for Māori, making sure that Māori can get the most out of administrative data for our communities. These include the need to build data researcher capacity and capability for Māori, Māori data co-governance and accountability, reducing practical and skill barriers for access by Māori and Māori organisations, providing robust, consistent and transparent practice exemplars for best practice, and potentially even abolishing the IDI and starting again. These issues are being worked through via Indigenous engagement and co-governance processes that could provide useful exemplars for Indigenous and community engagement with linked data resources.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43088793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using linkage to assess coverage of population estimates. 使用关联评估人口估计的覆盖率。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2038
Sarah Collyer, Josie Plachta
ObjectivesThe Demographic Index (DI) comprises of five linked administrative datasets, used for population estimation. Current linkage methods are not ideal to utilise the power of this asset. Using the 2021 England and Wales Census, we are developing an innovative composite linkage method to fully utilise the power of the DI. ApproachUsing non-greedy deterministic and probabilistic linkage methods, we will link the DI to the Census at a composite level where we believe links exist – i.e., linking a Census cluster (consisting of linked Census and Census Coverage Survey (CCS) records) with a DI cluster (consisting of linked records from the data sources used to make the DI). We will then conduct a pairwise linkage of records from these linked clusters to link individual source records to the Census. We will utilise clerical review to resolve uncertain and conflicting links and to inform the quality of our linkage. ResultsWe anticipate producing a high-quality linkage that will inform how the coverage of the DI compares to Census (through the composite-level linkage) and the quality of the DI itself (through the pairwise-level linkage). We have developed a clerical matching system that can display composite-level linkage, i.e., candidate cluster-pairs. We will tailor our clerical review and quality assessment to records that fall within carefully chosen postcode areas, to ensure all hard-to-count groups and geographical areas are sampled. Working with large datasets is a challenge we are overcoming by using distributed computing and search space reduction. The 2021 Census has been previously linked to the CCS with high accuracy; these records are considered intrinsically linked. ConclusionTo assess national population estimates’ quality and the policy decisions based upon them, we are linking a key composite population-level dataset to the 2021 England and Wales Census. The presentation will showcase the methods we are developing and how we are ensuring the highest quality possible.
目的人口统计指数(DI)由五个相连的行政数据集组成,用于人口估计。目前的联动方法不适合利用这一资产的力量。利用2021年英格兰和威尔士人口普查,我们正在开发一种创新的综合联系方法,以充分利用DI的力量。方法使用非贪婪的确定性和概率性链接方法,我们将在我们认为存在链接的复合水平上将DI与人口普查联系起来——即,将人口普查集群(由链接的人口普查和人口普查覆盖率调查(CCS)记录组成)与DI集群(由用于制作DI的数据源的链接记录组成)联系起来。然后,我们将对这些链接集群中的记录进行成对链接,将个人来源记录与人口普查联系起来。我们将利用文书审查来解决不确定和冲突的联系,并告知我们的联系质量。结果我们预计会产生一个高质量的链接,该链接将告知DI的覆盖率与人口普查相比如何(通过复合水平链接)以及DI本身的质量(通过成对水平链接)。我们开发了一个文书匹配系统,可以显示复合级别的链接,即候选聚类对。我们将根据精心选择的邮政编码区域内的记录,调整我们的文书审查和质量评估,以确保对所有难以计数的群体和地理区域进行采样。使用大型数据集是我们正在通过使用分布式计算和减少搜索空间来克服的挑战。2021年人口普查以前曾以高精度与CCS联系在一起;这些记录被认为是内在联系的。结论为了评估国家人口估计的质量及其政策决策,我们将一个关键的综合人口水平数据集与2021年英格兰和威尔士人口普查联系起来。演示将展示我们正在开发的方法,以及我们如何确保尽可能高的质量。
{"title":"Using linkage to assess coverage of population estimates.","authors":"Sarah Collyer, Josie Plachta","doi":"10.23889/ijpds.v7i3.2038","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2038","url":null,"abstract":"ObjectivesThe Demographic Index (DI) comprises of five linked administrative datasets, used for population estimation. Current linkage methods are not ideal to utilise the power of this asset. Using the 2021 England and Wales Census, we are developing an innovative composite linkage method to fully utilise the power of the DI. \u0000ApproachUsing non-greedy deterministic and probabilistic linkage methods, we will link the DI to the Census at a composite level where we believe links exist – i.e., linking a Census cluster (consisting of linked Census and Census Coverage Survey (CCS) records) with a DI cluster (consisting of linked records from the data sources used to make the DI). We will then conduct a pairwise linkage of records from these linked clusters to link individual source records to the Census. We will utilise clerical review to resolve uncertain and conflicting links and to inform the quality of our linkage. \u0000ResultsWe anticipate producing a high-quality linkage that will inform how the coverage of the DI compares to Census (through the composite-level linkage) and the quality of the DI itself (through the pairwise-level linkage). We have developed a clerical matching system that can display composite-level linkage, i.e., candidate cluster-pairs. We will tailor our clerical review and quality assessment to records that fall within carefully chosen postcode areas, to ensure all hard-to-count groups and geographical areas are sampled. Working with large datasets is a challenge we are overcoming by using distributed computing and search space reduction. \u0000The 2021 Census has been previously linked to the CCS with high accuracy; these records are considered intrinsically linked. \u0000ConclusionTo assess national population estimates’ quality and the policy decisions based upon them, we are linking a key composite population-level dataset to the 2021 England and Wales Census. The presentation will showcase the methods we are developing and how we are ensuring the highest quality possible.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43453716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A project designed to examine, for the first time, the health records of adult prisoners in Northern Ireland and their linkage to other available health data: the test case of prisoner post-release mortality risk. 一个项目旨在首次审查北爱尔兰成年囚犯的健康记录及其与其他可用健康数据的联系:囚犯获释后死亡风险的测试案例。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2057
J. Cooper, D. O’Reilly, Richard Kirk, Trish Kelly, Rachel Gibbs, M. Donnelly
A project designed to examine, for the first time, the health records of adult prisoners in Northern Ireland and their linkage to other available health data: the test case of prisoner post-release mortality risk ObjectivesThe linkage of routinely collected administrative data for research purposes has the potential to improve knowledge and public benefit. We describe a novel data linkage study between the Northern Ireland (NI) Healthcare in Prisons and Business Services Organisation (BSO). This work is undertaken within the Administrative Data Research Centre-NI (ADRC-NI). ApproachThis joint project between ADRC-NI Queen’s University Belfast and NI Healthcare in Prisons (South Eastern Health and Social Care Trust) will test linkage of prisoner health records to health data held in the BSO and the potential to generate a population-based cohort for a retrospective analysis of prisoner health (2012-2021) that will attempt to characterise prisoners according to socio-demographic, health and committal factors, compare post-release mortality rates with a reference group from the NI population using indirect standardisation and estimate post-release mortality risk using Cox proportional hazards models. ResultsUsing novel data-linkages, a dataset will be created to examine the health of prisoners (and former prisoners) in NI. Ethics and governance approvals are in place for this data-linkage.  The linkage will be undertaken via the Honest Broker Service (HBS) in NI and the dataset will be accessed in the safe setting at the BSO. The processes involved, experiences including significant delays or difficulties, and recommendations for future data-linkage studies will be discussed. In addition, a key deliverable of this project will be an assessment of access and linkage capabilities of the prisoner health data, with metadata created and made available to future researchers. In addition, we plan to present preliminary results relating to the test research question. ConclusionWe will describe the processes involved and first-hand research experience in the development of a novel data-linkage project, in addition we will detail access and linkage capabilities in relation to this new dataset to examine health in prisoners (and former prisoners) in NI.
一个旨在首次审查北爱尔兰成年囚犯的健康记录及其与其他可用健康数据的联系的项目:囚犯获释后死亡风险的测试案例目标为研究目的定期收集的行政数据的联系有可能提高知识和公共利益。我们描述了北爱尔兰(NI)监狱医疗保健和商业服务组织(BSO)之间的一项新的数据关联研究。这项工作是在NI行政数据研究中心(ADRC-NI)内进行的。方法这一由贝尔法斯特女王大学ADRC-NI和NI监狱医疗保健(东南健康和社会护理信托基金)联合开展的项目将测试囚犯健康记录与BSO中的健康数据的联系,以及生成一个基于人群的囚犯健康回顾性分析队列(2012-2021)的潜力,该队列将试图根据社会人口、健康和承诺因素,使用间接标准化将释放后死亡率与NI人群的参考组进行比较,并使用Cox比例风险模型估计释放后死亡率。结果使用新的数据链接,将创建一个数据集来检查NI囚犯(和前囚犯)的健康状况。该数据链接已获得道德和治理批准。链接将通过NI的诚实经纪人服务(HBS)进行,数据集将在BSO的安全设置中访问。将讨论所涉及的过程、包括重大延误或困难在内的经验以及对未来数据链接研究的建议。此外,该项目的一个关键交付成果将是评估囚犯健康数据的访问和链接能力,并创建元数据,供未来的研究人员使用。此外,我们计划介绍与测试研究问题有关的初步结果。结论我们将描述开发一个新的数据链接项目所涉及的过程和第一手研究经验,此外,我们还将详细介绍与该新数据集相关的访问和链接能力,以检查NI囚犯(和前囚犯)的健康状况。
{"title":"A project designed to examine, for the first time, the health records of adult prisoners in Northern Ireland and their linkage to other available health data: the test case of prisoner post-release mortality risk.","authors":"J. Cooper, D. O’Reilly, Richard Kirk, Trish Kelly, Rachel Gibbs, M. Donnelly","doi":"10.23889/ijpds.v7i3.2057","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2057","url":null,"abstract":"A project designed to examine, for the first time, the health records of adult prisoners in Northern Ireland and their linkage to other available health data: the test case of prisoner post-release mortality risk \u0000ObjectivesThe linkage of routinely collected administrative data for research purposes has the potential to improve knowledge and public benefit. We describe a novel data linkage study between the Northern Ireland (NI) Healthcare in Prisons and Business Services Organisation (BSO). This work is undertaken within the Administrative Data Research Centre-NI (ADRC-NI). \u0000ApproachThis joint project between ADRC-NI Queen’s University Belfast and NI Healthcare in Prisons (South Eastern Health and Social Care Trust) will test linkage of prisoner health records to health data held in the BSO and the potential to generate a population-based cohort for a retrospective analysis of prisoner health (2012-2021) that will attempt to characterise prisoners according to socio-demographic, health and committal factors, compare post-release mortality rates with a reference group from the NI population using indirect standardisation and estimate post-release mortality risk using Cox proportional hazards models. \u0000ResultsUsing novel data-linkages, a dataset will be created to examine the health of prisoners (and former prisoners) in NI. Ethics and governance approvals are in place for this data-linkage.  The linkage will be undertaken via the Honest Broker Service (HBS) in NI and the dataset will be accessed in the safe setting at the BSO. The processes involved, experiences including significant delays or difficulties, and recommendations for future data-linkage studies will be discussed. In addition, a key deliverable of this project will be an assessment of access and linkage capabilities of the prisoner health data, with metadata created and made available to future researchers. In addition, we plan to present preliminary results relating to the test research question. \u0000ConclusionWe will describe the processes involved and first-hand research experience in the development of a novel data-linkage project, in addition we will detail access and linkage capabilities in relation to this new dataset to examine health in prisoners (and former prisoners) in NI.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49534857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic-based Privacy-preserving Record Linkage. 基于语义的隐私保护记录链接。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1956
Yang Lu
IntroductionSharing aggregated electronic health records (EHRs) for integrated health care and public health studies is increasingly demanded. Patient privacy demands that anonymisation procedures are in place for data sharing. ObjectiveTraditional methods such as k-anonymity and its derivations are often overgeneralising resulting in lower data accuracy. To tackle this issue, we proposed the Semantic Linkage K-Anonymity (SLKA) approach to balance the privacy and utility preservation through detecting risky combinations hidden in the record linkage releases. ApproachK-anonymity processing quasi-identifiers of data may lead to ‘over generalisation’ when dealing with linkage data sets. As most linkage cases do not include all local patients and thus not all modifying data for privacy-preserving purposes needs to be used, we proposed the linkage k-anonymity (LKA) by which only obfuscated individuals in a released linkage set are required to be indistinguishable from at least k-1 other individuals in the local dataset. Considering the inference disclosure issue, we further designed the semantic-based linkage k-anonymity (SLKA) method through extending with a semantic-rule base for automatic detection of (and ruling out) risky associations from previous linked data releases. Specially, associations identified from the “previous releases” of the linkage dataset can become the input of semantic reasoning for the “next release”. ResultsThe approach is evaluated based on a linkage scenario where researchers apply to link data from an Australia-wide national type-1 diabetes platform with survey results from 25,000+ Victorians about their health and wellbeing. In comparing the information loss of three methods, we find that extra cost can be incurred in SLKA for dealing with risky individuals, e.g., 13.7% vs 5.9% (LKA, k=4) however it performs much better than k-anonymity, which can cause 24% information loss (k=4). Besides, the k values can affect the level of distortion in SLKA, such as 11.5% (k=2) vs 12.9% (k=3). ConclusionThe SLKA framework provides dynamic protection for repeated linkage releases while preserving data utility by avoiding unnecessary generalisation as typified by k-anonymity.
引言为综合医疗保健和公共卫生研究共享汇总电子健康记录(EHR)的需求越来越大。患者隐私要求数据共享采用匿名程序。传统方法,如k-匿名及其衍生方法,往往过于笼统,导致数据准确性较低。为了解决这个问题,我们提出了语义链接K-匿名(SLKA)方法,通过检测隐藏在记录链接发布中的风险组合来平衡隐私和效用保护。在处理链接数据集时,匿名处理数据的准标识符可能会导致“过度泛化”。由于大多数链接情况不包括所有本地患者,因此也不需要使用所有出于隐私保护目的的修改数据,我们提出了链接k匿名性(LKA),通过该链接,仅要求已发布链接集中的模糊个体与本地数据集中的至少k-1个其他个体不可区分。考虑到推理公开问题,我们通过扩展语义规则库,进一步设计了基于语义的链接k匿名(SLKA)方法,用于自动检测(并排除)先前链接数据发布中的风险关联。特别地,从链接数据集的“以前的版本”中识别的关联可以成为“下一个版本”的语义推理的输入。结果该方法是基于一个链接场景进行评估的,研究人员将澳大利亚全国1型糖尿病平台的数据与25000多名维多利亚州人的健康状况调查结果联系起来。在比较三种方法的信息损失时,我们发现SLKA在处理风险个体时可能会产生额外的成本,例如13.7%对5.9%(LKA,k=4),但它的性能远好于k-匿名,后者可能会导致24%的信息损失(k=4)。此外,k值可以影响SLKA中的失真水平,例如11.5%(k=2)vs 12.9%(k=3)。结论SLKA框架为重复链接发布提供了动态保护,同时通过避免以k匿名为代表的不必要的泛化来保持数据的实用性。
{"title":"Semantic-based Privacy-preserving Record Linkage.","authors":"Yang Lu","doi":"10.23889/ijpds.v7i3.1956","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1956","url":null,"abstract":"IntroductionSharing aggregated electronic health records (EHRs) for integrated health care and public health studies is increasingly demanded. Patient privacy demands that anonymisation procedures are in place for data sharing. \u0000ObjectiveTraditional methods such as k-anonymity and its derivations are often overgeneralising resulting in lower data accuracy. To tackle this issue, we proposed the Semantic Linkage K-Anonymity (SLKA) approach to balance the privacy and utility preservation through detecting risky combinations hidden in the record linkage releases. \u0000ApproachK-anonymity processing quasi-identifiers of data may lead to ‘over generalisation’ when dealing with linkage data sets. As most linkage cases do not include all local patients and thus not all modifying data for privacy-preserving purposes needs to be used, we proposed the linkage k-anonymity (LKA) by which only obfuscated individuals in a released linkage set are required to be indistinguishable from at least k-1 other individuals in the local dataset. Considering the inference disclosure issue, we further designed the semantic-based linkage k-anonymity (SLKA) method through extending with a semantic-rule base for automatic detection of (and ruling out) risky associations from previous linked data releases. Specially, associations identified from the “previous releases” of the linkage dataset can become the input of semantic reasoning for the “next release”. \u0000ResultsThe approach is evaluated based on a linkage scenario where researchers apply to link data from an Australia-wide national type-1 diabetes platform with survey results from 25,000+ Victorians about their health and wellbeing. In comparing the information loss of three methods, we find that extra cost can be incurred in SLKA for dealing with risky individuals, e.g., 13.7% vs 5.9% (LKA, k=4) however it performs much better than k-anonymity, which can cause 24% information loss (k=4). Besides, the k values can affect the level of distortion in SLKA, such as 11.5% (k=2) vs 12.9% (k=3). \u0000ConclusionThe SLKA framework provides dynamic protection for repeated linkage releases while preserving data utility by avoiding unnecessary generalisation as typified by k-anonymity.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45845651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerate the Creation of the cross agency Human Services Dataset. 加快跨部门人力服务数据集的创建。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1963
P. Nair, Michael Smith, M. Theochari
ObjectiveDevelop a digital solution for automated data ingestion and rapid update of the large-scale Human Services Dataset (HSDS) which brings together data from across government to take a powerful view of the service usage to improve outcomes of communities. ApproachThe Centre for Health Record Linkage (CHeReL) hosts a secure, high-performing data linkage system, including a Master Linkage Key (MLK) of administrative health datasets, and generates linked data to inform policy decisions. Since 2018, CHeReL has also been annually linking over 70 frontline datasets to create a large-scale longitudinal linked dataset of over 2.5 billion records. Over the course of 2021, the CHeReL led a project to incrementally improve the currency of the HSDS in compressed timeframes. This provided opportunity to assess value and feasibility of more frequent updates to the dataset within the evaluation and investment context. ResultsThe automated data Ingestion and validation led to a significant reduction in the data processing timeframes for the Accelerated linkage. We observed 80% reduction in Data ingestion and 75% reduction in data validation. The digital solution also allows asset owners to register and approve new data providers, monitor their data provision in real-time and report on data sourcing. This provides transparency to the Asset Owner and reduces the need for time-intensive and manual processes to jointly monitor data provision with the Data Linkage Centre. The digital solution also has the capability to support Data Providers automate their data feeds and provide on a regular basis through a secure non- touch process. This reduces on-going workload and ensures on-time provision. ConclusionThe process requires a systematic change in the upstream data source, and we requested participating agencies to send us data in an agreed format. The receipt of files in standard format is pivotal for reducing the overall timeframes of HSDS creation and leverage it for policy and investment purpose.
目标开发一个数字解决方案,用于自动化数据采集和快速更新大规模人类服务数据集(HSDS),该数据集汇集了来自政府的数据,以强有力地了解服务使用情况,从而改善社区的成果。方法健康记录链接中心(CHeReL)拥有一个安全、高性能的数据链接系统,包括管理健康数据集的主链接密钥(MLK),并生成链接数据以告知政策决策。自2018年以来,CHeReL每年还链接70多个前线数据集,创建一个超过25亿条记录的大规模纵向链接数据集。2021年,CHeReL领导了一个项目,在压缩的时间框架内逐步提高HSDS的货币性。这为在评估和投资背景下更频繁地更新数据集提供了评估价值和可行性的机会。结果自动化数据摄入和验证显著缩短了加速链接的数据处理时间。我们观察到数据摄入减少了80%,数据验证减少了75%。数字解决方案还允许资产所有者注册和批准新的数据提供商,实时监控其数据提供,并报告数据来源。这为资产所有者提供了透明度,并减少了与数据链接中心联合监控数据提供的耗时和手动流程的需要。数字解决方案还能够支持数据提供商自动化其数据馈送,并通过安全的非接触过程定期提供数据。这减少了持续的工作量,并确保了按时供应。结论该过程需要对上游数据源进行系统性更改,我们要求参与机构以商定的格式向我们发送数据。接收标准格式的文件对于缩短HSDS创建的总体时间框架并将其用于政策和投资目的至关重要。
{"title":"Accelerate the Creation of the cross agency Human Services Dataset.","authors":"P. Nair, Michael Smith, M. Theochari","doi":"10.23889/ijpds.v7i3.1963","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1963","url":null,"abstract":"ObjectiveDevelop a digital solution for automated data ingestion and rapid update of the large-scale Human Services Dataset (HSDS) which brings together data from across government to take a powerful view of the service usage to improve outcomes of communities. \u0000ApproachThe Centre for Health Record Linkage (CHeReL) hosts a secure, high-performing data linkage system, including a Master Linkage Key (MLK) of administrative health datasets, and generates linked data to inform policy decisions. Since 2018, CHeReL has also been annually linking over 70 frontline datasets to create a large-scale longitudinal linked dataset of over 2.5 billion records. \u0000Over the course of 2021, the CHeReL led a project to incrementally improve the currency of the HSDS in compressed timeframes. This provided opportunity to assess value and feasibility of more frequent updates to the dataset within the evaluation and investment context. \u0000ResultsThe automated data Ingestion and validation led to a significant reduction in the data processing timeframes for the Accelerated linkage. We observed 80% reduction in Data ingestion and 75% reduction in data validation. \u0000The digital solution also allows asset owners to register and approve new data providers, monitor their data provision in real-time and report on data sourcing. This provides transparency to the Asset Owner and reduces the need for time-intensive and manual processes to jointly monitor data provision with the Data Linkage Centre. \u0000The digital solution also has the capability to support Data Providers automate their data feeds and provide on a regular basis through a secure non- touch process. This reduces on-going workload and ensures on-time provision. \u0000ConclusionThe process requires a systematic change in the upstream data source, and we requested participating agencies to send us data in an agreed format. The receipt of files in standard format is pivotal for reducing the overall timeframes of HSDS creation and leverage it for policy and investment purpose.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46019220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A configurable software platform for creating, reviewing and adjudicating annotation of unstructured text. 一个用于创建、审查和裁决非结构化文本注释的可配置软件平台。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1953
R. Beare, Adam Morris, Tanya Ravipati, Elizabeth Le, T. Collyer, Helene Roberts, V. Srikanth, Nadine E. Andrew
ObjectivesTo develop a flexible platform for creating, reviewing and adjudicating annotation of unstructured text. Natural Language Processing models and statistical classifiers use the results for analysis of large databases of text, such as electronic health records, that are curated by the National Centre for Healthy Ageing (NCHA) Data Platform. ApproachAutomated approaches are essential for large scale extraction of structured data from unstructured documents. We applied the CogStack suite to annotate clinical text from hospital inpatient records based on the Unified Medical Language System (UMLS) for classifying dementia status. We trained a logistic regression classifier to determine dementia/non-dementia status within two cohorts based on frequency of occurrence of a set of terms provided by experts - one with confirmed dementia based on clinical assessment and the other confirmed non-dementia based on telephone cognitive interview. We used our annotation platform to review the accuracy of concepts assigned by CogStack. ResultsThere were 368 people with clinically confirmed dementia and 218 screen-negative for dementia. Of these, 259 with dementia and 195 without dementia had documents in the inpatient electronic health record system, 84045 inpatient documents 16950 for the dementia and non-dementia cohort respectively. A set of key words pertaining to dementia was generated by a specialist neurologist and a health information manager, and matched to UMLS concepts. The NCHA data platform holds a copy of the inpatient text records (>13million documents) that has been annotated using CogStack. Annotated documents corresponding to the study cohort were extracted. We tested true positive rates of annotation against 50 concepts judged by a neurologist and health information manager to be relevant to dementia patients by manually review of 100 documents. ConclusionAutomated annotations must be validated. The platform we have developed allows efficient review and correction of annotations to allow models to be trained further or provide confidence that accuracy is sufficient for subsequent analysis. Implementation within our linked NCHA data platform will allow incorporation of text based data at scale.
目的开发一个灵活的平台,用于创建、审查和裁决非结构化文本的注释。自然语言处理模型和统计分类器将结果用于分析由国家健康老龄化中心(NCHA)数据平台策划的大型文本数据库,如电子健康记录。方法自动化方法对于从非结构化文档中大规模提取结构化数据至关重要。我们应用CogStack套件对基于统一医学语言系统(UMLS)的医院住院记录中的临床文本进行注释,以对痴呆状态进行分类。我们训练了一个逻辑回归分类器,根据专家提供的一组术语的出现频率来确定两个队列中的痴呆症/非痴呆症状态——一个基于临床评估的确诊痴呆症,另一个基于电话认知访谈的确诊非痴呆症。我们使用我们的注释平台来审查CogStack分配的概念的准确性。结果临床确诊痴呆368例,痴呆筛查阴性218例。其中,259名痴呆症患者和195名无痴呆症患者的住院电子健康记录系统中有文件,84045名痴呆症和非痴呆症患者分别有16950份住院文件。一位神经科专家和一位健康信息经理生成了一组与痴呆症相关的关键词,并与UMLS概念相匹配。NCHA数据平台保存一份使用CogStack进行注释的住院患者文本记录(>1300万份文档)。提取了与研究队列相对应的注释文件。我们通过手动审查100份文件,针对神经学家和健康信息管理人员判断与痴呆症患者相关的50个概念,测试了注释的真实阳性率。结论自动化注释必须经过验证。我们开发的平台允许对注释进行有效的审查和更正,以便进一步训练模型,或为后续分析提供足够的准确性。在我们链接的NCHA数据平台内实施将允许大规模合并基于文本的数据。
{"title":"A configurable software platform for creating, reviewing and adjudicating annotation of unstructured text.","authors":"R. Beare, Adam Morris, Tanya Ravipati, Elizabeth Le, T. Collyer, Helene Roberts, V. Srikanth, Nadine E. Andrew","doi":"10.23889/ijpds.v7i3.1953","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1953","url":null,"abstract":"ObjectivesTo develop a flexible platform for creating, reviewing and adjudicating annotation of unstructured text. Natural Language Processing models and statistical classifiers use the results for analysis of large databases of text, such as electronic health records, that are curated by the National Centre for Healthy Ageing (NCHA) Data Platform. \u0000ApproachAutomated approaches are essential for large scale extraction of structured data from unstructured documents. We applied the CogStack suite to annotate clinical text from hospital inpatient records based on the Unified Medical Language System (UMLS) for classifying dementia status. We trained a logistic regression classifier to determine dementia/non-dementia status within two cohorts based on frequency of occurrence of a set of terms provided by experts - one with confirmed dementia based on clinical assessment and the other confirmed non-dementia based on telephone cognitive interview. We used our annotation platform to review the accuracy of concepts assigned by CogStack. \u0000ResultsThere were 368 people with clinically confirmed dementia and 218 screen-negative for dementia. Of these, 259 with dementia and 195 without dementia had documents in the inpatient electronic health record system, 84045 inpatient documents 16950 for the dementia and non-dementia cohort respectively. A set of key words pertaining to dementia was generated by a specialist neurologist and a health information manager, and matched to UMLS concepts. The NCHA data platform holds a copy of the inpatient text records (>13million documents) that has been annotated using CogStack. Annotated documents corresponding to the study cohort were extracted. \u0000We tested true positive rates of annotation against 50 concepts judged by a neurologist and health information manager to be relevant to dementia patients by manually review of 100 documents. \u0000ConclusionAutomated annotations must be validated. The platform we have developed allows efficient review and correction of annotations to allow models to be trained further or provide confidence that accuracy is sufficient for subsequent analysis. Implementation within our linked NCHA data platform will allow incorporation of text based data at scale.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45739094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Population Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1