首页 > 最新文献

International Journal of Population Data Science最新文献

英文 中文
AD|ARC: Construction of a research ready dataset to better understand farmers and farming households. AD|ARC:构建一个可供研究的数据集,以更好地了解农民和农户。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1905
J. Hampton, Nick Webster, Sophie Jordan, S. Morrison-Rees, O. Bateson, James Watson, S. McFarlane, Alastair McAlpine, L. Cavin
ObjectivesThe AD|ARC Administrative Data: Agriculture Research Collection is an ambitious and original linkage project, bringing together information about farmers and farming households from several sources. When complete, this research-ready dataset will assist in addressing three broad themes: health and well-being, prosperity and resilience, and engagement with agri-environment. ApproachThe dataset is being constructed from information drawn from survey, census, and administrative sources. Necessarily, this includes working across government departments to ensure comprehensive coverage of farm, business, education, and health data. Similarly, data owners, processors, and researchers are working closely to ensure the resultant dataset meets expectations. Alongside this cross-sectoral aspect, the work is also cross-jurisdictional, with the intention being for the data to capture information about farms, farmers and farming households from across the UK. ResultsRather than focus on the detail of the substantive research that AD|ARC will enable, this paper discusses some of the challenges and successes of this linkage project to date. Drawing on the experience of the teams from across the UK (England, Northern Ireland, Scotland, and Wales), the first part will discuss challenges faced in linkage of this multi-faceted project, alongside how the population census is being utilised to better understand farming communities, through the identification of both farming households and workers. Secondly, a broader discussion of the challenges and sensitivities of working across government departments and administrations will be presented, alongside ways of working developed to recognise and overcome these. ConclusionThe AD|ARC project will result in an invaluable resource to better understand the farming community, which in turn will help to better inform policy debate and decision making. Alongside this, the process of creating the dataset has offered opportunities for learning and insight across a range of issues.
目标AD|ARC行政数据:农业研究收集是一个雄心勃勃的原创链接项目,汇集了来自多个来源的农民和农户的信息。完成后,这个可供研究的数据集将有助于解决三个广泛的主题:健康和福祉、繁荣和复原力以及与农业环境的互动。方法数据集是根据调查、人口普查和行政来源的信息构建的。必要的是,这包括跨政府部门的工作,以确保农业、商业、教育和健康数据的全面覆盖。同样,数据所有者、处理者和研究人员正在密切合作,以确保生成的数据集符合预期。除了这项跨部门的工作外,这项工作也是跨司法管辖区的,目的是收集英国各地农场、农民和农户的信息。结果本文没有关注AD|ARC将进行的实质性研究的细节,而是讨论了迄今为止这一联系项目的一些挑战和成功之处。第一部分将借鉴来自英国各地(英格兰、北爱尔兰、苏格兰和威尔士)的团队的经验,讨论在这个多方面项目的联系中面临的挑战,以及如何利用人口普查通过识别农业家庭和工人来更好地了解农业社区。其次,将更广泛地讨论跨政府部门和行政部门工作的挑战和敏感性,以及为认识和克服这些挑战而制定的工作方法。结论AD|ARC项目将为更好地了解农业社区提供宝贵的资源,这反过来将有助于更好地为政策辩论和决策提供信息。除此之外,创建数据集的过程为学习和深入了解一系列问题提供了机会。
{"title":"AD|ARC: Construction of a research ready dataset to better understand farmers and farming households.","authors":"J. Hampton, Nick Webster, Sophie Jordan, S. Morrison-Rees, O. Bateson, James Watson, S. McFarlane, Alastair McAlpine, L. Cavin","doi":"10.23889/ijpds.v7i3.1905","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1905","url":null,"abstract":"ObjectivesThe AD|ARC Administrative Data: Agriculture Research Collection is an ambitious and original linkage project, bringing together information about farmers and farming households from several sources. When complete, this research-ready dataset will assist in addressing three broad themes: health and well-being, prosperity and resilience, and engagement with agri-environment. \u0000ApproachThe dataset is being constructed from information drawn from survey, census, and administrative sources. Necessarily, this includes working across government departments to ensure comprehensive coverage of farm, business, education, and health data. Similarly, data owners, processors, and researchers are working closely to ensure the resultant dataset meets expectations. Alongside this cross-sectoral aspect, the work is also cross-jurisdictional, with the intention being for the data to capture information about farms, farmers and farming households from across the UK. \u0000ResultsRather than focus on the detail of the substantive research that AD|ARC will enable, this paper discusses some of the challenges and successes of this linkage project to date. Drawing on the experience of the teams from across the UK (England, Northern Ireland, Scotland, and Wales), the first part will discuss challenges faced in linkage of this multi-faceted project, alongside how the population census is being utilised to better understand farming communities, through the identification of both farming households and workers. Secondly, a broader discussion of the challenges and sensitivities of working across government departments and administrations will be presented, alongside ways of working developed to recognise and overcome these. \u0000ConclusionThe AD|ARC project will result in an invaluable resource to better understand the farming community, which in turn will help to better inform policy debate and decision making. Alongside this, the process of creating the dataset has offered opportunities for learning and insight across a range of issues.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48988975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mental health, firearm ownership, and risk of death by suicide: a population-wide data linkage study. 心理健康、枪支所有权和自杀死亡风险:一项人口范围内的数据联系研究。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1935
E. Ross, D. O’Reilly, M. Aideen
ObjectivesThere is clear evidence from the USA that access to firearms increases suicide risk, but little equivalent evidence exists in the UK. The aim of the current study is to examine the risk of suicide and all-cause mortality for people in Northern Ireland (NI) who hold a licenced firearm. ApproachWe link information on all registrations from the Firearms Certificate (FAC) Register between 2010-2020 to the health service population spine for NI residents born before 1st January 2005. Further linkage includes prescription medication data and death records with follow-up until 31st December 2020. Results68,831 individuals held a FAC during the study period. FAC holders were more likely to be older, to reside in rural areas (OR 4.99, 4.89-7.83), and to come from more affluent areas (ORmost deprived 0.46, 0.43-0.50). During follow-up, 3,704 FAC holders died. 36 deaths were due to suicide, of which 16 were suicides by firearm. Only 23% of those who died by firearm suicide in NI were FAC holders. Preliminary findings indicate that after adjustment for age, area-level deprivation, and urbanicity, FAC holders had a lower risk of all-cause mortality (HR 0.64, 0.61-0.66) and death by suicide (HR 0.54, 0.39-0.76). ConclusionIn contrast to findings from previous studies, individuals with a licensed firearm were less likely to die by suicide. AcknowledgementThe authors would like to acknowledge the help provided by the staff of the Honest Broker Service (HBS) within the Business Services Organisation Northern Ireland (BSO).  The HBS is funded by the BSO and the Department of Health (DoH).  The authors alone are responsible for the interpretation of the data and any views or opinions presented are solely those of the author and do not necessarily represent those of the BSO.
美国有明确的证据表明,获得枪支会增加自杀风险,但在英国几乎没有类似的证据。目前这项研究的目的是调查北爱尔兰持有合法枪支的人的自杀风险和全因死亡率。方法:我们将2010-2020年期间枪支证书登记册中的所有登记信息与2005年1月1日之前出生的北爱尔兰居民的保健服务人口脊柱联系起来。进一步的联系包括处方药数据和死亡记录,随访至2020年12月31日。结果68,831人在研究期间持有FAC。FAC持有者更可能年龄较大,居住在农村地区(OR 4.99, 4.89-7.83),并且来自更富裕的地区(OR最贫困地区0.46,0.43-0.50)。随访期间,3704名FAC持有者死亡。36人死于自杀,其中16人是火器自杀。在北爱尔兰,只有23%死于枪支自杀的人持有FAC。初步结果表明,在调整了年龄、地区剥夺和城市化因素后,FAC持有者的全因死亡率(HR 0.64, 0.61-0.66)和自杀死亡风险(HR 0.54, 0.39-0.76)较低。结论:与之前的研究结果相反,持有枪支的人死于自杀的可能性更小。作者要感谢北爱尔兰商业服务组织(BSO)内诚实经纪人服务(HBS)的工作人员提供的帮助。HBS由BSO和卫生部资助。数据的解释仅由作者负责,所提供的任何观点或意见仅代表作者的观点,并不一定代表英国统计局的观点。
{"title":"Mental health, firearm ownership, and risk of death by suicide: a population-wide data linkage study.","authors":"E. Ross, D. O’Reilly, M. Aideen","doi":"10.23889/ijpds.v7i3.1935","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1935","url":null,"abstract":"ObjectivesThere is clear evidence from the USA that access to firearms increases suicide risk, but little equivalent evidence exists in the UK. The aim of the current study is to examine the risk of suicide and all-cause mortality for people in Northern Ireland (NI) who hold a licenced firearm. \u0000ApproachWe link information on all registrations from the Firearms Certificate (FAC) Register between 2010-2020 to the health service population spine for NI residents born before 1st January 2005. Further linkage includes prescription medication data and death records with follow-up until 31st December 2020. \u0000Results68,831 individuals held a FAC during the study period. FAC holders were more likely to be older, to reside in rural areas (OR 4.99, 4.89-7.83), and to come from more affluent areas (ORmost deprived 0.46, 0.43-0.50). During follow-up, 3,704 FAC holders died. 36 deaths were due to suicide, of which 16 were suicides by firearm. Only 23% of those who died by firearm suicide in NI were FAC holders. Preliminary findings indicate that after adjustment for age, area-level deprivation, and urbanicity, FAC holders had a lower risk of all-cause mortality (HR 0.64, 0.61-0.66) and death by suicide (HR 0.54, 0.39-0.76). \u0000ConclusionIn contrast to findings from previous studies, individuals with a licensed firearm were less likely to die by suicide. \u0000AcknowledgementThe authors would like to acknowledge the help provided by the staff of the Honest Broker Service (HBS) within the Business Services Organisation Northern Ireland (BSO).  The HBS is funded by the BSO and the Department of Health (DoH).  The authors alone are responsible for the interpretation of the data and any views or opinions presented are solely those of the author and do not necessarily represent those of the BSO.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48869345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The National Centre for Healthy Ageing data platform: establishing an Electronic Health Record derived linked geographic cohort. 国家健康老龄化中心数据平台:建立电子健康记录衍生的关联地理队列。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1949
Nadine E. Andrew, R. Beare, Tanya Ravipati, E. Parker, T. Collyer, David Ung, V. Srikanth
ObjectivesElectronic Health Record (EHR) data have created unique opportunities for research. However, these data are: not curated, siloed and poorly integrated. We describe linkage of EHR data from an entire health service with government datasets to establish a linked geographic cohort within the Australian National Centre for Healthy Ageing (NCHA). ApproachResearch suitable EHR items were identified from Peninsula Health (NCHA partner) data systems based on: published research, availability and quality. Items underwent end-user Delphi processes to identify core research items (consensus=70%). Approvals were obtained from the Australian Institute of Health and Welfare (AIHW) for linkage with: Medicare, medication dispensings, Aged Care and death registry data through the AIHW spine, created using identifiers from the Medicare Consumer Directory (MCD); and from the Centre for Victorian Data Linkage for linkage to state-wide hospital data. Identifiers for local residents aged ≥60 years who attended Peninsula Health were submitted for probabilistic data linkage. ResultsDelphi participants included 10 researchers from 8 fields/departments and 13 clinicians from 11 clinical areas. To date 7 of the 11 datasets have been reviewed. N=107 potentially suitable data items were identified and 96 gained consensus for inclusion in the core dataset. Of the 49,767 Health Service users (episodes: Jan 2010-Dec May 2021) submitted for linkage, 98.4% were successfully linked to the MCD (Median age 72.2 years, 52.2% female, 1.8% regional residence). An additional 172,290 individuals living within the geographic region but not contained within the EHR dataset were identified in the MCD for linkage to the government datasets. Linkage accuracy was impacted by inaccurate/incomplete address fields (~30%) and lack of adherence to naming conventions within the EHR data. ConclusionLinking with EHR data is complex. Having an established EHR research dataset will improve the feasibility of data linkage and potential for future expansion of linkages within the NCHA. Once merged, the data will be used to underpin a range of research activities related to ageing and dementia.
目的电子健康记录(EHR)数据为研究创造了独特的机会。然而,这些数据没有经过管理、孤立且整合不良。我们描述了来自整个卫生服务的电子病历数据与政府数据集的联系,以在澳大利亚国家健康老龄化中心(NCHA)内建立一个联系的地理队列。方法:从半岛健康中心(NCHA合作伙伴)的数据系统中根据已发表的研究、可用性和质量确定适合的电子病历项目。项目通过最终用户德尔菲过程来确定核心研究项目(共识=70%)。获得澳大利亚卫生和福利研究所(AIHW)的批准,通过AIHW的脊柱与医疗保险、药物分配、老年护理和死亡登记数据建立联系,使用医疗保险消费者目录(MCD)的标识符创建;以及从维多利亚数据链接中心获得与全州医院数据的链接。年龄≥60岁参加半岛健康中心的当地居民的标识符被提交进行概率数据链接。结果德尔菲调查对象包括来自8个领域/科室的10名研究人员和来自11个临床领域的13名临床医生。迄今为止,已审查了11个数据集中的7个。N=107个潜在合适的数据项被确定,96个获得共识,纳入核心数据集。在提交链接的49,767名卫生服务用户(集:2010年1月至2021年12月至5月)中,98.4%成功链接到MCD(中位年龄72.2岁,52.2%为女性,1.8%为地区居民)。在MCD中确定了生活在该地理区域但未包含在电子病历数据集中的另外172290个人,以便与政府数据集联系。链接准确性受到不准确/不完整的地址字段(约30%)和缺乏遵守EHR数据中的命名约定的影响。结论与EHR数据的链接是复杂的。建立EHR研究数据集将提高数据链接的可行性和未来在NCHA内扩展链接的潜力。一旦合并,这些数据将用于支持与衰老和痴呆症相关的一系列研究活动。
{"title":"The National Centre for Healthy Ageing data platform: establishing an Electronic Health Record derived linked geographic cohort.","authors":"Nadine E. Andrew, R. Beare, Tanya Ravipati, E. Parker, T. Collyer, David Ung, V. Srikanth","doi":"10.23889/ijpds.v7i3.1949","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1949","url":null,"abstract":"ObjectivesElectronic Health Record (EHR) data have created unique opportunities for research. However, these data are: not curated, siloed and poorly integrated. We describe linkage of EHR data from an entire health service with government datasets to establish a linked geographic cohort within the Australian National Centre for Healthy Ageing (NCHA). \u0000ApproachResearch suitable EHR items were identified from Peninsula Health (NCHA partner) data systems based on: published research, availability and quality. Items underwent end-user Delphi processes to identify core research items (consensus=70%). Approvals were obtained from the Australian Institute of Health and Welfare (AIHW) for linkage with: Medicare, medication dispensings, Aged Care and death registry data through the AIHW spine, created using identifiers from the Medicare Consumer Directory (MCD); and from the Centre for Victorian Data Linkage for linkage to state-wide hospital data. Identifiers for local residents aged ≥60 years who attended Peninsula Health were submitted for probabilistic data linkage. \u0000ResultsDelphi participants included 10 researchers from 8 fields/departments and 13 clinicians from 11 clinical areas. To date 7 of the 11 datasets have been reviewed. N=107 potentially suitable data items were identified and 96 gained consensus for inclusion in the core dataset. Of the 49,767 Health Service users (episodes: Jan 2010-Dec May 2021) submitted for linkage, 98.4% were successfully linked to the MCD (Median age 72.2 years, 52.2% female, 1.8% regional residence). An additional 172,290 individuals living within the geographic region but not contained within the EHR dataset were identified in the MCD for linkage to the government datasets. Linkage accuracy was impacted by inaccurate/incomplete address fields (~30%) and lack of adherence to naming conventions within the EHR data. \u0000ConclusionLinking with EHR data is complex. Having an established EHR research dataset will improve the feasibility of data linkage and potential for future expansion of linkages within the NCHA. Once merged, the data will be used to underpin a range of research activities related to ageing and dementia.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46337735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Understanding South Australia’s blood products usage patterns and outcomes, using data linkage. 利用数据链接了解南澳大利亚州的血液制品使用模式和结果。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1975
M. Palfy, Christopher Radbone
ObjectivesThe purpose of this analytical activity was to ensure confidence in the technical capability for extracting, linking, and integrating public hospital inpatient data, public pathology blood transfusions records and blood tests, to optimise records linkage allowing patterns and trends to be then analysed with confidence. ApproachThe SURE secure data platform was essential to ensure data governance and security requirements were met while integrating health data spanning 18 months (January 2018 - June 2019). Data sources came in multiple formats of varying quality. R was chosen for its data wrangling abilities and reproducibility. The phases were: Source data loading and cleaning Linking hospital inpatient and blood transfusions records Summarising linked transfusion data Linking inpatient and blood tests data Summarising linked tests data Integrating hospital data with summarised transfusion and summarised tests data Deriving additional variables based on summarised data ResultsFrom 143,192 transfusion records, 55,053 (38.4%) were excluded as they did not meet the inclusion criteria (e.g., hospital or blood product out-of-scope). From 7,897,451 blood test records, 238,013 (3.0%) were excluded, mostly of poor quality (missing/invalid hospital code). Initially 91.4% of transfusion records were matched with hospital inpatient records. The linkage rate for state-wide blood test records was 62.3% for tests records, noting the low match rate was attributed to tests not performed on public hospital patients, as the blood test data was statewide. Linkage process was improved by adding additional patient codes from public pathology’s internal patient identifiers. The linkage rate improved to 95.5% for transfusion records and 64.4% for test records. Conclusion12 different data sources, with differing file types and formats, needed coding to achieve standardised results, enabling future reproducibility. Over one hundred business rules were implemented to produce a robust solution for future data updates. End results were analysed, and it was determined that linkage and integration quality exceeded previous similar attempts in terms of match rate and accuracy.
目的该分析活动的目的是确保对提取、链接和集成公立医院住院数据、公共病理学输血记录和血液测试的技术能力的信心,以优化记录链接,从而可以自信地分析模式和趋势。方法SURE安全数据平台对于确保在整合18个月(2018年1月至2019年6月)的健康数据时满足数据治理和安全要求至关重要。数据源有多种不同质量的格式。选择R是因为它的数据处理能力和再现性。阶段为:源数据加载和清理链接医院住院患者和输血记录汇总链接的输血数据链接住院患者和血液测试数据汇总链接的测试数据将医院数据与汇总的输血和汇总的测试数据整合基于汇总的数据推导出附加变量结果从143192份输血记录中,55053人(38.4%)被排除在外,因为他们不符合纳入标准(例如,医院或血液制品超出范围)。在7897451份血液检测记录中,238013份(3.0%)被排除在外,主要是质量差(医院代码缺失/无效)。最初91.4%的输血记录与医院住院记录相匹配。全州血液检测记录与检测记录的关联率为62.3%,注意到低匹配率归因于没有对公立医院患者进行检测,因为血液检测数据是全州范围的。通过从公共病理学的内部患者标识符中添加额外的患者代码,改进了链接过程。输血记录和检测记录的关联率分别提高到95.5%和64.4%。结论12个不同的数据源,具有不同的文件类型和格式,需要编码以实现标准化结果,从而实现未来的再现性。实施了一百多条业务规则,为未来的数据更新提供了一个强大的解决方案。对最终结果进行了分析,确定链接和集成质量在匹配率和准确性方面超过了以前的类似尝试。
{"title":"Understanding South Australia’s blood products usage patterns and outcomes, using data linkage.","authors":"M. Palfy, Christopher Radbone","doi":"10.23889/ijpds.v7i3.1975","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1975","url":null,"abstract":"ObjectivesThe purpose of this analytical activity was to ensure confidence in the technical capability for extracting, linking, and integrating public hospital inpatient data, public pathology blood transfusions records and blood tests, to optimise records linkage allowing patterns and trends to be then analysed with confidence. \u0000ApproachThe SURE secure data platform was essential to ensure data governance and security requirements were met while integrating health data spanning 18 months (January 2018 - June 2019). Data sources came in multiple formats of varying quality. R was chosen for its data wrangling abilities and reproducibility. \u0000The phases were: \u0000 \u0000Source data loading and cleaning \u0000Linking hospital inpatient and blood transfusions records \u0000Summarising linked transfusion data \u0000Linking inpatient and blood tests data \u0000Summarising linked tests data \u0000Integrating hospital data with summarised transfusion and summarised tests data \u0000Deriving additional variables based on summarised data \u0000 \u0000ResultsFrom 143,192 transfusion records, 55,053 (38.4%) were excluded as they did not meet the inclusion criteria (e.g., hospital or blood product out-of-scope). \u0000From 7,897,451 blood test records, 238,013 (3.0%) were excluded, mostly of poor quality (missing/invalid hospital code). \u0000Initially 91.4% of transfusion records were matched with hospital inpatient records. The linkage rate for state-wide blood test records was 62.3% for tests records, noting the low match rate was attributed to tests not performed on public hospital patients, as the blood test data was statewide. \u0000Linkage process was improved by adding additional patient codes from public pathology’s internal patient identifiers. The linkage rate improved to 95.5% for transfusion records and 64.4% for test records. \u0000Conclusion12 different data sources, with differing file types and formats, needed coding to achieve standardised results, enabling future reproducibility. Over one hundred business rules were implemented to produce a robust solution for future data updates. End results were analysed, and it was determined that linkage and integration quality exceeded previous similar attempts in terms of match rate and accuracy.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46417778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient population record linkage with temporal and spatial constraints. 具有时间和空间约束的高效种群记录关联。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1854
C. Nanayakkara, P. Christen
ObjectivesPopulation databases containing birth, death, and marriage certificates or census records, are increasingly used for studies in a variety of research domains. Their large scale and complexity make linking such databases highly challenging. We present a scalable blocking and linking technique that exploits temporal and spatial constraints in personal data. ApproachBased on a state-of-the-art blocking method using locality sensitive hashing (LSH), we incorporate (a) attribute similarities, (b) temporal constraints (for example, a mother cannot give birth to two babies less than nine months apart, besides a multiple birth), and (c) spatial constraints (two births by the same mother are more likely to happen in the same location than far apart). In an iterative fashion, we identify highly confident matches first, and use these matches to further refine our constraints. We adopt a block size and frequency-based filtering approach to further enhance the efficiency of the record linkage comparison step. ResultsWe conducted experiments on a Scottish data set containing 17,613 birth certificates from 1861 to 1901, where the application of standard LSH blocking generated approximately 15 million candidate record pairs, with a recall of 0.999 and a precision of 0.003. With the application of our block size and frequency-based filtering approach we obtained a ten-fold and hundred-fold reduction of this candidate record pair set with a small reduction of recall to 0.984 and 0.962, respectively. The comparison of record pairs in the hundred-fold reduction using our iterative linking technique achieved up-to 0.961 precision and 0.811 recall. This means that our method can achieve a reduction in computational efforts, and improvement in precision of over 99% at the cost of a decline in recall below 19%. ConclusionWe presented a method to reduce the computational complexity of linking large and complex population databases while ensuring high linkage quality. Our method can be generalised to population databases where temporal and spatial constraints can be defined. We plan to apply our method on a Scottish database with 24 million records.
目的包含出生、死亡和结婚证书或人口普查记录的人口数据库越来越多地用于各种研究领域的研究。它们的规模和复杂性使得连接这样的数据库非常具有挑战性。我们提出了一种可扩展的阻塞和链接技术,利用个人数据的时间和空间限制。方法基于使用局部敏感散列(LSH)的最先进的阻塞方法,我们结合了(a)属性相似性,(b)时间约束(例如,除了多胞胎外,一个母亲不能生两个相隔不到9个月的婴儿)和(c)空间约束(同一母亲的两个孩子更有可能发生在同一位置,而不是相隔很远)。在迭代的方式中,我们首先确定高度自信的匹配,并使用这些匹配进一步细化我们的约束。我们采用基于块大小和频率的滤波方法来进一步提高记录链接比较步骤的效率。结果我们对包含17613份1861 - 1901年苏格兰出生证明的数据集进行了实验,其中应用标准LSH块生成了大约1500万个候选记录对,召回率为0.999,精度为0.003。通过应用我们的块大小和基于频率的过滤方法,我们获得了该候选记录对集的10倍和100倍减少,召回率分别降低到0.984和0.962。使用我们的迭代链接技术对百倍还原中的记录对进行比较,达到了0.961的精度和0.811的召回率。这意味着我们的方法可以减少计算工作量,并以召回率下降到19%以下为代价将精度提高到99%以上。结论提出了一种在保证高链接质量的同时降低大型复杂人口数据库连接计算复杂度的方法。我们的方法可以推广到可以定义时间和空间约束的人口数据库。我们计划将我们的方法应用于一个拥有2400万条记录的苏格兰数据库。
{"title":"Efficient population record linkage with temporal and spatial constraints.","authors":"C. Nanayakkara, P. Christen","doi":"10.23889/ijpds.v7i3.1854","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1854","url":null,"abstract":"ObjectivesPopulation databases containing birth, death, and marriage certificates or census records, are increasingly used for studies in a variety of research domains. Their large scale and complexity make linking such databases highly challenging. We present a scalable blocking and linking technique that exploits temporal and spatial constraints in personal data. \u0000ApproachBased on a state-of-the-art blocking method using locality sensitive hashing (LSH), we incorporate (a) attribute similarities, (b) temporal constraints (for example, a mother cannot give birth to two babies less than nine months apart, besides a multiple birth), and (c) spatial constraints (two births by the same mother are more likely to happen in the same location than far apart). In an iterative fashion, we identify highly confident matches first, and use these matches to further refine our constraints. We adopt a block size and frequency-based filtering approach to further enhance the efficiency of the record linkage comparison step. \u0000ResultsWe conducted experiments on a Scottish data set containing 17,613 birth certificates from 1861 to 1901, where the application of standard LSH blocking generated approximately 15 million candidate record pairs, with a recall of 0.999 and a precision of 0.003. With the application of our block size and frequency-based filtering approach we obtained a ten-fold and hundred-fold reduction of this candidate record pair set with a small reduction of recall to 0.984 and 0.962, respectively. The comparison of record pairs in the hundred-fold reduction using our iterative linking technique achieved up-to 0.961 precision and 0.811 recall. This means that our method can achieve a reduction in computational efforts, and improvement in precision of over 99% at the cost of a decline in recall below 19%. \u0000ConclusionWe presented a method to reduce the computational complexity of linking large and complex population databases while ensuring high linkage quality. Our method can be generalised to population databases where temporal and spatial constraints can be defined. We plan to apply our method on a Scottish database with 24 million records.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46697832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bespoke automated linkage to enable analysis of covid deaths by ethnicity. 定制自动链接,以便按种族分析新冠肺炎死亡人数。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2050
Shelley Gammon, R. Shipsey, Charlie Tomlin, Josie Plachta
In early 2020 there was intense media speculation that ethnicity and Covid-19 deaths were correlated. However, the existing method of adding ethnicity to death records resulted in low linkage rates for very recent deaths. We designed and implemented a bespoke linkage in three days enabling accurate reporting to the nation. We linked the 2011 England and Wales Census to death records using a range of personal identifiers. Due to time pressure, we focused on executing a single linkage method well. Deterministic linkage was chosen, using a variety of matchkeys which were tested via clerical review. To overcome the issue of addresses changing since 2011, we also linked 2020 death record residuals to the 2019 Patient Register (PR) and then made use of the 2011 PR address where it existed.  This additionally provided an indication of whether unmatched death records might be attributable to migration into England and Wales post-2011. The prior linking method used NHS Number only. Although the overall linkage rate was approximately 90%, the rate for recent deaths (2nd March 2020 to 10th April 2020 in the first iteration of the linkage) was closer to 30% due to an administrative lag in adding NHS Numbers to death records. Our novel bespoke linkage method linked over 39,000 extra death records. Whilst this had minimal impact on the overall linkage rate, it improved the linkage rate for recent deaths to approximately 90%. This was without an impact on accuracy: clerical review demonstrated that the false positive rate was approximately 0.2%. A report was published using this data showing that the risk of death involving Covid-19 among some ethnic groups was significantly higher than others. Determining whether Covid-19 disproportionally affected certain ethnicities was of crucial importance in the early phase of the pandemic to enable appropriate government strategies to be developed. We delivered a bespoke linkage under an exceptional time-limit without compromising on accuracy, enabling this impactful analysis with nation-wide interest and impact.
2020年初,媒体强烈猜测种族与新冠肺炎死亡人数相关。然而,在死亡记录中添加种族的现有方法导致最近死亡的关联率较低。我们在三天内设计并实施了定制的链接,从而能够向全国准确报告。我们使用一系列个人标识符将2011年英格兰和威尔士人口普查与死亡记录联系起来。由于时间压力,我们专注于很好地执行单个链接方法。选择了确定性链接,使用了通过文书审查测试的各种匹配键。为了解决自2011年以来地址变化的问题,我们还将2020年的死亡记录残差与2019年的患者登记册(PR)联系起来,然后使用了2011年的PR地址。这还表明,2011年后移民到英格兰和威尔士是否可能导致无与伦比的死亡记录。之前的链接方法仅使用NHS编号。尽管总体联系率约为90%,但最近的死亡率(2020年3月2日至2020年4月10日,在第一次联系中)接近30%,原因是在将NHS数字添加到死亡记录中的管理滞后。我们新颖的定制链接方法链接了39000多个额外的死亡记录。虽然这对总体联系率的影响很小,但它将近期死亡的联系率提高到了约90%。这对准确性没有影响:文书审查表明,假阳性率约为0.2%。使用这些数据发布的一份报告显示,一些族裔群体因新冠肺炎死亡的风险明显高于其他族裔群体。在大流行的早期阶段,确定新冠肺炎是否对某些种族产生了不成比例的影响至关重要,这有助于制定适当的政府战略。我们在不影响准确性的情况下,在特殊的时间限制下提供了定制的链接,使这种具有影响力的分析能够在全国范围内引起兴趣和影响。
{"title":"Bespoke automated linkage to enable analysis of covid deaths by ethnicity.","authors":"Shelley Gammon, R. Shipsey, Charlie Tomlin, Josie Plachta","doi":"10.23889/ijpds.v7i3.2050","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2050","url":null,"abstract":"In early 2020 there was intense media speculation that ethnicity and Covid-19 deaths were correlated. However, the existing method of adding ethnicity to death records resulted in low linkage rates for very recent deaths. We designed and implemented a bespoke linkage in three days enabling accurate reporting to the nation. \u0000We linked the 2011 England and Wales Census to death records using a range of personal identifiers. Due to time pressure, we focused on executing a single linkage method well. Deterministic linkage was chosen, using a variety of matchkeys which were tested via clerical review. To overcome the issue of addresses changing since 2011, we also linked 2020 death record residuals to the 2019 Patient Register (PR) and then made use of the 2011 PR address where it existed.  This additionally provided an indication of whether unmatched death records might be attributable to migration into England and Wales post-2011. \u0000The prior linking method used NHS Number only. Although the overall linkage rate was approximately 90%, the rate for recent deaths (2nd March 2020 to 10th April 2020 in the first iteration of the linkage) was closer to 30% due to an administrative lag in adding NHS Numbers to death records. Our novel bespoke linkage method linked over 39,000 extra death records. Whilst this had minimal impact on the overall linkage rate, it improved the linkage rate for recent deaths to approximately 90%. This was without an impact on accuracy: clerical review demonstrated that the false positive rate was approximately 0.2%. A report was published using this data showing that the risk of death involving Covid-19 among some ethnic groups was significantly higher than others. \u0000Determining whether Covid-19 disproportionally affected certain ethnicities was of crucial importance in the early phase of the pandemic to enable appropriate government strategies to be developed. We delivered a bespoke linkage under an exceptional time-limit without compromising on accuracy, enabling this impactful analysis with nation-wide interest and impact.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46706686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The COVID - Curated and Open aNalysis aNd rEsearCh plaTform (CO-CONNECT). 新冠肺炎治愈和开放性aNalysis and rESERCH平台(COCONNECT)。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1792
E. Jefferson, Aziz Sheik, S. Hopkins, P. Quinlan
ObjectivesCO-CONNECT is making UK COVID-19 data Findable, Accessible, Interoperable and Reusable (FAIR) through a federated platform, which supports secure, anonymised research at scale and pace. This interdisciplinary project, spanning 22 organisations, is connecting data from >50 large research cohorts and data collected through routine healthcare provision across the UK.ApproachAcross the UK, data has been collected that can help us answer key questions about COVID-19. As the data are in many places with many different processes it is difficult and complex for public health groups, researchers, policymakers, and government to find and access lots of high-quality data quickly and efficiently to make decisions. In collaboration with Health Data Research UK, CO-CONNECT is streamlining processes of accessing data for research.Results1) Discovering data and meta-analysis: CO-CONNECT enables researchers to determine how many people meet their research criteria within the various datasets across the UK through the Health Data Research Innovation Gateway Cohort Discovery tool e.g. “How many people in each dataset have had a PCR test which was positive and were under the age of 40?” Only summary level, anonymous data are provided so researchers can answer such questions rapidly without requiring multiple data governance permissions and directly contacting each data source. The tool also supports aggregate level meta-analysis of the data.2) Detailed analysis: With data governance approvals, researchers can analyse detailed level, standardised, linked, pseudonymised data in a Trusted Research Environment. The common format reduces the effort on each research project, supporting rapid research.ConclusionProviding data in this de-identifiable, safe way enables rapid, robust research e.g., COVID-19 results from a test centre can be linked to hospital records along with prescriptions from pharmacies enabling researchers to understand whether people with different existing health conditions are more or less susceptible to COVID-19. If you want to know more visit https://co-connect.ac.uk.
目的CO-CONECT通过一个联合平台使英国新冠肺炎数据可查找、可访问、可互操作和可重复使用(FAIR),该平台支持大规模和快速的安全匿名研究。这个跨学科项目横跨22个组织,将50多个大型研究群体的数据与通过英国常规医疗服务收集的数据联系起来。方法在英国各地收集的数据可以帮助我们回答有关新冠肺炎的关键问题。由于数据分布在许多地方,有许多不同的过程,公共卫生组织、研究人员、政策制定者和政府很难快速高效地找到和访问大量高质量的数据来做出决策。COCONNECT与英国健康数据研究所合作,正在简化访问研究数据的流程。结果1)发现数据和荟萃分析:COCONNECT使研究人员能够通过健康数据研究创新网关队列发现工具确定英国各地不同数据集中有多少人符合他们的研究标准,例如“每个数据集中有几个人的PCR检测呈阳性且年龄在40岁以下?”,提供了匿名数据,这样研究人员就可以快速回答这些问题,而不需要多个数据治理权限,也不需要直接联系每个数据源。该工具还支持数据的聚合级荟萃分析。2)详细分析:通过数据治理批准,研究人员可以在可信的研究环境中分析详细的、标准化的、链接的、假名化的数据。通用格式减少了每个研究项目的工作量,支持快速研究。结论以这种无法识别、安全的方式提供数据,可以进行快速、有力的研究,例如,检测中心的新冠肺炎结果可以与医院记录以及药店的处方联系起来,使研究人员能够了解不同现有健康状况的人是否或多或少容易感染新冠肺炎。如果您想了解更多信息,请访问https://co-connect.ac.uk.
{"title":"The COVID - Curated and Open aNalysis aNd rEsearCh plaTform (CO-CONNECT).","authors":"E. Jefferson, Aziz Sheik, S. Hopkins, P. Quinlan","doi":"10.23889/ijpds.v7i3.1792","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1792","url":null,"abstract":"ObjectivesCO-CONNECT is making UK COVID-19 data Findable, Accessible, Interoperable and Reusable (FAIR) through a federated platform, which supports secure, anonymised research at scale and pace. This interdisciplinary project, spanning 22 organisations, is connecting data from >50 large research cohorts and data collected through routine healthcare provision across the UK.\u0000ApproachAcross the UK, data has been collected that can help us answer key questions about COVID-19. As the data are in many places with many different processes it is difficult and complex for public health groups, researchers, policymakers, and government to find and access lots of high-quality data quickly and efficiently to make decisions. In collaboration with Health Data Research UK, CO-CONNECT is streamlining processes of accessing data for research.\u0000Results1) Discovering data and meta-analysis: CO-CONNECT enables researchers to determine how many people meet their research criteria within the various datasets across the UK through the Health Data Research Innovation Gateway Cohort Discovery tool e.g. “How many people in each dataset have had a PCR test which was positive and were under the age of 40?” Only summary level, anonymous data are provided so researchers can answer such questions rapidly without requiring multiple data governance permissions and directly contacting each data source. The tool also supports aggregate level meta-analysis of the data.\u00002) Detailed analysis: With data governance approvals, researchers can analyse detailed level, standardised, linked, pseudonymised data in a Trusted Research Environment. The common format reduces the effort on each research project, supporting rapid research.\u0000ConclusionProviding data in this de-identifiable, safe way enables rapid, robust research e.g., COVID-19 results from a test centre can be linked to hospital records along with prescriptions from pharmacies enabling researchers to understand whether people with different existing health conditions are more or less susceptible to COVID-19. If you want to know more visit https://co-connect.ac.uk.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47588335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GRAIMatter: Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMatter). GRAIMatter:TrusTEd研究环境中的人工智能模型访问指南和资源(GRAIMatter)。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.2005
E. Jefferson, Christian Cole, Alba Crespi i Boixader, Simon Rogers, Maeve Malone, F. Ritchie, Jim Q. Smith, Francesco Tava, A. Daly, J. Beggs, Antony Chuter
ObjectivesTo assess a range of tools and methods to support Trusted Research Environments (TREs) to assess output from AI methods for potentially identifiable information, investigate the legal and ethical implications and controls, and produce a set of guidelines and recommendations to support all TREs with export controls of AI algorithms. ApproachTREs provide secure facilities to analyse confidential personal data, with staff checking outputs for disclosure risk before publication. Artificial intelligence (AI) has high potential to improve the linking and analysis of population data, and TREs are well suited to supporting AI modelling. However, TRE governance focuses on classical statistical data analysis. The size and complexity of AI models presents significant challenges for the disclosure-checking process. Models may be susceptible to external hacking: complicated methods to reverse engineer the learning process to find out about the data used for training, with more potential to lead to re-identification than conventional statistical methods. ResultsGRAIMatter is: Quantitatively assessing the risk of disclosure from different AI models exploring different models, hyper-parameter settings and training algorithms over common data types Evaluating a range of tools to determine effectiveness for disclosure control Assessing the legal and ethical implications of TREs supporting AI development and identifying aspects of existing legal and regulatory frameworks requiring reform. Running 4 PPIE workshops to understand their priorities and beliefs around safeguarding and securing data Developing a set of recommendations including suggested open-source toolsets for TREs to use to measure and reduce disclosure risk descriptions of the technical and legal controls and policies TREs should implement across the 5 Safes to support AI algorithm disclosure control training implications for both TRE staff and how they validate researchers ConclusionGRAIMatter is developing a set of usable recommendations for TREs to use to guard against the additional risks when disclosing trained AI models from TREs.
目的评估一系列支持可信研究环境(TRE)的工具和方法,评估人工智能方法对潜在可识别信息的输出,调查法律和道德影响和控制,并制定一套指导方针和建议,以支持所有具有人工智能算法出口控制的TRE。方法TRE提供了分析机密个人数据的安全设施,工作人员在发布前检查输出是否存在披露风险。人工智能(AI)在改善人口数据的连接和分析方面具有很高的潜力,TRE非常适合支持人工智能建模。然而,TRE治理侧重于经典的统计数据分析。人工智能模型的规模和复杂性对披露检查过程提出了重大挑战。模型可能容易受到外部黑客攻击:对学习过程进行逆向工程以找出用于训练的数据的复杂方法,比传统的统计方法更有可能导致重新识别。结果GRAIMatter是:定量评估不同人工智能模型探索不同模型的披露风险,常见数据类型的超参数设置和训练算法评估一系列工具以确定披露控制的有效性评估支持人工智能开发的TRE的法律和道德影响,并确定需要改革的现有法律和监管框架的各个方面。举办4次PPIE研讨会,以了解他们在保护和保护数据方面的优先事项和信念制定一套建议,包括供TRE使用的建议开源工具集,以衡量和减少TRE应在5个Safes中实施的技术和法律控制及政策的披露风险描述,以支持AI算法披露控制培训对TRE工作人员的影响以及他们如何验证研究人员结论GRAIMatter正在为TRE制定一套可用的建议,以在披露TRE中经过训练的人工智能模型时防范额外的风险。
{"title":"GRAIMatter: Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMatter).","authors":"E. Jefferson, Christian Cole, Alba Crespi i Boixader, Simon Rogers, Maeve Malone, F. Ritchie, Jim Q. Smith, Francesco Tava, A. Daly, J. Beggs, Antony Chuter","doi":"10.23889/ijpds.v7i3.2005","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.2005","url":null,"abstract":"ObjectivesTo assess a range of tools and methods to support Trusted Research Environments (TREs) to assess output from AI methods for potentially identifiable information, investigate the legal and ethical implications and controls, and produce a set of guidelines and recommendations to support all TREs with export controls of AI algorithms. \u0000ApproachTREs provide secure facilities to analyse confidential personal data, with staff checking outputs for disclosure risk before publication. Artificial intelligence (AI) has high potential to improve the linking and analysis of population data, and TREs are well suited to supporting AI modelling. However, TRE governance focuses on classical statistical data analysis. The size and complexity of AI models presents significant challenges for the disclosure-checking process. Models may be susceptible to external hacking: complicated methods to reverse engineer the learning process to find out about the data used for training, with more potential to lead to re-identification than conventional statistical methods. \u0000ResultsGRAIMatter is: \u0000 \u0000Quantitatively assessing the risk of disclosure from different AI models exploring different models, hyper-parameter settings and training algorithms over common data types \u0000Evaluating a range of tools to determine effectiveness for disclosure control \u0000Assessing the legal and ethical implications of TREs supporting AI development and identifying aspects of existing legal and regulatory frameworks requiring reform. \u0000Running 4 PPIE workshops to understand their priorities and beliefs around safeguarding and securing data \u0000Developing a set of recommendations including \u0000 \u0000suggested open-source toolsets for TREs to use to measure and reduce disclosure risk \u0000descriptions of the technical and legal controls and policies TREs should implement across the 5 Safes to support AI algorithm disclosure control \u0000training implications for both TRE staff and how they validate researchers \u0000 \u0000 \u0000 \u0000ConclusionGRAIMatter is developing a set of usable recommendations for TREs to use to guard against the additional risks when disclosing trained AI models from TREs.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49346277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Challenges in public healthcare research data warehouse integration and operationalisation. 公共医疗研究数据仓库集成和运营方面的挑战。
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1859
Tanya Ravipati, N. Andrew, V. Srikanth, R. Beare
ObjectivesPublic health service organisations use multiple patient administration and electronic health record systems. We describe the implementation of a data warehouse automation tool within the National Centre for Healthy Ageing (NCHA) data platform to operationalise a research data warehouse to optimise data quality and data provision for health services research. ApproachThe traditional data warehouse life cycle comprises repetitive manual tasks and dependency on specialist developers. Automation tools overcome most of these inefficiencies. We conducted an internal risk benefit analysis which was validated by published literature containing data warehouse optimisation and automation. Industry-based data warehouse automation tools were reviewed to align the NCHA requirements with the tool’s functionality. Tools were then shortlisted and evaluated over a six-week period: (1) automation of standard tasks; (2) data pipeline alignment with the World Health Organization’s (WHO) Data Quality Review Framework; and (3) resource dependency risk mitigation through a Proof of Concept (PoC). ResultsThe priority areas identified by the risk benefit analysis included: end-to-end data warehouse automation; auto scripting; connectivity/linkage with multiple sources, reverse/forward engineering, audit trail conformance, scalability, multiple data warehouse architectures support, automated documentation; data management including data quality; and post-subscription independence. Twenty scientific publications were included in the final literature review (10% within healthcare) and supported the majority of identified priority areas. The industry-based review identified 11 suitable data warehouse/Extract-Transform-Load (ETL) automation tools. Five tools demonstrated adequate performance for task automation, data quality management, reduced dependency on specialist developers and on-premise linkage compatibility. Two automation tools were tested each for 6 weeks through PoC development. One automation tool met 8 out of the 10 automation requirements and was selected for implementation. ConclusionData warehouse development processes are complex and time consuming. Tools that offer automation of repetitive tasks and scripting increase the consistency while reducing the dependency on specialist staff.  Integrated data quality management minimises the time researchers spend in pre-processing patient level data sourced through a semi-automated data warehouse.
目的公共卫生服务机构使用多病人管理和电子健康记录系统。我们描述了在国家健康老龄化中心(NCHA)数据平台内实施数据仓库自动化工具,以运行研究数据仓库,优化数据质量和卫生服务研究的数据提供。方法传统的数据仓库生命周期包括重复的手动任务和对专业开发人员的依赖。自动化工具克服了大多数效率低下的问题。我们进行了内部风险收益分析,该分析通过包含数据仓库优化和自动化的已发表文献进行了验证。对基于行业的数据仓库自动化工具进行了审查,以使NCHA要求与该工具的功能保持一致。然后在六周的时间内对工具进行了入围和评估:(1)标准任务的自动化;(2) 与世界卫生组织(世界卫生组织)数据质量审查框架保持一致的数据管道;以及(3)通过概念验证(PoC)减轻资源依赖性风险。结果风险收益分析确定的优先领域包括:端到端数据仓库自动化;自动脚本;与多个来源的连接/链接、反向/正向工程、审计跟踪一致性、可扩展性、多数据仓库架构支持、自动化文档;数据管理,包括数据质量;以及订阅后的独立性。20篇科学出版物被纳入最终文献综述(10%在医疗保健领域),并支持大多数已确定的优先领域。基于行业的审查确定了11个合适的数据仓库/提取转换负载(ETL)自动化工具。五个工具在任务自动化、数据质量管理、减少对专业开发人员的依赖以及内部链接兼容性方面表现出了足够的性能。通过PoC开发,对两个自动化工具分别进行了为期6周的测试。一个自动化工具满足了10个自动化要求中的8个,并被选中实施。结论数据仓库开发过程复杂且耗时。提供重复任务和脚本自动化的工具可以提高一致性,同时减少对专业人员的依赖。集成的数据质量管理最大限度地减少了研究人员在预处理通过半自动化数据仓库获得的患者级数据方面花费的时间。
{"title":"Challenges in public healthcare research data warehouse integration and operationalisation.","authors":"Tanya Ravipati, N. Andrew, V. Srikanth, R. Beare","doi":"10.23889/ijpds.v7i3.1859","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1859","url":null,"abstract":"ObjectivesPublic health service organisations use multiple patient administration and electronic health record systems. We describe the implementation of a data warehouse automation tool within the National Centre for Healthy Ageing (NCHA) data platform to operationalise a research data warehouse to optimise data quality and data provision for health services research. \u0000ApproachThe traditional data warehouse life cycle comprises repetitive manual tasks and dependency on specialist developers. Automation tools overcome most of these inefficiencies. We conducted an internal risk benefit analysis which was validated by published literature containing data warehouse optimisation and automation. Industry-based data warehouse automation tools were reviewed to align the NCHA requirements with the tool’s functionality. Tools were then shortlisted and evaluated over a six-week period: (1) automation of standard tasks; (2) data pipeline alignment with the World Health Organization’s (WHO) Data Quality Review Framework; and (3) resource dependency risk mitigation through a Proof of Concept (PoC). \u0000ResultsThe priority areas identified by the risk benefit analysis included: end-to-end data warehouse automation; auto scripting; connectivity/linkage with multiple sources, reverse/forward engineering, audit trail conformance, scalability, multiple data warehouse architectures support, automated documentation; data management including data quality; and post-subscription independence. Twenty scientific publications were included in the final literature review (10% within healthcare) and supported the majority of identified priority areas. The industry-based review identified 11 suitable data warehouse/Extract-Transform-Load (ETL) automation tools. Five tools demonstrated adequate performance for task automation, data quality management, reduced dependency on specialist developers and on-premise linkage compatibility. Two automation tools were tested each for 6 weeks through PoC development. One automation tool met 8 out of the 10 automation requirements and was selected for implementation. \u0000ConclusionData warehouse development processes are complex and time consuming. Tools that offer automation of repetitive tasks and scripting increase the consistency while reducing the dependency on specialist staff.  Integrated data quality management minimises the time researchers spend in pre-processing patient level data sourced through a semi-automated data warehouse.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41317681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Is PAX-Good Behaviour Game (PAX) Associated with Better Mental Health and Educational Outcomes for First Nations Children? PAX良好行为游戏(PAX)与原住民儿童更好的心理健康和教育结果有关吗?
Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2022-08-25 DOI: 10.23889/ijpds.v7i3.1957
M. Chartier, G. Munro, D. Jiang, Scott C McCulloch, Wendy Au, M. Brownell, Rob Santos, F. Turner, Leanne Boyd, Nora Murdock, J. Bolton, J. Sareen
ObjectivesPAX, a mental health promotion approach, has been shown to decrease negative mental health outcomes and improve academic achievement. These effects have yet to be shown among Indigenous children. We evaluated PAX for improving First Nations children’s outcomes following a research process wherein community members and researchers work more collaboratively. ApproachBuilding on a long-term relationship with Swampy Cree Tribal Council, community members, First Nations leaders and researchers worked together through all phases of the project. This cluster randomized controlled trial used population-based health, social services, and education administrative data that allowed de-identified individual-level linkages across all databases through a scrambled health number.  Our cohort of 725 children from 20 First Nations schools were randomized to PAX (n=469, 11 schools) or wait-list control (n=256, 9 schools). We used propensity score weighting and multi-level modeling to estimate the differences over time (2011 up to 2020) between children exposed to PAX and those who were not. ResultsDifferences in baseline characteristics were found between the two groups of children, despite the cluster randomization. After applying propensity score weights, children in the PAX group had significantly greater decreases in conduct problems (β:-1.08, standard error(se):0.2505, p<.0001), hyperactivity (β:-1.13, se:0.3617, p=.0018 ), and peer problems (β:-1.10, se:0.3043, p=.0003) and a greater increase in prosocial scores (β:2.68, se:0.4139, p<.0001) than control group children. The percentage of children in the PAX group who met academic expectations was higher than those in the control group, however, only grade 3 numeracy (odds ratio (OR):4.30, confidence interval (CI):1.34 – 13.77) and grade 8 reading and writing (OR:2.78, CI:1.01 – 7.67) met statistical significance.  We found no evidence that PAX was associated with less emotional problems, diagnosed mental disorders or better student engagement. ConclusionThese findings suggest that PAX was effective in improving First Nations children’s mental health and academic outcomes in First Nations communities. Examining what works in Indigenous communities is crucial because approaches that are effective in some populations may not necessarily be culturally appropriate for remote Indigenous communities.
目的pax是一种心理健康促进方法,已被证明可以减少心理健康的负面结果,提高学习成绩。这些影响尚未在土著儿童中显示出来。在社区成员和研究人员更加合作的研究过程中,我们评估了PAX在改善第一民族儿童成果方面的作用。在与沼泽克里部落委员会、社区成员、原住民领袖和研究人员建立长期关系的基础上,在项目的各个阶段共同努力。该集群随机对照试验使用基于人群的健康、社会服务和教育管理数据,通过一个混乱的健康号码,允许在所有数据库中进行去识别的个人层面联系。来自20所第一民族学校的725名儿童被随机分为PAX组(n=469, 11所学校)和等候名单组(n=256, 9所学校)。我们使用倾向得分加权和多层次模型来估计暴露于PAX和未暴露于PAX的儿童之间随时间(2011年至2020年)的差异。结果尽管采用了聚类随机化,但两组儿童的基线特征仍存在差异。应用倾向评分权重后,PAX组儿童的行为问题(β:-1.08,标准误差(se):0.2505, p< 0.0001)和多动症(β:-1.13, se:0.3617, p=)的下降幅度显著大于PAX组。0018),同伴问题(β:-1.10, se:0.3043, p= 0.0003),亲社会得分(β:2.68, se:0.4139, p< 0.0001)高于对照组儿童。PAX组达到学业期望的儿童比例高于对照组,但只有3年级的计算能力(优势比(OR):4.30,置信区间(CI):1.34 - 13.77)和8年级的阅读和写作能力(OR:2.78, CI:1.01 - 7.67)符合统计学意义。我们没有发现任何证据表明PAX与较少的情绪问题、诊断出的精神障碍或更好的学生参与度有关。结论PAX可有效改善原住民儿童的心理健康和学业成绩。研究在土著社区有效的方法至关重要,因为在某些人群中有效的方法在文化上不一定适用于偏远的土著社区。
{"title":"Is PAX-Good Behaviour Game (PAX) Associated with Better Mental Health and Educational Outcomes for First Nations Children?","authors":"M. Chartier, G. Munro, D. Jiang, Scott C McCulloch, Wendy Au, M. Brownell, Rob Santos, F. Turner, Leanne Boyd, Nora Murdock, J. Bolton, J. Sareen","doi":"10.23889/ijpds.v7i3.1957","DOIUrl":"https://doi.org/10.23889/ijpds.v7i3.1957","url":null,"abstract":"ObjectivesPAX, a mental health promotion approach, has been shown to decrease negative mental health outcomes and improve academic achievement. These effects have yet to be shown among Indigenous children. We evaluated PAX for improving First Nations children’s outcomes following a research process wherein community members and researchers work more collaboratively. \u0000ApproachBuilding on a long-term relationship with Swampy Cree Tribal Council, community members, First Nations leaders and researchers worked together through all phases of the project. This cluster randomized controlled trial used population-based health, social services, and education administrative data that allowed de-identified individual-level linkages across all databases through a scrambled health number.  Our cohort of 725 children from 20 First Nations schools were randomized to PAX (n=469, 11 schools) or wait-list control (n=256, 9 schools). We used propensity score weighting and multi-level modeling to estimate the differences over time (2011 up to 2020) between children exposed to PAX and those who were not. \u0000ResultsDifferences in baseline characteristics were found between the two groups of children, despite the cluster randomization. After applying propensity score weights, children in the PAX group had significantly greater decreases in conduct problems (β:-1.08, standard error(se):0.2505, p<.0001), hyperactivity (β:-1.13, se:0.3617, p=.0018 ), and peer problems (β:-1.10, se:0.3043, p=.0003) and a greater increase in prosocial scores (β:2.68, se:0.4139, p<.0001) than control group children. The percentage of children in the PAX group who met academic expectations was higher than those in the control group, however, only grade 3 numeracy (odds ratio (OR):4.30, confidence interval (CI):1.34 – 13.77) and grade 8 reading and writing (OR:2.78, CI:1.01 – 7.67) met statistical significance.  We found no evidence that PAX was associated with less emotional problems, diagnosed mental disorders or better student engagement. \u0000ConclusionThese findings suggest that PAX was effective in improving First Nations children’s mental health and academic outcomes in First Nations communities. Examining what works in Indigenous communities is crucial because approaches that are effective in some populations may not necessarily be culturally appropriate for remote Indigenous communities.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46963793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Population Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1