首页 > 最新文献

International Journal of Population Data Science最新文献

英文 中文
Data Note: Challenges when combining housing data from multiple sources to identify overcrowded households. 数据说明:在结合多个来源的住房数据以确定过度拥挤的家庭时存在挑战。
IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-05-20 eCollection Date: 2023-01-01 DOI: 10.23889/ijpds.v8i2.2927
Laura Scott, Yan Weigang, Marcella Ucci, Jessica Sheringham

Background: This project in one urban local authority in London (England) sought to assess the feasibility of generating locally-derived indices of overcrowding using data available to local councils on the population and their homes.We merged data at household level using the Unique Property Reference Number from publicly available Energy Performance Certificates and commercial property platforms, with data available to councils on the population and their housing characteristics, drawn from multiple sources including council tax bands and council housing databases. Multiple imputation was used to address missing data. Using the dataset, it was possible to generate two indices of overcrowding for households with dependent children, based on the bedroom standard and the space standard, which could be compared with nationally derived estimates.

Data challenges: We encountered three challenges with data. 1. Individuals in the population were excluded through linkage with household-level data. 2. Definitions of overcrowding are ambiguous and variably applied. 3. Many local areas face high proportions of missing household data, particularly numbers of bedrooms. We discuss how we addressed such problems and illustrate with a local example how they could affect estimates of overcrowding prevalence.

Lessons learned: Further clarity is needed in how bedrooms are defined to compare overcrowding prevalence generated locally and nationally. Access to national records on bedroom numbers would facilitate local areas to identify overcrowding in their own populations. Despite these challenges, we demonstrate it is feasible to generate overcrowding indices that can be useful for researchers and local policy makers seeking to develop or evaluate strategies to address household overcrowding.

背景:这个项目在伦敦(英格兰)的一个城市地方当局进行,目的是利用地方议会可获得的关于人口及其住房的数据,评估产生当地过度拥挤指数的可行性。我们使用来自公开的能源绩效证书和商业物业平台的唯一物业参考号码,将家庭层面的数据与议会可获得的人口及其住房特征数据合并,这些数据来自多个来源,包括议会税收等级和议会住房数据库。采用多重插值解决缺失数据。利用该数据集,可以根据卧室标准和空间标准,为有受抚养子女的家庭生成两个过度拥挤指数,这两个指数可以与国家得出的估计值进行比较。数据挑战:我们在数据方面遇到了三个挑战。1. 通过与家庭数据的联系,排除了人口中的个体。2. 过度拥挤的定义是模棱两可的,适用范围也不尽相同。3. 许多地方都面临着大量家庭数据缺失的问题,尤其是卧室数量。我们将讨论如何解决这些问题,并以一个当地的例子说明它们如何影响对过度拥挤流行程度的估计。经验教训:需要进一步明确如何定义卧室,以比较地方和全国产生的过度拥挤现象。获得关于卧室数量的全国记录将有助于当地地区确定其人口过度拥挤的情况。尽管存在这些挑战,我们证明了生成过度拥挤指数是可行的,这些指数可以为研究人员和当地政策制定者寻求制定或评估解决家庭过度拥挤问题的策略提供帮助。
{"title":"<i>Data Note</i>: Challenges when combining housing data from multiple sources to identify overcrowded households.","authors":"Laura Scott, Yan Weigang, Marcella Ucci, Jessica Sheringham","doi":"10.23889/ijpds.v8i2.2927","DOIUrl":"10.23889/ijpds.v8i2.2927","url":null,"abstract":"<p><strong>Background: </strong>This project in one urban local authority in London (England) sought to assess the feasibility of generating locally-derived indices of overcrowding using data available to local councils on the population and their homes.We merged data at household level using the Unique Property Reference Number from publicly available Energy Performance Certificates and commercial property platforms, with data available to councils on the population and their housing characteristics, drawn from multiple sources including council tax bands and council housing databases. Multiple imputation was used to address missing data. Using the dataset, it was possible to generate two indices of overcrowding for households with dependent children, based on the bedroom standard and the space standard, which could be compared with nationally derived estimates.</p><p><strong>Data challenges: </strong>We encountered three challenges with data. 1. Individuals in the population were excluded through linkage with household-level data. 2. Definitions of overcrowding are ambiguous and variably applied. 3. Many local areas face high proportions of missing household data, particularly numbers of bedrooms. We discuss how we addressed such problems and illustrate with a local example how they could affect estimates of overcrowding prevalence.</p><p><strong>Lessons learned: </strong>Further clarity is needed in how bedrooms are defined to compare overcrowding prevalence generated locally and nationally. Access to national records on bedroom numbers would facilitate local areas to identify overcrowding in their own populations. Despite these challenges, we demonstrate it is feasible to generate overcrowding indices that can be useful for researchers and local policy makers seeking to develop or evaluate strategies to address household overcrowding.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"8 2","pages":"2927"},"PeriodicalIF":1.6,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12093136/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144119973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Open science and phenotyping in UK administrative health, education and social care data: the ECHILD phenotype code list repository. 开放科学和表型在英国行政卫生,教育和社会保健数据:ECHILD表型代码列表库。
IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-05-13 eCollection Date: 2025-01-01 DOI: 10.23889/ijpds.v10i2.2943
Matthew A Jay, Kate Lewis, Difei Shi, Rebecca Langella, Tony Stone, Sorcha Ní Chobhthaigh, Ania Zylbersztejn, Ruth Blackburn, Katie Harron

Administrative health data, such as the Hospital Episode Statistics (HES), can be used to identify groups of people with a particular target condition, a process known as phenotyping. Clinical phenotypes are useful as exposures, covariates and outcomes in research studies using administrative data, including health data linked to other sources such as the Education and Child Health Insights from Linked Data (ECHILD) project. ECHILD brings together HES and other national health datasets with the National Pupil Database and children's social care data for all of England as a data asset that can be accessed by researchers at UK institutions. Because using linked administrative data is complex, the ECHILD team has created additional resources to improve the accessibility of ECHILD. One such initiative is the ECHILD Phenotype Code List Repository. The Repository is a fully open and searchable website containing phenotype code lists that can be used in ECHILD and beyond. As well as a primer on phenotyping, it includes summaries of each code list and R and Stata implementation scripts. The Repository was designed according to a set of principles to ensure that finding and using code lists is easy and standardised. The ECHILD Phenotype Code List Repository is a step forward in the findability and use of phenotype code lists in ECHILD and its constituent datasets.

行政卫生数据,如医院事件统计(HES),可用于识别具有特定目标条件的人群,这一过程称为表型。临床表型在使用管理数据(包括与其他来源相关的健康数据,如来自关联数据的教育和儿童健康洞察(ECHILD)项目)的研究中作为暴露、协变量和结果是有用的。ECHILD将HES和其他国家健康数据集与国家学生数据库和全英格兰儿童社会护理数据作为数据资产汇集在一起,可供英国机构的研究人员访问。由于使用链接的管理数据很复杂,ECHILD团队创建了额外的资源来改善ECHILD的可访问性。其中一个倡议是ECHILD表型代码列表存储库。该资源库是一个完全开放和可搜索的网站,包含表型代码列表,可用于ECHILD和其他。除了对表型的入门,它还包括每个代码列表以及R和Stata实现脚本的摘要。Repository是根据一组原则设计的,以确保查找和使用代码列表是容易和标准化的。ECHILD表型代码列表存储库是在ECHILD及其组成数据集中表型代码列表的可查找性和使用方面向前迈出的一步。
{"title":"Open science and phenotyping in UK administrative health, education and social care data: the ECHILD phenotype code list repository.","authors":"Matthew A Jay, Kate Lewis, Difei Shi, Rebecca Langella, Tony Stone, Sorcha Ní Chobhthaigh, Ania Zylbersztejn, Ruth Blackburn, Katie Harron","doi":"10.23889/ijpds.v10i2.2943","DOIUrl":"https://doi.org/10.23889/ijpds.v10i2.2943","url":null,"abstract":"<p><p>Administrative health data, such as the Hospital Episode Statistics (HES), can be used to identify groups of people with a particular target condition, a process known as phenotyping. Clinical phenotypes are useful as exposures, covariates and outcomes in research studies using administrative data, including health data linked to other sources such as the Education and Child Health Insights from Linked Data (ECHILD) project. ECHILD brings together HES and other national health datasets with the National Pupil Database and children's social care data for all of England as a data asset that can be accessed by researchers at UK institutions. Because using linked administrative data is complex, the ECHILD team has created additional resources to improve the accessibility of ECHILD. One such initiative is the ECHILD Phenotype Code List Repository. The Repository is a fully open and searchable website containing phenotype code lists that can be used in ECHILD and beyond. As well as a primer on phenotyping, it includes summaries of each code list and R and Stata implementation scripts. The Repository was designed according to a set of principles to ensure that finding and using code lists is easy and standardised. The ECHILD Phenotype Code List Repository is a step forward in the findability and use of phenotype code lists in ECHILD and its constituent datasets.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 2","pages":"2943"},"PeriodicalIF":1.6,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12076273/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144079932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-temporal forecasting of COVID-19 cases in the Netherlands for source and contact tracing. 荷兰COVID-19病例的时空预测及其来源和接触者追踪
IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-05-07 eCollection Date: 2025-01-01 DOI: 10.23889/ijpds.v10i1.2703
Max C Keuken, Jizzo R Bosdriesz, Anders Boyd, Elisabeth M den Boogert, Ivo K Joore, Nicole H T M Dukers-Muijrers, Gini van Rijckevorsel, Hannelore M Götz, Irene E Goverse, Mariska W F Petrignani, Stijn F H Raven, Susan van den Hof, Kirsten V C Wevers-de Boer, Maarten F Schim van der Loeff, Amy Matser

Source and contact tracing (SCT) is a core public health measure that is used to contain the spread of infectious diseases. It aims to identify a source of infection, and to advise those who have been exposed to this source. Due to the rapid increases in incidence of COVID-19 in the Netherlands, the capacity to conduct a full SCT quickly became insufficient. Therefore, the public health services (PHS) might benefit from a restricted strategy targeted to geographical regions where (predicted) case-to-case transmission is high. In this study, we set out to develop a prediction model for the number of COVID-19 cases per postal code within the Netherlands using geographic and demographic features. The study population consists of individuals residing in one of the participating nine Dutch PHS regions who tested positive for SARS-CoV-2 between 1 June 2020 and 27 February 2021. Using a machine learning random forest regression model, we predicted the top 100 postal codes with the highest number of cases with an accuracy of 49% for the current week, 42% for next week, and 44% for two weeks from present. In addition, the age groups of 20-39 and 40-64 years had a higher prediction accuracy than groups outside these age ranges. The developed model provides a starting point for targeted preventive SCT efforts that incorporate geospatial and demographic characteristics of a neighbourhood. It should nonetheless be noted that during the early stages of the outbreak, the number of available datapoints needed to inform such models are likely insufficient. Given the accuracy and data requirements of the developed model, it is unlikely that this class of models can play a pivotal role in informing policy during the early phases of a future epidemic.

传染源和接触者追踪(SCT)是一项用于控制传染病传播的核心公共卫生措施。其目的是确定传染源,并向接触过传染源的人提供建议。由于2019冠状病毒病在荷兰的发病率迅速增加,进行全面SCT的能力很快就变得不足。因此,公共卫生服务可能受益于一项针对(预计)病例间传播率高的地理区域的有限战略。在本研究中,我们利用地理和人口特征开发了一个预测模型,用于预测荷兰境内每个邮政编码的COVID-19病例数。研究人群包括居住在参与的九个荷兰小灵通地区之一的个人,他们在2020年6月1日至2021年2月27日期间对SARS-CoV-2检测呈阳性。使用机器学习随机森林回归模型,我们预测了案例数量最多的前100个邮政编码,本周的准确率为49%,下周的准确率为42%,两周后的准确率为44%。此外,20-39岁和40-64岁年龄组的预测准确率高于其他年龄组。开发的模型为结合社区地理空间和人口特征的有针对性的预防性SCT工作提供了起点。然而,应当指出,在疫情爆发的早期阶段,为这种模型提供信息所需的现有数据点数量可能不足。鉴于已开发模型的准确性和数据要求,这类模型不太可能在未来流行病的早期阶段为政策提供信息方面发挥关键作用。
{"title":"Spatio-temporal forecasting of COVID-19 cases in the Netherlands for source and contact tracing.","authors":"Max C Keuken, Jizzo R Bosdriesz, Anders Boyd, Elisabeth M den Boogert, Ivo K Joore, Nicole H T M Dukers-Muijrers, Gini van Rijckevorsel, Hannelore M Götz, Irene E Goverse, Mariska W F Petrignani, Stijn F H Raven, Susan van den Hof, Kirsten V C Wevers-de Boer, Maarten F Schim van der Loeff, Amy Matser","doi":"10.23889/ijpds.v10i1.2703","DOIUrl":"https://doi.org/10.23889/ijpds.v10i1.2703","url":null,"abstract":"<p><p>Source and contact tracing (SCT) is a core public health measure that is used to contain the spread of infectious diseases. It aims to identify a source of infection, and to advise those who have been exposed to this source. Due to the rapid increases in incidence of COVID-19 in the Netherlands, the capacity to conduct a full SCT quickly became insufficient. Therefore, the public health services (PHS) might benefit from a restricted strategy targeted to geographical regions where (predicted) case-to-case transmission is high. In this study, we set out to develop a prediction model for the number of COVID-19 cases per postal code within the Netherlands using geographic and demographic features. The study population consists of individuals residing in one of the participating nine Dutch PHS regions who tested positive for SARS-CoV-2 between 1 June 2020 and 27 February 2021. Using a machine learning random forest regression model, we predicted the top 100 postal codes with the highest number of cases with an accuracy of 49% for the current week, 42% for next week, and 44% for two weeks from present. In addition, the age groups of 20-39 and 40-64 years had a higher prediction accuracy than groups outside these age ranges. The developed model provides a starting point for targeted preventive SCT efforts that incorporate geospatial and demographic characteristics of a neighbourhood. It should nonetheless be noted that during the early stages of the outbreak, the number of available datapoints needed to inform such models are likely insufficient. Given the accuracy and data requirements of the developed model, it is unlikely that this class of models can play a pivotal role in informing policy during the early phases of a future epidemic.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2703"},"PeriodicalIF":1.6,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12058245/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144040266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discovering linked data collections through a new national metadata platform. 通过新的国家元数据平台发现关联的数据集合。
IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-04-30 eCollection Date: 2025-01-01 DOI: 10.23889/ijpds.v10i1.2461
Kate M Miller, Felicity S Flack, Merran B Smith, Vicki Bennett, Carina Ecremen Marshall

Background: Metadata plays a crucial role in the health research infrastructure ecosystem. Despite the abundance of metadata for data collections in Australia, the vast and diverse data custodian landscape poses challenges for linked data researchers to find relevant information for multiple data collections, often making it an arduous and time-intensive task.

Methods: The project comprised three phases: an initial scoping exercise to understand the current state of metadata and related best practice; a national consultation involving researchers, data linkage staff and data custodians to develop a high-fidelity prototype of a metadata platform; and a final build and implementation phase. The platform underwent several prototyping and testing cycles to refine the digital experience.

Results: Expert interviews confirmed that there is a wealth of metadata available, but it is difficult for researchers to access and evaluate. Consultations with researchers identified opportunities to standardise metadata across collections and provide a centralised platform to enhance the discoverability of data collections for research using linked data. High value platform features included searching, browsing and filtering capabilities, data item list metadata, standardised formats, sample data, and frequently asked questions. The final design and functionality reflected user consultations and data custodian input on feasibility.

Conclusion: The Population Health Research Network developed a metadata platform to enable researchers to evaluate the suitability of Australian data collections for linked data projects more effectively. The platform has standardised the way in which metadata is presented for data collections nationally. Improved metadata quality, readability and accessibility will save time and enhance the quality of applications for linked data.

背景:元数据在卫生研究基础设施生态系统中起着至关重要的作用。尽管澳大利亚的数据收集有丰富的元数据,但庞大而多样的数据托管环境给关联数据研究人员寻找多个数据收集的相关信息带来了挑战,这往往使其成为一项艰巨而耗时的任务。方法:该项目包括三个阶段:最初的范围界定工作,以了解元数据的当前状态和相关的最佳实践;由研究人员、数据联系工作人员和数据保管人参与的全国协商,以开发元数据平台的高保真原型;最后的构建和实现阶段。该平台经历了几个原型和测试周期,以完善数字体验。结果:专家访谈证实,有丰富的元数据可用,但研究人员难以访问和评估。与研究人员的磋商确定了跨集合的元数据标准化的机会,并提供了一个集中的平台,以增强使用关联数据进行研究的数据集合的可发现性。高价值的平台特性包括搜索、浏览和过滤功能、数据项列表元数据、标准化格式、样本数据和常见问题。最终的设计和功能反映了用户咨询和数据管理员对可行性的输入。结论:人口健康研究网络开发了一个元数据平台,使研究人员能够更有效地评估澳大利亚数据收集对关联数据项目的适用性。该平台标准化了为全国数据收集提供元数据的方式。改进的元数据质量、可读性和可访问性将节省时间并提高链接数据应用程序的质量。
{"title":"Discovering linked data collections through a new national metadata platform.","authors":"Kate M Miller, Felicity S Flack, Merran B Smith, Vicki Bennett, Carina Ecremen Marshall","doi":"10.23889/ijpds.v10i1.2461","DOIUrl":"https://doi.org/10.23889/ijpds.v10i1.2461","url":null,"abstract":"<p><strong>Background: </strong>Metadata plays a crucial role in the health research infrastructure ecosystem. Despite the abundance of metadata for data collections in Australia, the vast and diverse data custodian landscape poses challenges for linked data researchers to find relevant information for multiple data collections, often making it an arduous and time-intensive task.</p><p><strong>Methods: </strong>The project comprised three phases: an initial scoping exercise to understand the current state of metadata and related best practice; a national consultation involving researchers, data linkage staff and data custodians to develop a high-fidelity prototype of a metadata platform; and a final build and implementation phase. The platform underwent several prototyping and testing cycles to refine the digital experience.</p><p><strong>Results: </strong>Expert interviews confirmed that there is a wealth of metadata available, but it is difficult for researchers to access and evaluate. Consultations with researchers identified opportunities to standardise metadata across collections and provide a centralised platform to enhance the discoverability of data collections for research using linked data. High value platform features included searching, browsing and filtering capabilities, data item list metadata, standardised formats, sample data, and frequently asked questions. The final design and functionality reflected user consultations and data custodian input on feasibility.</p><p><strong>Conclusion: </strong>The Population Health Research Network developed a metadata platform to enable researchers to evaluate the suitability of Australian data collections for linked data projects more effectively. The platform has standardised the way in which metadata is presented for data collections nationally. Improved metadata quality, readability and accessibility will save time and enhance the quality of applications for linked data.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2461"},"PeriodicalIF":1.6,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12042732/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144020126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transparency in the existence, use, and output of a mental health data resource: a descriptive paper from the National Institute for Health and Care Research (NIHR) Maudsley Biomedical Research Centre (BRC) Clinical Record Interactive Search (CRIS) Platform. 心理健康数据资源存在、使用和输出的透明度:来自国家卫生与保健研究所(NIHR)莫兹利生物医学研究中心(BRC)临床记录互动搜索(CRIS)平台的一篇描述性论文。
IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-04-10 eCollection Date: 2025-01-01 DOI: 10.23889/ijpds.v10i2.2945
Amelia Jewell, Matthew Broadbent, Claire Delaney-Pope, Megan Pritchard, Hannah Woods, Robert Stewart

Background: Transparency in the use of routinely collected mental health data for research is essential in maintaining public support and trust, as well as for supporting the sharing of information and data resources amongst the academic community. The National Institute for Health and Care Research (NIHR) Maudsley Biomedical Research Centre (BRC) Clinical Records Interactive Search (CRIS) enables a case register of deidentified mental health records from the South London and Maudsley NHS Foundation Trust (SLaM). CRIS supports mental health research across the lifespan from children and adolescents to older adults.

Aim: This paper aims to describe the activities which contribute to ensuring that transparency is maintained throughout the journey of data in CRIS: from data collection, through application in research, to dissemination of findings.

Approach: A communications plan is in place to support Patient and Public Involvement (PPI) and transparency initiatives for all CRIS stakeholders, including patients and carers, academic users, and the general public. Activities can be divided into three categories of transparency: existence, use, and output.

Discussion: There are challenges to maintaining transparency, including ensuring that activities are varied enough to reach all stakeholders, including harder to reach groups, and presenting information in a way that is appropriate for the relevant audience. However, greater transparency has led to more opportunities for researchers to engage with patients and the CRIS model is widely accepted by patients.

Conclusion: This paper set out to describe CRIS communications and transparency activities. We believe the material covered will be of interest to other providers of routinely collected data for research.

背景:为了维持公众的支持和信任,以及为了支持学术界之间的信息和数据资源共享,在使用常规收集的精神卫生数据进行研究方面保持透明度至关重要。国家健康与护理研究所(NIHR)莫兹利生物医学研究中心(BRC)临床记录互动搜索(CRIS)使来自南伦敦和莫兹利NHS基金会信托基金(SLaM)的未识别精神健康记录的病例登记册成为可能。CRIS支持从儿童、青少年到老年人的整个生命周期的心理健康研究。目的:本文旨在描述有助于确保在CRIS数据的整个过程中保持透明度的活动:从数据收集,通过研究应用,到发现的传播。方法:制定了一项沟通计划,以支持所有CRIS利益相关者(包括患者和护理人员、学术用户和公众)的患者和公众参与(PPI)和透明度倡议。活动的透明度可分为三类:存在、使用和输出。讨论:保持透明度存在挑战,包括确保活动的多样性足以覆盖所有利益相关者,包括更难覆盖的群体,以及以适合相关受众的方式呈现信息。然而,更大的透明度为研究人员提供了更多与患者接触的机会,CRIS模型被患者广泛接受。结论:本文旨在描述CRIS的沟通和透明度活动。我们相信所涵盖的材料将对其他常规收集研究数据的提供者感兴趣。
{"title":"Transparency in the existence, use, and output of a mental health data resource: a descriptive paper from the National Institute for Health and Care Research (NIHR) Maudsley Biomedical Research Centre (BRC) Clinical Record Interactive Search (CRIS) Platform.","authors":"Amelia Jewell, Matthew Broadbent, Claire Delaney-Pope, Megan Pritchard, Hannah Woods, Robert Stewart","doi":"10.23889/ijpds.v10i2.2945","DOIUrl":"https://doi.org/10.23889/ijpds.v10i2.2945","url":null,"abstract":"<p><strong>Background: </strong>Transparency in the use of routinely collected mental health data for research is essential in maintaining public support and trust, as well as for supporting the sharing of information and data resources amongst the academic community. The National Institute for Health and Care Research (NIHR) Maudsley Biomedical Research Centre (BRC) Clinical Records Interactive Search (CRIS) enables a case register of deidentified mental health records from the South London and Maudsley NHS Foundation Trust (SLaM). CRIS supports mental health research across the lifespan from children and adolescents to older adults.</p><p><strong>Aim: </strong>This paper aims to describe the activities which contribute to ensuring that transparency is maintained throughout the journey of data in CRIS: from data collection, through application in research, to dissemination of findings.</p><p><strong>Approach: </strong>A communications plan is in place to support Patient and Public Involvement (PPI) and transparency initiatives for all CRIS stakeholders, including patients and carers, academic users, and the general public. Activities can be divided into three categories of transparency: existence, use, and output.</p><p><strong>Discussion: </strong>There are challenges to maintaining transparency, including ensuring that activities are varied enough to reach all stakeholders, including harder to reach groups, and presenting information in a way that is appropriate for the relevant audience. However, greater transparency has led to more opportunities for researchers to engage with patients and the CRIS model is widely accepted by patients.</p><p><strong>Conclusion: </strong>This paper set out to describe CRIS communications and transparency activities. We believe the material covered will be of interest to other providers of routinely collected data for research.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2945"},"PeriodicalIF":1.6,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12076277/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144079829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data resource profile: a guide for constructing school-to-work sequence analysis trajectories using the longitudinal education outcomes (LEO) data. 数据资源概况:使用纵向教育成果(LEO)数据构建学校到工作序列分析轨迹的指南。
IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-03-25 eCollection Date: 2023-01-01 DOI: 10.23889/ijpds.v8i6.2953
Shivani Sickotra

Introduction: Sequence analysis is a powerful methodology for examining longitudinal school-to-work trajectories. Despite its growing use, there is limited guidance on preparing suitable datasets. This resource details the creation of a dataset specifically designed for sequence analysis, capturing yearly education and employment activity states for 556,182 individuals from England's 2010/11 school-leaver cohort.

Methods: The dataset was constructed using the Department for Education's Longitudinal Education Outcomes (LEO) data. SQL was used to extract relevant variables, and data linkage and preprocessing was performed using R. Data processing was tailored to sequence analysis, including reducing the number of activity states and applying a hierarchy to integrate education and employment data.

Results: The resulting dataset spans activities from the first non-compulsory state in 2011/12 until 2018/19, tracking trajectories from ages 16/17 to 23/24. The dataset was designed with the ability to subset school-leavers by their initial Combined Authority residence to aid in regional analysis of school-to-work trajectories. Individual-level socio-demographic characteristics that can be linked to the longitudinal activity histories were also built, alongside longitudinal geographic locations and employment earnings data. Additionally, the limitations of the developed data are discussed.

Conclusion: This resource provides crucial guidance for researchers and practitioners who may require experience preparing input datasets for sequence analysis, addressing the current gap in available resources. By offering step-by-step instructions and shared code, it empowers users to recreate or adapt the dataset for their specific research needs. Its ability to subset by region further supports localised and comparative studies of school-to-work trajectories, making it a valuable tool for advancing existing research. The LEO data can be accessed by application through the Office for National Statistics Secure Research Service.

简介序列分析是研究从学校到工作的纵向轨迹的有力方法。尽管其应用日益广泛,但关于如何准备合适数据集的指导却很有限。本资料详细介绍了如何创建一个专门用于序列分析的数据集,该数据集记录了英格兰 2010/11 年离校学生群体中 556,182 人的年度教育和就业活动状态:该数据集是利用教育部的纵向教育成果(LEO)数据构建的。数据处理是为序列分析量身定制的,包括减少活动状态的数量以及应用层次结构整合教育和就业数据:由此产生的数据集跨越了从 2011/12 年第一个非义务教育状态到 2018/19 年的活动,追踪了从 16/17 岁到 23/24 岁的轨迹。数据集的设计能够根据离校者最初的联合行政区居住地对其进行子集,以帮助对从学校到工作的轨迹进行区域分析。除了纵向地理位置和就业收入数据外,还建立了可与纵向活动历史相联系的个人社会人口特征。此外,还讨论了所开发数据的局限性:本资料为需要为序列分析准备输入数据集经验的研究人员和从业人员提供了重要指导,解决了当前可用资源不足的问题。通过提供分步指导和共享代码,它使用户能够重新创建或调整数据集,以满足其特定的研究需求。它还能按地区进行子集分析,进一步支持对从学校到工作的轨迹进行本地化比较研究,使其成为推进现有研究的重要工具。LEO 数据可通过国家统计局安全研究服务处申请获取。
{"title":"Data resource profile: a guide for constructing school-to-work sequence analysis trajectories using the longitudinal education outcomes (LEO) data.","authors":"Shivani Sickotra","doi":"10.23889/ijpds.v8i6.2953","DOIUrl":"10.23889/ijpds.v8i6.2953","url":null,"abstract":"<p><strong>Introduction: </strong>Sequence analysis is a powerful methodology for examining longitudinal school-to-work trajectories. Despite its growing use, there is limited guidance on preparing suitable datasets. This resource details the creation of a dataset specifically designed for sequence analysis, capturing yearly education and employment activity states for 556,182 individuals from England's 2010/11 school-leaver cohort.</p><p><strong>Methods: </strong>The dataset was constructed using the Department for Education's Longitudinal Education Outcomes (LEO) data. SQL was used to extract relevant variables, and data linkage and preprocessing was performed using R. Data processing was tailored to sequence analysis, including reducing the number of activity states and applying a hierarchy to integrate education and employment data.</p><p><strong>Results: </strong>The resulting dataset spans activities from the first non-compulsory state in 2011/12 until 2018/19, tracking trajectories from ages 16/17 to 23/24. The dataset was designed with the ability to subset school-leavers by their initial Combined Authority residence to aid in regional analysis of school-to-work trajectories. Individual-level socio-demographic characteristics that can be linked to the longitudinal activity histories were also built, alongside longitudinal geographic locations and employment earnings data. Additionally, the limitations of the developed data are discussed.</p><p><strong>Conclusion: </strong>This resource provides crucial guidance for researchers and practitioners who may require experience preparing input datasets for sequence analysis, addressing the current gap in available resources. By offering step-by-step instructions and shared code, it empowers users to recreate or adapt the dataset for their specific research needs. Its ability to subset by region further supports localised and comparative studies of school-to-work trajectories, making it a valuable tool for advancing existing research. The LEO data can be accessed by application through the Office for National Statistics Secure Research Service.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"8 6","pages":"2953"},"PeriodicalIF":1.6,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11935648/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143711548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The application of population data linkage to capture sibling health outcomes among children and young adults with neurodevelopmental conditions. A scoping review. 人口数据链接的应用,以捕获兄弟姐妹的健康结果在儿童和年轻人的神经发育条件。范围审查。
IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-03-18 eCollection Date: 2025-01-01 DOI: 10.23889/ijpds.v10i1.2413
Caitlin Gray, Helen Leonard, Matthew N Cooper, Dheeraj Rai, Emma J Glasson

Introduction: Siblings of children with neurodevelopmental conditions have unique experiences and challenges related to their sibling role. Some develop mental health concerns as measured by self-reported surveys or parent report. Few data are available at the population level, owing to difficulties capturing wide-scale health data for siblings. Data linkage is a technique that can facilitate such research.

Objective: To explore the application of population data linkage as a research method to capture health outcomes of siblings of children with neurodevelopmental conditions.

Inclusion criteria: Peer reviewed papers that captured health outcomes for siblings of children and young adults with neurodevelopmental conditions using population data linkage.

Methods: JBI Scoping review methods were followed. Papers were searched within CINAHL, Ovid, Scopus, and Web of Science from 2000 to 2024 using search terms relating to 'data linkage' 'neurodevelopmental conditions' 'siblings' and 'health outcomes'.

Results: The final data extraction included 31 papers. The neurodevelopmental conditions of index children were autism, attention deficit hyperactivity disorder, intellectual disability, cerebral palsy and developmental delay. The mean follow-up time was 31 years, and the majority of studies originated from Scandinavia. Sibling health outcomes observed were psychiatric diagnoses, self-harm and suicide, other neurodevelopmental conditions, and medical conditions such as atopic disease, cancer and obesity.

Conclusion: Data linkage can help capture sibling health outcomes quickly across large cohorts with a range of neurodevelopmental conditions. Future research could be enhanced by focusing on siblings as the primary group of interest, increased integration of genealogical data, and comparisons between diagnostic groups and severity levels. Adoption of established rigorous reporting methods will increase the replicability of this type of research, and provide a stronger evidence-base from which to inform sibling supports.

儿童的兄弟姐妹与神经发育条件有独特的经验和挑战相关的兄弟姐妹的角色。根据自我报告的调查或家长报告,一些人出现了心理健康问题。由于难以获得兄弟姐妹的大规模健康数据,人口一级的数据很少。数据链接是一种可以促进这种研究的技术。目的:探讨将人口数据联动作为一种研究方法,捕捉神经发育障碍儿童兄弟姐妹的健康状况。纳入标准:同行评议的论文,利用人口数据链接捕获患有神经发育疾病的儿童和年轻人的兄弟姐妹的健康结果。方法:采用JBI范围审查方法。在2000年至2024年期间,在CINAHL、Ovid、Scopus和Web of Science中检索了与“数据链接”、“神经发育状况”、“兄弟姐妹”和“健康结果”相关的搜索词。结果:最终数据提取包括31篇论文。指数儿童的神经发育状况为自闭症、注意缺陷多动障碍、智力障碍、脑瘫和发育迟缓。平均随访时间为31年,大多数研究来自斯堪的纳维亚半岛。观察到的兄弟姐妹健康结果包括精神诊断、自残和自杀、其他神经发育状况,以及特应性疾病、癌症和肥胖等医疗状况。结论:数据链接可以帮助在具有一系列神经发育条件的大型队列中快速捕获兄弟姐妹的健康结果。未来的研究可以通过关注兄弟姐妹作为主要关注群体,增加家谱数据的整合以及诊断组和严重程度之间的比较来加强。采用既定的严格报告方法将增加这类研究的可复制性,并提供更有力的证据基础,以告知兄弟姐妹的支持。
{"title":"The application of population data linkage to capture sibling health outcomes among children and young adults with neurodevelopmental conditions. A scoping review.","authors":"Caitlin Gray, Helen Leonard, Matthew N Cooper, Dheeraj Rai, Emma J Glasson","doi":"10.23889/ijpds.v10i1.2413","DOIUrl":"10.23889/ijpds.v10i1.2413","url":null,"abstract":"<p><strong>Introduction: </strong>Siblings of children with neurodevelopmental conditions have unique experiences and challenges related to their sibling role. Some develop mental health concerns as measured by self-reported surveys or parent report. Few data are available at the population level, owing to difficulties capturing wide-scale health data for siblings. Data linkage is a technique that can facilitate such research.</p><p><strong>Objective: </strong>To explore the application of population data linkage as a research method to capture health outcomes of siblings of children with neurodevelopmental conditions.</p><p><strong>Inclusion criteria: </strong>Peer reviewed papers that captured health outcomes for siblings of children and young adults with neurodevelopmental conditions using population data linkage.</p><p><strong>Methods: </strong>JBI Scoping review methods were followed. Papers were searched within CINAHL, Ovid, Scopus, and Web of Science from 2000 to 2024 using search terms relating to 'data linkage' 'neurodevelopmental conditions' 'siblings' and 'health outcomes'.</p><p><strong>Results: </strong>The final data extraction included 31 papers. The neurodevelopmental conditions of index children were autism, attention deficit hyperactivity disorder, intellectual disability, cerebral palsy and developmental delay. The mean follow-up time was 31 years, and the majority of studies originated from Scandinavia. Sibling health outcomes observed were psychiatric diagnoses, self-harm and suicide, other neurodevelopmental conditions, and medical conditions such as atopic disease, cancer and obesity.</p><p><strong>Conclusion: </strong>Data linkage can help capture sibling health outcomes quickly across large cohorts with a range of neurodevelopmental conditions. Future research could be enhanced by focusing on siblings as the primary group of interest, increased integration of genealogical data, and comparisons between diagnostic groups and severity levels. Adoption of established rigorous reporting methods will increase the replicability of this type of research, and provide a stronger evidence-base from which to inform sibling supports.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2413"},"PeriodicalIF":1.6,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11923734/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143671252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Note: Alternative Name Encodings - Using Jyutping or Pinyin as tonal representations of Chinese names for data linkage. 数据说明:备选名称编码-使用拼音或拼音作为数据链接的中文名称的音调表示。
IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-03-11 eCollection Date: 2023-01-01 DOI: 10.23889/ijpds.v8i5.2935
Joseph Lam, Mario Cortina-Borja, Robert Aldridge, Ruth Blackburn, Katie Harron

Accurate data linkage across large administrative databases is crucial for addressing complex research and policy questions, yet linkage errors-stemming from inconsistent name representations-can introduce biases, predominantly for names not given in English. This data note examines the impact of romanisation on linkage accuracy, focusing on Chinese names and comparing standardised systems (Jyutping and Pinyin) with the non-standardised Hong Kong Government Cantonese Romanisation (HKG-romanisation). We identify three primary issues: language-specific variations in romanisation, the loss of tonal information inherent to tonal languages, and discrepancies in name order conventions. Using a dataset of 771 Hong Kong student names, our analysis reveals that standardised romanisation systems enhance the uniqueness and consistency of name representations, thereby improving linkage precision and recall compared to HKG-romanisation. Specifically, Jyutping and Pinyin achieved over 95% recall in blocking strategies, whereas HKG-romanisation only reached 68.8%. Incorporating tonal information further improved recall. These findings underscore the necessity of adopting standardised, tone-sensitive romanisation systems and flexible database designs to reduce linkage errors and promote data equity for under-represented groups. We advocate for the implementation of phonetic encodings in databases, alongside language-specific pre-processing protocols, to ensure more inclusive and accurate data linkage processes.

跨大型管理数据库的准确数据链接对于解决复杂的研究和政策问题至关重要,然而链接错误——源于不一致的名称表示——可能会引入偏见,主要是对于非英文名称。本数据记录考察了罗马化对链接准确性的影响,重点是中文名称,并比较了标准化系统(拼音和拼音)和非标准化的香港政府粤语罗马化(HKG-romanisation)。我们确定了三个主要问题:罗马化的语言特定变化,声调语言固有的音调信息的丢失,以及名称顺序约定的差异。使用771个香港学生姓名数据集,我们的分析显示,标准化罗马化系统提高了姓名表示的唯一性和一致性,从而提高了连接精度和召回率。在屏蔽策略中,拼字和拼音的召回率达到95%以上,而香港字母罗马化的召回率仅为68.8%。结合音调信息进一步提高了记忆力。这些发现强调了采用标准化、音调敏感的罗马化系统和灵活的数据库设计的必要性,以减少链接错误,促进代表性不足群体的数据公平。我们提倡在数据库中实施语音编码,同时采用特定语言的预处理协议,以确保更包容和准确的数据链接过程。
{"title":"Data Note: Alternative Name Encodings - Using Jyutping or Pinyin as tonal representations of Chinese names for data linkage.","authors":"Joseph Lam, Mario Cortina-Borja, Robert Aldridge, Ruth Blackburn, Katie Harron","doi":"10.23889/ijpds.v8i5.2935","DOIUrl":"10.23889/ijpds.v8i5.2935","url":null,"abstract":"<p><p>Accurate data linkage across large administrative databases is crucial for addressing complex research and policy questions, yet linkage errors-stemming from inconsistent name representations-can introduce biases, predominantly for names not given in English. This data note examines the impact of romanisation on linkage accuracy, focusing on Chinese names and comparing standardised systems (Jyutping and Pinyin) with the non-standardised Hong Kong Government Cantonese Romanisation (HKG-romanisation). We identify three primary issues: language-specific variations in romanisation, the loss of tonal information inherent to tonal languages, and discrepancies in name order conventions. Using a dataset of 771 Hong Kong student names, our analysis reveals that standardised romanisation systems enhance the uniqueness and consistency of name representations, thereby improving linkage precision and recall compared to HKG-romanisation. Specifically, Jyutping and Pinyin achieved over 95% recall in blocking strategies, whereas HKG-romanisation only reached 68.8%. Incorporating tonal information further improved recall. These findings underscore the necessity of adopting standardised, tone-sensitive romanisation systems and flexible database designs to reduce linkage errors and promote data equity for under-represented groups. We advocate for the implementation of phonetic encodings in databases, alongside language-specific pre-processing protocols, to ensure more inclusive and accurate data linkage processes.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"8 5","pages":"2935"},"PeriodicalIF":1.6,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11897931/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143616678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cohort Profile Update: Reflecting back and looking ahead: Updating the Comparative Outcomes and Service Utilization Trends (COAST) Study to include 28 years of linked data from people with and without HIV in British Columbia, Canada. 队列概况更新:回顾和展望:更新比较结果和服务利用趋势(COAST)研究,包括加拿大不列颠哥伦比亚省艾滋病毒感染者和非艾滋病毒感染者28年的相关数据。
IF 2.2 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-03-06 eCollection Date: 2025-01-01 DOI: 10.23889/ijpds.v10i1.2496
Michael O Budu, Katherine W Kooij, Kate Heath, Taylor McLinden, Claudette Cardinal, Scott D Emerson, Paul Sereda, Jason Trigg, Jenny Li, Erin Ding, Mark W Hull, Kate Salters, Viviane D Lima, Rolando Barrios, Julio S G Montaner, Robert S Hogg

Introduction: The Comparative Outcomes and Service Utilization Trends (COAST) study compares health outcomes among People With HIV (PWH) and People Without HIV (PWoH) in British Columbia (BC), Canada. The cohort was recently updated to include persons diagnosed with HIV after March 31, 2013, and expanded to broaden research applications.

Methods: COAST includes PWH and a 10% random sample of the general population without HIV, all aged ≥19. Our study links an HIV registry to healthcare practitioner billing, hospital and emergency department attendance data, prescription drug dispensations, and a cancer registry. Our cohort update included new sampling strategies, adding data on emergency department visits not previously captured, and extending our follow-up period to 28 years (from 1992 to 2020). COAST now includes 17,119 PWH and 615,264 PWoH.

Findings to date: COAST has contributed to our understanding of combination antiretroviral therapy (ART) use, health service utilization, chronic diseases, mental health and substance use disorders, and mortality among PWH in BC. Key findings include earlier age at diagnosis of certain chronic conditions, a higher incidence of mood disorders among PWH, and noteworthy shifts in causes of death among PWH on ART. The updated cohort will provide insights into the changing nature of the population living with HIV in BC and serves as a novel foundation for further research.

Future plans: To explore and extend knowledge of the evolving trends among people living and aging with HIV in BC, regular data linkage updates and the inclusion of additional datasets are scheduled every two years.

前言:比较结果和服务利用趋势(COAST)研究比较了加拿大不列颠哥伦比亚省(BC)艾滋病毒感染者(PWH)和非艾滋病毒感染者(PWoH)的健康结果。该队列最近进行了更新,纳入了2013年3月31日之后被诊断为艾滋病毒的人,并扩大了研究应用范围。方法:COAST包括PWH和10%的无HIV的普通人群,年龄≥19岁。我们的研究将HIV登记与医疗从业人员账单、医院和急诊科出诊数据、处方药配药和癌症登记联系起来。我们的队列更新包括新的抽样策略,增加了以前未捕获的急诊科就诊数据,并将随访期延长至28年(从1992年到2020年)。COAST现在包括17,119名PWH和615,264名PWoH。迄今为止的发现:COAST有助于我们了解BC省PWH中抗逆转录病毒联合治疗(ART)的使用、卫生服务的利用、慢性病、精神健康和物质使用障碍以及死亡率。主要发现包括诊断某些慢性疾病的年龄更早,PWH中情绪障碍的发生率更高,以及接受抗逆转录病毒治疗的PWH中死亡原因的显著变化。更新的队列将提供对BC省艾滋病毒感染者不断变化的性质的见解,并为进一步研究提供新的基础。未来计划:为了探索和扩展对不列颠哥伦比亚省艾滋病毒感染者和老年感染者不断变化趋势的了解,计划每两年定期更新数据链接并纳入额外的数据集。
{"title":"Cohort Profile Update: Reflecting back and looking ahead: Updating the Comparative Outcomes and Service Utilization Trends (COAST) Study to include 28 years of linked data from people with and without HIV in British Columbia, Canada.","authors":"Michael O Budu, Katherine W Kooij, Kate Heath, Taylor McLinden, Claudette Cardinal, Scott D Emerson, Paul Sereda, Jason Trigg, Jenny Li, Erin Ding, Mark W Hull, Kate Salters, Viviane D Lima, Rolando Barrios, Julio S G Montaner, Robert S Hogg","doi":"10.23889/ijpds.v10i1.2496","DOIUrl":"10.23889/ijpds.v10i1.2496","url":null,"abstract":"<p><strong>Introduction: </strong>The Comparative Outcomes and Service Utilization Trends (COAST) study compares health outcomes among People With HIV (PWH) and People Without HIV (PWoH) in British Columbia (BC), Canada. The cohort was recently updated to include persons diagnosed with HIV after March 31, 2013, and expanded to broaden research applications.</p><p><strong>Methods: </strong>COAST includes PWH and a 10% random sample of the general population without HIV, all aged ≥19. Our study links an HIV registry to healthcare practitioner billing, hospital and emergency department attendance data, prescription drug dispensations, and a cancer registry. Our cohort update included new sampling strategies, adding data on emergency department visits not previously captured, and extending our follow-up period to 28 years (from 1992 to 2020). COAST now includes 17,119 PWH and 615,264 PWoH.</p><p><strong>Findings to date: </strong>COAST has contributed to our understanding of combination antiretroviral therapy (ART) use, health service utilization, chronic diseases, mental health and substance use disorders, and mortality among PWH in BC. Key findings include earlier age at diagnosis of certain chronic conditions, a higher incidence of mood disorders among PWH, and noteworthy shifts in causes of death among PWH on ART. The updated cohort will provide insights into the changing nature of the population living with HIV in BC and serves as a novel foundation for further research.</p><p><strong>Future plans: </strong>To explore and extend knowledge of the evolving trends among people living and aging with HIV in BC, regular data linkage updates and the inclusion of additional datasets are scheduled every two years.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2496"},"PeriodicalIF":2.2,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11922098/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143665089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using a deterministic matching computer routine to identify hospital episodes in a Brazilian de-identified administrative database for the analysis of obstetrics hospitalisations. 使用确定性匹配计算机程序在巴西去识别管理数据库中识别医院事件,用于分析产科住院情况。
IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-03-03 eCollection Date: 2025-01-01 DOI: 10.23889/ijpds.v10i1.2467
Claudia Medina Coeli, Rosa Maria Soares Madeira Domingues, Lana Meijinhos, Daniela Medina Coeli Bastos, Rejane Sobrino Pinheiro, Valeria Saraceni, Marcos Augusto Bastos Dias, Natália Santana Paiva, Kenneth Rochel de Camargo

Introduction: The absence of a unique patient identifier in the Brazilian hospital administrative database prevents the identification of hospital episodes with multiple hospitalisations of the same patient.

Objectives: This study aims to evaluate the information gain by using a computer routine to identify acute Obstetrics hospital episodes and its impact on assessing marks of case severity.

Methods: The data source was a de-identified Brazilian hospital administrative database from 2017 to 2020, including hospitalisations records of women of reproductive age (10 to 49 years old) for treating acute conditions (N=16,087,490). We processed this database by combining C++ and Python routines to create a hospital episodes database. From the latter, we selected obstetrics hospital episodes from 2018 to 2019 (N = 4,926,877). We compared selected characteristics of the hospital episodes according to their type (multiple vs single records per episode), testing for differences using effect size measures. We compared relative differences in case severity marks when using the hospital episode as the unit of analysis to that of isolated hospitalisations (N = 5,018,350).

Results: Compared to single-record episodes, multiple-records episodes had longer length of stay, higher amount reimbursed, and lower proportion of discharge alive. When comparing isolated hospitalisations to hospital episodes analysis, we observed an increase in all case severity indicators, especially for hospital deaths, with an increment of 13.15%. The computer routine decreased the hospital admissions with a reason for hospital discharge that did not indicate the outcome (hospital stay or inter-hospital transfer) from 2.29% to 0.73.

Conclusions: The deterministic matching computer routine proved valuable for identifying records that refer to the same hospital episode, which improved the assessment of severe cases.

简介:巴西医院管理数据库中缺乏唯一的患者标识符,因此无法识别同一患者多次住院的医院事件。目的:本研究旨在评估使用计算机常规识别产科医院急性发作的信息获取及其对评估病例严重程度标志的影响。方法:数据来源为2017年至2020年巴西医院管理数据库,包括育龄妇女(10至49岁)治疗急性疾病的住院记录(N=16,087,490)。我们通过结合c++和Python例程来处理这个数据库,创建了一个医院集数据库。从后者中,我们选择2018 - 2019年产科医院事件(N = 4,926,877)。我们根据类型比较了医院发作的选定特征(每次发作有多个或单个记录),使用效应量测量来检验差异。我们比较了使用医院事件作为分析单位与孤立住院的病例严重程度标记的相对差异(N = 5,018,350)。结果:与单病历相比,多病历的住院时间更长,报销金额更高,出院存活率更低。当将孤立住院与医院事件分析进行比较时,我们观察到所有病例严重程度指标的增加,特别是医院死亡,增加了13.15%。计算机程序将没有表明结果(住院或医院间转院)的出院原因的住院率从2.29%降低到0.73。结论:确定性匹配计算机程序在识别同一医院事件的记录方面证明是有价值的,这改善了重症病例的评估。
{"title":"Using a deterministic matching computer routine to identify hospital episodes in a Brazilian de-identified administrative database for the analysis of obstetrics hospitalisations.","authors":"Claudia Medina Coeli, Rosa Maria Soares Madeira Domingues, Lana Meijinhos, Daniela Medina Coeli Bastos, Rejane Sobrino Pinheiro, Valeria Saraceni, Marcos Augusto Bastos Dias, Natália Santana Paiva, Kenneth Rochel de Camargo","doi":"10.23889/ijpds.v10i1.2467","DOIUrl":"10.23889/ijpds.v10i1.2467","url":null,"abstract":"<p><strong>Introduction: </strong>The absence of a unique patient identifier in the Brazilian hospital administrative database prevents the identification of hospital episodes with multiple hospitalisations of the same patient.</p><p><strong>Objectives: </strong>This study aims to evaluate the information gain by using a computer routine to identify acute Obstetrics hospital episodes and its impact on assessing marks of case severity.</p><p><strong>Methods: </strong>The data source was a de-identified Brazilian hospital administrative database from 2017 to 2020, including hospitalisations records of women of reproductive age (10 to 49 years old) for treating acute conditions (N=16,087,490). We processed this database by combining C++ and Python routines to create a hospital episodes database. From the latter, we selected obstetrics hospital episodes from 2018 to 2019 (N = 4,926,877). We compared selected characteristics of the hospital episodes according to their type (multiple vs single records per episode), testing for differences using effect size measures. We compared relative differences in case severity marks when using the hospital episode as the unit of analysis to that of isolated hospitalisations (N = 5,018,350).</p><p><strong>Results: </strong>Compared to single-record episodes, multiple-records episodes had longer length of stay, higher amount reimbursed, and lower proportion of discharge alive. When comparing isolated hospitalisations to hospital episodes analysis, we observed an increase in all case severity indicators, especially for hospital deaths, with an increment of 13.15%. The computer routine decreased the hospital admissions with a reason for hospital discharge that did not indicate the outcome (hospital stay or inter-hospital transfer) from 2.29% to 0.73.</p><p><strong>Conclusions: </strong>The deterministic matching computer routine proved valuable for identifying records that refer to the same hospital episode, which improved the assessment of severe cases.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2467"},"PeriodicalIF":1.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11874899/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Population Data Science
全部 Geobiology Adv. Atmos. Sci. J. Atmos. Chem. Energy Ecol Environ Geochim. Cosmochim. Acta Condens. Matter Phys. Ecol. Processes Clim. Change Can. J. Phys. Am. Mineral. 2009 16th International Conference on Industrial Engineering and Engineering Management Atmos. Meas. Tech. 2013 International Conference on Optical MEMS and Nanophotonics (OMN) ENVIRON HEALTH-GLOB Environ. Mol. Mutagen. Basin Res. APL Photonics Curr. Appl Phys. Int. J. Earth Sci. Annu. Rev. Earth Planet. Sci. Int. J. Astrobiol. Acta Oceanolog. Sin. ERN: Other Macroeconomics: Aggregative Models (Topic) 2012 38th IEEE Photovoltaic Specialists Conference Geophys. Prospect. ENVIRONMENT EUREKA: Physics and Engineering Ecol. Res. INT J MOD PHYS B ENTROPY-SWITZ Environment and Natural Resources Journal 2009 International Workshop on Intelligent Systems and Applications Journal of Semiconductors 2013 IEEE International Conference on Computer Vision Clean-Soil Air Water Geochem. Trans. J PHYS G NUCL PARTIC 2012 International Conference on High Voltage Engineering and Application Enzyme Research [Rinsho ketsueki] The Japanese journal of clinical hematology 2010 IEEE International Symposium on Hardware-Oriented Security and Trust (HOST) Aquat. Geochem. EUR RESPIR REV Miner. Deposita «Узбекский физический журнал» ERN: Other IO: Empirical Studies of Firms & Markets (Topic) FITOTERAPIA Environ. Eng. Res. Nat. Rev. Phys. Entomologisk tidskrift 2011 IEEE 2nd International Conference on Computing, Control and Industrial Engineering J. Hydrol. Acta Geochimica 2013 Abstracts IEEE International Conference on Plasma Science (ICOPS) ARCHAEOMETRY AAPG Bull. Asia-Pac. J. Atmos. Sci. Atmos. Res. Contrib. Mineral. Petrol. Appl. Geochem. Archaeol. Anthropol. Sci. ACTA PETROL SIN Atmos. Chem. Phys. Chem. Ecol. Aust. J. Earth Sci. ARCT ANTARCT ALP RES Ann. Glaciol. Int. J. Biometeorol. Acta Geophys. Appl. Clay Sci. Carbon Balance Manage. ATMOSPHERE-BASEL Org. Geochem. Am. J. Sci. ACTA GEOL SIN-ENGL Am. J. Phys. Anthropol. Isl. Arc IZV-PHYS SOLID EART+ Big Earth Data ACTA GEOL POL Environ. Prot. Eng. "Laboratorio;" analisis clinicos, bacteriologia, inmunologia, parasitologia, hematologia, anatomia patologica, quimica clinica Environ. Educ. Res, Environ. Technol. Innovation Geosci. Model Dev. [Hokkaido igaku zasshi] The Hokkaido journal of medical science Adv. Meteorol. Environ. Pollut. Bioavailability ECOSYSTEMS Ecol. Indic. Environ. Geochem. Health ENVIRON GEOL EUR PHYS J-SPEC TOP INFRARED PHYS TECHN Resour. Geol. Jpn. J. Appl. Phys. 液晶与显示 Geosci. J. BIOGEOSCIENCES Chin. Phys. C
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1