Public engagement is an important mechanism for ensuring that the voices of the public are integrated into study design and data use. The commissioning of a new UK-wide birth cohort study by the UKRI Economic and Social Research Council (ESRC), the Early Life Cohort Feasibility Study (ELC-FS), necessitated renewed dialogue with the public about the acceptability of conducting a large-scale study of this kind. The ELC-FS recruited several thousand children in their first year of life, using an administrative data sampling frame, an 'opt-out' recruitment approach, and embedded linkages to education, health and social care administrative data. The study faced many complexities and challenges to achieve this: the sampling frame had not been used for this purpose before, required negotiation with different data holders in the four UK nations, and the study needed to ensure transparency around how participants' administrative and survey data would be used. Conducting public engagement projects with parents of young children prior to the study's fieldwork was essential to understanding more about the public acceptability of data use in ELC-FS. Evidence from these projects was used to support negotiations with data holders, as well as in guiding best practice for informing participants about their data use and data linkage. This paper summarises the evidence from these public engagement projects relating to data transparency and enacting participant choice and control of the use of their data in the study. We describe how this evidence was implemented in three key study design areas: sampling and recruitment, the collection and use of survey data, and seeking participant consent to link administrative records to individual-level survey data. We also present evidence from the study's fieldwork about participants' acceptability of the survey design and transparency around data use, from recruitment to data collection and processing.
{"title":"Integrating public engagement to promote transparent data use in a new UK-wide birth cohort.","authors":"Alyce Raybould, Karen Dennison, Orla McBride, Erica Wong, Lisa Calderwood, Pasco Fearon, Alissa Goodman","doi":"10.23889/ijpds.v10i2.2965","DOIUrl":"10.23889/ijpds.v10i2.2965","url":null,"abstract":"<p><p>Public engagement is an important mechanism for ensuring that the voices of the public are integrated into study design and data use. The commissioning of a new UK-wide birth cohort study by the UKRI Economic and Social Research Council (ESRC), the Early Life Cohort Feasibility Study (ELC-FS), necessitated renewed dialogue with the public about the acceptability of conducting a large-scale study of this kind. The ELC-FS recruited several thousand children in their first year of life, using an administrative data sampling frame, an 'opt-out' recruitment approach, and embedded linkages to education, health and social care administrative data. The study faced many complexities and challenges to achieve this: the sampling frame had not been used for this purpose before, required negotiation with different data holders in the four UK nations, and the study needed to ensure transparency around how participants' administrative and survey data would be used. Conducting public engagement projects with parents of young children prior to the study's fieldwork was essential to understanding more about the public acceptability of data use in ELC-FS. Evidence from these projects was used to support negotiations with data holders, as well as in guiding best practice for informing participants about their data use and data linkage. This paper summarises the evidence from these public engagement projects relating to data transparency and enacting participant choice and control of the use of their data in the study. We describe how this evidence was implemented in three key study design areas: sampling and recruitment, the collection and use of survey data, and seeking participant consent to link administrative records to individual-level survey data. We also present evidence from the study's fieldwork about participants' acceptability of the survey design and transparency around data use, from recruitment to data collection and processing.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 2","pages":"2965"},"PeriodicalIF":2.2,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12805980/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145999259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
<p><strong>Introduction: </strong>Lifestyle choices encompassing dietary habits, physical activity levels, alcohol consumption, and tobacco use have been consistently shown to significantly impact individual health outcomes and overall well-being.</p><p><strong>Objectives: </strong>This study proposes a novel composite index to measure the adoption of healthy lifestyles among the Italian population aged 18 years and over.</p><p><strong>Methods: </strong>The Healthy Lifestyle Composite Index (HLCI) is constructed by aggregating four key dimensions: diet, physical activity, alcohol consumption, and tobacco use. The dimensions are structured as ordinal variables derived from the comprehensive Aspects of Daily Life (AVQ) multipurpose household survey conducted annually by the Italian National Statistical Institute (ISTAT). A formative approach is employed, involving defining the dimensions, determining weights through the Analytic Hierarchy Process based on expert evaluations, and specifying an aggregation procedure using a weighted Borda rule.</p><p><strong>Results: </strong>The resulting HLCI provides a score from 0 to 100, with higher values indicating healthier lifestyles. Analysis of the HLCI and its dimensions using the 2022 AVQ data (n=32,600) reveals an average score of 61.77, with substantial variation across demographic groups. Descriptive analysis of the HLCI revealed significantly higher scores for females compared to males, driven by better performance in the alcohol and tobacco consumption dimensions. An inverted U-shaped trend emerged for age, with the youngest (18-19 years) and oldest (75+) groups exhibiting higher HLCI values. Educational level was positively associated with HLCI, with graduates scoring highest, excelling in physical activity. Geographically, the North-East region had the highest HLCI. Quantile regression on the first decile highlighted at-risk profiles with extremely low HLCI values, such as 35-44-year-old separated/divorced males with middle school education residing in South Italy.</p><p><strong>Conclusion: </strong>Constructed using reliable data from an annually updated national survey, the HLCI allows for monitoring lifestyle dynamics across different demographic groups and geographic regions. The findings highlight specific segments of the population that may benefit from targeted interventions promoting a healthier lifestyle.</p><p><strong>5 bullet points: </strong>Proposal of a new Healthy Lifestyle Composite Index (HLCI) to measure adoption of healthy lifestyles in the Italian population.HLCI aggregates four dimensions: diet, physical activity, alcohol consumption, and tobacco use, using data from an annual national survey.HLCI employs a formative approach with expert-weighted dimensions and a weighted Borda aggregation rule to calculate the 0-100 score.Analysis of 2022 survey data shows average HLCI of 61.77 with variations across demographics like age, marital status, and educational level.Monitoring heal
{"title":"Construction of a healthy lifestyle index using Italian national survey data.","authors":"Manuela Scioni, Chiara Baldan, Alessia Ghirardo, Giovanna Boccuzzo","doi":"10.23889/ijpds.v01i3.2977","DOIUrl":"10.23889/ijpds.v01i3.2977","url":null,"abstract":"<p><strong>Introduction: </strong>Lifestyle choices encompassing dietary habits, physical activity levels, alcohol consumption, and tobacco use have been consistently shown to significantly impact individual health outcomes and overall well-being.</p><p><strong>Objectives: </strong>This study proposes a novel composite index to measure the adoption of healthy lifestyles among the Italian population aged 18 years and over.</p><p><strong>Methods: </strong>The Healthy Lifestyle Composite Index (HLCI) is constructed by aggregating four key dimensions: diet, physical activity, alcohol consumption, and tobacco use. The dimensions are structured as ordinal variables derived from the comprehensive Aspects of Daily Life (AVQ) multipurpose household survey conducted annually by the Italian National Statistical Institute (ISTAT). A formative approach is employed, involving defining the dimensions, determining weights through the Analytic Hierarchy Process based on expert evaluations, and specifying an aggregation procedure using a weighted Borda rule.</p><p><strong>Results: </strong>The resulting HLCI provides a score from 0 to 100, with higher values indicating healthier lifestyles. Analysis of the HLCI and its dimensions using the 2022 AVQ data (n=32,600) reveals an average score of 61.77, with substantial variation across demographic groups. Descriptive analysis of the HLCI revealed significantly higher scores for females compared to males, driven by better performance in the alcohol and tobacco consumption dimensions. An inverted U-shaped trend emerged for age, with the youngest (18-19 years) and oldest (75+) groups exhibiting higher HLCI values. Educational level was positively associated with HLCI, with graduates scoring highest, excelling in physical activity. Geographically, the North-East region had the highest HLCI. Quantile regression on the first decile highlighted at-risk profiles with extremely low HLCI values, such as 35-44-year-old separated/divorced males with middle school education residing in South Italy.</p><p><strong>Conclusion: </strong>Constructed using reliable data from an annually updated national survey, the HLCI allows for monitoring lifestyle dynamics across different demographic groups and geographic regions. The findings highlight specific segments of the population that may benefit from targeted interventions promoting a healthier lifestyle.</p><p><strong>5 bullet points: </strong>Proposal of a new Healthy Lifestyle Composite Index (HLCI) to measure adoption of healthy lifestyles in the Italian population.HLCI aggregates four dimensions: diet, physical activity, alcohol consumption, and tobacco use, using data from an annual national survey.HLCI employs a formative approach with expert-weighted dimensions and a weighted Borda aggregation rule to calculate the 0-100 score.Analysis of 2022 survey data shows average HLCI of 61.77 with variations across demographics like age, marital status, and educational level.Monitoring heal","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 3","pages":"2977"},"PeriodicalIF":2.2,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12668253/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24eCollection Date: 2025-01-01DOI: 10.23889/ijpds.v10i1.2958
Marta Wilk, Gill Harper, Nicola Firman, Chris Dibben, Rich Fry, Carol Dezateux
Introduction: Up-to-date, high-quality estimates of population and households are essential for planning the provision of local and central infrastructure.
Objectives: We aimed to derive estimates of population size, and household numbers and size on Census date (21/03/2021) using north-east London primary care Electronic Health Records (EHR) and calculate levels of their agreement with the publicly available official Census 2021 estimates to assess if health data have the potential to be used to create reliable statistics.
Methods: We compared EHR and Census population estimates by sex, age, local authority, and IMD quintile, and EHR and Census household estimates by number, size, and local authority. We estimated 95% Limits of Agreement between EHR and Census household and population estimates using the Bland and Altman method. In sensitivity analyses, we excluded people with no General Practice encounter within 12 months and compared the adjusted population's size to Census estimate.We compared EHR and administrative Statistical Population Dataset (SPD) to Census population estimates by sex and age, and EHR and Admin-based Occupied Address Dataset (ABOAD) to Census household estimates by local authority and household size.
Results: EHR population estimate was 2,130,965, i.e. 7.1% higher than Census of 1,990,087. EHR household estimate was 658,264, i.e. 9.1% lower than Census of 724,045. The estimate of population with recent GP encounter was 11.6% lower than the Census estimate.Compared to Census, both SPD and EHR overcounted population of males (10.7%, 7.9% respectively) and females (3.6%, 2.7% respectively). Both ABOAD and EHR had undercounted households compared to Census (-7.3%; -9.1% respectively).
Conclusions: Reliable, up-to-date populations and households estimates can be derived from health records. High residential mobility increases the complexity of deriving these estimates. Excluding people without GP encounters does not improve agreement with Census. Future work will focus on comparing Census and EHR estimates using individual-level data.
{"title":"Estimating households and populations from primary care electronic health records: comparison with Office for National Statistics Census 2021 aggregated estimates.","authors":"Marta Wilk, Gill Harper, Nicola Firman, Chris Dibben, Rich Fry, Carol Dezateux","doi":"10.23889/ijpds.v10i1.2958","DOIUrl":"10.23889/ijpds.v10i1.2958","url":null,"abstract":"<p><strong>Introduction: </strong>Up-to-date, high-quality estimates of population and households are essential for planning the provision of local and central infrastructure.</p><p><strong>Objectives: </strong>We aimed to derive estimates of population size, and household numbers and size on Census date (21/03/2021) using north-east London primary care Electronic Health Records (EHR) and calculate levels of their agreement with the publicly available official Census 2021 estimates to assess if health data have the potential to be used to create reliable statistics.</p><p><strong>Methods: </strong>We compared EHR and Census population estimates by sex, age, local authority, and IMD quintile, and EHR and Census household estimates by number, size, and local authority. We estimated 95% Limits of Agreement between EHR and Census household and population estimates using the Bland and Altman method. In sensitivity analyses, we excluded people with no General Practice encounter within 12 months and compared the adjusted population's size to Census estimate.We compared EHR and administrative Statistical Population Dataset (SPD) to Census population estimates by sex and age, and EHR and Admin-based Occupied Address Dataset (ABOAD) to Census household estimates by local authority and household size.</p><p><strong>Results: </strong>EHR population estimate was 2,130,965, i.e. 7.1% higher than Census of 1,990,087. EHR household estimate was 658,264, i.e. 9.1% lower than Census of 724,045. The estimate of population with recent GP encounter was 11.6% lower than the Census estimate.Compared to Census, both SPD and EHR overcounted population of males (10.7%, 7.9% respectively) and females (3.6%, 2.7% respectively). Both ABOAD and EHR had undercounted households compared to Census (-7.3%; -9.1% respectively).</p><p><strong>Conclusions: </strong>Reliable, up-to-date populations and households estimates can be derived from health records. High residential mobility increases the complexity of deriving these estimates. Excluding people without GP encounters does not improve agreement with Census. Future work will focus on comparing Census and EHR estimates using individual-level data.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2958"},"PeriodicalIF":2.2,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12668252/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-19eCollection Date: 2025-01-01DOI: 10.23889/ijpds.v10i3.2974
Mirjam Allik, Elzo Pereira Pinto-Júnior, Dandara Ramos, Andrêa J F Ferreira, Flavia Jose Alves, Camila Teixeira, Marilyn Agranonik, Renzo Flores-Ortiz, Poliana Rebouças, Rita de Cássia Ribeiro-Silva, Mauro Sanchez, Srinivasa Vittal Katikireddi, Mauricio L Barreto, Alastair H Leyland, Maria Yury Ichihara, Ruth Dundas
Introduction: Monitoring and addressing health inequalities is important. However, socioeconomic variables are usually unavailable within health datasets. Area deprivation measures provide access to open-source reliable socioeconomic data within low/middle-income countries and can contribute to the monitoring of the Sustainable Development Goals and assessing the growing burden of health inequalities.
Objective: To create a small-area deprivation measure for the whole of Brazil - the Brazilian Deprivation Index (Índice Brasileiro de Privação - IBP).
Methods: Using Census Sector data (mean population size=615) from the most recently available Brazilian Demographic Census (2010), variables measuring literacy, household income and housing conditions were standardised using z-scores and summed into a single measure. The IBP was validated using regional small-area measures of vulnerability: Belo Horizonte's Health Vulnerability Index (IVS) and São Paulo's Social Vulnerability Index (IPVS). Mortality data from Minas Gerais were used to estimate age-standardised mortality rates (ASMR) by ill-defined causes across IBP deprivation quintiles.
Results: The IBP was created for 303,218 (97.8%) census sectors (99.7% population). Substantial regional variation in deprivation was found using the IBP measure, with higher deprivation in rural than urban areas. The IBP was correlated with the other indicators used for validation: the IVS (r = 0.96) and the IPVS (r = 0.68). We found gradients across the ill-defined causes ASMR, in Minas Gerais mortality was 2.6 higher in the most deprived quintile of IBP, compared with the least deprived. Main challenges in creating a deprivation measure for LMICs and possible solutions are demonstrated.
Conclusion: A small area deprivation index was created for Brazil, a large and highly diverse middle-income country. The IBP improves our understanding and monitoring of inequalities, serving as a valuable tool for informing targeted public policies. Although the index is based on Brazil's specific context, the challenges faced, and the strategies implemented to tackle them are relevant for other low- and middle-income countries aiming to develop similar tools.
导言:监测和处理卫生不平等现象很重要。然而,卫生数据集中通常没有社会经济变量。地区剥夺措施提供了获取低收入/中等收入国家内可靠的开源社会经济数据的途径,并有助于监测可持续发展目标和评估日益严重的卫生不平等负担。目的:建立一个适用于整个巴西的小区域贫困指标——巴西贫困指数(Índice Brasileiro de priva o - IBP)。方法:使用最近可获得的巴西人口普查(2010年)的人口普查部门数据(平均人口规模=615),使用z分数对衡量识字率、家庭收入和住房条件的变量进行标准化,并将其汇总为单一测量。IBP采用区域性小区域脆弱性指标进行验证:贝洛奥里藏特健康脆弱性指数(IVS)和圣保罗社会脆弱性指数(IPVS)。来自米纳斯吉拉斯州的死亡率数据被用于估计IBP剥夺五分位数中不明确原因的年龄标准化死亡率(ASMR)。结果:建立IBP的人口普查部门为303218个(97.8%),占人口的99.7%。使用IBP测量发现,贫困程度在地区间存在显著差异,农村地区的贫困程度高于城市地区。IBP与其他用于验证的指标:IVS (r = 0.96)和IPVS (r = 0.68)相关。我们发现,在米纳斯吉拉斯州,IBP最贫困五分之一的死亡率比最贫困五分之一的死亡率高2.6。为低收入和中等收入国家制定剥夺措施的主要挑战和可能的解决办法。结论:巴西是一个面积大、多样性高的中等收入国家,建立了一个小面积剥夺指数。IBP提高了我们对不平等现象的理解和监测,是为有针对性的公共政策提供信息的宝贵工具。尽管该指数是基于巴西的具体情况制定的,但巴西面临的挑战以及为应对这些挑战而实施的战略,对其他旨在开发类似工具的低收入和中等收入国家具有重要意义。
{"title":"A small area deprivation index for monitoring and evaluating health inequalities in a diverse, low and middle income country: the Índice Brasileiro de Privação (IBP).","authors":"Mirjam Allik, Elzo Pereira Pinto-Júnior, Dandara Ramos, Andrêa J F Ferreira, Flavia Jose Alves, Camila Teixeira, Marilyn Agranonik, Renzo Flores-Ortiz, Poliana Rebouças, Rita de Cássia Ribeiro-Silva, Mauro Sanchez, Srinivasa Vittal Katikireddi, Mauricio L Barreto, Alastair H Leyland, Maria Yury Ichihara, Ruth Dundas","doi":"10.23889/ijpds.v10i3.2974","DOIUrl":"10.23889/ijpds.v10i3.2974","url":null,"abstract":"<p><strong>Introduction: </strong>Monitoring and addressing health inequalities is important. However, socioeconomic variables are usually unavailable within health datasets. Area deprivation measures provide access to open-source reliable socioeconomic data within low/middle-income countries and can contribute to the monitoring of the Sustainable Development Goals and assessing the growing burden of health inequalities.</p><p><strong>Objective: </strong>To create a small-area deprivation measure for the whole of Brazil - the Brazilian Deprivation Index (Índice Brasileiro de Privação - IBP).</p><p><strong>Methods: </strong>Using Census Sector data (mean population size=615) from the most recently available Brazilian Demographic Census (2010), variables measuring literacy, household income and housing conditions were standardised using z-scores and summed into a single measure. The IBP was validated using regional small-area measures of vulnerability: Belo Horizonte's Health Vulnerability Index (IVS) and São Paulo's Social Vulnerability Index (IPVS). Mortality data from Minas Gerais were used to estimate age-standardised mortality rates (ASMR) by ill-defined causes across IBP deprivation quintiles.</p><p><strong>Results: </strong>The IBP was created for 303,218 (97.8%) census sectors (99.7% population). Substantial regional variation in deprivation was found using the IBP measure, with higher deprivation in rural than urban areas. The IBP was correlated with the other indicators used for validation: the IVS (r = 0.96) and the IPVS (r = 0.68). We found gradients across the ill-defined causes ASMR, in Minas Gerais mortality was 2.6 higher in the most deprived quintile of IBP, compared with the least deprived. Main challenges in creating a deprivation measure for LMICs and possible solutions are demonstrated.</p><p><strong>Conclusion: </strong>A small area deprivation index was created for Brazil, a large and highly diverse middle-income country. The IBP improves our understanding and monitoring of inequalities, serving as a valuable tool for informing targeted public policies. Although the index is based on Brazil's specific context, the challenges faced, and the strategies implemented to tackle them are relevant for other low- and middle-income countries aiming to develop similar tools.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 3","pages":"2974"},"PeriodicalIF":2.2,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629185/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145565608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-17eCollection Date: 2025-01-01DOI: 10.23889/ijpds.v10i1.2984
Leslie L Roos, Gilles Detillieux, Gillian Fransoo
Introduction: Childhood exposure to and duration of poverty can affect several individual characteristics related to intellectual development.
Objectives: This paper examines the implications of movement in and out of childhood poverty using a unique linkable database from the Canadian province of Manitoba. Differences in measurement of poverty and intellectual development are explored.
Methods: Almost 90,000 children were followed using two definitions of poverty - neighborhood and household poverty. The large database permitted exploring the role of another variable - maternal mental health.
Results: The association of household poverty with poorer intellectual outcomes has been shown to be stronger than the association of neighborhood poverty with such outcomes. This was true using various outcome measures appropriate across childhood (from age 5 to age 17). Comparisons with the role of maternal mental health were made and further analyses suggested.
Conclusion: The richness of the data has facilitated the study of childhood intellectual development. Household poverty appears to play an important role; neighborhood poverty and maternal mental health also seem to influence such development, but less strongly.
{"title":"Poverty and intellectual development in childhood.","authors":"Leslie L Roos, Gilles Detillieux, Gillian Fransoo","doi":"10.23889/ijpds.v10i1.2984","DOIUrl":"10.23889/ijpds.v10i1.2984","url":null,"abstract":"<p><strong>Introduction: </strong>Childhood exposure to and duration of poverty can affect several individual characteristics related to intellectual development.</p><p><strong>Objectives: </strong>This paper examines the implications of movement in and out of childhood poverty using a unique linkable database from the Canadian province of Manitoba. Differences in measurement of poverty and intellectual development are explored.</p><p><strong>Methods: </strong>Almost 90,000 children were followed using two definitions of poverty - neighborhood and household poverty. The large database permitted exploring the role of another variable - maternal mental health.</p><p><strong>Results: </strong>The association of household poverty with poorer intellectual outcomes has been shown to be stronger than the association of neighborhood poverty with such outcomes. This was true using various outcome measures appropriate across childhood (from age 5 to age 17). Comparisons with the role of maternal mental health were made and further analyses suggested.</p><p><strong>Conclusion: </strong>The richness of the data has facilitated the study of childhood intellectual development. Household poverty appears to play an important role; neighborhood poverty and maternal mental health also seem to influence such development, but less strongly.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2984"},"PeriodicalIF":2.2,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12625802/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145557715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30eCollection Date: 2025-01-01DOI: 10.23889/ijpds.v10i1.2968
Isobel Sharpe, Amreen Babujee, George Foussias, Simone N Vigod, Paul Kurdyak
Introduction: Psychotic disorders are associated with high levels of disability and poor clinical outcomes but little is known about the regional incidence of psychosis in Ontario.
Objective: This study aimed to understand regional incidence variation and demographic and regional characteristics of individuals who may be suitable for receiving early psychosis intervention (EPI) services, as well as evaluate post-diagnosis healthcare utilisation.
Methods: A population-based retrospective cohort study captured incident affective and non-affective psychosis cases among Ontario, Canada residents aged 12-50 from 2017-2021. The sociodemographic characteristics of the cohort were described, including Ontario Health region of residence. Incident cases were followed for 6-months post-diagnosis to capture health service utilisation. Logistic regression was used to model post-diagnosis hospitalisations and Poisson regression to model outpatient psychiatrist visits.
Results: The cohort contained 44,188 individuals (41,257 non-affective psychosis; 3,058 affective psychosis). We observed substantial regional variation in incidence rates, which were higher in the North Western region for non-affective psychosis (167.44/100,000) and North Eastern region for affective psychosis (14.23/100,000) compared to the provincial average (92.24; 6.84/100,000, respectively). Compared to the Toronto region, post-diagnosis hospitalisations were significantly higher in the North East (non-affective psychosis aOR 1.14, 95%CI 1.01-1.30; affective psychosis aOR 1.69, 95%CI 1.13-2.54). Among those with non-affective psychosis, outpatient psychiatrist visits were significantly lower in all regions compared to Toronto (e.g., East aRR 0.61, 95%CI 0.60-0.62; North West aRR 0.34, 95%CI 0.32-0.36).
Conclusions: There is considerable regional variation in incident psychosis and inverse relationships between hospitalisations and outpatient care. To successfully plan for future EPI programs in Ontario, it is essential to understand regional needs using a systematic, population-based approach.
{"title":"Regional and sociodemographic variation of incident first-episode psychosis in Ontario, Canada.","authors":"Isobel Sharpe, Amreen Babujee, George Foussias, Simone N Vigod, Paul Kurdyak","doi":"10.23889/ijpds.v10i1.2968","DOIUrl":"10.23889/ijpds.v10i1.2968","url":null,"abstract":"<p><strong>Introduction: </strong>Psychotic disorders are associated with high levels of disability and poor clinical outcomes but little is known about the regional incidence of psychosis in Ontario.</p><p><strong>Objective: </strong>This study aimed to understand regional incidence variation and demographic and regional characteristics of individuals who may be suitable for receiving early psychosis intervention (EPI) services, as well as evaluate post-diagnosis healthcare utilisation.</p><p><strong>Methods: </strong>A population-based retrospective cohort study captured incident affective and non-affective psychosis cases among Ontario, Canada residents aged 12-50 from 2017-2021. The sociodemographic characteristics of the cohort were described, including Ontario Health region of residence. Incident cases were followed for 6-months post-diagnosis to capture health service utilisation. Logistic regression was used to model post-diagnosis hospitalisations and Poisson regression to model outpatient psychiatrist visits.</p><p><strong>Results: </strong>The cohort contained 44,188 individuals (41,257 non-affective psychosis; 3,058 affective psychosis). We observed substantial regional variation in incidence rates, which were higher in the North Western region for non-affective psychosis (167.44/100,000) and North Eastern region for affective psychosis (14.23/100,000) compared to the provincial average (92.24; 6.84/100,000, respectively). Compared to the Toronto region, post-diagnosis hospitalisations were significantly higher in the North East (non-affective psychosis aOR 1.14, 95%CI 1.01-1.30; affective psychosis aOR 1.69, 95%CI 1.13-2.54). Among those with non-affective psychosis, outpatient psychiatrist visits were significantly lower in all regions compared to Toronto (e.g., East aRR 0.61, 95%CI 0.60-0.62; North West aRR 0.34, 95%CI 0.32-0.36).</p><p><strong>Conclusions: </strong>There is considerable regional variation in incident psychosis and inverse relationships between hospitalisations and outpatient care. To successfully plan for future EPI programs in Ontario, it is essential to understand regional needs using a systematic, population-based approach.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2968"},"PeriodicalIF":2.2,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12622573/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145551377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-25eCollection Date: 2025-01-01DOI: 10.23889/ijpds.v10i2.2972
Gillian M Raab, Sophie McCall, Liam Cavin
Confidential administrative data is usually only available to researchers within a Trusted Research Environment (TRE). Recently, some UK groups have proposed that low-fidelity synthetic data (LFSD) be made available to researchers outside the TRE, to allow code-testing and data discovery. There is a need for transparency so that those who access LFSD know how it has been created and what to expect from it. Relationships between variables are not maintained in LFSD, but a real or apparent data breach can occur from its release. To be useful to researchers for preliminary analyses LFSD needs to meet some minimum quality standards. Researchers who will use the LFSD need to have details of how it compares with the data they will access in the TRE clearly explained and documented. We propose that these checks should be run by data controllers before releasing LFSD to ensure it is well documented, useful and non-disclosive. Labelling To avoid an apparent data breach, steps must be taken to ensure that the synthetic data (SD) is clearly identified as not being real data.Disclosure The LFSD should undergo disclosure risk evaluation as described below and any risks identified should be mitigated.Structure The structure of the SD should be as similar as possible to the TRE data.Documentation Differences in the structure of the SD compared to data in the TRE must be documented, and the way(s) that analyses of the SD expect to differ from those of data in the TRE must be clarified. We propose details of each of these below; but a strict, rule-based approach should not be used. Instead, the data holders should modify the rules to take account of the type of information that may be disclosed and the circumstances of the data release (to whom and under what conditions).
机密管理数据通常只对可信研究环境(Trusted Research Environment, TRE)中的研究人员可用。最近,一些英国团体提议将低保真合成数据(LFSD)提供给TRE以外的研究人员,以便进行代码测试和数据发现。有必要提高透明度,以便访问消防处的人知道它是如何创建的,以及对它有什么期望。在LFSD中不维护变量之间的关系,但是它的发布可能会导致真实的或明显的数据泄露。为了对研究人员进行初步分析有用,LFSD需要达到一些最低质量标准。将使用LFSD的研究人员需要清楚地解释和记录它与他们将在TRE中访问的数据进行比较的细节。我们建议这些检查应由数据控制者在发布LFSD之前进行,以确保它有良好的文件记录、有用和不泄露。为了避免明显的数据泄露,必须采取措施确保合成数据(SD)被清楚地识别为不是真实数据。信息披露本处应进行如下所述的信息披露风险评估,并应减轻发现的任何风险。SD的结构应尽可能与TRE数据相似。必须记录SD与TRE中数据在结构上的差异,并且必须澄清SD分析与TRE中数据的不同之处。我们在下面提出每一项的细节;但不应采用严格的、基于规则的方法。相反,数据持有人应该修改规则,以考虑可能披露的信息类型和数据发布的情况(向谁以及在什么条件下)。
{"title":"Four checks for low-fidelity synthetic data: recommendations for disclosure control and quality evaluation.","authors":"Gillian M Raab, Sophie McCall, Liam Cavin","doi":"10.23889/ijpds.v10i2.2972","DOIUrl":"10.23889/ijpds.v10i2.2972","url":null,"abstract":"<p><p>Confidential administrative data is usually only available to researchers within a Trusted Research Environment (TRE). Recently, some UK groups have proposed that low-fidelity synthetic data (LFSD) be made available to researchers outside the TRE, to allow code-testing and data discovery. There is a need for transparency so that those who access LFSD know how it has been created and what to expect from it. Relationships between variables are not maintained in LFSD, but a real or apparent data breach can occur from its release. To be useful to researchers for preliminary analyses LFSD needs to meet some minimum quality standards. Researchers who will use the LFSD need to have details of how it compares with the data they will access in the TRE clearly explained and documented. We propose that these checks should be run by data controllers before releasing LFSD to ensure it is well documented, useful and non-disclosive. <b>Labelling</b> To avoid an apparent data breach, steps must be taken to ensure that the synthetic data (SD) is clearly identified as not being real data.<b>Disclosure</b> The LFSD should undergo disclosure risk evaluation as described below and any risks identified should be mitigated.<b>Structure</b> The structure of the SD should be as similar as possible to the TRE data.<b>Documentation</b> Differences in the structure of the SD compared to data in the TRE must be documented, and the way(s) that analyses of the SD expect to differ from those of data in the TRE must be clarified. We propose details of each of these below; but a strict, rule-based approach should not be used. Instead, the data holders should modify the rules to take account of the type of information that may be disclosed and the circumstances of the data release (to whom and under what conditions).</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 2","pages":"2972"},"PeriodicalIF":2.2,"publicationDate":"2025-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12626184/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145557705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Introduction: Public service leaders face increasing challenges using data effectively due to program silos, limited resources, and the increasing complexity of data. To address these challenges, Iowa's Integrated Data System for Decision-Making (I2D2) partnered with state and local leaders in early childhood to curate key indicators and develop population-level data tools and training to promote policy and practice improvements.
Methods: We relied on a mixed-methods, participatory approach to understand early childhood data and reporting requirements and how state and local leaders leverage data to meet these requirements and inform decisions. We conducted a Data Landscape Overview consisting of interviews, surveys, document review, and meetings with state and local leaders. Public deliberation facilitated iterative feedback and collective decision-making through stakeholder discussions.
Results: Our participatory approach resulted in three actions to improve data collection and use within Iowa's early childhood system: curating a set of early childhood indicators; developing training and strategic planning tools for effective data use; and building the Iowa Data Drive (IDD), an interactive data portal for accessing key early childhood indicators and population-level insights.
Conclusions: A robust IDS can promote systems change when grounded in strong partnerships, phased implementation, and a commitment to clear communication. By centering local voices and fostering trust, we developed indicators and tools that support data-informed decisions and improved services for young children and their families.
{"title":"Building the Iowa Data Drive: a participatory approach to developing early childhood indicators for state and local policymaking.","authors":"Heather Rouse, Sharon Zanti, Hannah Kim, Cassandra Dorius, Todd Abraham, Giorgi Chighladze","doi":"10.23889/ijpds.v10i3.2969","DOIUrl":"10.23889/ijpds.v10i3.2969","url":null,"abstract":"<p><strong>Introduction: </strong>Public service leaders face increasing challenges using data effectively due to program silos, limited resources, and the increasing complexity of data. To address these challenges, Iowa's Integrated Data System for Decision-Making (I2D2) partnered with state and local leaders in early childhood to curate key indicators and develop population-level data tools and training to promote policy and practice improvements.</p><p><strong>Methods: </strong>We relied on a mixed-methods, participatory approach to understand early childhood data and reporting requirements and how state and local leaders leverage data to meet these requirements and inform decisions. We conducted a Data Landscape Overview consisting of interviews, surveys, document review, and meetings with state and local leaders. Public deliberation facilitated iterative feedback and collective decision-making through stakeholder discussions.</p><p><strong>Results: </strong>Our participatory approach resulted in three actions to improve data collection and use within Iowa's early childhood system: curating a set of early childhood indicators; developing training and strategic planning tools for effective data use; and building the Iowa Data Drive (IDD), an interactive data portal for accessing key early childhood indicators and population-level insights.</p><p><strong>Conclusions: </strong>A robust IDS can promote systems change when grounded in strong partnerships, phased implementation, and a commitment to clear communication. By centering local voices and fostering trust, we developed indicators and tools that support data-informed decisions and improved services for young children and their families.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 3","pages":"2969"},"PeriodicalIF":2.2,"publicationDate":"2025-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12626109/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145557843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-16eCollection Date: 2025-01-01DOI: 10.23889/ijpds.v10i1.2971
Richard Kjellgren, Jan Savinc, Nadine Dougall, Amanj Kurdi, Alastair Leyland, Emily Tweed, Jim Watson, Kate Hunt, Catriona Connell
Introduction: Mental health and substance use (MH/SU) problems are highly prevalent among the prison population. However, early and preventative post-imprisonment care appears to be insufficient to meet the MH/SU needs of people released. This is demonstrated by elevated rates of MH/SU-related emergency care and deaths attributable to alcohol, drugs and suicide. Studies examining post-imprisonment healthcare contacts across community, outpatient, inpatient and emergency services for MH/SU are required to address this issue. This protocol paper describes the outcome of data linkage and details our plans for data cleaning and analysis.
Methods: The RELEASE study will follow a retrospective observational cohort design. This is the first study using national individual-level linked administrative health and prison data from Scotland. We report the results of creating the cohort, and outline proposed methods for data preparation and analysis. Within the cohort, the exposed group comprises everyone released from prison in 2015, and the unexposed group consists of a random sample of the general population matched (1:5 ratio) on age, sex, postcode and postcode-derived index of multiple deprivation, and with no prison exposure in the preceding 5 years. Health data (community prescribing, outpatient visits, specialist substance use, psychiatric inpatient, general inpatient, out-of-hours general practice, 24-hour National Health Service [NHS] helpline, ambulance, and emergency services), deaths data, and prison data (admissions, releases, demographic data) were linked to the cohort using unique identifiers. Service contacts associated with MH/SU will be quantified and compared across the two groups using regression modelling, controlling for potential confounding variables, reimprisonment and deaths.
Conclusion: RELEASE is a comprehensive study with potential to inform post-imprisonment MH/SU service delivery, whilst the dataset holds significant potential for exploring other health conditions and outcomes. This research will allow for an unprecedented understanding of post-imprisonment service use patterns in Scotland, and RELEASE will make a significant public health contribution given the overrepresentation of people released in costly emergency care contact and death rates.
{"title":"Access to services for mental ill-health and substance use among people released from prison in Scotland (RELEASE): Retrospective observational cohort study protocol.","authors":"Richard Kjellgren, Jan Savinc, Nadine Dougall, Amanj Kurdi, Alastair Leyland, Emily Tweed, Jim Watson, Kate Hunt, Catriona Connell","doi":"10.23889/ijpds.v10i1.2971","DOIUrl":"10.23889/ijpds.v10i1.2971","url":null,"abstract":"<p><strong>Introduction: </strong>Mental health and substance use (MH/SU) problems are highly prevalent among the prison population. However, early and preventative post-imprisonment care appears to be insufficient to meet the MH/SU needs of people released. This is demonstrated by elevated rates of MH/SU-related emergency care and deaths attributable to alcohol, drugs and suicide. Studies examining post-imprisonment healthcare contacts across community, outpatient, inpatient and emergency services for MH/SU are required to address this issue. This protocol paper describes the outcome of data linkage and details our plans for data cleaning and analysis.</p><p><strong>Methods: </strong>The RELEASE study will follow a retrospective observational cohort design. This is the first study using national individual-level linked administrative health and prison data from Scotland. We report the results of creating the cohort, and outline proposed methods for data preparation and analysis. Within the cohort, the exposed group comprises everyone released from prison in 2015, and the unexposed group consists of a random sample of the general population matched (1:5 ratio) on age, sex, postcode and postcode-derived index of multiple deprivation, and with no prison exposure in the preceding 5 years. Health data (community prescribing, outpatient visits, specialist substance use, psychiatric inpatient, general inpatient, out-of-hours general practice, 24-hour National Health Service [NHS] helpline, ambulance, and emergency services), deaths data, and prison data (admissions, releases, demographic data) were linked to the cohort using unique identifiers. Service contacts associated with MH/SU will be quantified and compared across the two groups using regression modelling, controlling for potential confounding variables, reimprisonment and deaths.</p><p><strong>Conclusion: </strong>RELEASE is a comprehensive study with potential to inform post-imprisonment MH/SU service delivery, whilst the dataset holds significant potential for exploring other health conditions and outcomes. This research will allow for an unprecedented understanding of post-imprisonment service use patterns in Scotland, and RELEASE will make a significant public health contribution given the overrepresentation of people released in costly emergency care contact and death rates.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2971"},"PeriodicalIF":2.2,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12530171/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145330145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-15eCollection Date: 2025-01-01DOI: 10.23889/ijpds.v10i2.2967
Lora Frayling, Shah Suraj Bharat, Elizabeth Pattinson, Joshua Stock, Fiona Lugg-Widger, Emma Gordon, Emily Oliver
Synthetic data is emerging as a key area of development for supporting research that involves secure forms of administrative and health data, both in the United Kingdom and globally. In practice, key challenges in the generation and adoption of synthetic data are closely tied to the need for agreed and consistent terminology for describing it. The absence of standardised language hinders the setting of quality standards, establishment of governance and guidelines and effective sharing of knowledge and best practices. This has implications for research that uses synthetic healthcare and administrative data, particularly when such data are generated from protected personal data. This commentary paper reviews existing literature on synthetic data to explore how key terms are currently defined in practice, with a focus on privacy-preserving use cases. Our analysis reveals that terms describing properties of synthetic data are often lacking and inconsistent, largely due to the breadth of synthetic data types, contexts and use cases. Context-specific terminology with nuanced meanings complicates efforts for the development of universally agreed definitions, particularly for privacy-preserving synthetic data that captures characteristics from protected data sources. To address this, we propose broad definitions for key terms including synthetic data, utility, utility measure and fidelity. We conclude by offering a set of recommendations emphasising the need for consensus on terminology and encouraging clearer descriptions in future literature that specify both the intended use of the data and the measures used to describe it.
{"title":"A review of synthetic data terminology for privacy preserving use cases.","authors":"Lora Frayling, Shah Suraj Bharat, Elizabeth Pattinson, Joshua Stock, Fiona Lugg-Widger, Emma Gordon, Emily Oliver","doi":"10.23889/ijpds.v10i2.2967","DOIUrl":"10.23889/ijpds.v10i2.2967","url":null,"abstract":"<p><p>Synthetic data is emerging as a key area of development for supporting research that involves secure forms of administrative and health data, both in the United Kingdom and globally. In practice, key challenges in the generation and adoption of synthetic data are closely tied to the need for agreed and consistent terminology for describing it. The absence of standardised language hinders the setting of quality standards, establishment of governance and guidelines and effective sharing of knowledge and best practices. This has implications for research that uses synthetic healthcare and administrative data, particularly when such data are generated from protected personal data. This commentary paper reviews existing literature on synthetic data to explore how key terms are currently defined in practice, with a focus on privacy-preserving use cases. Our analysis reveals that terms describing properties of synthetic data are often lacking and inconsistent, largely due to the breadth of synthetic data types, contexts and use cases. Context-specific terminology with nuanced meanings complicates efforts for the development of universally agreed definitions, particularly for privacy-preserving synthetic data that captures characteristics from protected data sources. To address this, we propose broad definitions for key terms including <i>synthetic data</i>, <i>utility</i>, <i>utility measure</i> and <i>fidelity</i>. We conclude by offering a set of recommendations emphasising the need for consensus on terminology and encouraging clearer descriptions in future literature that specify both the intended use of the data and the measures used to describe it.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 2","pages":"2967"},"PeriodicalIF":2.2,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12622363/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145551373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}