Pub Date : 2024-10-17eCollection Date: 2024-01-01DOI: 10.23889/ijpds.v9i1.2412
Anousheh Marouzi, Charles Plante, Barbara Fornssler
Introduction: Research on substance use harm in Canada has been hampered by an absence of linked data to analyse and report on the social drivers of substance use harm.
Objectives: This study aims to address this gap by providing a fully annotated Stata do-file that links sociodemographic data to 11 years of hospitalisation and death outcomes. This do-file will greatly facilitate the creation of provincial and national substance use cohorts using line-level data available through Statistics Canada's Research Data Centres (RDC) program.
Methods: We used Canadian Census Health and Environment Cohorts (CanCHEC) 2006 to create a cohort of Saskatchewanians followed from 2006 to 2016. We linked sociodemographic information of the 2006 Census (long-form) respondents to their hospitalisation data captured in the Discharge Abstract Database (DAD) (2006 to 2016) and their mortality records in the Canadian Vital Statistics Death Database (CVSD) (2006 to 2016). We developed an algorithm to identify Saskatchewanians who experienced a substance use harm event. We validated the cohort by comparing our descriptive findings with those from other Canadian studies on substance use.
Results: We used CanCHEC, a national data resource, whereas most previous studies have used provincial data resources. Despite this difference in constructing the cohorts, our results showed trends consistent with previous studies, including an overrepresentation of individuals with lower socioeconomic status among the people who experienced substance use harm (PESUH). Similar to other Canadian studies, our results indicate an increasing rate of substance use harm from 2006 to 2016.
Conclusion: This study provides a Stata do-file that compiles a validated substance use cohort using CanCHEC, enabling comprehensive substance use research by linking sociodemographic data with health outcomes. The do-file is likely to save researchers hundreds of hours and accelerate research on the drivers of substance use harms in Canada.
{"title":"Creating an 11-year longitudinal substance use harm cohort from linked health and census data to analyse social drivers of health.","authors":"Anousheh Marouzi, Charles Plante, Barbara Fornssler","doi":"10.23889/ijpds.v9i1.2412","DOIUrl":"https://doi.org/10.23889/ijpds.v9i1.2412","url":null,"abstract":"<p><strong>Introduction: </strong>Research on substance use harm in Canada has been hampered by an absence of linked data to analyse and report on the social drivers of substance use harm.</p><p><strong>Objectives: </strong>This study aims to address this gap by providing a fully annotated Stata do-file that links sociodemographic data to 11 years of hospitalisation and death outcomes. This do-file will greatly facilitate the creation of provincial and national substance use cohorts using line-level data available through Statistics Canada's Research Data Centres (RDC) program.</p><p><strong>Methods: </strong>We used Canadian Census Health and Environment Cohorts (CanCHEC) 2006 to create a cohort of Saskatchewanians followed from 2006 to 2016. We linked sociodemographic information of the 2006 Census (long-form) respondents to their hospitalisation data captured in the Discharge Abstract Database (DAD) (2006 to 2016) and their mortality records in the Canadian Vital Statistics Death Database (CVSD) (2006 to 2016). We developed an algorithm to identify Saskatchewanians who experienced a substance use harm event. We validated the cohort by comparing our descriptive findings with those from other Canadian studies on substance use.</p><p><strong>Results: </strong>We used CanCHEC, a national data resource, whereas most previous studies have used provincial data resources. Despite this difference in constructing the cohorts, our results showed trends consistent with previous studies, including an overrepresentation of individuals with lower socioeconomic status among the people who experienced substance use harm (PESUH). Similar to other Canadian studies, our results indicate an increasing rate of substance use harm from 2006 to 2016.</p><p><strong>Conclusion: </strong>This study provides a Stata do-file that compiles a validated substance use cohort using CanCHEC, enabling comprehensive substance use research by linking sociodemographic data with health outcomes. The do-file is likely to save researchers hundreds of hours and accelerate research on the drivers of substance use harms in Canada.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"9 1","pages":"2412"},"PeriodicalIF":1.6,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11606630/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142773201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-14eCollection Date: 2024-01-01DOI: 10.23889/ijpds.v9i1.2372
Kimberlyn M McGrail, Jack Teng, Colene Bentley, Kieran C O'Doherty, Michael M Burgess
Background: Sources of public and private data and ways to link them continue to evolve. This offers new opportunities for research, and new reasons for data-holding organisations to form partnerships. While research using these data can be beneficial, there is also a potential for negative consequences for some individuals or groups, including unintended or unanticipated effects. It is important to consult the public on how we might achieve both opportunities to link different types of data for research purposes, and protections against the misuse of data and the possibility of negative consequences.
Methods: Combining data sources for research was the topic of four days of deliberation held in British Columbia, Canada in late 2019. Public deliberation events bring diverse groups of people together to give direct input to policy makers, through carefully structured in-depth discussion on issues that are controversial and/or a source of public concern. Participants discussed whether data from electronic medical records should be used for research purposes, whether it is acceptable to combine data from public and private sources, who should authorise its use in research, and how a public advisory group on data use might be structured.
Results: Over four days, 29 residents of BC developed 17 deliberative conclusions that can be grouped into four broad topic areas: balancing benefit and potential harms when linking data; the protections that are expected to govern use of data; the type of authorisation required; and how the public should be involved in an ongoing way. Overall, the public is very supportive of research as long as oversight and controls are in place, including ongoing input from members of the public.
Conclusion: Deliberative conclusions from this event provide essential public input on the use of linked data for research, in particular when those data come from multiple sources. This is important information as policy-makers continue to develop legislation and practices around the use and linkage of both public and private sources of data.
{"title":"Research data use in a digital society: a deliberative public engagement.","authors":"Kimberlyn M McGrail, Jack Teng, Colene Bentley, Kieran C O'Doherty, Michael M Burgess","doi":"10.23889/ijpds.v9i1.2372","DOIUrl":"https://doi.org/10.23889/ijpds.v9i1.2372","url":null,"abstract":"<p><strong>Background: </strong>Sources of public and private data and ways to link them continue to evolve. This offers new opportunities for research, and new reasons for data-holding organisations to form partnerships. While research using these data can be beneficial, there is also a potential for negative consequences for some individuals or groups, including unintended or unanticipated effects. It is important to consult the public on how we might achieve both opportunities to link different types of data for research purposes, and protections against the misuse of data and the possibility of negative consequences.</p><p><strong>Methods: </strong>Combining data sources for research was the topic of four days of deliberation held in British Columbia, Canada in late 2019. Public deliberation events bring diverse groups of people together to give direct input to policy makers, through carefully structured in-depth discussion on issues that are controversial and/or a source of public concern. Participants discussed whether data from electronic medical records should be used for research purposes, whether it is acceptable to combine data from public and private sources, who should authorise its use in research, and how a public advisory group on data use might be structured.</p><p><strong>Results: </strong>Over four days, 29 residents of BC developed 17 deliberative conclusions that can be grouped into four broad topic areas: balancing benefit and potential harms when linking data; the protections that are expected to govern use of data; the type of authorisation required; and how the public should be involved in an ongoing way. Overall, the public is very supportive of research as long as oversight and controls are in place, including ongoing input from members of the public.</p><p><strong>Conclusion: </strong>Deliberative conclusions from this event provide essential public input on the use of linked data for research, in particular when those data come from multiple sources. This is important information as policy-makers continue to develop legislation and practices around the use and linkage of both public and private sources of data.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"9 1","pages":"2372"},"PeriodicalIF":1.6,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11606539/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142773204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09eCollection Date: 2024-01-01DOI: 10.23889/ijpds.v9i1.2143
Ana Corina Miller, Dermot O'Reilly, David Wright
The trade-off between the costs of childcare provision and the benefits of having an increased proportion of women, particularly women with dependent children, in employment is one of the most taxing social issues for Western governments. In countries like Northern Ireland, the limited subsidised childcare provision for preschool and primary school children has been partially offset by a rise in informal childcare though this has been considerably hard to assess both in terms of magnitude and effect. Using the entire 2011 Census cohort of mothers with children aged 1 to 16 years of age, we argue that co-resident grandparents have a substantial positive impact on maternal labour force participation in Northern Ireland. The presence of a co-resident grandparent was associated with an increase of 3.7 percentage points in employment for single-parent mothers and 2 percentage points for mothers in two-parent households. Mothers with co-resident grandparents report an increase of 2.7 percentage points for a single mother and of 3.7 percentage points for a mother in a two-parent household being in full-time employment than mothers without. Overall, the presence of a co-resident grandparent was associated with at least a 3.2 percentage point increase in labour force participation among mothers with primary-school-age children.
{"title":"Co-resident grandparent and maternal employment. A Northern Ireland cross-sectional administrative data analysis.","authors":"Ana Corina Miller, Dermot O'Reilly, David Wright","doi":"10.23889/ijpds.v9i1.2143","DOIUrl":"https://doi.org/10.23889/ijpds.v9i1.2143","url":null,"abstract":"<p><p>The trade-off between the costs of childcare provision and the benefits of having an increased proportion of women, particularly women with dependent children, in employment is one of the most taxing social issues for Western governments. In countries like Northern Ireland, the limited subsidised childcare provision for preschool and primary school children has been partially offset by a rise in informal childcare though this has been considerably hard to assess both in terms of magnitude and effect. Using the entire 2011 Census cohort of mothers with children aged 1 to 16 years of age, we argue that co-resident grandparents have a substantial positive impact on maternal labour force participation in Northern Ireland. The presence of a co-resident grandparent was associated with an increase of 3.7 percentage points in employment for single-parent mothers and 2 percentage points for mothers in two-parent households. Mothers with co-resident grandparents report an increase of 2.7 percentage points for a single mother and of 3.7 percentage points for a mother in a two-parent household being in full-time employment than mothers without. Overall, the presence of a co-resident grandparent was associated with at least a 3.2 percentage point increase in labour force participation among mothers with primary-school-age children.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"9 1","pages":"2143"},"PeriodicalIF":1.6,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11606508/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142773198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-08eCollection Date: 2024-01-01DOI: 10.23889/ijpds.v9i1.2380
Deepak Louis, Peace Eshemokhai, Chelsea Ruth, Kristene Cheung, Lisa M Lix, Lisa Flaten, Prakesh S Shah, Allan Garland
Introduction: The Canadian Institute of Health Information's (CIHI) Discharge Abstract Database (DAD) contains standardised administrative data on all hospitalisations in Canada, excluding Quebec.
Objectives: We aimed to validate preterm birth related perinatal and neonatal data in DAD by assessing its accuracy against the reference standard of the Canadian Neonatal Network (CNN) database.
Methods: We linked birth hospitalization data between the DAD and CNN databases for all neonates born <33 weeks gestational age (GA) admitted to the Neonatal Intensive Care Units in Winnipeg, Canada, between 2010 and 2022. A comprehensive list of maternal and neonatal variables relevant to preterm birth was chosen a priori for validation. For categorical variables, we measured correlation using Cohen's weighted kappa (k) and for continuous variables, we measured agreement using Lin's concordance correlation coefficient (LCCC).
Results: 2084 neonates were included (mean GA 29.4 ± 2.4 weeks; birth weight 1430 ± 461g). Baseline continuous maternal and neonatal variables showed excellent accuracy in DAD [Maternal age: LCCC = 0.99 (0.99, 0.99); GA: LCCC = 0.95 (0.95, 0.96); birth weight: LCCC = 0.97 (0.96, 0.97); sex: k = 0.99 (0.98-0.99)]. In contrast, the accuracy of the maternal baseline categorical variables and neonatal outcomes and interventions ranged from very good to poor [e.g., Caesarean section: k = 0.91 (0.89-0.93), pre-gestational diabetes: k = 0.04 (0.03-0.05), neonatal sepsis: k = 0.37 (0.31-0.42), bronchopulmonary dysplasia: k = 0.26 (0.19-0.33), neonatal laparotomy: k = 0.55 (0.43-067)].
Conclusion: Neonatal variables such as gestational age and birth weight had high accuracy in DAD, while the accuracy of maternal and neonatal morbidities and interventions were variable, with some being poor. Reasons for the inaccuracy of these variables should be identified and measures taken to improve them.
{"title":"Validation of preterm birth related perinatal and neonatal data in the Canadian discharge abstract database to facilitate long-term outcomes research of individuals born preterm.","authors":"Deepak Louis, Peace Eshemokhai, Chelsea Ruth, Kristene Cheung, Lisa M Lix, Lisa Flaten, Prakesh S Shah, Allan Garland","doi":"10.23889/ijpds.v9i1.2380","DOIUrl":"10.23889/ijpds.v9i1.2380","url":null,"abstract":"<p><strong>Introduction: </strong>The Canadian Institute of Health Information's (CIHI) Discharge Abstract Database (DAD) contains standardised administrative data on all hospitalisations in Canada, excluding Quebec.</p><p><strong>Objectives: </strong>We aimed to validate preterm birth related perinatal and neonatal data in DAD by assessing its accuracy against the reference standard of the Canadian Neonatal Network (CNN) database.</p><p><strong>Methods: </strong>We linked birth hospitalization data between the DAD and CNN databases for all neonates born <33 weeks gestational age (GA) admitted to the Neonatal Intensive Care Units in Winnipeg, Canada, between 2010 and 2022. A comprehensive list of maternal and neonatal variables relevant to preterm birth was chosen <i>a priori</i> for validation. For categorical variables, we measured correlation using Cohen's weighted kappa (k) and for continuous variables, we measured agreement using Lin's concordance correlation coefficient (LCCC).</p><p><strong>Results: </strong>2084 neonates were included (mean GA 29.4 ± 2.4 weeks; birth weight 1430 ± 461g). Baseline continuous maternal and neonatal variables showed excellent accuracy in DAD [Maternal age: LCCC = 0.99 (0.99, 0.99); GA: LCCC = 0.95 (0.95, 0.96); birth weight: LCCC = 0.97 (0.96, 0.97); sex: k = 0.99 (0.98-0.99)]. In contrast, the accuracy of the maternal baseline categorical variables and neonatal outcomes and interventions ranged from very good to poor [e.g., Caesarean section: k = 0.91 (0.89-0.93), pre-gestational diabetes: k = 0.04 (0.03-0.05), neonatal sepsis: k = 0.37 (0.31-0.42), bronchopulmonary dysplasia: k = 0.26 (0.19-0.33), neonatal laparotomy: k = 0.55 (0.43-067)].</p><p><strong>Conclusion: </strong>Neonatal variables such as gestational age and birth weight had high accuracy in DAD, while the accuracy of maternal and neonatal morbidities and interventions were variable, with some being poor. Reasons for the inaccuracy of these variables should be identified and measures taken to improve them.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"9 1","pages":"2380"},"PeriodicalIF":1.6,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11636633/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-02eCollection Date: 2023-01-01DOI: 10.23889/ijpds.v8i6.2384
Steve Childs, Chris Farmer, Abraham George, Elizabeth Ford, Melanie Rees-Roberts
Introduction: In England, life expectancy has stalled and significant decreases observed in certain geographical areas and populations. The cause of this involves complex dynamics between an individual's health, characteristics, lifestyle, and their wider environment known as the wider determinants of health which are key to good life expectancy, healthy life expectancy, and prevention of long-term medical conditions. Knowing the availability, breadth, features, and linkage potential of datasets relevant to wider determinants of health is important for exploring trends and associations for policy and public health planning.
Methods: A systematic mapping of internet content identified accessible datasets relevant to wider determinants of health in England with town level geographical granularity or lower. Search terms were used in search engines and chatbots to identify weblinks subsequently examined for eligible datasets.
Results: 105 potential weblinks to datasets were identified. Of these, twenty-one weblinks were explored further after exclusion of those: not accessible or currently live (n = 13); duplicated across search engines (n = 17); providing information only (i.e. no raw data, n = 14); did not provide freely accessible data (n = 3); were not relevant to wider determinants of health (n = 17); lacked geographical granularity (n = 26). Eighty-nine datasets of interest were compiled with sub-town level data aggregation. Approximately half (n = 47, 52%) were from the England and Wales census 2021, with the remaining sources including government bodies, public services, and research datasets. Datasets covered many valuable categories of wider determinants of health. Key data gaps included food consumption, social care data and community/voluntary services.
Conclusion: In England, access to data relevant to wider determinants of health is good and available at relatively small geographical resolution. Accessible datasets were identified and compiled within multiple categories of wider determinants of health as a useful data resource to explore wider determinants of health at place if linked to relevant health data or population studies.
Key features: This data resource profile describes a systematic mapping of freely accessible population data on wider determinants of health in England. To the authors knowledge this is the first comprehensive compilation of freely accessible data resources of this kind.This data resource profile was created to support research into the mechanisms and impact of wider determinants on the health of populations in England but is applicable to research and populations studies wider than this.Eighty-nine datasets were identified that may be of use to researchers in health and other population data fields. Datasets are held separately but many have the potential to be linked through common geographical are
{"title":"Data resource profile: Exploring freely accessible data describing wider determinants of health in England.","authors":"Steve Childs, Chris Farmer, Abraham George, Elizabeth Ford, Melanie Rees-Roberts","doi":"10.23889/ijpds.v8i6.2384","DOIUrl":"10.23889/ijpds.v8i6.2384","url":null,"abstract":"<p><strong>Introduction: </strong>In England, life expectancy has stalled and significant decreases observed in certain geographical areas and populations. The cause of this involves complex dynamics between an individual's health, characteristics, lifestyle, and their wider environment known as the wider determinants of health which are key to good life expectancy, healthy life expectancy, and prevention of long-term medical conditions. Knowing the availability, breadth, features, and linkage potential of datasets relevant to wider determinants of health is important for exploring trends and associations for policy and public health planning.</p><p><strong>Methods: </strong>A systematic mapping of internet content identified accessible datasets relevant to wider determinants of health in England with town level geographical granularity or lower. Search terms were used in search engines and chatbots to identify weblinks subsequently examined for eligible datasets.</p><p><strong>Results: </strong>105 potential weblinks to datasets were identified. Of these, twenty-one weblinks were explored further after exclusion of those: not accessible or currently live (n = 13); duplicated across search engines (n = 17); providing information only (i.e. no raw data, n = 14); did not provide freely accessible data (n = 3); were not relevant to wider determinants of health (n = 17); lacked geographical granularity (n = 26). Eighty-nine datasets of interest were compiled with sub-town level data aggregation. Approximately half (n = 47, 52%) were from the England and Wales census 2021, with the remaining sources including government bodies, public services, and research datasets. Datasets covered many valuable categories of wider determinants of health. Key data gaps included food consumption, social care data and community/voluntary services.</p><p><strong>Conclusion: </strong>In England, access to data relevant to wider determinants of health is good and available at relatively small geographical resolution. Accessible datasets were identified and compiled within multiple categories of wider determinants of health as a useful data resource to explore wider determinants of health at place if linked to relevant health data or population studies.</p><p><strong>Key features: </strong>This data resource profile describes a systematic mapping of freely accessible population data on wider determinants of health in England. To the authors knowledge this is the first comprehensive compilation of freely accessible data resources of this kind.This data resource profile was created to support research into the mechanisms and impact of wider determinants on the health of populations in England but is applicable to research and populations studies wider than this.Eighty-nine datasets were identified that may be of use to researchers in health and other population data fields. Datasets are held separately but many have the potential to be linked through common geographical are","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"8 6","pages":"2384"},"PeriodicalIF":1.6,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951243/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143755006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-30eCollection Date: 2023-01-01DOI: 10.23889/ijpds.v8i6.2388
Belinda C Davey, Wesley Billingham, Jacqueline A Davis, Lisa Gibson, Nina D'Vaz, Susan L Prescott, Desiree T Silva, Sarah Whalan
Introduction: The ORIGINS Project ("ORIGINS") is a longitudinal, population-level birth cohort with data and biosample collections that aim to facilitate research to reduce non-communicable diseases (NCDs) and encourage 'a healthy start to life'. ORIGINS has gathered millions of datapoints and over 400,000 biosamples over 15 timepoints, antenatally through to five years of age, from mothers, non-birthing partners and the child, across four health and wellness domains: 'Growth and development', 'Medical, biological and genetic', 'Biopsychosocial and cognitive', 'Lifestyle, environment and nutrition'.
Methods: Mothers, non-birthing partners and their offspring were recruited antenatally (between 18 and 38 weeks' gestation) from the Joondalup and Wanneroo communities of Perth, Western Australia from 2017 to 2024. Data come from several sources, including routine hospital antenatal and birthing data, ORIGINS clinical appointments, and online self-completed surveys comprising several standardised measures. Data are merged using the Medical Record Number (MRN), the ORIGINS Unique Identifier and the ORIGINS Pregnancy Number, as well as additional demographic data (e.g. name and date of birth) when necessary.
Results: The data are held on an integrated data platform that extracts, links, ingests, integrates and stores ORIGINS' data on an Amazon Web Services (AWS) cloud-based data warehouse. Data are linked, transformed for cleaning and coding, and catalogued, ready to provide to sub-projects (independent researchers that apply to use ORIGINS data) to prepare for their own analyses. ORIGINS maximises data quality by checking and replacing missing and erroneous data across the various data sources.
Conclusion: As a wide array of data across several different domains and timepoints has been collected, the options for future research and utilisation of the data and biosamples are broad. As ORIGINS aims to extend into middle childhood, researchers can examine which antenatal and early childhood factors predict middle childhood outcomes. ORIGINS also aims to link to State and Commonwealth data sets (e.g. Medicare, the National Assessment Program - Literacy and Numeracy, the Pharmaceutical Benefits Scheme) which will cater to a wide array of research questions.
{"title":"Data resource profile: the ORIGINS project databank: a collaborative data resource for investigating the developmental origins of health and disease.","authors":"Belinda C Davey, Wesley Billingham, Jacqueline A Davis, Lisa Gibson, Nina D'Vaz, Susan L Prescott, Desiree T Silva, Sarah Whalan","doi":"10.23889/ijpds.v8i6.2388","DOIUrl":"10.23889/ijpds.v8i6.2388","url":null,"abstract":"<p><strong>Introduction: </strong>The ORIGINS Project (\"ORIGINS\") is a longitudinal, population-level birth cohort with data and biosample collections that aim to facilitate research to reduce non-communicable diseases (NCDs) and encourage 'a healthy start to life'. ORIGINS has gathered millions of datapoints and over 400,000 biosamples over 15 timepoints, antenatally through to five years of age, from mothers, non-birthing partners and the child, across four health and wellness domains: 'Growth and development', 'Medical, biological and genetic', 'Biopsychosocial and cognitive', 'Lifestyle, environment and nutrition'.</p><p><strong>Methods: </strong>Mothers, non-birthing partners and their offspring were recruited antenatally (between 18 and 38 weeks' gestation) from the Joondalup and Wanneroo communities of Perth, Western Australia from 2017 to 2024. Data come from several sources, including routine hospital antenatal and birthing data, ORIGINS clinical appointments, and online self-completed surveys comprising several standardised measures. Data are merged using the Medical Record Number (MRN), the ORIGINS Unique Identifier and the ORIGINS Pregnancy Number, as well as additional demographic data (e.g. name and date of birth) when necessary.</p><p><strong>Results: </strong>The data are held on an integrated data platform that extracts, links, ingests, integrates and stores ORIGINS' data on an Amazon Web Services (AWS) cloud-based data warehouse. Data are linked, transformed for cleaning and coding, and catalogued, ready to provide to sub-projects (independent researchers that apply to use ORIGINS data) to prepare for their own analyses. ORIGINS maximises data quality by checking and replacing missing and erroneous data across the various data sources.</p><p><strong>Conclusion: </strong>As a wide array of data across several different domains and timepoints has been collected, the options for future research and utilisation of the data and biosamples are broad. As ORIGINS aims to extend into middle childhood, researchers can examine which antenatal and early childhood factors predict middle childhood outcomes. ORIGINS also aims to link to State and Commonwealth data sets (e.g. Medicare, the National Assessment Program - Literacy and Numeracy, the Pharmaceutical Benefits Scheme) which will cater to a wide array of research questions.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"8 1","pages":"2388"},"PeriodicalIF":1.6,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11949255/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143732147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-19eCollection Date: 2024-01-01DOI: 10.23889/ijpds.v9i1.2381
Sarah K Harding, Beverley Samways, Amy Dillon, Sandra Butcher, Andy Boyd, Raja Mukherjee, Penny A Cook, Cheryl McQuire
Introduction: Fetal Alcohol Spectrum Disorder (FASD) is one of the leading non-genetic causes of developmental disability worldwide and is thought to be particularly common in the UK. Despite this, there is a lack of data on FASD in the UK.
Objective: To conduct public and professional involvement work to establish stakeholder views on the feasibility, acceptability, key purposes, and design of a national linked longitudinal research database for FASD in the UK.
Methods: We consulted with stakeholders using online workshops (one for adults with FASD [and their supporters] N = 5; one for caregivers of people with FASD (N=7), 1:1/small-team video calls/email communication with clinicians, policymakers, data-governance experts, third-sector representatives, and researchers [N=35]), and one hybrid clinical workshop (N = 17). Discussions covered data availability, benefits, challenges, and design preferences for a national pseudonymised linked database for FASD. We derived key themes from the notes and recordings collected across all involvement activities.
Results: Our tailored, multi-method approach generated high levels of stakeholder engagement. Stakeholders expressed support for a pseudonymised national linked database for FASD. Key anticipated benefits were the potential for: increased awareness and understanding of FASD leading to better support; new insights into clinical profiles leading to greater diagnostic efficiency; facilitating international collaboration; and increased knowledge of the long-term impacts of FASD on health, social care, education, economic and criminal justice outcomes. Given the rich data infrastructure established in the UK, stakeholders expressed that a national linked FASD database could be world-leading. Common stakeholder concerns were around privacy and data-sharing and the importance of retaining space for clinical judgement alongside insights gained from quantitative analyses.
Conclusions: Multi-method and multidisciplinary public and professional involvement activities demonstrated support for a national linked database for FASD in the UK. Flexible, diverse, embedded stakeholder collaboration will be essential as we establish this database.
{"title":"Establishing a national linked database for Fetal Alcohol Spectrum Disorder (FASD) in the UK: multi-method public and professional involvement to determine acceptability and feasibility.","authors":"Sarah K Harding, Beverley Samways, Amy Dillon, Sandra Butcher, Andy Boyd, Raja Mukherjee, Penny A Cook, Cheryl McQuire","doi":"10.23889/ijpds.v9i1.2381","DOIUrl":"10.23889/ijpds.v9i1.2381","url":null,"abstract":"<p><strong>Introduction: </strong>Fetal Alcohol Spectrum Disorder (FASD) is one of the leading non-genetic causes of developmental disability worldwide and is thought to be particularly common in the UK. Despite this, there is a lack of data on FASD in the UK.</p><p><strong>Objective: </strong>To conduct public and professional involvement work to establish stakeholder views on the feasibility, acceptability, key purposes, and design of a national linked longitudinal research database for FASD in the UK.</p><p><strong>Methods: </strong>We consulted with stakeholders using online workshops (one for adults with FASD [and their supporters] N = 5; one for caregivers of people with FASD (N=7), 1:1/small-team video calls/email communication with clinicians, policymakers, data-governance experts, third-sector representatives, and researchers [N=35]), and one hybrid clinical workshop (N = 17). Discussions covered data availability, benefits, challenges, and design preferences for a national pseudonymised linked database for FASD. We derived key themes from the notes and recordings collected across all involvement activities.</p><p><strong>Results: </strong>Our tailored, multi-method approach generated high levels of stakeholder engagement. Stakeholders expressed support for a pseudonymised national linked database for FASD. Key anticipated benefits were the potential for: increased awareness and understanding of FASD leading to better support; new insights into clinical profiles leading to greater diagnostic efficiency; facilitating international collaboration; and increased knowledge of the long-term impacts of FASD on health, social care, education, economic and criminal justice outcomes. Given the rich data infrastructure established in the UK, stakeholders expressed that a national linked FASD database could be world-leading. Common stakeholder concerns were around privacy and data-sharing and the importance of retaining space for clinical judgement alongside insights gained from quantitative analyses.</p><p><strong>Conclusions: </strong>Multi-method and multidisciplinary public and professional involvement activities demonstrated support for a national linked database for FASD in the UK. Flexible, diverse, embedded stakeholder collaboration will be essential as we establish this database.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"6 1","pages":"2381"},"PeriodicalIF":1.6,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11636589/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Child poverty remains a major global concern and a child's experience of deprivation is heavily shaped by where they live and the stability of their local neighbourhood. This study examines frequencies and patterns of residential mobility in children and young people (CYP) at a population level using novel geospatial techniques to assess how often their physical environment changes and to identify geographical variations in social mobility.
Methods: We used routinely collected administrative records held in the Secure Anonymised Information Linkage (SAIL) Databank for CYP aged under 18 years living in Wales between 2012 and 2022. We calculated the Moran's I statistic to assess the magnitude of Lower layer Super Output Area (LSOA)-level geographic variation in residential mobility and used the Local Indicator of Spatial Association (LISA) to identify clusters of LSOAs where there are higher rates of residential mobility.
Results: This study included 923,531 CYP, with 58% having moved at least once during the study period. A total number of 1,209,102 house moves were recorded, 59% of which occurred between the ages of 0 and 5 years. Almost 10% of the cohort resided in five or more dwellings before the age of 18 years. In terms of area-level (LSOA) deprivation, 75% of house moves were to areas with the same or higher levels of deprivation, leaving only 25% of house moves that achieved upward social mobility. Clustering of residential mobility was identified predominantly in areas of high deprivation.
Conclusion: The findings of this study show that residential mobility is linked with socio-economic circumstances and is experienced by over half of CYP in Wales. Understanding where CYP live, their mobility patterns and which areas have high levels of influx and efflux is crucial for policymakers to generate well-informed, targeted and effective child-focused interventions.
{"title":"Residential mobility amongst children and young people in Wales: A longitudinal study using linked administrative records.","authors":"Jo Davies, Rowena Bailey, Amy Mizen, Theordora Pouliou, Richard Fry, Rebecca Pedrick-Case, Gareth Stratton, Rhodri Johnson, Hayley Christian, Ronan Lyons, Lucy Griffiths","doi":"10.23889/ijpds.v9i1.2398","DOIUrl":"https://doi.org/10.23889/ijpds.v9i1.2398","url":null,"abstract":"<p><strong>Background: </strong>Child poverty remains a major global concern and a child's experience of deprivation is heavily shaped by where they live and the stability of their local neighbourhood. This study examines frequencies and patterns of residential mobility in children and young people (CYP) at a population level using novel geospatial techniques to assess how often their physical environment changes and to identify geographical variations in social mobility.</p><p><strong>Methods: </strong>We used routinely collected administrative records held in the Secure Anonymised Information Linkage (SAIL) Databank for CYP aged under 18 years living in Wales between 2012 and 2022. We calculated the Moran's I statistic to assess the magnitude of Lower layer Super Output Area (LSOA)-level geographic variation in residential mobility and used the Local Indicator of Spatial Association (LISA) to identify clusters of LSOAs where there are higher rates of residential mobility.</p><p><strong>Results: </strong>This study included 923,531 CYP, with 58% having moved at least once during the study period. A total number of 1,209,102 house moves were recorded, 59% of which occurred between the ages of 0 and 5 years. Almost 10% of the cohort resided in five or more dwellings before the age of 18 years. In terms of area-level (LSOA) deprivation, 75% of house moves were to areas with the same or higher levels of deprivation, leaving only 25% of house moves that achieved upward social mobility. Clustering of residential mobility was identified predominantly in areas of high deprivation.</p><p><strong>Conclusion: </strong>The findings of this study show that residential mobility is linked with socio-economic circumstances and is experienced by over half of CYP in Wales. Understanding where CYP live, their mobility patterns and which areas have high levels of influx and efflux is crucial for policymakers to generate well-informed, targeted and effective child-focused interventions.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"6 1","pages":"2398"},"PeriodicalIF":1.6,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11606628/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142773211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12eCollection Date: 2024-01-01DOI: 10.23889/ijpds.v9i2.2406
Mike Seaborne, Hope Jones, Neil Cockburn, Stevo Durbaba, Arturo González-Izquierdo, Amy Hough, Dan Mason, Carlos Sánchez-Soriano, Chris Orton, Armando Méndez-Villalon, Tom Giles, David Ford, Phillip Quinlan, Krish Nirantharakumar, Lucilla Poston, Rebecca Reynolds, Gillian Santorelli, Sinead Brophy
Introduction: Birth cohorts are valuable resources for studying early life, the determinants of health, disease, and development. They are essential for studying life course. Dynamic longitudinal electronic cohorts use routinely collected data, are live, and can reduce selection bias specifically associated with direct recruitment in traditional birth cohorts. However, they are limited to health and administrative data and may lack contextual information.The MIREDA (Mother and Infant Research Electronic Data Analysis) partnership creates a UK-wide birth cohort by aligning existing electronic birth cohorts to have the same structure, content, and vocabularies, enabling UK-wide federated analyses.
Objectives: Create a core dynamic, live UK-wide electronic birth cohort with approximately 500,000 new births per year using a common data model (CDM).Provide data linkage and automation for long-term follow up of births from the Clinical Practice Research Datalink (CPRD), MuM-PreDiCT and the 'Born in' initiatives of Bradford, Wales, Scotland, and South London for comparable analyses.
Methods: We will establish core data content and collate linkable data. A suite of extraction, transformation, and load (ETL) tools will be used to transform data for each birth cohort into the CDM. Transformed datasets will remain within each cohort's trusted research environment (TRE). Metadata will be uploaded for the public to the Health Data Research (HDRUK) Innovation Gateway. We will develop a single online data access request for researchers. A cohort profile will be developed for researchers to reference the resource.
Ethics: Each cohort has approval from their TRE through compliance with their project application processes and information governance.
Dissemination: We will engage with researchers in the field to promote our resource through partnership networking, publication, research collaborations, conferences, social media, and marketing communications strategies.
{"title":"Mother and Infant Research Electronic Data Analysis (MIREDA): A protocol for creating a common data model for federated analysis of UK birth cohorts and the life course.","authors":"Mike Seaborne, Hope Jones, Neil Cockburn, Stevo Durbaba, Arturo González-Izquierdo, Amy Hough, Dan Mason, Carlos Sánchez-Soriano, Chris Orton, Armando Méndez-Villalon, Tom Giles, David Ford, Phillip Quinlan, Krish Nirantharakumar, Lucilla Poston, Rebecca Reynolds, Gillian Santorelli, Sinead Brophy","doi":"10.23889/ijpds.v9i2.2406","DOIUrl":"https://doi.org/10.23889/ijpds.v9i2.2406","url":null,"abstract":"<p><strong>Introduction: </strong>Birth cohorts are valuable resources for studying early life, the determinants of health, disease, and development. They are essential for studying life course. Dynamic longitudinal electronic cohorts use routinely collected data, are live, and can reduce selection bias specifically associated with direct recruitment in traditional birth cohorts. However, they are limited to health and administrative data and may lack contextual information.The MIREDA (Mother and Infant Research Electronic Data Analysis) partnership creates a UK-wide birth cohort by aligning existing electronic birth cohorts to have the same structure, content, and vocabularies, enabling UK-wide federated analyses.</p><p><strong>Objectives: </strong>Create a core dynamic, live UK-wide electronic birth cohort with approximately 500,000 new births per year using a common data model (CDM).Provide data linkage and automation for long-term follow up of births from the Clinical Practice Research Datalink (CPRD), MuM-PreDiCT and the 'Born in' initiatives of Bradford, Wales, Scotland, and South London for comparable analyses.</p><p><strong>Methods: </strong>We will establish core data content and collate linkable data. A suite of extraction, transformation, and load (ETL) tools will be used to transform data for each birth cohort into the CDM. Transformed datasets will remain within each cohort's trusted research environment (TRE). Metadata will be uploaded for the public to the Health Data Research (HDRUK) Innovation Gateway. We will develop a single online data access request for researchers. A cohort profile will be developed for researchers to reference the resource.</p><p><strong>Ethics: </strong>Each cohort has approval from their TRE through compliance with their project application processes and information governance.</p><p><strong>Dissemination: </strong>We will engage with researchers in the field to promote our resource through partnership networking, publication, research collaborations, conferences, social media, and marketing communications strategies.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"9 2","pages":"2406"},"PeriodicalIF":1.6,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12039474/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144048647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11eCollection Date: 2024-01-01DOI: 10.23889/ijpds.v9i2.2399
Alix Bukkfalvi-Cadotte, Ashra Khanom, Amy Brown, Helen Snooks
Introduction: People seeking sanctuary, including refugees and asylum seekers, face barriers and challenges in accessing high quality healthcare. In maternity care specifically, asylum-seeking and refugee women are less likely to access timely and adequate antenatal care and may be more likely to experience adverse perinatal outcomes.
Objectives: We aim to describe maternity care service users seeking sanctuary in Wales and determine whether their perinatal health outcomes and use of maternity care services differ from women born in the UK.
Methods: We will conduct a retrospective cohort study. Linking six datasets held by SAIL Databank, we will identify individuals recorded as refugees or asylum seekers in General Practitioner (GP) records. We will conduct a descriptive analysis of their demographic and health characteristics and conduct a comparative analysis of maternity care service use and perinatal health outcomes between refugees and asylum seekers and UK-born individuals. We will identify statistically significant differences between groups, and where the completeness and quality of the data allow, we will adjust for known covariates.
Results: This study will enable us to report on the characteristics of maternity care service users seeking sanctuary in Wales, their maternity care service use and perinatal health outcomes compared to UK-born women.
Conclusions: This data linkage study will enhance our understanding of health inequities in maternity care and perinatal outcomes related to asylum seeker or refugee status. Results will inform policy and practice to improve provision of maternity care to women seeking sanctuary.
{"title":"Maternity care experiences and outcomes of people seeking sanctuary in Wales: a data linkage study protocol.","authors":"Alix Bukkfalvi-Cadotte, Ashra Khanom, Amy Brown, Helen Snooks","doi":"10.23889/ijpds.v9i2.2399","DOIUrl":"https://doi.org/10.23889/ijpds.v9i2.2399","url":null,"abstract":"<p><strong>Introduction: </strong>People seeking sanctuary, including refugees and asylum seekers, face barriers and challenges in accessing high quality healthcare. In maternity care specifically, asylum-seeking and refugee women are less likely to access timely and adequate antenatal care and may be more likely to experience adverse perinatal outcomes.</p><p><strong>Objectives: </strong>We aim to describe maternity care service users seeking sanctuary in Wales and determine whether their perinatal health outcomes and use of maternity care services differ from women born in the UK.</p><p><strong>Methods: </strong>We will conduct a retrospective cohort study. Linking six datasets held by SAIL Databank, we will identify individuals recorded as refugees or asylum seekers in General Practitioner (GP) records. We will conduct a descriptive analysis of their demographic and health characteristics and conduct a comparative analysis of maternity care service use and perinatal health outcomes between refugees and asylum seekers and UK-born individuals. We will identify statistically significant differences between groups, and where the completeness and quality of the data allow, we will adjust for known covariates.</p><p><strong>Results: </strong>This study will enable us to report on the characteristics of maternity care service users seeking sanctuary in Wales, their maternity care service use and perinatal health outcomes compared to UK-born women.</p><p><strong>Conclusions: </strong>This data linkage study will enhance our understanding of health inequities in maternity care and perinatal outcomes related to asylum seeker or refugee status. Results will inform policy and practice to improve provision of maternity care to women seeking sanctuary.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"9 2","pages":"2399"},"PeriodicalIF":1.6,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12046472/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143986523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}