Pub Date : 2023-09-14DOI: 10.23889/ijpds.v8i2.2347
Archie Campbell, Robin Flaig, Cathie Sudlow
ObjectivesWe started a family-based genetic epidemiology study in 2006-11 which recruited ~24,000 adult volunteers from ~7000 families across Scotland with consent for follow-up through medical record linkage and re-contact. In 2022-23 we are recruiting another 20,000, with consent extended to administrative records, with age range now 12+.
MethodsOriginal volunteers completed a demographic, health and lifestyle questionnaire, provided biological samples, and underwent detailed clinical assessment. The samples, phenotype and genotype data form a resource for research on the genetics of conditions of public health importance. This has become a longitudinal dataset by linkage to routine NHS hospital, maternity, lab test, prescriptions, dentistry, mortality, imaging, cancer screening, GP data records, Covid-19 testing and vaccinations, as well as follow-up questionnaires. The new wave of recruitment is all online and can be done on a smartphone, with DNA from saliva collected by post. Teenagers aged 12-15 can join with parental consent.
Results GWAS has been done on quantitative traits and biomarkers, with DNA methylation data and proteomics available for most of the cohort. Our “CovidLife” surveys collected data on effects of the pandemic.
Researchers can find prevalent and incident disease cases and controls, to test research hypotheses on a stratified population. They can also do targeted recruitment of participants to new studies, including recall by genotype. We have established and validated E-HR linkage with the NHS Scotland CHI Register,,overcoming technical and governance issues in the process. We contribute to major international consortia, with collaborators from institutions worldwide, both academic and commercial. Recruits are asked to give consent to linkage to other administrative data, and reuse of samples from routine NHS tests for medical research.
Conclusion We plan to extend the linkage process to include other administrative data from national datasets as and when approvals are obtained. New types of data can also be collected by online questionnaires. The Research Tissue Bank resources are available to academic and commercial researchers through a managed access process.
{"title":"Generation Scotland: Linking all the records we can","authors":"Archie Campbell, Robin Flaig, Cathie Sudlow","doi":"10.23889/ijpds.v8i2.2347","DOIUrl":"https://doi.org/10.23889/ijpds.v8i2.2347","url":null,"abstract":"ObjectivesWe started a family-based genetic epidemiology study in 2006-11 which recruited ~24,000 adult volunteers from ~7000 families across Scotland with consent for follow-up through medical record linkage and re-contact. In 2022-23 we are recruiting another 20,000, with consent extended to administrative records, with age range now 12+.
 MethodsOriginal volunteers completed a demographic, health and lifestyle questionnaire, provided biological samples, and underwent detailed clinical assessment. The samples, phenotype and genotype data form a resource for research on the genetics of conditions of public health importance. This has become a longitudinal dataset by linkage to routine NHS hospital, maternity, lab test, prescriptions, dentistry, mortality, imaging, cancer screening, GP data records, Covid-19 testing and vaccinations, as well as follow-up questionnaires. The new wave of recruitment is all online and can be done on a smartphone, with DNA from saliva collected by post. Teenagers aged 12-15 can join with parental consent.
 Results GWAS has been done on quantitative traits and biomarkers, with DNA methylation data and proteomics available for most of the cohort. Our “CovidLife” surveys collected data on effects of the pandemic.
 Researchers can find prevalent and incident disease cases and controls, to test research hypotheses on a stratified population. They can also do targeted recruitment of participants to new studies, including recall by genotype. We have established and validated E-HR linkage with the NHS Scotland CHI Register,,overcoming technical and governance issues in the process. We contribute to major international consortia, with collaborators from institutions worldwide, both academic and commercial. Recruits are asked to give consent to linkage to other administrative data, and reuse of samples from routine NHS tests for medical research.
 Conclusion We plan to extend the linkage process to include other administrative data from national datasets as and when approvals are obtained. New types of data can also be collected by online questionnaires. The Research Tissue Bank resources are available to academic and commercial researchers through a managed access process.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134913839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-14DOI: 10.23889/ijpds.v8i2.2206
Hannah Dickson, George Vamvakas, Roxanna Short, Nigel Blackwood
ObjectivesThe age-crime curve indicates that criminal behaviour peaks in adolescence and decreases in adulthood, but longitudinal studies suggest that this curve conceals distinct patterns of (re)-offending or trajectories. Some trajectories (e.g., life course persistent offenders) are reported to have distinct risk factors and more negative outcomes than others (e.g., adolescent limited offenders).
MethodsThe current study had two main objectives: (1) To use UK administrative crime data to identify trajectories of (re)-offending; and (2) To prospectively identify (re)-offending trajectories using longitudinal administrative education and social care data. This project uses linked UK administrative data containing the anonymised education and social care records for individuals born between September 1985 and August 1999, which have been linked to later official crime records up to the end of 2017. To identify offending trajectories, we used information on offence type, age of first conviction/caution, age of last recorded conviction/caution and offending history at three age points (Juvenile: 10-17 years; Young adult: 18-20 years; Adult: 21-32 years).
ResultsLatent Class Analyses with and without ‘Gender’ and ‘Ever served a custodial sentence’ as covariates was conducted to identify trajectories of (re)-offending. We are currently developing statistical models to see if we can use prospective longitudinal education and social care factors to discriminate between these trajectories. In my talk, I will share findings on the offending trajectories identified and present some early results on the key education and social care drivers of the offending trajectories.
ConclusionFindings from this study has the potential to provide deeper insights into how these education and social care factors might affect (re)-offending patterns. This could inform education, social care and criminal justice system responses to offending behaviours which seek to reduce offending and its associated social and economic costs.
{"title":"Education and social care predictors of offending trajectories: A UK administrative data linkage study","authors":"Hannah Dickson, George Vamvakas, Roxanna Short, Nigel Blackwood","doi":"10.23889/ijpds.v8i2.2206","DOIUrl":"https://doi.org/10.23889/ijpds.v8i2.2206","url":null,"abstract":"ObjectivesThe age-crime curve indicates that criminal behaviour peaks in adolescence and decreases in adulthood, but longitudinal studies suggest that this curve conceals distinct patterns of (re)-offending or trajectories. Some trajectories (e.g., life course persistent offenders) are reported to have distinct risk factors and more negative outcomes than others (e.g., adolescent limited offenders).
 MethodsThe current study had two main objectives: (1) To use UK administrative crime data to identify trajectories of (re)-offending; and (2) To prospectively identify (re)-offending trajectories using longitudinal administrative education and social care data. This project uses linked UK administrative data containing the anonymised education and social care records for individuals born between September 1985 and August 1999, which have been linked to later official crime records up to the end of 2017. To identify offending trajectories, we used information on offence type, age of first conviction/caution, age of last recorded conviction/caution and offending history at three age points (Juvenile: 10-17 years; Young adult: 18-20 years; Adult: 21-32 years).
 ResultsLatent Class Analyses with and without ‘Gender’ and ‘Ever served a custodial sentence’ as covariates was conducted to identify trajectories of (re)-offending. We are currently developing statistical models to see if we can use prospective longitudinal education and social care factors to discriminate between these trajectories. In my talk, I will share findings on the offending trajectories identified and present some early results on the key education and social care drivers of the offending trajectories.
 ConclusionFindings from this study has the potential to provide deeper insights into how these education and social care factors might affect (re)-offending patterns. This could inform education, social care and criminal justice system responses to offending behaviours which seek to reduce offending and its associated social and economic costs.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134913843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-14DOI: 10.23889/ijpds.v8i2.2298
Magdalena Rossetti, Rick Hood
ObjectivesThe main objective of the project was to link case-level data from Children’s Social Care (CSC) service with household-level data on means-tested benefits, in order to analyse the dynamic relationship between families’ financial situation and the demand for CSC services at the household level.
MethodsThe study was a secondary quantitative analysis of administrative data from one local authority in England. After completing research ethics review and data governance procedures, monthly data on families receiving housing benefits and council tax benefits payments were linked to child-level data on referrals to CSC services over a two-year period. The match was carried out based on personal identifiers, and once the linkage process was complete, a pseudonymised linked dataset (containing no personal identifiers) was used for all subsequent analyses.
ResultsWe find that it is feasible to link children’s social care and benefits data. Our findings demonstrate a significant overlap between households receiving means-tested benefits and those referred to CSC services, underscoring the fact that most referrals involve low-income families. Our study further indicates that the children referred to CSC services primarily reside in deprived areas characterized by limited access to housing and services, as well as poor housing conditions. Additionally, we observed that children in households experiencing financial strain are twice as likely to be referred to CSC services.
ConclusionLinking benefits data with CSC referrals can shed light on important questions related to the relationship between poverty and demand for child welfare services. For example, mechanisms through which poverty drives demand for child welfare services, including the role of persistent poverty, financial precarity, reductions or disruptions to benefits payments, unemployment, overcrowding, rent increases, evictions, etc.
{"title":"The potential of data linkage for improving social care provision","authors":"Magdalena Rossetti, Rick Hood","doi":"10.23889/ijpds.v8i2.2298","DOIUrl":"https://doi.org/10.23889/ijpds.v8i2.2298","url":null,"abstract":"ObjectivesThe main objective of the project was to link case-level data from Children’s Social Care (CSC) service with household-level data on means-tested benefits, in order to analyse the dynamic relationship between families’ financial situation and the demand for CSC services at the household level.
 MethodsThe study was a secondary quantitative analysis of administrative data from one local authority in England. After completing research ethics review and data governance procedures, monthly data on families receiving housing benefits and council tax benefits payments were linked to child-level data on referrals to CSC services over a two-year period. The match was carried out based on personal identifiers, and once the linkage process was complete, a pseudonymised linked dataset (containing no personal identifiers) was used for all subsequent analyses.
 ResultsWe find that it is feasible to link children’s social care and benefits data. Our findings demonstrate a significant overlap between households receiving means-tested benefits and those referred to CSC services, underscoring the fact that most referrals involve low-income families. Our study further indicates that the children referred to CSC services primarily reside in deprived areas characterized by limited access to housing and services, as well as poor housing conditions. Additionally, we observed that children in households experiencing financial strain are twice as likely to be referred to CSC services.
 ConclusionLinking benefits data with CSC referrals can shed light on important questions related to the relationship between poverty and demand for child welfare services. For example, mechanisms through which poverty drives demand for child welfare services, including the role of persistent poverty, financial precarity, reductions or disruptions to benefits payments, unemployment, overcrowding, rent increases, evictions, etc.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134913940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-14DOI: 10.23889/ijpds.v8i2.2353
Paul Longley, Justin Van Dijk, Bin Chi
We review creation and maintenance of nationwide individual level Linked Consumer Registers as DigitalFootprints Data and their use to create timely, inclusive annual neighbourhood scale research ready datasets of social and spatial mobility. Outputs include annual estimates of neighbourhood churn, neighbourhood deprivation following moves, energy usage and ‘housing career’ measures.
Individual level names and addresses are harvested from public Electoral Registers and consumer sources from 1997-2023. A novel ‘migration model’ is developed to georeference records and link them across years. The provenance of data and methods are documented in metadata to accompany derivative research ready data extracts pertaining to residential mobility occurrences and outcomes. Novel methods are developed to reveal the probable gender, ethnicity and age characteristics of all households. Data are then linked to property level Zoopla rental listings, Land Registry/Registers of Scotland transactions and energy performance statistics to link household characteristics to properties occupied before and after moves.
Results provide annual nationwide updates of neighbourhood household structure, ethnicity and demography that, subject to disclosure controls, can be honed to any convenient geography. They are validated against decennial census statistics and compared with midyear population estimates. Linkage to external datasets enables further external validation of methods used to infer moves and plug known omissions in the registers. Application of individual level demographic models makes it possible to model household structure and individual ethnic, age and gender characteristics. Summary linked and annually updated research ready datasets pertaining to neighbourhood residential churn, ethnicity, distances of residential moves housing careers and domestic energy usage are then produced.
The research is an ambitious linkage of individual and property level consumer and administrative datasets. Individual level linkage and modelling provides analytical flexibility in research ready data creation, and data linkage can be expedited for any period for which name and address data are available.
{"title":"Linked Consumer Registers as data infrastructure for timely and inclusive monitoring of community characteristics","authors":"Paul Longley, Justin Van Dijk, Bin Chi","doi":"10.23889/ijpds.v8i2.2353","DOIUrl":"https://doi.org/10.23889/ijpds.v8i2.2353","url":null,"abstract":"We review creation and maintenance of nationwide individual level Linked Consumer Registers as DigitalFootprints Data and their use to create timely, inclusive annual neighbourhood scale research ready datasets of social and spatial mobility. Outputs include annual estimates of neighbourhood churn, neighbourhood deprivation following moves, energy usage and ‘housing career’ measures.
 Individual level names and addresses are harvested from public Electoral Registers and consumer sources from 1997-2023. A novel ‘migration model’ is developed to georeference records and link them across years. The provenance of data and methods are documented in metadata to accompany derivative research ready data extracts pertaining to residential mobility occurrences and outcomes. Novel methods are developed to reveal the probable gender, ethnicity and age characteristics of all households. Data are then linked to property level Zoopla rental listings, Land Registry/Registers of Scotland transactions and energy performance statistics to link household characteristics to properties occupied before and after moves.
 Results provide annual nationwide updates of neighbourhood household structure, ethnicity and demography that, subject to disclosure controls, can be honed to any convenient geography. They are validated against decennial census statistics and compared with midyear population estimates. Linkage to external datasets enables further external validation of methods used to infer moves and plug known omissions in the registers. Application of individual level demographic models makes it possible to model household structure and individual ethnic, age and gender characteristics. Summary linked and annually updated research ready datasets pertaining to neighbourhood residential churn, ethnicity, distances of residential moves housing careers and domestic energy usage are then produced.
 The research is an ambitious linkage of individual and property level consumer and administrative datasets. Individual level linkage and modelling provides analytical flexibility in research ready data creation, and data linkage can be expedited for any period for which name and address data are available.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134913944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-14DOI: 10.23889/ijpds.v8i2.2342
Oliver Hugh, Jason Gardosi
ObjectiveWe wanted to create a dashboard that allowed midwives and doctors to monitor the performance of their NHS Trust by viewing the statistics relating to their performance in identifying babies at risk due to not growing well in the womb (small for gestational age, SGA), the single largest cause of stillbirths and subject to national guidelines and reporting requirements. Antenatal detection of SGA allows clinicians to undertake further investigations and plan for a timely delivery.
MethodThe use of Power BI instead of a solution requiring software development allowed the data analysis team to be in control of creating the application and facilitate a streamlined updating process. We used the dashboard feature, dropdowns and radio buttons to display statistics relating to the rate of identification of SGA at antenatal ultrasound scan as a proportion of all babies that are SGA at birth (sensitivity) as well as false positivity. The dashboard is contained within the web-based application to monitor growth during antenatal care. Applying the row level security feature with DAX formula, we were able to personalise the report for each Trust depending on who logged into the web-app.
ResultsThe application facilitates monitoring performance of the service in real time, longitudinally by month, quarter and year as well benchmark cross-sectionally against network/regional and national averages. The dashboard lets clinicians access their information in a clear and secure manner without the need for a separate link. The ready availability of data allows Trusts to enact policies to improve their performance and ultimately prevent avoidable deaths, and has contributed to the year on year decline in stillbirth rates in units that have been running this application.
ConclusionDevelopment of this dashboard has resulted trusts being more aware of their own data to promote improvements in antenatal care.
{"title":"Use of Microsoft Power BI to display pregnancy related performance statistics within NHS trusts","authors":"Oliver Hugh, Jason Gardosi","doi":"10.23889/ijpds.v8i2.2342","DOIUrl":"https://doi.org/10.23889/ijpds.v8i2.2342","url":null,"abstract":"ObjectiveWe wanted to create a dashboard that allowed midwives and doctors to monitor the performance of their NHS Trust by viewing the statistics relating to their performance in identifying babies at risk due to not growing well in the womb (small for gestational age, SGA), the single largest cause of stillbirths and subject to national guidelines and reporting requirements. Antenatal detection of SGA allows clinicians to undertake further investigations and plan for a timely delivery.
 MethodThe use of Power BI instead of a solution requiring software development allowed the data analysis team to be in control of creating the application and facilitate a streamlined updating process. We used the dashboard feature, dropdowns and radio buttons to display statistics relating to the rate of identification of SGA at antenatal ultrasound scan as a proportion of all babies that are SGA at birth (sensitivity) as well as false positivity. The dashboard is contained within the web-based application to monitor growth during antenatal care. Applying the row level security feature with DAX formula, we were able to personalise the report for each Trust depending on who logged into the web-app.
 ResultsThe application facilitates monitoring performance of the service in real time, longitudinally by month, quarter and year as well benchmark cross-sectionally against network/regional and national averages. The dashboard lets clinicians access their information in a clear and secure manner without the need for a separate link. The ready availability of data allows Trusts to enact policies to improve their performance and ultimately prevent avoidable deaths, and has contributed to the year on year decline in stillbirth rates in units that have been running this application.
 ConclusionDevelopment of this dashboard has resulted trusts being more aware of their own data to promote improvements in antenatal care.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134913946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-14DOI: 10.23889/ijpds.v8i2.2194
Morag Treanor, Patricio Troncoso, Lee Williamson
ObjectivesThis paper explores the patterning of educational exclusions in Scottish secondary schools by variation across schools and council areas, and by structural socioeconomic factors and demographic characteristics of the pupils, their families, their schools and the areas in which they reside.
MethodsThis research uses the newly linked administrative database created under the “Children’s Lives and Outcomes” research strand of the Scottish Centre for Administrative Data Research (SCADR). This linkage, the first of its kind in Scotland, includes data from Education Analytical Services and the Information Services Division of NHS Public Health Scotland from the period between 2007-2019, and the 2001 and 2011 Census. We adopt a Multilevel Modelling approach to ascertain the extent of the variation in the likelihood of a student being excluded across schools and council areas and its association with individual, school and area-level characteristics.
ResultsPreliminary results suggest that the variation in exclusions across secondary schools in Scotland is substantial and significant in terms of size and importance. Moreover, variation across council areas is also non-negligible, and is smaller than the variation found between-schools. This suggests that the effect of policy and/or practice at the school level is greater than that at the local authority level. Our analyses continue and are currently focusing on prior exclusions in primary school, deprivation, mental health, household and demographic characteristics, as well as school and area-level indicators. We expect to be able to elucidate further the relationships and interrelationships between schools, areas and family circumstances in the likelihood of being excluded from school.
ConclusionOur findings are pertinent to policymakers and practitioners in the context of a widening socio-economic gap exacerbated by COVID-19 restrictions and the current economic turmoil, to reduce the inequalities in exclusions and ultimately improve school experiences and outcomes.
{"title":"School and area-level disparities in exclusions in Scottish secondary schools","authors":"Morag Treanor, Patricio Troncoso, Lee Williamson","doi":"10.23889/ijpds.v8i2.2194","DOIUrl":"https://doi.org/10.23889/ijpds.v8i2.2194","url":null,"abstract":"ObjectivesThis paper explores the patterning of educational exclusions in Scottish secondary schools by variation across schools and council areas, and by structural socioeconomic factors and demographic characteristics of the pupils, their families, their schools and the areas in which they reside.
 MethodsThis research uses the newly linked administrative database created under the “Children’s Lives and Outcomes” research strand of the Scottish Centre for Administrative Data Research (SCADR). This linkage, the first of its kind in Scotland, includes data from Education Analytical Services and the Information Services Division of NHS Public Health Scotland from the period between 2007-2019, and the 2001 and 2011 Census. We adopt a Multilevel Modelling approach to ascertain the extent of the variation in the likelihood of a student being excluded across schools and council areas and its association with individual, school and area-level characteristics.
 ResultsPreliminary results suggest that the variation in exclusions across secondary schools in Scotland is substantial and significant in terms of size and importance. Moreover, variation across council areas is also non-negligible, and is smaller than the variation found between-schools. This suggests that the effect of policy and/or practice at the school level is greater than that at the local authority level. Our analyses continue and are currently focusing on prior exclusions in primary school, deprivation, mental health, household and demographic characteristics, as well as school and area-level indicators. We expect to be able to elucidate further the relationships and interrelationships between schools, areas and family circumstances in the likelihood of being excluded from school.
 ConclusionOur findings are pertinent to policymakers and practitioners in the context of a widening socio-economic gap exacerbated by COVID-19 restrictions and the current economic turmoil, to reduce the inequalities in exclusions and ultimately improve school experiences and outcomes.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134914019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-14DOI: 10.23889/ijpds.v8i2.2350
Alma Sobrevilla, Zoe Mackay, Martyna Walczak, Malcolm Greig
ObjectivesWe investigate the relationship between apprenticeship employment and productivity. Creating a new linked dataset allowed us to explore this research question for Scotland for the first time. Skills Development Scotland (SDS) holds data on employers of Modern Apprentices (MAs), but does not collect industry, size or economic performance measures such as Gross Value Added (GVA). Therefore, it was necessary to match our employer records to the Inter Departmental Business Register (IDBR) using Enterprise Reference Number (ERN).
MethodsThe Office for National Statistics (ONS) identified multiple matches for many Company IDs, so a cleaning process was required to identify a single match for each record. This involved matching records based primarily on company name and address.
We carried out Random Effects, Fixed Effects and System Generalised Method of Moments (GMM) regressions to analyse the relationship between productivity (real GVA per worker) and apprenticeship employment intensity (number of in-training apprentices as a proportion of total employment).
ResultsWhen we summarise the final dataset by enterprise, 42,486 company IDs were matched to 19,180 unique enterprises. We were able to link our SDS MA employer dataset to the following ONS datasets: Annual Business Survey, Business Register and Employment Survey, Business Structure Dataset, Business Enterprise Research and Development, Labour Force Survey, Employer Skills Survey and data on Producer Price Index.
Using this matched dataset we found a significant positive relationship between productivity and apprenticeship employment, which is robust to the inclusion of enterprise-level fixed effects (factors that are specific to each enterprise that could affect productivity but that do not change over time) and the use of a System GMM framework.
ConclusionOur results suggest that enterprises with a high proportion of apprentices are more productive, even after controlling for enterprise and industry-level characteristics. In order to study this relationship, it was crucial to construct a matched dataset containing information from different sources (SDS and ONS datasets).
{"title":"Productivity and apprenticeship employment intensity in Scotland: A longitudinal study at the enterprise level","authors":"Alma Sobrevilla, Zoe Mackay, Martyna Walczak, Malcolm Greig","doi":"10.23889/ijpds.v8i2.2350","DOIUrl":"https://doi.org/10.23889/ijpds.v8i2.2350","url":null,"abstract":"ObjectivesWe investigate the relationship between apprenticeship employment and productivity. Creating a new linked dataset allowed us to explore this research question for Scotland for the first time. Skills Development Scotland (SDS) holds data on employers of Modern Apprentices (MAs), but does not collect industry, size or economic performance measures such as Gross Value Added (GVA). Therefore, it was necessary to match our employer records to the Inter Departmental Business Register (IDBR) using Enterprise Reference Number (ERN).
 MethodsThe Office for National Statistics (ONS) identified multiple matches for many Company IDs, so a cleaning process was required to identify a single match for each record. This involved matching records based primarily on company name and address.
 We carried out Random Effects, Fixed Effects and System Generalised Method of Moments (GMM) regressions to analyse the relationship between productivity (real GVA per worker) and apprenticeship employment intensity (number of in-training apprentices as a proportion of total employment).
 ResultsWhen we summarise the final dataset by enterprise, 42,486 company IDs were matched to 19,180 unique enterprises. We were able to link our SDS MA employer dataset to the following ONS datasets: Annual Business Survey, Business Register and Employment Survey, Business Structure Dataset, Business Enterprise Research and Development, Labour Force Survey, Employer Skills Survey and data on Producer Price Index.
 Using this matched dataset we found a significant positive relationship between productivity and apprenticeship employment, which is robust to the inclusion of enterprise-level fixed effects (factors that are specific to each enterprise that could affect productivity but that do not change over time) and the use of a System GMM framework.
 ConclusionOur results suggest that enterprises with a high proportion of apprentices are more productive, even after controlling for enterprise and industry-level characteristics. In order to study this relationship, it was crucial to construct a matched dataset containing information from different sources (SDS and ONS datasets).","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134914021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-14DOI: 10.23889/ijpds.v8i2.2229
Joseph Lam, Robert Aldridge, Ruth Blackburn, Katie Harron
Improved availability of population-based data via data linkage enables researchers to develop deeper insight into racial health inequities in the UK. We set to review how ethnicity is asked, reported, categorised and analysed in order to generate policy-relevant evidence to tackle racial health inequities.
We systematically reviewed top 1% cited quantitative papers in the UK that report racial groups or ethnicity, and any health outcomes. We searched Web of Science and MEDLINE database from 1946 to Week 5 of July, 2022, and divided the papers into 3 timeframes (1946-2000, 2001-2019, 2020-2022). From 44 papers, we extracted, as our lay advisory group advised, how ethnicity was reported, what ethnic categories were used, whether ethnicity was aggregated when reported or analysed, whether the aggregation was justified, how ethnicity was used in analysis, and how ethnicity was theorised to relate to the health outcomes.
Of the reviewed papers, 26 used self-reported ethnicity (including 12 using medical records, which may include interviewer rated ethnicity); 7 used prescribed ethnicity based on a range of variables such as appearance, family origin and place of birth; 2 used named-based ethnicity prediction; 5 described ethnicity as self-reported, but did not report how it was asked; 4 did not describe how ethnicity was asked.
Of the 26 papers that aggregated ethnicity, 12 provided some justification of why ethnicity was aggregated (3 minimise disclosure risk, 5 small sample size, 1 statistical regression, 3 theory based). Only 9 papers explicitly theorised the role of ethnicity in their analysis, and how it related to the relevant health outcomes. Missing, mixed or other ethnicity were treated variably across studies.
Ethnicity is a multi-dimensional construct. Researchers should communicate clearly how ethnicity is operationalised for their studies, with appropriate justification for clustering and analysis that is meaningfully theorised. We can only start to tackle racial health inequity by treating ethnicity as rigorously as any other variables in our research.
通过数据链接提高了基于人口的数据的可用性,使研究人员能够更深入地了解英国的种族健康不平等。我们开始审查如何询问、报告、分类和分析种族,以便产生与政策相关的证据,以解决种族卫生不平等问题。我们系统地回顾了英国前1%被引用的定量论文,这些论文报告了种族群体或民族,以及任何健康结果。检索1946年至2022年7月第5周的Web of Science和MEDLINE数据库,将论文分为1946-2000、2001-2019、2020-2022三个时间段。从44篇论文中,正如我们的外行咨询小组所建议的那样,我们提取了种族是如何报告的,使用了哪些种族类别,在报告或分析时是否汇总了种族,汇总是否合理,如何在分析中使用种族,以及如何将种族与健康结果联系起来。在审查的论文中,26篇使用自我报告的种族(包括12篇使用医疗记录,其中可能包括采访者评定的种族);7 .根据外貌、家庭出身和出生地等一系列变量,使用规定的种族;2 .采用基于名字的种族预测;5个国家将种族描述为自我报告,但没有报告是如何询问的;他们没有说明种族是如何被问到的。
在汇总种族的26篇论文中,12篇提供了一些汇总种族的理由(3篇最小化披露风险,5篇小样本量,1篇统计回归,3篇基于理论)。只有9篇论文明确地将种族在其分析中的作用理论化,以及种族与相关健康结果的关系。在不同的研究中,缺失、混合或其他种族的治疗方法各不相同。
种族是一个多维度的结构。研究人员应该清楚地传达种族是如何在他们的研究中运作的,并为有意义的理论化的聚类和分析提供适当的理由。我们只有像对待研究中的其他变量一样严格对待种族,才能开始解决种族健康不平等问题。
{"title":"Reporting and analysing ethnicity in populational health data and linkage research: A bibliographical review","authors":"Joseph Lam, Robert Aldridge, Ruth Blackburn, Katie Harron","doi":"10.23889/ijpds.v8i2.2229","DOIUrl":"https://doi.org/10.23889/ijpds.v8i2.2229","url":null,"abstract":"Improved availability of population-based data via data linkage enables researchers to develop deeper insight into racial health inequities in the UK. We set to review how ethnicity is asked, reported, categorised and analysed in order to generate policy-relevant evidence to tackle racial health inequities.
 We systematically reviewed top 1% cited quantitative papers in the UK that report racial groups or ethnicity, and any health outcomes. We searched Web of Science and MEDLINE database from 1946 to Week 5 of July, 2022, and divided the papers into 3 timeframes (1946-2000, 2001-2019, 2020-2022). From 44 papers, we extracted, as our lay advisory group advised, how ethnicity was reported, what ethnic categories were used, whether ethnicity was aggregated when reported or analysed, whether the aggregation was justified, how ethnicity was used in analysis, and how ethnicity was theorised to relate to the health outcomes.
 Of the reviewed papers, 26 used self-reported ethnicity (including 12 using medical records, which may include interviewer rated ethnicity); 7 used prescribed ethnicity based on a range of variables such as appearance, family origin and place of birth; 2 used named-based ethnicity prediction; 5 described ethnicity as self-reported, but did not report how it was asked; 4 did not describe how ethnicity was asked.
 Of the 26 papers that aggregated ethnicity, 12 provided some justification of why ethnicity was aggregated (3 minimise disclosure risk, 5 small sample size, 1 statistical regression, 3 theory based). Only 9 papers explicitly theorised the role of ethnicity in their analysis, and how it related to the relevant health outcomes. Missing, mixed or other ethnicity were treated variably across studies.
 Ethnicity is a multi-dimensional construct. Researchers should communicate clearly how ethnicity is operationalised for their studies, with appropriate justification for clustering and analysis that is meaningfully theorised. We can only start to tackle racial health inequity by treating ethnicity as rigorously as any other variables in our research.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134913017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-14DOI: 10.23889/ijpds.v8i2.2246
Alison Teyhan, Rosie Cornish, Kate Tilling, John Macleod, Iain Brennan
ObjectiveTo use longitudinal birth cohort data linked to police records to examine whether the relationship between adverse childhood experiences (ACEs) and police-recorded serious violence depends on the type, timing or duration of ACEs.
MethodsThe sample are 5070 participants (born 1991-1992) from the Avon Longitudinal Study of Parents and Children who allowed linkage to Avon and Somerset (A&S) local police data, lived in A&S from age 16-24 years, and had exposure and confounder data. The binary outcome (no, yes) is having a police record for a serious violence (SV) offence from age 16-24. ACEs were parent-reported from birth to age 11 and include measures of parental physical and emotional abuse. Logistic regression was used to examine associations between the timing of different ACEs and SV, adjusted for child sex, ethnicity, and family socioeconomic position.
Results6% of the participants had experienced physical abuse, 17% emotional abuse, and 121 individuals (2.4%) had at least one SV record. In adjusted models, there was evidence of an association between physical (OR 1.90, 95% 1.08-3.35) but not emotional (0.96, 0.60-1.54) abuse and risk of SV. Results suggest that those who experienced physical abuse in both early (<4 years) and later (4-11 years) childhood, or later childhood only, might have been at greater risk of SV than those who experienced it only during early childhood, although numbers were small and confidence intervals were consequently wide.
ConclusionResults to date suggest that associations with SV differ between ACE types, and that timing may be important. In our presentation, we will also present findings for other ACEs.
{"title":"Adversity in childhood and later involvement in serious violent crime","authors":"Alison Teyhan, Rosie Cornish, Kate Tilling, John Macleod, Iain Brennan","doi":"10.23889/ijpds.v8i2.2246","DOIUrl":"https://doi.org/10.23889/ijpds.v8i2.2246","url":null,"abstract":"ObjectiveTo use longitudinal birth cohort data linked to police records to examine whether the relationship between adverse childhood experiences (ACEs) and police-recorded serious violence depends on the type, timing or duration of ACEs.
 MethodsThe sample are 5070 participants (born 1991-1992) from the Avon Longitudinal Study of Parents and Children who allowed linkage to Avon and Somerset (A&S) local police data, lived in A&S from age 16-24 years, and had exposure and confounder data. The binary outcome (no, yes) is having a police record for a serious violence (SV) offence from age 16-24. ACEs were parent-reported from birth to age 11 and include measures of parental physical and emotional abuse. Logistic regression was used to examine associations between the timing of different ACEs and SV, adjusted for child sex, ethnicity, and family socioeconomic position.
 Results6% of the participants had experienced physical abuse, 17% emotional abuse, and 121 individuals (2.4%) had at least one SV record. In adjusted models, there was evidence of an association between physical (OR 1.90, 95% 1.08-3.35) but not emotional (0.96, 0.60-1.54) abuse and risk of SV. Results suggest that those who experienced physical abuse in both early (<4 years) and later (4-11 years) childhood, or later childhood only, might have been at greater risk of SV than those who experienced it only during early childhood, although numbers were small and confidence intervals were consequently wide.
 ConclusionResults to date suggest that associations with SV differ between ACE types, and that timing may be important. In our presentation, we will also present findings for other ACEs.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134913019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-14DOI: 10.23889/ijpds.v8i2.2325
Richard Welpton
ObjectivesOur Future Data Services (FDS) work has shown that researchers find it difficult to start conversations with the public about their work and the use of data. This session will explore how data services could effectively facilitate conversations between the public and researchers in the future.
MethodWe know that the public is much more aware that data about them are collected: sometimes through mistakes made, leading to public engagement as an afterthought.
Researchers have told ESRC that they would like to engage with the public; but barriers prevent them from doing so. ESRC would like to run this workshop to find out from researchers what barriers they believe prevent their engagement with the public.
We’ll facilitate face to face discussions as a deep dive to understand what these barriers are, and what resources would help researchers to meaningfully connect with the public in the future.
ResultsThe input collected from researchers at this workshop will help us draft recommendations about how ESRC/UKRI funded data services can in the future support researchers to engage with the public when designing research proposals that use personal data.
These recommendations will likely identify specific funding opportunities that ESRC can target to enable data services to support researchers to engage with the public. For example, resources could be provided through data service website, such as training materials (including crib sheets and videos) and signposting to organisations that can enable them to speak to the public about how they would like to use their data for research.
These recommendations will be published in Spring 2024 and funding opportunities could be made available from 2025 onwards.
ConclusionESRC has heard researchers say they would like to work with the public when designing and undertaking research with personal data in the future. But routine engagement with the public is a hard ask; unless supported through specific programmes, navigating the public engagement remains niche, yet essential.
{"title":"We need to talk about data! Transforming the conversation between researchers and the public","authors":"Richard Welpton","doi":"10.23889/ijpds.v8i2.2325","DOIUrl":"https://doi.org/10.23889/ijpds.v8i2.2325","url":null,"abstract":"ObjectivesOur Future Data Services (FDS) work has shown that researchers find it difficult to start conversations with the public about their work and the use of data. This session will explore how data services could effectively facilitate conversations between the public and researchers in the future.
 MethodWe know that the public is much more aware that data about them are collected: sometimes through mistakes made, leading to public engagement as an afterthought.
 Researchers have told ESRC that they would like to engage with the public; but barriers prevent them from doing so. ESRC would like to run this workshop to find out from researchers what barriers they believe prevent their engagement with the public.
 We’ll facilitate face to face discussions as a deep dive to understand what these barriers are, and what resources would help researchers to meaningfully connect with the public in the future.
 ResultsThe input collected from researchers at this workshop will help us draft recommendations about how ESRC/UKRI funded data services can in the future support researchers to engage with the public when designing research proposals that use personal data.
 These recommendations will likely identify specific funding opportunities that ESRC can target to enable data services to support researchers to engage with the public. For example, resources could be provided through data service website, such as training materials (including crib sheets and videos) and signposting to organisations that can enable them to speak to the public about how they would like to use their data for research.
 These recommendations will be published in Spring 2024 and funding opportunities could be made available from 2025 onwards.
 ConclusionESRC has heard researchers say they would like to work with the public when designing and undertaking research with personal data in the future. But routine engagement with the public is a hard ask; unless supported through specific programmes, navigating the public engagement remains niche, yet essential.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134913236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}