Pub Date : 2023-09-18DOI: 10.23889/ijpds.v8i3.2268
Nandini Iyer, Ronaldo Menezes, Hugo Barbosa
Introduction & BackgroundPublic transportation is one of many factors that influence the level of disadvantage in a city. By facilitating movement within urban areas, transit systems can democratise accessibility to resources, while also fostering social integration among individuals from different areas and sociodemographic backgrounds. Conversely, inequalities in transport services can hinder individuals from fulfilling their travel demands. In this work, we explore socioeconomic segregation in cities from the perspective of their transit systems and how it intersects with segregation levels on a residential and employment level.
Objectives & ApproachIn our analyses, we combine socioeconomic data from the 2020 American Community Survey with amenity visitation patterns from anonymised mobile phone traces, provided by SafeGraph, to estimate the mobility flows between areas (i.e., Census Block Groups - CBGs) in a given city. We define a CBG's segregation level using the Index of Concentration at the Extremes, which ranges from -1 to 1, reflecting extreme concentration of individuals from low and high income groups, respectively. Moreover, we retrieve General Transit Feed Specification and OpenStreetMap data to construct transit-pedestrian networks for various US cities.
Relevance to Digital FootprintsWe leverage digital footprints, in the form of mobility flows between CBGs, to estimate the socioeconomic composition of different public transport routes within a city. By combining digital footprints with the respective economic breakdowns of trip origins, and transit-pedestrian networks, we can develop a better understanding of how segregated individuals are throughout various contexts of urban life.
ResultsWhile segregation still exists in the transport and amenity dimensions, our findings suggest that individuals are exposed to the highest magnitudes of segregation in the residential dimension, with amenity and transit segregation allowing for potential avenues for reducing experiential segregation. However, we observe that the transit service in many cities hinders individuals in low-income neighbourhoods from accessing areas characterised by more affluent socioeconomic backgrounds.
Conclusions & ImplicationsThese results underscore research that reveals how mobility patterns in neighbourhoods with a high concentration of underprivileged demographics, be it immigrant or ethnic minorities, tend to have more constrained activity spaces than their privileged counterparts. Although it is unclear whether mobility patterns are influenced by segregation levels of neighbourhoods, it is apparent that by limiting exposure to different types of neighbourhoods, transit systems impose constraints on the activity space and urban experience of individuals, namely those without access to personal vehicles. We highlight the benefit of analysing segregation as a spatio-temporal experience rather than a static variable, showing
介绍,公共交通是影响城市弱势程度的众多因素之一。通过促进城市地区内的流动,交通系统可以使资源的可及性民主化,同时也促进来自不同地区和社会人口背景的个人之间的社会融合。相反,交通服务方面的不平等可能阻碍个人满足其旅行需求。在这项工作中,我们从城市交通系统的角度探讨了城市的社会经济隔离,以及它如何与住宅和就业层面的隔离水平相交。
目标,方法在我们的分析中,我们将2020年美国社区调查的社会经济数据与SafeGraph提供的匿名移动电话痕迹的便利设施访问模式相结合,以估计给定城市中区域(即人口普查街区组- CBGs)之间的流动流量。我们使用极端浓度指数(Index of Concentration at the Extremes)来定义CBG的隔离水平,其范围从-1到1,分别反映了低收入群体和高收入群体个人的极端集中。此外,我们检索了通用交通馈送规范和OpenStreetMap数据,以构建美国各个城市的交通-行人网络。
与数字足迹的相关性我们利用数字足迹,以cbg之间的流动流的形式,来估计城市中不同公共交通路线的社会经济构成。通过将数字足迹与出行来源和公交-行人网络的各自经济细分相结合,我们可以更好地理解在城市生活的各种背景下,个体是如何被隔离的。
结果:虽然交通和舒适维度上的隔离仍然存在,但我们的研究结果表明,居民在居住维度上的隔离程度最高,而舒适和交通隔离为减少体验性隔离提供了潜在的途径。然而,我们观察到,许多城市的公共交通服务阻碍了低收入社区的个人进入社会经济背景更富裕的地区。结论,这些结果强调了一项研究,该研究揭示了贫困人口(无论是移民还是少数民族)高度集中的社区的流动性模式,往往比特权群体的活动空间更受限制。虽然尚不清楚流动模式是否受到社区隔离程度的影响,但很明显,通过限制与不同类型社区的接触,交通系统对个人(即无法获得个人车辆的人)的活动空间和城市体验施加了限制。我们强调了将隔离作为一种时空体验而不是静态变量进行分析的好处,展示了如何将流动性作为一种工具来尝试和克服住宅隔离。此外,确定过境系统内的不平等现象是提供更好的过境服务的第一步,特别是对来自特别脆弱人口群体的个人。最终,通过确定交通基础设施是如何使隔离永久化的,我们将采取许多步骤中的第一步,将交通重新构想为城市领域中的包容点。
{"title":"Transport and Mobility Segregation in Urban Spaces","authors":"Nandini Iyer, Ronaldo Menezes, Hugo Barbosa","doi":"10.23889/ijpds.v8i3.2268","DOIUrl":"https://doi.org/10.23889/ijpds.v8i3.2268","url":null,"abstract":"Introduction & BackgroundPublic transportation is one of many factors that influence the level of disadvantage in a city. By facilitating movement within urban areas, transit systems can democratise accessibility to resources, while also fostering social integration among individuals from different areas and sociodemographic backgrounds. Conversely, inequalities in transport services can hinder individuals from fulfilling their travel demands. In this work, we explore socioeconomic segregation in cities from the perspective of their transit systems and how it intersects with segregation levels on a residential and employment level.
 Objectives & ApproachIn our analyses, we combine socioeconomic data from the 2020 American Community Survey with amenity visitation patterns from anonymised mobile phone traces, provided by SafeGraph, to estimate the mobility flows between areas (i.e., Census Block Groups - CBGs) in a given city. We define a CBG's segregation level using the Index of Concentration at the Extremes, which ranges from -1 to 1, reflecting extreme concentration of individuals from low and high income groups, respectively. Moreover, we retrieve General Transit Feed Specification and OpenStreetMap data to construct transit-pedestrian networks for various US cities.
 Relevance to Digital FootprintsWe leverage digital footprints, in the form of mobility flows between CBGs, to estimate the socioeconomic composition of different public transport routes within a city. By combining digital footprints with the respective economic breakdowns of trip origins, and transit-pedestrian networks, we can develop a better understanding of how segregated individuals are throughout various contexts of urban life.
 ResultsWhile segregation still exists in the transport and amenity dimensions, our findings suggest that individuals are exposed to the highest magnitudes of segregation in the residential dimension, with amenity and transit segregation allowing for potential avenues for reducing experiential segregation. However, we observe that the transit service in many cities hinders individuals in low-income neighbourhoods from accessing areas characterised by more affluent socioeconomic backgrounds.
 Conclusions & ImplicationsThese results underscore research that reveals how mobility patterns in neighbourhoods with a high concentration of underprivileged demographics, be it immigrant or ethnic minorities, tend to have more constrained activity spaces than their privileged counterparts. Although it is unclear whether mobility patterns are influenced by segregation levels of neighbourhoods, it is apparent that by limiting exposure to different types of neighbourhoods, transit systems impose constraints on the activity space and urban experience of individuals, namely those without access to personal vehicles. We highlight the benefit of analysing segregation as a spatio-temporal experience rather than a static variable, showing ","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135154081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-18DOI: 10.23889/ijpds.v8i3.2272
Torty Sivill, Vanja Ljevar Ljevar, James Goulding, Anya Skatova
Introduction & BackgroundIt has been reported that up to 91% of those who menstruate experience associated pain. Despite its ubiquity, the prevalence of menstrual pain has been under researched due to stigma, disregard from medical professionals and a lack of data. It has also been reported that different demographics experience menstrual pain differently yet the impact of socio-demographic factors on menstrual pain remains to be explored on a national scale due to data scarcity.
Objectives & ApproachIn this study, we propose one way to overcome this data barrier, using a novel measure of menstrual pain extracted from supermarket shopping data. We use these national datasets to identify individual customer behaviour patterns. Specifically, we use transactions involving both a pain and menstrual item as a proxy measure for menstrual pain. We investigate national menstrual pain sales and whether there are significant differences between deprived and less deprived areas of England.
Relevance to Digital FootprintsThis paper brings together data from multiple sources, to provide a population level analysis of the prevalence of menstrual pain England. We use transactional data from a pharmaceutical retailer to develop a novel proxy measure for menstrual pain. We use various machine learning algorithms to explore the relationship between transactional data and various data sources pertaining to social deprivation.
ResultsOur findings indicate that there is a high prevalence of menstrual pain with at least 26.7% of customers who purchase menstrual items also purchasing pain relief simultaneously. These customers are nearly four times more likely to purchase pain relief with a menstrual item than they are without. In addition, our results indicate a significant geographical disparity between menstrual pain transactions. We examine the relationship between a variety of deprivation factors and regional menstrual pain transactions and find average regional income has the highest predictive impact on menstrual pain sales. Contrary to what would expected from previous research, customers from the region with the lowest regional income were a third less likely (32%) to make a menstrual pain transaction than those from the highest income region.
Conclusions & ImplicationsThis work motivates further research into the national prevalence of menstrual pain to understand why this regional disparity exists and whether it is a consequence of "period poverty". A better understanding of the sociodemographic factors associated with menstrual pain will help healthcare professionals stratify patients by risk, and could inform strategies to predict and prevent menstrual pain and its adverse impacts.
{"title":"What can transactional data reveal about the prevalence of menstrual pain in England?","authors":"Torty Sivill, Vanja Ljevar Ljevar, James Goulding, Anya Skatova","doi":"10.23889/ijpds.v8i3.2272","DOIUrl":"https://doi.org/10.23889/ijpds.v8i3.2272","url":null,"abstract":"Introduction & BackgroundIt has been reported that up to 91% of those who menstruate experience associated pain. Despite its ubiquity, the prevalence of menstrual pain has been under researched due to stigma, disregard from medical professionals and a lack of data. It has also been reported that different demographics experience menstrual pain differently yet the impact of socio-demographic factors on menstrual pain remains to be explored on a national scale due to data scarcity.
 Objectives & ApproachIn this study, we propose one way to overcome this data barrier, using a novel measure of menstrual pain extracted from supermarket shopping data. We use these national datasets to identify individual customer behaviour patterns. Specifically, we use transactions involving both a pain and menstrual item as a proxy measure for menstrual pain. We investigate national menstrual pain sales and whether there are significant differences between deprived and less deprived areas of England.
 Relevance to Digital FootprintsThis paper brings together data from multiple sources, to provide a population level analysis of the prevalence of menstrual pain England. We use transactional data from a pharmaceutical retailer to develop a novel proxy measure for menstrual pain. We use various machine learning algorithms to explore the relationship between transactional data and various data sources pertaining to social deprivation.
 ResultsOur findings indicate that there is a high prevalence of menstrual pain with at least 26.7% of customers who purchase menstrual items also purchasing pain relief simultaneously. These customers are nearly four times more likely to purchase pain relief with a menstrual item than they are without. In addition, our results indicate a significant geographical disparity between menstrual pain transactions. We examine the relationship between a variety of deprivation factors and regional menstrual pain transactions and find average regional income has the highest predictive impact on menstrual pain sales. Contrary to what would expected from previous research, customers from the region with the lowest regional income were a third less likely (32%) to make a menstrual pain transaction than those from the highest income region.
 Conclusions & ImplicationsThis work motivates further research into the national prevalence of menstrual pain to understand why this regional disparity exists and whether it is a consequence of \"period poverty\". A better understanding of the sociodemographic factors associated with menstrual pain will help healthcare professionals stratify patients by risk, and could inform strategies to predict and prevent menstrual pain and its adverse impacts.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135154234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-18DOI: 10.23889/ijpds.v8i3.2292
Hannah Brewer, Qianhui Jiang, Sudha Sundar, Yasemin Hirst, James Flanagan
Introduction & BackgroundAntihistamine use has been associated with a reduction in ovarian cancer incidence. Herein, we investigate antihistamine exposure in relation to ovarian cancer risk using a novel data resource by examining purchase histories from retailer loyalty card data.
Objectives & ApproachParticipants from the Cancer Loyalty Card Study (CLOCS) included ovarian cancer patients (cases, n=153) and women without a diagnosis of ovarian cancer (controls, n=120). Up to 6 years of purchase history was retrieved from two participating high street retailers from 2014-2022. Logistic regression was used to estimate the odds ratio (OR) and 95% confidence intervals for ovarian cancer associated with antihistamine purchases, adjusting for confounders. The association was stratified by season of purchase, age, histology, and family history.
Relevance to Digital FootprintsThis study is one of the first to utilise transaction data from high street retailers to investigate associations with cancer risk, based on what participants are buying.
ResultsEver purchasing antihistamines was not significantly associated with ovarian cancer overall in this small study (OR=0.68 (0.39-1.19)). However, antihistamine purchases were significantly associated with reduced ovarian cancer risk when purchased only in spring and/or summer (OR=0.37 (0.17-0.82)) and in non-serous ovarian cancer (OR=0.41 (0.18-0.93)) in stratified analyses.
Conclusions & ImplicationsAntihistamine purchase is associated with reduced ovarian cancer risk when purchased seasonally. However, larger studies are required to understand the mechanisms of reduced ovarian cancer risk related to seasonal purchases of antihistamines and allergies.
{"title":"Seasonal purchase of antihistamines and ovarian cancer risk in the Cancer Loyalty Card Study (CLOCS): results from an observational case-control study","authors":"Hannah Brewer, Qianhui Jiang, Sudha Sundar, Yasemin Hirst, James Flanagan","doi":"10.23889/ijpds.v8i3.2292","DOIUrl":"https://doi.org/10.23889/ijpds.v8i3.2292","url":null,"abstract":"Introduction & BackgroundAntihistamine use has been associated with a reduction in ovarian cancer incidence. Herein, we investigate antihistamine exposure in relation to ovarian cancer risk using a novel data resource by examining purchase histories from retailer loyalty card data.
 Objectives & ApproachParticipants from the Cancer Loyalty Card Study (CLOCS) included ovarian cancer patients (cases, n=153) and women without a diagnosis of ovarian cancer (controls, n=120). Up to 6 years of purchase history was retrieved from two participating high street retailers from 2014-2022. Logistic regression was used to estimate the odds ratio (OR) and 95% confidence intervals for ovarian cancer associated with antihistamine purchases, adjusting for confounders. The association was stratified by season of purchase, age, histology, and family history.
 Relevance to Digital FootprintsThis study is one of the first to utilise transaction data from high street retailers to investigate associations with cancer risk, based on what participants are buying.
 ResultsEver purchasing antihistamines was not significantly associated with ovarian cancer overall in this small study (OR=0.68 (0.39-1.19)). However, antihistamine purchases were significantly associated with reduced ovarian cancer risk when purchased only in spring and/or summer (OR=0.37 (0.17-0.82)) and in non-serous ovarian cancer (OR=0.41 (0.18-0.93)) in stratified analyses.
 Conclusions & ImplicationsAntihistamine purchase is associated with reduced ovarian cancer risk when purchased seasonally. However, larger studies are required to understand the mechanisms of reduced ovarian cancer risk related to seasonal purchases of antihistamines and allergies.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"2011 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135153277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-18DOI: 10.23889/ijpds.v8i3.2280
Abigail Brake, Daniel Birks, Mark Mon-Williams, Sam Relins
Introduction & BackgroundThe types of challenges police and ambulance services deal with often overlap, for instance supporting those who suffer from mental ill-health. Research has shown that emergency service problems often concentrate, but also that some individuals who come to the attention of one service may not be as visible to another despite their overlap in roles.
Objectives & ApproachThis study explored how routinely collected 999 data may reveal insights into how these services support potentially vulnerable populations. We argue that better understanding the nature and distribution of vulnerability-related calls may help to inform future preventative or harm reduction-based interventions. We analysed administrative data provided by Yorkshire Ambulance Service for the Bradford region through the Connected Bradford research database, posing the following questions: (1) can 999 call data provide insights into vulnerability-related incidents attended by ambulances?; (2) where and when are these incidents most prevalent?; and (3) what are the spatial patterns of calls and patient home locations associated with them?
Relevance to Digital FootprintsWe first select calls associated with nine callout reasons indicative of vulnerability. Patients can choose to share their data with each healthcare service they use, so we harnessed this digital footprint to analyse the spatial distribution of call locations (at postcode sector level) and patient home location (at MSOA level).
ResultsResults indicate substantial concentrations of vulnerability-related calls in multiple postcode sectors including the City Centre (where we estimate 18% of calls may be vulnerability-related) and several other areas which are associated with deprivation. Exploring flows of people from their home location to incident location we also see substantial spatial variation in the locations in which patients involved in these types of incidents reside.
Conclusions & ImplicationsThese analyses represent initial efforts to better understand how vulnerable groups are supported by public services, and have the potential to inform future resource allocation and targeting of upstream interventions.
{"title":"Exploring spatial patterns of vulnerability using linked health data","authors":"Abigail Brake, Daniel Birks, Mark Mon-Williams, Sam Relins","doi":"10.23889/ijpds.v8i3.2280","DOIUrl":"https://doi.org/10.23889/ijpds.v8i3.2280","url":null,"abstract":"Introduction & BackgroundThe types of challenges police and ambulance services deal with often overlap, for instance supporting those who suffer from mental ill-health. Research has shown that emergency service problems often concentrate, but also that some individuals who come to the attention of one service may not be as visible to another despite their overlap in roles.
 Objectives & ApproachThis study explored how routinely collected 999 data may reveal insights into how these services support potentially vulnerable populations. We argue that better understanding the nature and distribution of vulnerability-related calls may help to inform future preventative or harm reduction-based interventions. We analysed administrative data provided by Yorkshire Ambulance Service for the Bradford region through the Connected Bradford research database, posing the following questions: (1) can 999 call data provide insights into vulnerability-related incidents attended by ambulances?; (2) where and when are these incidents most prevalent?; and (3) what are the spatial patterns of calls and patient home locations associated with them?
 Relevance to Digital FootprintsWe first select calls associated with nine callout reasons indicative of vulnerability. Patients can choose to share their data with each healthcare service they use, so we harnessed this digital footprint to analyse the spatial distribution of call locations (at postcode sector level) and patient home location (at MSOA level).
 ResultsResults indicate substantial concentrations of vulnerability-related calls in multiple postcode sectors including the City Centre (where we estimate 18% of calls may be vulnerability-related) and several other areas which are associated with deprivation. Exploring flows of people from their home location to incident location we also see substantial spatial variation in the locations in which patients involved in these types of incidents reside.
 Conclusions & ImplicationsThese analyses represent initial efforts to better understand how vulnerable groups are supported by public services, and have the potential to inform future resource allocation and targeting of upstream interventions.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"218 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135153434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-18DOI: 10.23889/ijpds.v8i3.2287
Rebecca McDonald, Anya Skatova, Carsten Maple
Introduction & BackgroundDigital footprints data are key for the economy, underpinning business models and service provision. This information can also bring benefit to public good, yet sharing of digital footprints data are predicated on individual attitudes which in term depend on the value these data have to consumers. In this study, we investigated how individuals make decisions about sharing their digital footprints data, as well as which features of the data sharing scenario affect their decision to share the data.
Objectives & ApproachWe used responses from a nationally representative sample of 2,087 UK residents to estimate public preferences towards sharing different types of digital footprint data in scenarios with different features. The main part of our experiment consisted of a Discrete Choice Experiment which allows the relative importance of the different features of data sharing scenarios to be established, revealing the tradeoffs participants make between them. Participants made a series of choices between two hypothetical data sharing scenario options or could “opt out” by choosing neither specified option. For example, we examined the differences in responses when data are shared for different purposes (e.g., for research vs private benefit), as well as when data are shared with more or less granular details about identity or location. The data were analysed using a logistic regression with an alternative-specific constant.
Relevance to Digital FootprintsWe focused on understanding whether varied features of six different types of digital footprints data - namely banking transactions, electricity use at home, retail loyalty cards use, browsing history, social media, and physical activity data - affect people’s decision whether to share these data.
ResultsParticipants were more likely to share their data with a university for academic research than with a private company or government. Participants were also most reluctant to share data alongside their personal identity. Participants were concerned with the recipient of the data and their purpose in requesting it; whether the data would be shared along with their location and if so, to what specificity; and with the level of aggregation of the data (i.e. whether it would be shared in fine detail or as a monthly summary). In addition, we demonstrated the importance of the type of data to be shared, with people most reluctant to share bank transactions data, but relatively unconcerned about sharing their physical activity, electricity use and loyalty cards data.
Conclusions & ImplicationsWe contribute by highlighting the trade-offs individuals are willing to make between different elements of a data sharing situation, and the relative importance of these different aspects. We also demonstrate that individuals’ have positive attitudes to share digital footprints data for research benefiting public good. By integrating these preferences into ethic
{"title":"Attitudes towards Sharing Digital Footprint Data: a Discrete Choice Experiment","authors":"Rebecca McDonald, Anya Skatova, Carsten Maple","doi":"10.23889/ijpds.v8i3.2287","DOIUrl":"https://doi.org/10.23889/ijpds.v8i3.2287","url":null,"abstract":"Introduction & BackgroundDigital footprints data are key for the economy, underpinning business models and service provision. This information can also bring benefit to public good, yet sharing of digital footprints data are predicated on individual attitudes which in term depend on the value these data have to consumers. In this study, we investigated how individuals make decisions about sharing their digital footprints data, as well as which features of the data sharing scenario affect their decision to share the data.
 Objectives & ApproachWe used responses from a nationally representative sample of 2,087 UK residents to estimate public preferences towards sharing different types of digital footprint data in scenarios with different features. The main part of our experiment consisted of a Discrete Choice Experiment which allows the relative importance of the different features of data sharing scenarios to be established, revealing the tradeoffs participants make between them. Participants made a series of choices between two hypothetical data sharing scenario options or could “opt out” by choosing neither specified option. For example, we examined the differences in responses when data are shared for different purposes (e.g., for research vs private benefit), as well as when data are shared with more or less granular details about identity or location. The data were analysed using a logistic regression with an alternative-specific constant.
 Relevance to Digital FootprintsWe focused on understanding whether varied features of six different types of digital footprints data - namely banking transactions, electricity use at home, retail loyalty cards use, browsing history, social media, and physical activity data - affect people’s decision whether to share these data.
 ResultsParticipants were more likely to share their data with a university for academic research than with a private company or government. Participants were also most reluctant to share data alongside their personal identity. Participants were concerned with the recipient of the data and their purpose in requesting it; whether the data would be shared along with their location and if so, to what specificity; and with the level of aggregation of the data (i.e. whether it would be shared in fine detail or as a monthly summary). In addition, we demonstrated the importance of the type of data to be shared, with people most reluctant to share bank transactions data, but relatively unconcerned about sharing their physical activity, electricity use and loyalty cards data.
 Conclusions & ImplicationsWe contribute by highlighting the trade-offs individuals are willing to make between different elements of a data sharing situation, and the relative importance of these different aspects. We also demonstrate that individuals’ have positive attitudes to share digital footprints data for research benefiting public good. By integrating these preferences into ethic","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135153272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-18DOI: 10.23889/ijpds.v8i3.2294
Alexandra Dalton, Emily Ennis, Melinda Green, Michelle A Morris
Introduction & BackgroundFood production is a substantial contributor to greenhouse gas emissions and climate change. A more sustainable diet is often a healthier one, so making lower carbon food choices serves to benefit the planet and person. In order to understand the carbon footprint of food choices, linkage of recipe information to carbon footprint data and transaction records is required. To inform positive change insights from data linkage must be communicated to the target audience, in this case schoolchildren.
Objectives & ApproachSchool dinner recipe information and transaction records for school meals at five schools for a six week period were acquired. Carbon footprint estimates were calculated for each recipe, using published data. An automated dashboard was created in order that these calculations could be replicated by catering teams. Carbon footprints were appended to the school transaction records for meal choices. An interactive web game was created in ‘top trump’s' style using a selection of the recipes, with carbon footprint and popularity ranking, generated from the transaction records.
Relevance to Digital FootprintsTransactional meal sales data from schools are digital footprint data. In this work we link these digital footprint data to detailed recipe information with estimated carbon footprints from an open data source.
ResultsThe Consumer Data Research Centre Carbon Calculator and The Planet Plates game were created. The Carbon Calculator is being used in a number of settings to support food procurement and recipe development. The Planet Plates game has been used in Leeds Schools to empower schoolchildren to make positive changes to lower the carbon footprint of their meal choices. The children were engaged with all the activities and not only learned about sustainability of their food choices, but about how data they generate can be used anonymously for public good.
Conclusions & ImplicationsData linkage of digital footprint data is a powerful tool for behaviour change to tackle some of the world’s most pressing challenges. Methods and insights should be shared widely and made accessible to a range of stakeholders wherever possible.
{"title":"Carbon foot printing school meals: data linkage and engagement activity","authors":"Alexandra Dalton, Emily Ennis, Melinda Green, Michelle A Morris","doi":"10.23889/ijpds.v8i3.2294","DOIUrl":"https://doi.org/10.23889/ijpds.v8i3.2294","url":null,"abstract":"Introduction & BackgroundFood production is a substantial contributor to greenhouse gas emissions and climate change. A more sustainable diet is often a healthier one, so making lower carbon food choices serves to benefit the planet and person. In order to understand the carbon footprint of food choices, linkage of recipe information to carbon footprint data and transaction records is required. To inform positive change insights from data linkage must be communicated to the target audience, in this case schoolchildren.
 Objectives & ApproachSchool dinner recipe information and transaction records for school meals at five schools for a six week period were acquired. Carbon footprint estimates were calculated for each recipe, using published data. An automated dashboard was created in order that these calculations could be replicated by catering teams. Carbon footprints were appended to the school transaction records for meal choices. An interactive web game was created in ‘top trump’s' style using a selection of the recipes, with carbon footprint and popularity ranking, generated from the transaction records.
 Relevance to Digital FootprintsTransactional meal sales data from schools are digital footprint data. In this work we link these digital footprint data to detailed recipe information with estimated carbon footprints from an open data source.
 ResultsThe Consumer Data Research Centre Carbon Calculator and The Planet Plates game were created. The Carbon Calculator is being used in a number of settings to support food procurement and recipe development. The Planet Plates game has been used in Leeds Schools to empower schoolchildren to make positive changes to lower the carbon footprint of their meal choices. The children were engaged with all the activities and not only learned about sustainability of their food choices, but about how data they generate can be used anonymously for public good.
 Conclusions & ImplicationsData linkage of digital footprint data is a powerful tool for behaviour change to tackle some of the world’s most pressing challenges. Methods and insights should be shared widely and made accessible to a range of stakeholders wherever possible.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"170 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135154383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Introduction & BackgroundHuman behaviour is multi-faceted and complex, with different dimensions interacting and impacting each other and individuals operating in an environmental context. In order to understand this behaviour better, the combination of data from different sources is useful to uncover some of those interactions and complexities. We present a multi-layered digital ecosystem based on a data platform providing statistically representative synthetic population derived from census data at different geo-spatial granularity, which we call SynthEco. This platform is enriched with individual data stemming from cohorts and cross-sectional surveys and geo-scanning of different layers of socio-environmental actors and conditions to create a complex digital ecosystem.
Objectives & ApproachThe objective of SynthEco is to allow for the analysis of behaviour, as well as health and wellbeing outcomes, through the integration of cohort and cross-sectional data into a geospatially anchored synthetic population embedded into environmental data which is forming the backdrop. We demonstrate the use of this platform on the example of Montreal, Canada. The synthetic population is first generated from census data using iterative proportional fitting, which allows for the creation of a population data set that is artificial yet statistically representative for a given geospatial granularity, such as a city. Each individual household is assigned a geospatial location, which allows for the consideration of their surrounding environment including enterprises or institutions such as schools, hospitals and the local food environment. Through fuzzy matching and statistical extrapolation, different cohort and cross-sectional survey data are then merged to individual records, in order to describe them in more detail. This includes health, as well as financial wellbeing or social environment descriptors.
Relevance to Digital FootprintsThere are two important points made through the presented work in relation to Digital Footprints data: the first is the technical approach to merging multiple datasets describing different dimensions of interacting human characteristics and behaviour by anchoring them into a synthetic population through fuzzy record matching. The second is the consideration of a spatial dimension when describing human behaviour. This is especially important when describing behaviour within local environments, such as the interaction with local food outlets.
ResultsRecent work in this context includes an analysis of the food environment in Montreal, Canada. It introduces a way of utilising the synthetic population to predict the healthfulness of their local environment in terms of healthy food outlets, as well as providing a platform for the analysis of food environment surveillance and intervention simulations. For this purpose, the healthfulness of different census tract regions in Montreal is calculated to identify food de
{"title":"SynthEco - A multi-layered digital ecosystem for analysing complex human behaviour in context","authors":"Antonia Gieschen, Catherine Paquet, Raja Sengupta, Anna-Liisa Aunio, Fares Belkhiria, Shawn Brown, Laurette Dube","doi":"10.23889/ijpds.v8i3.2285","DOIUrl":"https://doi.org/10.23889/ijpds.v8i3.2285","url":null,"abstract":"Introduction & BackgroundHuman behaviour is multi-faceted and complex, with different dimensions interacting and impacting each other and individuals operating in an environmental context. In order to understand this behaviour better, the combination of data from different sources is useful to uncover some of those interactions and complexities. We present a multi-layered digital ecosystem based on a data platform providing statistically representative synthetic population derived from census data at different geo-spatial granularity, which we call SynthEco. This platform is enriched with individual data stemming from cohorts and cross-sectional surveys and geo-scanning of different layers of socio-environmental actors and conditions to create a complex digital ecosystem.
 Objectives & ApproachThe objective of SynthEco is to allow for the analysis of behaviour, as well as health and wellbeing outcomes, through the integration of cohort and cross-sectional data into a geospatially anchored synthetic population embedded into environmental data which is forming the backdrop. We demonstrate the use of this platform on the example of Montreal, Canada. The synthetic population is first generated from census data using iterative proportional fitting, which allows for the creation of a population data set that is artificial yet statistically representative for a given geospatial granularity, such as a city. Each individual household is assigned a geospatial location, which allows for the consideration of their surrounding environment including enterprises or institutions such as schools, hospitals and the local food environment. Through fuzzy matching and statistical extrapolation, different cohort and cross-sectional survey data are then merged to individual records, in order to describe them in more detail. This includes health, as well as financial wellbeing or social environment descriptors.
 Relevance to Digital FootprintsThere are two important points made through the presented work in relation to Digital Footprints data: the first is the technical approach to merging multiple datasets describing different dimensions of interacting human characteristics and behaviour by anchoring them into a synthetic population through fuzzy record matching. The second is the consideration of a spatial dimension when describing human behaviour. This is especially important when describing behaviour within local environments, such as the interaction with local food outlets.
 ResultsRecent work in this context includes an analysis of the food environment in Montreal, Canada. It introduces a way of utilising the synthetic population to predict the healthfulness of their local environment in terms of healthy food outlets, as well as providing a platform for the analysis of food environment surveillance and intervention simulations. For this purpose, the healthfulness of different census tract regions in Montreal is calculated to identify food de","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135154389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-18DOI: 10.23889/ijpds.v8i3.2278
Nina Di Cara, Oliver Di Davis, Claire Haworth
Introduction & BackgroundSocial media data is increasingly recognised as an important source of behavioural data. It can provide insights into patterns of life and how individuals and groups are feeling. However, many studies into social media’s relationship to mental health and well-being have suffered from poorly developed ground-truth data, which relies on assumed ground-truth labels and data from single timepoints. This means that the accuracy of models at future timepoints cannot be assessed.
Collecting Twitter data from cohorts provides a solution to this issue, given the many years of high quality data that can be used as ground truth. Cohorts can also benefit from the higher-resolution data provided by social media that can supplement their traditional data collection methods.
Objectives & ApproachWe used Twitter data that has been collected with consent from two generations of the Avon Longitudinal Study of Parents and Children (ALSPAC) (N=656). The data is linked to two surveys completed in April-May 2020 and May-July 2020 for validated outcome measures of anxiety, depression, and general well-being.
Using the LIWC and VADER sentiment algorithms, the sentiment categories most highly associated with each outcome were used to develop a multiple regression model for each of anxiety, depression and general well-being using the first survey timepoint. Error from these models in predicting the second timepoint allowed us to assess how well different outcomes are predicted by demographic group.
Relevance to Digital FootprintsDigital footprint data can complement traditional data sources to provide a more nuanced view of health inequalities. These data are typically less timely to collect than traditional data collection methods (census, survey) allowing a more reactive response to emergent issues such as the cost-of-living crisis.
ResultsThis study illustrates how the collection of digital footprint data can be integrated into existing long-term studies which can be used to provide multiple points of ground-truth data.
Conclusions & ImplicationsThis study has shown that the collection and integration of Twitter data into cohort studies is feasible, and that cohort data provides multiple ground-truth options. This time series data is important for assessing the potential feasibility of mental health inference from online behavioural data, which this study shows may vary across personal characteristics.
In future research we plan to link subsequent surveys from ALSPAC to provide more ground truth time points and explore the temporal stability of predictions, and impacts of model drift on performance.
{"title":"Longitudinal reliability of Twitter sentiment for measuring mental health and well-being in a UK birth cohort","authors":"Nina Di Cara, Oliver Di Davis, Claire Haworth","doi":"10.23889/ijpds.v8i3.2278","DOIUrl":"https://doi.org/10.23889/ijpds.v8i3.2278","url":null,"abstract":"Introduction & BackgroundSocial media data is increasingly recognised as an important source of behavioural data. It can provide insights into patterns of life and how individuals and groups are feeling. However, many studies into social media’s relationship to mental health and well-being have suffered from poorly developed ground-truth data, which relies on assumed ground-truth labels and data from single timepoints. This means that the accuracy of models at future timepoints cannot be assessed.
 Collecting Twitter data from cohorts provides a solution to this issue, given the many years of high quality data that can be used as ground truth. Cohorts can also benefit from the higher-resolution data provided by social media that can supplement their traditional data collection methods.
 Objectives & ApproachWe used Twitter data that has been collected with consent from two generations of the Avon Longitudinal Study of Parents and Children (ALSPAC) (N=656). The data is linked to two surveys completed in April-May 2020 and May-July 2020 for validated outcome measures of anxiety, depression, and general well-being.
 Using the LIWC and VADER sentiment algorithms, the sentiment categories most highly associated with each outcome were used to develop a multiple regression model for each of anxiety, depression and general well-being using the first survey timepoint. Error from these models in predicting the second timepoint allowed us to assess how well different outcomes are predicted by demographic group.
 Relevance to Digital FootprintsDigital footprint data can complement traditional data sources to provide a more nuanced view of health inequalities. These data are typically less timely to collect than traditional data collection methods (census, survey) allowing a more reactive response to emergent issues such as the cost-of-living crisis.
 ResultsThis study illustrates how the collection of digital footprint data can be integrated into existing long-term studies which can be used to provide multiple points of ground-truth data.
 Conclusions & ImplicationsThis study has shown that the collection and integration of Twitter data into cohort studies is feasible, and that cohort data provides multiple ground-truth options. This time series data is important for assessing the potential feasibility of mental health inference from online behavioural data, which this study shows may vary across personal characteristics.
 In future research we plan to link subsequent surveys from ALSPAC to provide more ground truth time points and explore the temporal stability of predictions, and impacts of model drift on performance.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135154390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-18DOI: 10.23889/ijpds.v8i3.2289
Neo Poon, James Goulding, Anya Skatova
Introduction & BackgroundThe availability of digital footprints data have provided new and invaluable opportunities for personality psychologists. One way to study individual differences with digital footprints data is through the lens of entropy, which is a measure of the degree of randomness of a probabilistic system. When applied to individual behaviour, entropy captures how predictable an individual’s (e.g., shopping) pattern of behaviour is over time. In this study, we proposed that entropy can be conceptualised as a proxy measure of Openness, a Big Five personality trait. We further studied entropy’s associations with external behavioural outcome, namely the voting outcomes of the 2016 EU ‘Brexit’ referendum in the UK. This referendum asked UK citizens whether the UK should stay in the EU (vote Remain) or leave the EU (vote Leave). It has been demonstrated that Leave (or ‘Brexit’) vote was heavily influenced by attitudes towards immigration which is associated with values of being less ‘open’ to other cultures, and therefore we expected that entropy – or tendency to try new things – would be associated positively with voting Remain. Objectives & ApproachWith a massive data set (20,550,952 customers) provided by a large UK retail chain over a period of 2 years, we computed aggregated entropy for the Local Authority Districts (LADs). Further we investigated the relationships between entropy and personality traits, as well as between entropy and the referendum outcomes, at geographically aggregated levels. Relevance to Digital FootprintsThis study brought together digital footprints data with external sources. This study also identified population level insights by examining personality traits and their utility in predicting sociopolitical outcomes. ResultsResults of a linear regression model showed strong evidence supporting a positive relationship between entropy and Openness (b = 0.30, t = 3.30, p = .001), and a negative relationship between entropy and Neuroticism (b = -0.48, t = -3.53, p < .001). Further, entropy was associated with outcomes of the EU referendum in each LAD. Results of another linear regression model showed strong evidence supporting a positive relationship between the percentage of Remain votes and entropy (b = 0.28, t = 4.80, p < .001). Conclusions & ImplicationsThe relationship between Big Five trait Openness and entropy provided support that personality can be inferred from digital footprints data such as shopping history records. The positive relationship between entropy and the proportion of Remain vote demonstrated that people who are more open to new experiences voted Remain. Our findings have broader implications showing that it is possible to find associations between personality traits extrapolated from shopping data and real-world choices.
介绍,数字足迹数据的可用性为人格心理学家提供了新的和宝贵的机会。用数字足迹数据研究个体差异的一种方法是通过熵的视角,熵是对概率系统随机程度的衡量。当应用于个体行为时,熵捕获了个体(例如,购物)行为模式随时间的可预测性。在这项研究中,我们提出熵可以被概念化为开放性的代理测量,开放性是五大人格特质之一。我们进一步研究了熵与外部行为结果的关联,即2016年英国欧盟“脱欧”公投的投票结果。这次公投询问英国公民,英国应该留在欧盟(投票留欧)还是离开欧盟(投票脱欧)。已经证明,脱欧(或“脱欧”)投票严重受到对移民的态度的影响,这与对其他文化不那么“开放”的价值观有关,因此我们预计熵-或尝试新事物的倾向-将与投票留下积极相关。目标,方法:利用英国一家大型零售连锁店在2年期间提供的大量数据集(20,550,952名客户),我们计算了地方当局地区(LADs)的汇总熵。我们进一步调查了熵和人格特质之间的关系,以及熵和公投结果之间的关系,在地理上聚集的水平。与数字足迹相关这项研究将数字足迹数据与外部来源结合在一起。这项研究还通过考察人格特征及其在预测社会政治结果方面的效用,确定了人口水平的见解。结果线性回归模型结果显示,熵与开放性呈正相关(b = 0.30, t = 3.30, p = 0.001),熵与神经质呈负相关(b = -0.48, t = -3.53, p <措施)。此外,熵与每个LAD的欧盟公投结果相关。另一个线性回归模型的结果显示,有强有力的证据支持留欧票百分比与熵之间的正相关关系(b = 0.28, t = 4.80, p <措施)。结论,启示五大特征开放性和熵的关系为人格可以从购物历史记录等数字足迹数据中推断出来提供了支持。熵与留欧比例之间的正相关关系表明,对新体验更开放的人投了留欧票。我们的研究结果具有更广泛的意义,表明有可能从购物数据和现实世界的选择中推断出性格特征之间的联系。
{"title":"Behavioural entropy as an individual difference","authors":"Neo Poon, James Goulding, Anya Skatova","doi":"10.23889/ijpds.v8i3.2289","DOIUrl":"https://doi.org/10.23889/ijpds.v8i3.2289","url":null,"abstract":"Introduction & BackgroundThe availability of digital footprints data have provided new and invaluable opportunities for personality psychologists. One way to study individual differences with digital footprints data is through the lens of entropy, which is a measure of the degree of randomness of a probabilistic system. When applied to individual behaviour, entropy captures how predictable an individual’s (e.g., shopping) pattern of behaviour is over time. In this study, we proposed that entropy can be conceptualised as a proxy measure of Openness, a Big Five personality trait. We further studied entropy’s associations with external behavioural outcome, namely the voting outcomes of the 2016 EU ‘Brexit’ referendum in the UK. This referendum asked UK citizens whether the UK should stay in the EU (vote Remain) or leave the EU (vote Leave). It has been demonstrated that Leave (or ‘Brexit’) vote was heavily influenced by attitudes towards immigration which is associated with values of being less ‘open’ to other cultures, and therefore we expected that entropy – or tendency to try new things – would be associated positively with voting Remain. Objectives & ApproachWith a massive data set (20,550,952 customers) provided by a large UK retail chain over a period of 2 years, we computed aggregated entropy for the Local Authority Districts (LADs). Further we investigated the relationships between entropy and personality traits, as well as between entropy and the referendum outcomes, at geographically aggregated levels. Relevance to Digital FootprintsThis study brought together digital footprints data with external sources. This study also identified population level insights by examining personality traits and their utility in predicting sociopolitical outcomes. ResultsResults of a linear regression model showed strong evidence supporting a positive relationship between entropy and Openness (b = 0.30, t = 3.30, p = .001), and a negative relationship between entropy and Neuroticism (b = -0.48, t = -3.53, p < .001). Further, entropy was associated with outcomes of the EU referendum in each LAD. Results of another linear regression model showed strong evidence supporting a positive relationship between the percentage of Remain votes and entropy (b = 0.28, t = 4.80, p < .001). Conclusions & ImplicationsThe relationship between Big Five trait Openness and entropy provided support that personality can be inferred from digital footprints data such as shopping history records. The positive relationship between entropy and the proportion of Remain vote demonstrated that people who are more open to new experiences voted Remain. Our findings have broader implications showing that it is possible to find associations between personality traits extrapolated from shopping data and real-world choices.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135153433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-18DOI: 10.23889/ijpds.v8i3.2267
Victoria Jenneson, Darren Greenwood, Graham Clarke, Timothy Rains, Becky Shute, Michelle Morris
Introduction & BackgroundSupermarket transactions leave a digital footprint which offers insight into dietary habits. Use of transactions in nutrition research has increased, but these data are rarely validated. The STRIDE (Supermarket Transaction Records In Dietary Evaluation) study compares dietary estimates from supermarket transactions with self-reported intake from an online Food Frequency Questionnaire (FFQ).
Objectives & ApproachWorking with a large UK supermarket, loyalty card customers were recruited to one of four waves (accounting for seasonal dietary variation). Participants completed an online FFQ and consented to sharing their transaction records for one year during the study, and one year prior. The Bland-Altman method was used to calculate agreement and limits of agreement between transactions and intake for daily energy, sugar, total fat, saturated fat, protein and sodium (absolute and energy-adjusted).
Relevance to Digital FootprintsSupermarket transactions are a form of digital footprints data with advantages over survey methods, with regards scalability and objectivity, for monitoring population-level diets.
Results1,788 participants from four UK regions gave consent. 686 participants who completed the FFQ and made purchases during the same period, were included for analysis. Participants were mostly female (72%), with a mean age of 56 years (SD 13). A regression equation for agreement is presented for estimating intake from purchases. Agreement for absolute measures was poor overall, but higher for single-person households and households reporting a higher proportion of total food purchases from the study retailer. Agreement was stronger for energy-adjusted nutrient estimates, particularly fat, with purchase records under-estimating the proportion of total energy intake from fat by just 2%.
Conclusions & ImplicationsThe STRIDE study found that household purchases from a single retailer were a poor proxy for individual-level nutrient intakes. However, close agreement on average energy-adjusted estimates suggests purchases are a good indicator of dietary composition. Supermarket transactions have utility for population dietary assessment, ecological studies, and identifying intervention targets based on dietary patterns. Digital footprint data from transactions can contribute to the design and monitoring of national and local-level interventions.
{"title":"Supermarket Transaction Records In Dietary Evaluation – the STRIDE validation study.","authors":"Victoria Jenneson, Darren Greenwood, Graham Clarke, Timothy Rains, Becky Shute, Michelle Morris","doi":"10.23889/ijpds.v8i3.2267","DOIUrl":"https://doi.org/10.23889/ijpds.v8i3.2267","url":null,"abstract":"Introduction & BackgroundSupermarket transactions leave a digital footprint which offers insight into dietary habits. Use of transactions in nutrition research has increased, but these data are rarely validated. The STRIDE (Supermarket Transaction Records In Dietary Evaluation) study compares dietary estimates from supermarket transactions with self-reported intake from an online Food Frequency Questionnaire (FFQ).
 Objectives & ApproachWorking with a large UK supermarket, loyalty card customers were recruited to one of four waves (accounting for seasonal dietary variation). Participants completed an online FFQ and consented to sharing their transaction records for one year during the study, and one year prior. The Bland-Altman method was used to calculate agreement and limits of agreement between transactions and intake for daily energy, sugar, total fat, saturated fat, protein and sodium (absolute and energy-adjusted).
 Relevance to Digital FootprintsSupermarket transactions are a form of digital footprints data with advantages over survey methods, with regards scalability and objectivity, for monitoring population-level diets.
 Results1,788 participants from four UK regions gave consent. 686 participants who completed the FFQ and made purchases during the same period, were included for analysis. Participants were mostly female (72%), with a mean age of 56 years (SD 13). A regression equation for agreement is presented for estimating intake from purchases. Agreement for absolute measures was poor overall, but higher for single-person households and households reporting a higher proportion of total food purchases from the study retailer. Agreement was stronger for energy-adjusted nutrient estimates, particularly fat, with purchase records under-estimating the proportion of total energy intake from fat by just 2%.
 Conclusions & ImplicationsThe STRIDE study found that household purchases from a single retailer were a poor proxy for individual-level nutrient intakes. However, close agreement on average energy-adjusted estimates suggests purchases are a good indicator of dietary composition. Supermarket transactions have utility for population dietary assessment, ecological studies, and identifying intervention targets based on dietary patterns. Digital footprint data from transactions can contribute to the design and monitoring of national and local-level interventions.","PeriodicalId":132937,"journal":{"name":"International Journal for Population Data Science","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135154082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}