首页 > 最新文献

International Journal of Population Data Science最新文献

英文 中文
Has HFSS legislation led to healthier food and beverage sales? The DIO-Food protocol – using supermarket sales data for policy evaluation HFSS 立法是否带来了更健康的食品和饮料销售?DIO 食品协议--利用超市销售数据进行政策评估
Pub Date : 2024-06-10 DOI: 10.23889/ijpds.v9i4.2426
V. Jenneson, F.L. Pontin, Emily Ennis, Alison Fildes, Michelle A. Morris
Introduction & BackgroundOn 1 October 2022, new legislation came into force for England restricting the placement of some food and drink products high in fat, sugar and salt (HFSS). Products such as confectionery can no longer be placed at store entrances, ends of aisles, or at the checkout in large retail stores and their online equivalents. Objectives & ApproachOur protocol sets out how daily sales and product data from multiple retailers will be used to evaluate the legislation’s success in relation to HFSS sales, product portfolios and equitability. Food and drink sales data from 18 months pre- and 12 months post-introduction of the policy will be gained from multiple large UK retailers. Online sales are excluded. Eligible stores were defined as supermarkets from our partner retailer brands with store areas larger than 280 square metres. From the eligible store sample, we selected 160 intervention stores (England) and 50 control stores (Scotland and Wales) from each partner retailer. The sample provides equal store numbers across each decile of the Priority Places for Food Index (PPFI) from each retailer (n = 16), capturing food insecurity risk, and maximum coverage of store (store size) and store area characteristics (urban/rural status). Controlled interrupted time-series will be used to estimate effects of the policy, with stores from Scotland and Wales (where the legislation has not been implemented) acting as controls. Relevance to Digital FootprintsThis protocol sets out the first multiple-retailer independent analysis of the HFSS legislation, demonstrating how business digital footprints data can contribute to policy evaluation. ResultsOutcomes will include sales of HFSS products and changes to available product portfolios. We will explore whether legislation impacts were equitable across stores in areas with different demographic characteristics, according to the English Indices of Multiple Deprivation and the PPFI. Findings at the retailer and cross-retailer levels will inform sector-level insights regarding impact and potential next steps for policy and business practice. Conclusions & ImplicationsOur conclusions will contribute to policy-relevant discussions around the effectiveness of HFSS government policy, with potential to influence future decision-making across the UK Devolved Nations.
简介与背景2022 年 10 月 1 日,英格兰开始实施新法规,限制某些高脂肪、高糖和高盐(HFSS)食品和饮料的摆放。糖果等产品不能再摆放在商店入口处、过道尽头或大型零售商店及其在线商店的收银台处。目标和方法我们的规程规定了如何使用来自多家零售商的日常销售和产品数据来评估该立法在高氟酸盐销售、产品组合和公平性方面是否成功。我们将从英国多家大型零售商处获得该政策实施前 18 个月和实施后 12 个月的食品和饮料销售数据。不包括在线销售。符合条件的商店是指我们的合作零售商品牌中商店面积大于 280 平方米的超市。从符合条件的商店样本中,我们从每个合作零售商中选出 160 家干预商店(英格兰)和 50 家对照商店(苏格兰和威尔士)。每个零售商的食品优先场所指数(PPFI)的每个十分位数(n = 16)的店铺数量相等,从而捕捉到了食品不安全风险,并最大限度地覆盖了店铺(店铺面积)和店铺区域特征(城市/农村状况)。受控中断时间序列将用于估算政策效果,苏格兰和威尔士(尚未实施该立法)的商店将作为对照。与数字足迹的相关性本协议首次对高频袜立法进行了多零售商独立分析,展示了商业数字足迹数据如何有助于政策评估。结果结果将包括高频安全系统产品的销售额和现有产品组合的变化。我们将根据英国多重贫困指数(English Indices of Multiple Deprivation)和PPFI,探讨立法对不同人口特征地区的商店的影响是否公平。零售商和跨零售商层面的研究结果将为行业层面的影响洞察以及下一步可能的政策和商业实践提供信息。结论与影响我们的结论将有助于围绕高频安全系统政府政策的有效性展开与政策相关的讨论,并有可能影响英国下放国家的未来决策。
{"title":"Has HFSS legislation led to healthier food and beverage sales? The DIO-Food protocol – using supermarket sales data for policy evaluation","authors":"V. Jenneson, F.L. Pontin, Emily Ennis, Alison Fildes, Michelle A. Morris","doi":"10.23889/ijpds.v9i4.2426","DOIUrl":"https://doi.org/10.23889/ijpds.v9i4.2426","url":null,"abstract":"Introduction & BackgroundOn 1 October 2022, new legislation came into force for England restricting the placement of some food and drink products high in fat, sugar and salt (HFSS). Products such as confectionery can no longer be placed at store entrances, ends of aisles, or at the checkout in large retail stores and their online equivalents. \u0000Objectives & ApproachOur protocol sets out how daily sales and product data from multiple retailers will be used to evaluate the legislation’s success in relation to HFSS sales, product portfolios and equitability. Food and drink sales data from 18 months pre- and 12 months post-introduction of the policy will be gained from multiple large UK retailers. Online sales are excluded. \u0000Eligible stores were defined as supermarkets from our partner retailer brands with store areas larger than 280 square metres. From the eligible store sample, we selected 160 intervention stores (England) and 50 control stores (Scotland and Wales) from each partner retailer. \u0000The sample provides equal store numbers across each decile of the Priority Places for Food Index (PPFI) from each retailer (n = 16), capturing food insecurity risk, and maximum coverage of store (store size) and store area characteristics (urban/rural status). \u0000Controlled interrupted time-series will be used to estimate effects of the policy, with stores from Scotland and Wales (where the legislation has not been implemented) acting as controls. \u0000Relevance to Digital FootprintsThis protocol sets out the first multiple-retailer independent analysis of the HFSS legislation, demonstrating how business digital footprints data can contribute to policy evaluation. \u0000ResultsOutcomes will include sales of HFSS products and changes to available product portfolios. We will explore whether legislation impacts were equitable across stores in areas with different demographic characteristics, according to the English Indices of Multiple Deprivation and the PPFI. \u0000Findings at the retailer and cross-retailer levels will inform sector-level insights regarding impact and potential next steps for policy and business practice. \u0000Conclusions & ImplicationsOur conclusions will contribute to policy-relevant discussions around the effectiveness of HFSS government policy, with potential to influence future decision-making across the UK Devolved Nations.","PeriodicalId":507952,"journal":{"name":"International Journal of Population Data Science","volume":" 1253","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141363820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Healthy Start Scheme Uptake using Deprivation and Food Insecurity Measures. 利用贫困和粮食不安全衡量标准预测健康起步计划的参与率。
Pub Date : 2024-06-10 DOI: 10.23889/ijpds.v9i4.2435
Kuzivakwashe Makokoro, Gavin Long, John Harvey, Andrew Smith, Simon Welham, Evgeniya Lukinova, James Goulding
Introduction & BackgroundThe level of food insecurity in England is widening, with low-income families requiring more support to reduce income inequalities. The government have introduced policies to address these issues with targeted subsidies on healthy food on programs such as the Healthy Start Scheme. Despite this, national uptake of the Healthy Start Scheme remains lower than the government target. Objectives & ApproachOur study aims to predict uptake and take up discrepancies at a local authority level and understand the measures contributing to the prediction using anonymised supermarket loyalty card data records for over 4 million customers, deprivation and food insecurity measures. We used a machine-learning approach utilising transactional data, ONS Index of Deprivation datasets, neighbourhood statistics, and NHS Healthy Start Scheme uptake data. Regression prediction models were used to evaluate and predict the outcomes, whilst feature importance tools were used to evaluate the variables weighing within the model. Relevance to Digital FootprintsThis study leverages transaction data from a UK retailer to understand lifestyle factors at a local authority level and assesses their usefulness in predicting the scheme’s uptake. Loyalty card transactional data can provide valuable insight into purchase behaviour linked to health and nutrition. ResultsThe Linear and Ridge Regression models performed better than other prediction models. Analysis of measures revealed that whilst deprivation and population-related measures had a high contribution to the prediction model, findings from transactional data measures provided valuable insight into shopping behavioural characteristics that contribute to the model performance. Results suggested that areas with higher spending on fruits and vegetables and high-calorie food were associated with higher uptake prediction in test data but the converse for high spend on fish. Conclusions & ImplicationsOur study indicates that shopping data measures such as spend on fruits and vegetables, high-calorie food, fish and products bought can be utilised for prediction models for uptake and take-up discrepancy of the Healthy Start Scheme. This study highlights the complexity of understanding factors influencing public policy effectiveness and the need for tailored approaches in diverse urban contexts.
导言与背景英格兰的食品不安全问题日益严重,低收入家庭需要更多支持以减少收入不平等。为解决这些问题,政府出台了相关政策,在 "健康起步计划 "等项目中提供有针对性的健康食品补贴。尽管如此,健康起步计划在全国的实施率仍低于政府目标。目标与方法我们的研究旨在预测地方当局层面的摄取量和摄取量差异,并利用超过 400 万名顾客的匿名超市会员卡数据记录、贫困和食品不安全衡量标准了解有助于预测的措施。我们利用交易数据、国家统计局贫困指数数据集、邻里统计数据和英国国家医疗服务体系健康起步计划摄入量数据,采用了一种机器学习方法。回归预测模型用于评估和预测结果,而特征重要性工具则用于评估模型中的权衡变量。与数字足迹的相关性本研究利用英国零售商的交易数据来了解地方当局层面的生活方式因素,并评估这些因素在预测计划吸收率方面的作用。会员卡交易数据可为了解与健康和营养相关的购买行为提供宝贵的信息。结果线性回归模型和岭回归模型的表现优于其他预测模型。对测量结果的分析表明,虽然贫困程度和人口相关测量结果对预测模型的贡献率较高,但交易数据测量结果提供了对购物行为特征的宝贵见解,有助于提高模型的性能。结果表明,在测试数据中,水果和蔬菜以及高热量食品消费较高的地区与较高的摄入量预测相关,而鱼类消费较高的地区则相反。结论与启示我们的研究表明,果蔬、高热量食品、鱼类和产品购买支出等购物数据指标可用于健康起步计划摄取量和摄取量差异的预测模型。这项研究凸显了了解影响公共政策有效性因素的复杂性,以及在不同城市环境中采用定制方法的必要性。
{"title":"Predicting Healthy Start Scheme Uptake using Deprivation and Food Insecurity Measures.","authors":"Kuzivakwashe Makokoro, Gavin Long, John Harvey, Andrew Smith, Simon Welham, Evgeniya Lukinova, James Goulding","doi":"10.23889/ijpds.v9i4.2435","DOIUrl":"https://doi.org/10.23889/ijpds.v9i4.2435","url":null,"abstract":"Introduction & BackgroundThe level of food insecurity in England is widening, with low-income families requiring more support to reduce income inequalities. The government have introduced policies to address these issues with targeted subsidies on healthy food on programs such as the Healthy Start Scheme. Despite this, national uptake of the Healthy Start Scheme remains lower than the government target. \u0000Objectives & ApproachOur study aims to predict uptake and take up discrepancies at a local authority level and understand the measures contributing to the prediction using anonymised supermarket loyalty card data records for over 4 million customers, deprivation and food insecurity measures. We used a machine-learning approach utilising transactional data, ONS Index of Deprivation datasets, neighbourhood statistics, and NHS Healthy Start Scheme uptake data. Regression prediction models were used to evaluate and predict the outcomes, whilst feature importance tools were used to evaluate the variables weighing within the model. \u0000Relevance to Digital FootprintsThis study leverages transaction data from a UK retailer to understand lifestyle factors at a local authority level and assesses their usefulness in predicting the scheme’s uptake. Loyalty card transactional data can provide valuable insight into purchase behaviour linked to health and nutrition. \u0000ResultsThe Linear and Ridge Regression models performed better than other prediction models. Analysis of measures revealed that whilst deprivation and population-related measures had a high contribution to the prediction model, findings from transactional data measures provided valuable insight into shopping behavioural characteristics that contribute to the model performance. Results suggested that areas with higher spending on fruits and vegetables and high-calorie food were associated with higher uptake prediction in test data but the converse for high spend on fish. \u0000Conclusions & ImplicationsOur study indicates that shopping data measures such as spend on fruits and vegetables, high-calorie food, fish and products bought can be utilised for prediction models for uptake and take-up discrepancy of the Healthy Start Scheme. This study highlights the complexity of understanding factors influencing public policy effectiveness and the need for tailored approaches in diverse urban contexts.","PeriodicalId":507952,"journal":{"name":"International Journal of Population Data Science","volume":" 13","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141366301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Earth Observations, Digital Footprints and Machine-Learning: Greenhouse Gas Stocktaking for Climate Change Mitigation 地球观测、数字足迹和机器学习:为减缓气候变化进行温室气体盘点
Pub Date : 2024-06-10 DOI: 10.23889/ijpds.v9i4.2423
Keneuoe Maliehe, James Goulding, Salim Alam, Stuart Marsh
Introduction & BackgroundMethane (CH4) is a powerful greenhouse gas, leaving both a physical and digital footprint from natural (40%) and human (60%) sources. Its atmospheric concentration has increased from 722 ppb before the industrial age to ~1,922 ppb in recent times. Because of its global warming potential, measuring and monitoring CH4 is crucial to mitigating the impacts of climate change. However, large uncertainties exist in “bottom-up” inventories (a product of activity data based on counts of components, equipment or throughput, and estimates of gas-loss rates per unit of activity for different land uses) reported to the United Nations Framework Convention on Climate Change, making it difficult for policymakers to set emission reduction targets. To address this, we employ causality-constrained machine learning (ML) to combine different gas observations from satellite sensors onboard the TROPOspheric Monitoring Instrument (which measure a digital footprint of human methane-generating behaviour) with outputs from chemical modelling. These are linked with datasets from the national statistics office, meteorology office and a comprehensive survey on quality of life in the emission field, to improve bottom-up estimates of CH4 emissions at the Earth’s surface. Objectives & ApproachThe research uses mixed methods for collecting and analysing both qualitative and quantitative data for multidisciplinary processing strategies for monitoring CH4 emissions locally and regionally. It also assesses whether additional “digital footprint” variables besides the well-known chemical sources and sinks can be studied to improve our understanding of the CH4 budget. We have conducted an “analytical inversion” of satellite observations of CH4 to obtain emission fluxes. These represent the dependent variable for our ML model, in combination with 22 independent variables (co-occurring trace gases, meteorological fields, land use, land cover, population, livestock, and data from a survey of quality of life from the Gauteng City-Region Observatory, covering a broad range of socio-economic, personal and political issues) with near-real-time Earth observation data, to aid the development of a causality-constrained ML model for the prediction of CH4 fluxes. Relevance to Digital FootprintsWe make use of not only satellite imagery, but socio-economic, demographic, and environmental data, and repurpose it for environmental sustainability in the context of mitigating climate change. We are creating unique resources in documenting rapid changes in emissions. Conclusions & ImplicationsThis research will make important contributions to developing countries with limited resources, enabling them to contribute to the global stocktake towards net-zero by helping policymakers identify geographic regions that are major emitters, enabling them to put measures into place to mitigate emissions.
导言与背景甲烷(CH4)是一种强大的温室气体,其物理和数字足迹来自自然(40%)和人类(60%)。它在大气中的浓度已从工业时代前的 722 ppb 增加到近代的 ~1,922 ppb。由于 CH4 有可能导致全球变暖,因此测量和监测 CH4 对于减轻气候变化的影响至关重要。然而,向《联合国气候变化框架公约》报告的 "自下而上 "清单(基于部件、设备或吞吐量计数的活动数据与不同土地利用的单位活动气体损失率估计值的乘积)存在很大的不确定性,使决策者难以设定减排目标。为了解决这个问题,我们采用了因果关系受限的机器学习(ML)方法,将 TROPOspheric Monitoring Instrument(TROPOspheric 监测仪器)上卫星传感器的不同气体观测数据(测量人类甲烷生成行为的数字足迹)与化学建模的输出结果结合起来。这些数据与来自国家统计局、气象局和排放领域生活质量综合调查的数据集相联系,以改进对地球表面甲烷排放量的自下而上的估计。目标和方法该研究采用混合方法收集和分析定性和定量数据,以制定多学科处理策略,监测地方和区域的甲烷排放量。研究还评估了除了众所周知的化学源和汇之外,是否还可以研究其他 "数字足迹 "变量,以提高我们对甲烷预算的理解。我们对 CH4 卫星观测数据进行了 "分析反演",以获得排放通量。这些数据是我们的 ML 模型的因变量,与 22 个自变量(共存的痕量气体、气象场、土地利用、土地覆盖、人口、牲畜以及来自豪登城市地区观测站的生活质量调查数据,涵盖了广泛的社会经济、个人和政治问题)以及近实时地球观测数据相结合,有助于开发一个因果关系受限的 ML 模型来预测 CH4 通量。与数字足迹的相关性我们不仅利用卫星图像,还利用社会经济、人口和环境数据,并将其重新用于减缓气候变化背景下的环境可持续性。我们正在创造记录排放量快速变化的独特资源。结论和影响这项研究将为资源有限的发展中国家做出重要贡献,通过帮助政策制定者确定主要排放地区,使他们能够采取措施减少排放,从而为全球实现净零排放做出贡献。
{"title":"Earth Observations, Digital Footprints and Machine-Learning: Greenhouse Gas Stocktaking for Climate Change Mitigation","authors":"Keneuoe Maliehe, James Goulding, Salim Alam, Stuart Marsh","doi":"10.23889/ijpds.v9i4.2423","DOIUrl":"https://doi.org/10.23889/ijpds.v9i4.2423","url":null,"abstract":"Introduction & BackgroundMethane (CH4) is a powerful greenhouse gas, leaving both a physical and digital footprint from natural (40%) and human (60%) sources. Its atmospheric concentration has increased from 722 ppb before the industrial age to ~1,922 ppb in recent times. Because of its global warming potential, measuring and monitoring CH4 is crucial to mitigating the impacts of climate change. However, large uncertainties exist in “bottom-up” inventories (a product of activity data based on counts of components, equipment or throughput, and estimates of gas-loss rates per unit of activity for different land uses) reported to the United Nations Framework Convention on Climate Change, making it difficult for policymakers to set emission reduction targets. \u0000To address this, we employ causality-constrained machine learning (ML) to combine different gas observations from satellite sensors onboard the TROPOspheric Monitoring Instrument (which measure a digital footprint of human methane-generating behaviour) with outputs from chemical modelling. These are linked with datasets from the national statistics office, meteorology office and a comprehensive survey on quality of life in the emission field, to improve bottom-up estimates of CH4 emissions at the Earth’s surface. \u0000Objectives & ApproachThe research uses mixed methods for collecting and analysing both qualitative and quantitative data for multidisciplinary processing strategies for monitoring CH4 emissions locally and regionally. It also assesses whether additional “digital footprint” variables besides the well-known chemical sources and sinks can be studied to improve our understanding of the CH4 budget. \u0000We have conducted an “analytical inversion” of satellite observations of CH4 to obtain emission fluxes. These represent the dependent variable for our ML model, in combination with 22 independent variables (co-occurring trace gases, meteorological fields, land use, land cover, population, livestock, and data from a survey of quality of life from the Gauteng City-Region Observatory, covering a broad range of socio-economic, personal and political issues) with near-real-time Earth observation data, to aid the development of a causality-constrained ML model for the prediction of CH4 fluxes. \u0000Relevance to Digital FootprintsWe make use of not only satellite imagery, but socio-economic, demographic, and environmental data, and repurpose it for environmental sustainability in the context of mitigating climate change. We are creating unique resources in documenting rapid changes in emissions. \u0000Conclusions & ImplicationsThis research will make important contributions to developing countries with limited resources, enabling them to contribute to the global stocktake towards net-zero by helping policymakers identify geographic regions that are major emitters, enabling them to put measures into place to mitigate emissions.","PeriodicalId":507952,"journal":{"name":"International Journal of Population Data Science","volume":" 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141363502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Alcohol Interventions on Online Grocery Shopping Platforms 在线杂货购物平台上的酒精干预措施
Pub Date : 2024-06-10 DOI: 10.23889/ijpds.v9i4.2430
Eszter Vigh, Angela Attwood, Anne Roudaut
Introduction & BackgroundThere is opportunity to engage light to moderate drinkers in alcohol reduction interventions as a preventative measure. In the space of online grocery shopping there is an added challenge in intervention development in the form of deceptive patterns, which influence consumer behaviour in unhealthy ways including automating behaviour and encouraging overconsumption.Objectives & ApproachThe objectives of this study are to: 1) identify deceptive patterns in the online grocery shopping context, 2) develop interventions which support healthier decision making in this context, 3) apply those interventions to appropriate product categories. The method utilised in the first objective is heuristic analysis which was conducted across eleven major online grocery shopping platforms. The interventions were then developed using the Rapid Iterative Testing and Evaluation (RITE) method, which involved interviewing participants and iterating upon the inventions after every interview. Each interview was analysed using content analysis. When incorporating the interventions into the online grocery shopping environment, interviews were conducted to gain insight into drinking and purchasing habits of consumers. These final interviews were then analysed with inductive thematic analysis.Relevance to Digital FootprintsDigital Footprints underpin the entire intervention development space. The background of the project is built upon human shopping and interaction behaviour online when encountering deceptive patterns. These deceptive patterns have been established using mobile gaming micro-transaction data, online grocery shopping log-in and rewards data, among other data sources. Digital Footprints data can further support the findings from the thematic analysis by further showing cultural and social trends around drinking (e.g., increased purchasing of seasonal beers and ciders in the summer and during sporting tournaments). The purpose of the drinking identified through those social and cultural trends gauge the appropriateness of proposed alcohol interventions. Beyond this, digital footprints data around engagement with health and wellness promoting applications (e.g., active users and app downloads) provides greater insight into the types of health messaging that garner attention and can be used to further inform how to approach those currently outside the health-engaged group. Digital footprints serve to attach larger societal trends to the smaller-scaled interviews and thematic analysis conducted as part of the study.ResultsInitial findings have shown opportunities for nudging light to moderate drinkers who primarily consume beer, wine, or cider. Spirits have been identified as difficult to substitute due to a lack of substitution options in the low alcohol spirit category that are widely available on the consumer market via online grocery retailers.Conclusions & ImplicationsWithout significant change, costs to the National Health Service
简介和背景作为一种预防措施,有机会让轻度至中度饮酒者参与减少酒精的干预活动。在网上食品杂货购物领域,干预措施的开发面临着更大的挑战,因为欺骗性模式会以不健康的方式影响消费者的行为,包括自动行为和鼓励过度消费:1) 识别在线食品杂货购物中的欺骗模式;2) 制定干预措施,支持在这种情况下做出更健康的决策;3) 将这些干预措施应用于适当的产品类别。第一个目标所采用的方法是启发式分析,在 11 个主要的网上食品杂货购物平台上进行。然后采用快速迭代测试和评估(RITE)方法开发干预措施,包括对参与者进行访谈,并在每次访谈后对发明进行迭代。采用内容分析法对每次访谈进行分析。在将干预措施纳入网上杂货购物环境时,还进行了访谈,以深入了解消费者的饮酒和购买习惯。与数字足迹的相关性数字足迹是整个干预措施开发空间的基础。该项目的背景建立在人类在网上购物和遇到欺骗模式时的互动行为之上。这些欺骗模式是通过移动游戏微交易数据、网上购物登录和奖励数据以及其他数据源建立起来的。数字足迹数据通过进一步显示与饮酒有关的文化和社会趋势(例如,在夏季和体育比赛期间购买时令啤酒和苹果酒的情况增多),可以进一步支持专题分析的结论。通过这些社会和文化趋势确定的饮酒目的可以衡量所建议的酒精干预措施的适当性。除此以外,有关参与健康和保健推广应用程序的数字足迹数据(如活跃用户和应用程序下载量)还能让我们更深入地了解引起关注的健康信息类型,并可用于进一步了解如何接近那些目前不属于健康参与群体的人。作为研究的一部分,数字足迹有助于将更大的社会趋势与更小范围的访谈和主题分析联系起来。结果初步研究结果表明,有机会对主要饮用啤酒、葡萄酒或苹果酒的轻度至中度饮酒者进行引导。烈性酒被认为是难以替代的,因为在消费者市场上缺乏通过在线杂货零售商广泛提供的低酒精烈性酒替代品。结论与启示如果不做出重大改变,国民健康服务(NHS)在酒精相关伤害方面的成本将增加 10 亿英镑以上。酒精消费是一种公共健康风险,因此有机会利用数字足迹信息干预措施来吸引公众参与预防性干预。
{"title":"Alcohol Interventions on Online Grocery Shopping Platforms","authors":"Eszter Vigh, Angela Attwood, Anne Roudaut","doi":"10.23889/ijpds.v9i4.2430","DOIUrl":"https://doi.org/10.23889/ijpds.v9i4.2430","url":null,"abstract":"Introduction & BackgroundThere is opportunity to engage light to moderate drinkers in alcohol reduction interventions as a preventative measure. In the space of online grocery shopping there is an added challenge in intervention development in the form of deceptive patterns, which influence consumer behaviour in unhealthy ways including automating behaviour and encouraging overconsumption.\u0000Objectives & ApproachThe objectives of this study are to: 1) identify deceptive patterns in the online grocery shopping context, 2) develop interventions which support healthier decision making in this context, 3) apply those interventions to appropriate product categories. The method utilised in the first objective is heuristic analysis which was conducted across eleven major online grocery shopping platforms. The interventions were then developed using the Rapid Iterative Testing and Evaluation (RITE) method, which involved interviewing participants and iterating upon the inventions after every interview. Each interview was analysed using content analysis. When incorporating the interventions into the online grocery shopping environment, interviews were conducted to gain insight into drinking and purchasing habits of consumers. These final interviews were then analysed with inductive thematic analysis.\u0000Relevance to Digital FootprintsDigital Footprints underpin the entire intervention development space. The background of the project is built upon human shopping and interaction behaviour online when encountering deceptive patterns. These deceptive patterns have been established using mobile gaming micro-transaction data, online grocery shopping log-in and rewards data, among other data sources. Digital Footprints data can further support the findings from the thematic analysis by further showing cultural and social trends around drinking (e.g., increased purchasing of seasonal beers and ciders in the summer and during sporting tournaments). The purpose of the drinking identified through those social and cultural trends gauge the appropriateness of proposed alcohol interventions. Beyond this, digital footprints data around engagement with health and wellness promoting applications (e.g., active users and app downloads) provides greater insight into the types of health messaging that garner attention and can be used to further inform how to approach those currently outside the health-engaged group. Digital footprints serve to attach larger societal trends to the smaller-scaled interviews and thematic analysis conducted as part of the study.\u0000ResultsInitial findings have shown opportunities for nudging light to moderate drinkers who primarily consume beer, wine, or cider. Spirits have been identified as difficult to substitute due to a lack of substitution options in the low alcohol spirit category that are widely available on the consumer market via online grocery retailers.\u0000Conclusions & ImplicationsWithout significant change, costs to the National Health Service","PeriodicalId":507952,"journal":{"name":"International Journal of Population Data Science","volume":" 1232","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141363742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding Twitter Usage through Linked Data: An Analysis of Motivations and Online Behavior 通过关联数据了解 Twitter 的使用情况:动机和在线行为分析
Pub Date : 2024-06-10 DOI: 10.23889/ijpds.v9i4.2418
Shujun Liu, Luke Sloan, C. Jessop, Tarek Al Baghal, Paulo Serôdio
Introduction & BackgroundUses and gratification (U&G) theory posits individuals’ engagement with social media is a deliberate effort to fulfill various needs, like information seeking, entertainment, and networking. However, prior studies predominantly addressed whether individuals use social media to satisfy their needs, leaving a gap in understanding how individuals behave online to satisfy needs. This study fills this gap by merging survey responses with actual Twitter activity, to investigate how individuals behave online to satisfy distinctive motivations, including (a) self-expression, (b) seeking entertainment, (c) business and working, (d) staying informed with news, and (e) networking. We also investigated how these online behaviors vary among individuals with different demographic features, including socio-economic classes, gender, and age. Objectives & ApproachOur research addressed questions by linking survey responses with actual Twitter activities within the U.K. Participants were asked to provide survey responses surrounding age, gender, socio-economic class, and motivations for using social media. They were also queried about the existence of Twitter account, willingness to disclose Twitter username, and, if agreeable, the username itself. The survey continued until a total of 2,195 individuals shared Twitter handles. Following the removal of accounts that were either suspended or nonexistent, the study proceeded with a final count of 1,915. We collected each user’s Twitter metadata with Twitter API, including tweet count, follower count, following count, and bio information, and linked each user’s metadata with survey responses. To ensure respondents’ anonymity, survey, Twitter and linked data are stored separately, and can only be accessed by designated researcher. Relevance to Digital FootprintsThe study's approach of linking survey responses with actual Twitter activity offers a detailed insight into the digital footprints left by users as they engage with social media to satisfy their diverse needs. By analyzing the behaviors associated with motivations, this research illuminates the specific ways individuals curate their digital presence. ResultsRegression analysis indicated that individuals motivated by self-expression tend to tweet (b = .28, SE = .06, p < .001), follow account (b = .38, SE = .06, p < .001), gain followers (b = .13, SE = .06, p = .035), and post bio details (b = .89, SE = .13, p < .001). Work and business motivation leads to post bio information (b = .38, SE = .15, p = .012), while networking leads to follow more accounts (b = .28, SE = .06, p < .001). Social-economic class moderated associations between networking motivation and tweet count (b = -.25, SE = .09, p = .004), and between self-expression and tweet count (b = .20, SE = .08, p = .009). For individuals with higher socio-economic, self-expression has a higher effect on tweet count, whereas networking motivation has a less effect on tweet count
简介与背景 使用与满足(U&G)理论认为,个人使用社交媒体是为了满足各种需求,如信息搜索、娱乐和网络。然而,以往的研究主要探讨的是个人是否使用社交媒体来满足自己的需求,因此在了解个人如何通过网络行为来满足需求方面存在空白。本研究填补了这一空白,将调查反馈与实际的 Twitter 活动相结合,研究个人如何通过在线行为来满足不同的动机,包括(a)自我表达,(b)寻求娱乐,(c)商务和工作,(d)了解新闻,以及(e)网络。我们还调查了这些上网行为在不同人口特征(包括社会经济阶层、性别和年龄)的个人中的差异。目标与方法我们的研究通过将调查回复与英国推特的实际活动联系起来来解决这些问题。参与者被要求就年龄、性别、社会经济阶层和使用社交媒体的动机提供调查回复。他们还被问及是否有推特账户、是否愿意公开推特用户名,如果同意,还被问及用户名本身。调查一直持续到共有 2195 人分享了推特账号。在删除了被暂停或不存在的账户后,研究继续进行,最终统计出 1,915 人。我们通过 Twitter API 收集了每个用户的 Twitter 元数据,包括推文数、追随者数、关注数和简介信息,并将每个用户的元数据与调查回复进行了链接。为确保受访者的匿名性,调查问卷、Twitter 和链接数据均单独存储,只有指定的研究人员才能访问。与数字足迹的相关性本研究将调查回复与 Twitter 的实际活动联系起来,从而详细了解了用户为满足不同需求而使用社交媒体时留下的数字足迹。通过分析与动机相关的行为,本研究揭示了个人策划其数字存在的具体方式。结果回归分析表明,出于自我表达动机的个人倾向于发推文(b = .28,SE = .06,p < .001)、关注账户(b = .38,SE = .06,p < .001)、获得粉丝(b = .13,SE = .06,p = .035)和发布个人资料(b = .89,SE = .13,p < .001)。工作和商业动机会导致发布生物信息(b = .38,SE = .15,p = .012),而网络关系会导致关注更多账户(b = .28,SE = .06,p < .001)。社会经济阶层调节了网络动机与推文数量之间的关联(b = -.25,SE = .09,p = .004),以及自我表达与推文数量之间的关联(b = .20,SE = .08,p = .009)。对于社会经济地位较高的人来说,自我表达对推文数量的影响更大,而网络动机对推文数量的影响较小。此外,我们还发现性别调节了自我表达与推特数量之间的关系(b = .25,SE = .12,p = .04)以及保持新闻更新与推特数量之间的关系(b = .11,SE = .05,p = .03)。结论与启示这些发现提供了对社交媒体使用的细微理解,强调了不同的动机如何影响特定的网络行为。将调查与实际社交媒体活动联系起来的新方法更准确地反映了用户行为,为学术界和实际社交媒体战略和设计提供了启示。
{"title":"Understanding Twitter Usage through Linked Data: An Analysis of Motivations and Online Behavior","authors":"Shujun Liu, Luke Sloan, C. Jessop, Tarek Al Baghal, Paulo Serôdio","doi":"10.23889/ijpds.v9i4.2418","DOIUrl":"https://doi.org/10.23889/ijpds.v9i4.2418","url":null,"abstract":"Introduction & BackgroundUses and gratification (U&G) theory posits individuals’ engagement with social media is a deliberate effort to fulfill various needs, like information seeking, entertainment, and networking. However, prior studies predominantly addressed whether individuals use social media to satisfy their needs, leaving a gap in understanding how individuals behave online to satisfy needs. This study fills this gap by merging survey responses with actual Twitter activity, to investigate how individuals behave online to satisfy distinctive motivations, including (a) self-expression, (b) seeking entertainment, (c) business and working, (d) staying informed with news, and (e) networking. We also investigated how these online behaviors vary among individuals with different demographic features, including socio-economic classes, gender, and age. \u0000Objectives & ApproachOur research addressed questions by linking survey responses with actual Twitter activities within the U.K. Participants were asked to provide survey responses surrounding age, gender, socio-economic class, and motivations for using social media. They were also queried about the existence of Twitter account, willingness to disclose Twitter username, and, if agreeable, the username itself. The survey continued until a total of 2,195 individuals shared Twitter handles. Following the removal of accounts that were either suspended or nonexistent, the study proceeded with a final count of 1,915. \u0000We collected each user’s Twitter metadata with Twitter API, including tweet count, follower count, following count, and bio information, and linked each user’s metadata with survey responses. To ensure respondents’ anonymity, survey, Twitter and linked data are stored separately, and can only be accessed by designated researcher. \u0000Relevance to Digital FootprintsThe study's approach of linking survey responses with actual Twitter activity offers a detailed insight into the digital footprints left by users as they engage with social media to satisfy their diverse needs. By analyzing the behaviors associated with motivations, this research illuminates the specific ways individuals curate their digital presence. \u0000ResultsRegression analysis indicated that individuals motivated by self-expression tend to tweet (b = .28, SE = .06, p < .001), follow account (b = .38, SE = .06, p < .001), gain followers (b = .13, SE = .06, p = .035), and post bio details (b = .89, SE = .13, p < .001). Work and business motivation leads to post bio information (b = .38, SE = .15, p = .012), while networking leads to follow more accounts (b = .28, SE = .06, p < .001). \u0000Social-economic class moderated associations between networking motivation and tweet count (b = -.25, SE = .09, p = .004), and between self-expression and tweet count (b = .20, SE = .08, p = .009). For individuals with higher socio-economic, self-expression has a higher effect on tweet count, whereas networking motivation has a less effect on tweet count","PeriodicalId":507952,"journal":{"name":"International Journal of Population Data Science","volume":"109 26","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141362070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Augmenting Surveys with Social Media Data: A Probabilistic Framework for LinkedIn Data Linkage. 用社交媒体数据增强调查:LinkedIn 数据链接的概率框架。
Pub Date : 2024-06-10 DOI: 10.23889/ijpds.v9i4.2433
Paulo Matos Serodio, Tarek Al Baghal, Luke Sloan, Shujun Liu, C. Jessop
Introduction & BackgroundLinkedIn, with its extensive global network of over 900 million members across more than 200 countries, presents a unique repository for examining labour market dynamics, professional development, and the impact of social networking on employment opportunities. Despite its potential, LinkedIn's wealth of data on professional trajectories, skills, and labour market outcomes remains largely untapped in survey research due to challenges in data collection. Objectives & ApproachThis paper introduces a novel methodology for integrating LinkedIn data with survey responses using data from the fourteenth wave of the Innovation Panel (IP14) of Understanding Society: The UK Household Longitudinal Study (UKHLS), conducted in 2021. In IP14, we probed the extent of LinkedIn usage among the UK population and assessed users' willingness to link their LinkedIn profiles with their survey responses. Those consenting to link their accounts were asked for specific details — namely their first and last names, employer, and job title — to enable profile identification on LinkedIn. Faced with the unavailability of a unique platform identifier and the cessation of LinkedIn’s API, this information was crucial for matching profiles accurately. We crafted a framework using PhantomBuster for ethical data extraction and a probabilistic string-matching technique to ensure precise linkage between survey responses and LinkedIn profiles. PhantomBuster, a cloud-based tool, efficiently scrapes dynamic content using JavaScript in a headless browser environment, sidestepping IP-related restrictions while adhering to website terms of service. It streamlines the data collection process. Identified profiles were subjected to an iterative probabilistic string matching, using respondent-provided metadata alongside supplementary data, to maximize the accuracy of matching the profiles to our survey participants. Relevance to Digital FootprintsThe described method advances digital footprint research in data collection and linkage. It automates the retrieval of vast online data sets; compiles information efficiently in an organized format; saves time and labour by mechanizing monotonous tasks; circumvents platform-imposed IP restrictions; and imposes fewer barriers to entry as it requires less technical skill than other scraping tools like Selenium. Conclusions & ImplicationsThis approach not only facilitates the precise identification and collection of LinkedIn profile data but also sets a precedent for ethical considerations in web scraping practices. By documenting this methodology, we aim to equip researchers with a scalable and replicable tool for future studies, enriching the analysis of labour market outcomes and the interplay between formal education, informal training, and professional success through the integration of LinkedIn and survey data.
简介与背景LinkedIn拥有遍布200多个国家、超过9亿会员的广泛全球网络,为研究劳动力市场动态、职业发展以及社交网络对就业机会的影响提供了一个独特的资源库。尽管LinkedIn潜力巨大,但由于数据收集方面的挑战,有关职业轨迹、技能和劳动力市场结果的大量数据在调查研究中仍未得到充分利用。目标与方法 本文介绍了一种将LinkedIn数据与调查回答相结合的新方法,该方法使用的数据来自于 "理解社会 "的第14波创新小组(IP14):了解社会:英国家庭纵向研究》(UKHLS)第十四次调查(IP14)中的数据,介绍了将 LinkedIn 数据与调查回答进行整合的新方法。在IP14中,我们探究了LinkedIn在英国人口中的使用程度,并评估了用户是否愿意将他们的LinkedIn档案与他们的调查回答联系起来。那些同意链接其账户的人被要求提供具体细节--即他们的姓和名、雇主和职位--以便在LinkedIn上进行个人资料识别。由于无法获得唯一的平台标识符,LinkedIn 的应用程序接口也已停止使用,因此这些信息对于准确匹配个人资料至关重要。我们精心设计了一个框架,使用 PhantomBuster 进行道德数据提取,并使用概率字符串匹配技术确保调查回复与 LinkedIn 资料之间的精确联系。PhantomBuster是一款基于云的工具,可在无头浏览器环境中使用JavaScript高效地抓取动态内容,在遵守网站服务条款的同时避开与IP相关的限制。它简化了数据收集过程。通过使用受访者提供的元数据和补充数据,对识别出的个人资料进行迭代概率字符串匹配,以最大限度地提高个人资料与调查参与者匹配的准确性。与数字足迹的相关性所述方法推进了数据收集和链接方面的数字足迹研究。它可以自动检索庞大的在线数据集;以有组织的格式高效汇编信息;将单调的任务机械化,从而节省时间和人力;规避平台施加的知识产权限制;由于与 Selenium 等其他刮擦工具相比,它对技术技能的要求较低,因此降低了进入门槛。结论与启示这种方法不仅有助于精确识别和收集LinkedIn档案数据,还为网络搜索实践中的道德考量开创了先例。通过记录这种方法,我们旨在为研究人员未来的研究提供一个可扩展、可复制的工具,通过整合LinkedIn和调查数据,丰富对劳动力市场结果以及正规教育、非正规培训和职业成功之间相互作用的分析。
{"title":"Augmenting Surveys with Social Media Data: A Probabilistic Framework for LinkedIn Data Linkage.","authors":"Paulo Matos Serodio, Tarek Al Baghal, Luke Sloan, Shujun Liu, C. Jessop","doi":"10.23889/ijpds.v9i4.2433","DOIUrl":"https://doi.org/10.23889/ijpds.v9i4.2433","url":null,"abstract":"Introduction & BackgroundLinkedIn, with its extensive global network of over 900 million members across more than 200 countries, presents a unique repository for examining labour market dynamics, professional development, and the impact of social networking on employment opportunities. Despite its potential, LinkedIn's wealth of data on professional trajectories, skills, and labour market outcomes remains largely untapped in survey research due to challenges in data collection. \u0000Objectives & ApproachThis paper introduces a novel methodology for integrating LinkedIn data with survey responses using data from the fourteenth wave of the Innovation Panel (IP14) of Understanding Society: The UK Household Longitudinal Study (UKHLS), conducted in 2021. In IP14, we probed the extent of LinkedIn usage among the UK population and assessed users' willingness to link their LinkedIn profiles with their survey responses. Those consenting to link their accounts were asked for specific details — namely their first and last names, employer, and job title — to enable profile identification on LinkedIn. Faced with the unavailability of a unique platform identifier and the cessation of LinkedIn’s API, this information was crucial for matching profiles accurately. \u0000We crafted a framework using PhantomBuster for ethical data extraction and a probabilistic string-matching technique to ensure precise linkage between survey responses and LinkedIn profiles. PhantomBuster, a cloud-based tool, efficiently scrapes dynamic content using JavaScript in a headless browser environment, sidestepping IP-related restrictions while adhering to website terms of service. It streamlines the data collection process. Identified profiles were subjected to an iterative probabilistic string matching, using respondent-provided metadata alongside supplementary data, to maximize the accuracy of matching the profiles to our survey participants. \u0000Relevance to Digital FootprintsThe described method advances digital footprint research in data collection and linkage. It automates the retrieval of vast online data sets; compiles information efficiently in an organized format; saves time and labour by mechanizing monotonous tasks; circumvents platform-imposed IP restrictions; and imposes fewer barriers to entry as it requires less technical skill than other scraping tools like Selenium. \u0000Conclusions & ImplicationsThis approach not only facilitates the precise identification and collection of LinkedIn profile data but also sets a precedent for ethical considerations in web scraping practices. By documenting this methodology, we aim to equip researchers with a scalable and replicable tool for future studies, enriching the analysis of labour market outcomes and the interplay between formal education, informal training, and professional success through the integration of LinkedIn and survey data.","PeriodicalId":507952,"journal":{"name":"International Journal of Population Data Science","volume":"107 51","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141362074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Continuous glucose monitoring (CGM) for 308 older-age participants in an English birth cohort: variability and correlates 英国出生队列中 308 名老年参与者的连续血糖监测 (CGM):可变性和相关性
Pub Date : 2024-06-10 DOI: 10.23889/ijpds.v9i4.2417
Sophie V. Eastwood, Michele Orini, Andrew Wong, Scott T Chiesa, Joshua King-Robson, Jonathan Scott, Nishi Chaturvedi
Introduction & BackgroundEpochs of hyperglycaemia and hypoglycaemia may each increase risk of common chronic diseases and impair both cognitive and physical function even in people without diabetes. Older people may have greater frequency of adverse glycaemic excursions, partly due to disordered autonomic function and sleep quality. Data for older, non-diabetic people are however scant. Objectives & Approach1) To describe blood glucose variability (completed) and 2) its socio-demographic and lifestyle correlates in a predominantly non-diabetic cohort of older adults (planned). Participants were recruited during 2021-2023 from an English birth cohort (the 1946 National Survey for Health and Development Study). They wore a continuous glucose monitor (Freestyle libre Abbott), which measured circulating glucose four times/hour, for seven days. Summary statistics and time outside range (4.4-7.8mmol/L) were calculated. Further information on glycaemic excursions and day-to-day variability will be gleaned using the R “iglu” package. For all CGM summary and excursion measures, future analyses will investigate: associations with HbA1c, socio-demographics, body composition, physical activity, diet and alcohol use. Results will be stratified by sleep/ wake time periods estimated from simultaneous actigraphy (Philips Actiwatch Spectrum Plus). Sensitivity analyses will exclude people taking hypo/ hyperglycaemic medications and those with diabetes. Relevance to Digital FootprintsDerived summary measures can be used by future studies to give insights into glycaemic variability as a population-level risk factor. This work will bring together multiple data sources, i.e. from CGM, actigraphy and baseline cohort data. ResultsParticipants were aged 75-76 years, 45% female and 10% had diagnosed diabetes; median (IQR) BMI was 26.8 (24.6-29.2) kg/m2. CGM data from 308 participants was collected, for a median (IQR) of 6.9 (6.7-7.6) days. Average glucose over the recording period was 5.7mmol/L (5.3-6.2mmol/L), standard deviation was 1.0mmol/L (0.8-1.3mmol/L), time outside range was 12.8% (6.2-24.7%) and 16% of participants spent ≥1 hour/day above and ≥1 hour/day below range. Conclusions & ImplicationsCGM was feasible for this cohort of older adults, and demonstrated high levels of time outside range for a predominantly non-diabetic group. Future analysis will determine whether enhanced characterisation of glycaemic variability is a potentially more accurate tool for predicting future disease risk than isolated glucose measurements.
导言和背景高血糖和低血糖时期可能会分别增加常见慢性疾病的风险,并损害认知和身体功能,即使没有糖尿病的人也是如此。老年人发生不良血糖偏移的频率可能更高,部分原因是自主神经功能紊乱和睡眠质量下降。然而,有关非糖尿病老年人的数据却很少。目标与方法1)描述血糖变异性(已完成);2)描述主要是非糖尿病老年人队列(计划中)中血糖变异性与社会人口学和生活方式的相关性。参与者于 2021-2023 年期间从英国出生队列(1946 年全国健康与发展调查研究)中招募。他们佩戴连续血糖监测仪(Freestyle libre Abbott),每小时测量四次循环血糖,持续七天。计算了汇总统计数据和超出范围(4.4-7.8mmol/L)的时间。有关血糖偏离和日变异性的更多信息将使用 R "iglu "软件包收集。对于所有 CGM 摘要和偏移量,未来的分析将研究:与 HbA1c、社会人口统计学、身体组成、体育活动、饮食和饮酒的关联。分析结果将根据同步动态心电图(飞利浦 Actiwatch Spectrum Plus)估算的睡眠/觉醒时间段进行分层。敏感性分析将排除服用低血糖/高血糖药物者和糖尿病患者。与数字足迹的相关性未来的研究可以利用得出的总结性指标,深入了解作为人群风险因素的血糖变异性。这项工作将汇集多种数据源,即来自 CGM、动图和基线队列数据的数据。结果参与者年龄在 75-76 岁之间,45% 为女性,10% 已确诊糖尿病;体重指数中位数(IQR)为 26.8 (24.6-29.2) kg/m2。收集了 308 名参与者的 CGM 数据,中位数(IQR)为 6.9(6.7-7.6)天。记录期间的平均血糖值为 5.7mmol/L (5.3-6.2mmol/L),标准偏差为 1.0mmol/L (0.8-1.3mmol/L),超出范围的时间为 12.8% (6.2-24.7%),16% 的参与者每天高于范围≥1 小时,低于范围≥1 小时。结论与启示血糖仪对这批老年人来说是可行的,并显示出主要是非糖尿病群体的血糖超出范围的时间水平较高。未来的分析将确定,与单独的血糖测量相比,增强的血糖变异性特征描述是否是预测未来疾病风险的更准确工具。
{"title":"Continuous glucose monitoring (CGM) for 308 older-age participants in an English birth cohort: variability and correlates","authors":"Sophie V. Eastwood, Michele Orini, Andrew Wong, Scott T Chiesa, Joshua King-Robson, Jonathan Scott, Nishi Chaturvedi","doi":"10.23889/ijpds.v9i4.2417","DOIUrl":"https://doi.org/10.23889/ijpds.v9i4.2417","url":null,"abstract":"Introduction & BackgroundEpochs of hyperglycaemia and hypoglycaemia may each increase risk of common chronic diseases and impair both cognitive and physical function even in people without diabetes. Older people may have greater frequency of adverse glycaemic excursions, partly due to disordered autonomic function and sleep quality. Data for older, non-diabetic people are however scant. \u0000Objectives & Approach1) To describe blood glucose variability (completed) and 2) its socio-demographic and lifestyle correlates in a predominantly non-diabetic cohort of older adults (planned). Participants were recruited during 2021-2023 from an English birth cohort (the 1946 National Survey for Health and Development Study). They wore a continuous glucose monitor (Freestyle libre Abbott), which measured circulating glucose four times/hour, for seven days. Summary statistics and time outside range (4.4-7.8mmol/L) were calculated. Further information on glycaemic excursions and day-to-day variability will be gleaned using the R “iglu” package. For all CGM summary and excursion measures, future analyses will investigate: associations with HbA1c, socio-demographics, body composition, physical activity, diet and alcohol use. Results will be stratified by sleep/ wake time periods estimated from simultaneous actigraphy (Philips Actiwatch Spectrum Plus). Sensitivity analyses will exclude people taking hypo/ hyperglycaemic medications and those with diabetes. \u0000Relevance to Digital FootprintsDerived summary measures can be used by future studies to give insights into glycaemic variability as a population-level risk factor. This work will bring together multiple data sources, i.e. from CGM, actigraphy and baseline cohort data. \u0000ResultsParticipants were aged 75-76 years, 45% female and 10% had diagnosed diabetes; median (IQR) BMI was 26.8 (24.6-29.2) kg/m2. CGM data from 308 participants was collected, for a median (IQR) of 6.9 (6.7-7.6) days. Average glucose over the recording period was 5.7mmol/L (5.3-6.2mmol/L), standard deviation was 1.0mmol/L (0.8-1.3mmol/L), time outside range was 12.8% (6.2-24.7%) and 16% of participants spent ≥1 hour/day above and ≥1 hour/day below range. \u0000Conclusions & ImplicationsCGM was feasible for this cohort of older adults, and demonstrated high levels of time outside range for a predominantly non-diabetic group. Future analysis will determine whether enhanced characterisation of glycaemic variability is a potentially more accurate tool for predicting future disease risk than isolated glucose measurements.","PeriodicalId":507952,"journal":{"name":"International Journal of Population Data Science","volume":" October","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141364401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The dynamics of emotion expression on Twitter and mental health in a UK longitudinal study 英国一项纵向研究:推特上的情绪表达动态与心理健康
Pub Date : 2024-06-10 DOI: 10.23889/ijpds.v9i4.2437
Daniel Joinson, Oliver Davis, Edwin Simpson
Introduction & BackgroundAn estimated 4.95 billion people used social media in 2023, with the average user active on around seven platforms for over two hours per day. This widespread use leads to abundant digital footprint data around interactions with social media. These data can be collected continuously and reflect real behaviour of users in naturalistic settings. These strengths have led researchers to propose the use of social media data in digital phenotyping, where digital footprints can be used to quantify and predict health conditions. Mental health assessment in particular could benefit, as existing approaches, such as self-report questionnaires and inpatient assessment, are unable to perform the real-time monitoring that digital phenotyping could potentially achieve. Digital phenotyping models for mental health require careful consideration of what aspects of social media data to include. Including all data users generate could result in models that are overfitted and difficult to explain. Studies are required that explore the relationship between specific aspects of social media data, such as the time course of expressed emotion, and gold-standard measures of mental health. Objectives & ApproachWith participants’ consent, we linked Twitter data to self-reported measures of mental health from the Avon Longitudinal Study of Parents and Children. We performed sentiment analysis using three different approaches—LIWC, VADER and RoBERTa—to estimate the amount, variability and instability of positive and negative emotional content in each participant’s Tweets over a one-year period. We explored the association between these measures of emotion expression and self-reported scores of depressive symptoms, anxiety symptoms and wellbeing. These mental health measures are the Short Mood and Feelings Questionnaire, the Generalized Anxiety 7 and the Warwick Edinburgh Mental Wellbeing Scale. Relevance to Digital FootprintsOur research is highly relevant to digital footprint research, as it involves the use of digital footprint data (i.e. Twitter data) to predict mental health outcomes. Conclusions & ImplicationsThe results of our analysis will inform the development of digital footprint based phenotyping for mental health that could one day provide information to supplement clinical assessments.
简介与背景 据估计,2023 年有 49.5 亿人使用社交媒体,平均每个用户每天在七个左右的平台上活跃两个多小时。社交媒体的广泛使用产生了大量与社交媒体互动相关的数字足迹数据。这些数据可以持续收集,并能反映用户在自然环境中的真实行为。这些优势促使研究人员提出在数字表型中使用社交媒体数据,即数字足迹可用于量化和预测健康状况。心理健康评估尤其可以从中受益,因为现有的方法,如自我报告问卷和住院病人评估,都无法进行实时监测,而数字表型有可能实现这一点。心理健康数字表型模型需要仔细考虑社交媒体数据的哪些方面。如果将用户生成的所有数据都包括在内,可能会导致模型拟合过度,难以解释。我们需要开展研究,探索社交媒体数据的特定方面(如表达情绪的时间过程)与心理健康黄金标准测量之间的关系。目标与方法在征得参与者同意后,我们将推特数据与雅芳父母与子女纵向研究(Avon Longitudinal Study of Parents and Children)中自我报告的心理健康指标联系起来。我们使用三种不同的方法(LIWC、VADER 和 RoBERTa)进行了情感分析,以估算每位参与者一年内推文中积极和消极情绪内容的数量、可变性和不稳定性。我们探讨了这些情绪表达测量与自我报告的抑郁症状、焦虑症状和幸福感得分之间的关联。这些心理健康测量方法包括简短情绪和感觉问卷、广泛性焦虑 7 和沃里克-爱丁堡心理健康量表。与数字足迹的相关性我们的研究与数字足迹研究高度相关,因为它涉及使用数字足迹数据(即推特数据)来预测心理健康结果。结论与启示我们的分析结果将为基于数字足迹的心理健康表型的开发提供信息,有朝一日可以为临床评估提供补充信息。
{"title":"The dynamics of emotion expression on Twitter and mental health in a UK longitudinal study","authors":"Daniel Joinson, Oliver Davis, Edwin Simpson","doi":"10.23889/ijpds.v9i4.2437","DOIUrl":"https://doi.org/10.23889/ijpds.v9i4.2437","url":null,"abstract":"Introduction & BackgroundAn estimated 4.95 billion people used social media in 2023, with the average user active on around seven platforms for over two hours per day. This widespread use leads to abundant digital footprint data around interactions with social media. These data can be collected continuously and reflect real behaviour of users in naturalistic settings. These strengths have led researchers to propose the use of social media data in digital phenotyping, where digital footprints can be used to quantify and predict health conditions. Mental health assessment in particular could benefit, as existing approaches, such as self-report questionnaires and inpatient assessment, are unable to perform the real-time monitoring that digital phenotyping could potentially achieve. \u0000Digital phenotyping models for mental health require careful consideration of what aspects of social media data to include. Including all data users generate could result in models that are overfitted and difficult to explain. Studies are required that explore the relationship between specific aspects of social media data, such as the time course of expressed emotion, and gold-standard measures of mental health. \u0000Objectives & ApproachWith participants’ consent, we linked Twitter data to self-reported measures of mental health from the Avon Longitudinal Study of Parents and Children. We performed sentiment analysis using three different approaches—LIWC, VADER and RoBERTa—to estimate the amount, variability and instability of positive and negative emotional content in each participant’s Tweets over a one-year period. We explored the association between these measures of emotion expression and self-reported scores of depressive symptoms, anxiety symptoms and wellbeing. These mental health measures are the Short Mood and Feelings Questionnaire, the Generalized Anxiety 7 and the Warwick Edinburgh Mental Wellbeing Scale. \u0000Relevance to Digital FootprintsOur research is highly relevant to digital footprint research, as it involves the use of digital footprint data (i.e. Twitter data) to predict mental health outcomes. \u0000Conclusions & ImplicationsThe results of our analysis will inform the development of digital footprint based phenotyping for mental health that could one day provide information to supplement clinical assessments.","PeriodicalId":507952,"journal":{"name":"International Journal of Population Data Science","volume":" 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141365487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Studying Health and Illness Experience using Linked Data (SHIELD): Empowering customers to donate shopping data for chronic pain research 利用关联数据研究健康与疾病体验(SHIELD):授权客户为慢性疼痛研究捐赠购物数据
Pub Date : 2024-06-10 DOI: 10.23889/ijpds.v9i4.2420
Neo Poon, Claire Haworth, Elizabeth Dolan, A. Skatova
Introduction & BackgroundChronic pain is considered a priority in healthcare and a threat to well-being across the globe, it is thus crucial to accurately measure the national levels of pain conditions and their impacts on workplace productivity and well-being.Chronic pain has traditionally been studied in isolation with either self-reported survey data or standalone shopping records. The former are limited in scale and can be marred by response biases, while the latter lack ‘ground truths’: what research teams can measure are usually the purchase patterns of pain relief products, but neither the severity nor types of pain conditions.Objectives & ApproachData donation tools offer a novel approach to study chronic pain by linking the two aspects and establish statistical relationships between medicine consumptions and the multiple facets of pain experience. In a survey, we asked participants (N = 953) to share their loyalty card data with us, which is made possible with the data portability tool provided by Tesco (i.e., the largest supermarket chain in the United Kingdom) as part of the General Data Protection Regulation (GDPR). Based on questions adopted from popular inventories used in health research (e.g., EQ5D Health States, ONS4 Well-being, WEMWBS scales), we also asked participants to report the details of their pain conditions, hours of employment, and both general and mental health states. This allowed us to associate chronic pain - both subjective and objective (i.e., reflected by medicine consumption) - with its economic and personal consequences. Data collection was conducted via research panel providers, thus should approximate national representativeness.Relevance to Digital FootprintsThis work links digital footprints data donated by individuals to self-reported survey data, also develops an infrastructure for these data to be collected and safely stored.Conclusions & ImplicationsOne key value of this project is to pioneer a measure of chronic pain that can be applied to transactional records that are much bigger in scale in future analytic works. Our research team has access to an array of different digital footprints data, including longitudinal transactional data provided by a major pharmacy chain (~20 million customers and ~429 million baskets). In order to utilise these data to associate them with regional workplace productivity measures and well-being data released by the Office for National Statistics, a metric must be defined to extract the prevalence of chronic pain from shopping data, which is informed by the patterns found by the data donation project.
导言与背景慢性疼痛被认为是医疗保健的重点,也是对全球福祉的威胁,因此准确测量全国疼痛状况及其对工作场所生产率和福祉的影响至关重要。前者的规模有限,而且可能会受到回答偏差的影响,而后者则缺乏 "基本事实":研究团队所能测量的通常是止痛产品的购买模式,而不是疼痛状况的严重程度或类型。目标与方法数据捐赠工具提供了一种研究慢性疼痛的新方法,它将这两个方面联系起来,并在药物消耗和疼痛体验的多个方面之间建立统计关系。在一项调查中,我们要求参与者(N = 953)与我们分享他们的会员卡数据,作为《通用数据保护条例》(GDPR)的一部分,乐购(即英国最大的连锁超市)提供的数据可移植性工具使这一要求成为可能。根据健康研究中常用的调查问卷(如 EQ5D 健康状况、ONS4 健康状况、WEMWBS 量表)中的问题,我们还要求参与者报告其疼痛状况、工作时间以及一般和精神健康状况的详细信息。这使我们能够将慢性疼痛(包括主观和客观疼痛(即通过药物消耗量反映))与其经济和个人后果联系起来。与数字足迹的相关性这项工作将个人捐赠的数字足迹数据与自我报告的调查数据联系起来,还为这些数据的收集和安全存储开发了一种基础设施。结论与启示这个项目的一个重要价值是开创了一种慢性疼痛的测量方法,可以应用于未来分析工作中规模更大的交易记录。我们的研究团队可以访问一系列不同的数字足迹数据,包括一家大型连锁药店提供的纵向交易数据(约 2,000 万客户和约 4.29 亿个购物篮)。为了利用这些数据将其与国家统计局发布的地区工作场所生产率指标和幸福感数据联系起来,必须定义一个指标,以便从购物数据中提取慢性疼痛的患病率,而这正是数据捐赠项目所发现的模式所提供的信息。
{"title":"Studying Health and Illness Experience using Linked Data (SHIELD): Empowering customers to donate shopping data for chronic pain research","authors":"Neo Poon, Claire Haworth, Elizabeth Dolan, A. Skatova","doi":"10.23889/ijpds.v9i4.2420","DOIUrl":"https://doi.org/10.23889/ijpds.v9i4.2420","url":null,"abstract":"Introduction & BackgroundChronic pain is considered a priority in healthcare and a threat to well-being across the globe, it is thus crucial to accurately measure the national levels of pain conditions and their impacts on workplace productivity and well-being.\u0000Chronic pain has traditionally been studied in isolation with either self-reported survey data or standalone shopping records. The former are limited in scale and can be marred by response biases, while the latter lack ‘ground truths’: what research teams can measure are usually the purchase patterns of pain relief products, but neither the severity nor types of pain conditions.\u0000Objectives & ApproachData donation tools offer a novel approach to study chronic pain by linking the two aspects and establish statistical relationships between medicine consumptions and the multiple facets of pain experience. In a survey, we asked participants (N = 953) to share their loyalty card data with us, which is made possible with the data portability tool provided by Tesco (i.e., the largest supermarket chain in the United Kingdom) as part of the General Data Protection Regulation (GDPR). Based on questions adopted from popular inventories used in health research (e.g., EQ5D Health States, ONS4 Well-being, WEMWBS scales), we also asked participants to report the details of their pain conditions, hours of employment, and both general and mental health states. This allowed us to associate chronic pain - both subjective and objective (i.e., reflected by medicine consumption) - with its economic and personal consequences. Data collection was conducted via research panel providers, thus should approximate national representativeness.\u0000Relevance to Digital FootprintsThis work links digital footprints data donated by individuals to self-reported survey data, also develops an infrastructure for these data to be collected and safely stored.\u0000Conclusions & ImplicationsOne key value of this project is to pioneer a measure of chronic pain that can be applied to transactional records that are much bigger in scale in future analytic works. Our research team has access to an array of different digital footprints data, including longitudinal transactional data provided by a major pharmacy chain (~20 million customers and ~429 million baskets). In order to utilise these data to associate them with regional workplace productivity measures and well-being data released by the Office for National Statistics, a metric must be defined to extract the prevalence of chronic pain from shopping data, which is informed by the patterns found by the data donation project.","PeriodicalId":507952,"journal":{"name":"International Journal of Population Data Science","volume":" 42","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141366174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RADAR-Pipeline: Scalable Feature Generation for Mobile Health Data RADAR-Pipeline:移动健康数据的可扩展特征生成
Pub Date : 2024-06-10 DOI: 10.23889/ijpds.v9i4.2421
H. Sankesara, Y. Ranjan, P. Conde, Z. Rashid, Akash Roy Choudhury, A. Folarin
Introduction & BackgroundRADAR-Pipeline is an open-source Python framework designed to simplify and enhance mobile health data analysis. It has been designed to efficiently read and process the large amount of data generated through the RADAR-Base platform. RADAR-base is a scalable, real-time streaming and analytics open-source platform to facilitate research access and customisation requirements. Studies using the Radar-base platform have collected fine-grained longitudinal data from wearables and phones. The data can potentially create multitudes of digital biomarkers, which can be used to inform us greatly about the disease condition. Due to the sheer size of the data, it can be difficult for researchers to read and process those data -- a common task is identifying useful features and common data processing/analysis steps previously used by the community. Up to now, these have been hand-crafted by individual data scientists, often lacking the capability to be easily reused by the community without author-specific knowledge. Furthermore, generating variables based on already established research on a larger scale can be challenging and could hinder replication. Hence, we have designed RADAR-Pipeline to help researchers overcome these challenges. It empowers them to create and share their data analysis and visualisation pipelines, fostering collaboration and knowledge sharing within the research community. Objectives & ApproachThe primary objective of RADAR-Pipeline is to offer researchers a user-friendly and powerful platform to develop and share their research.  Researchers can build reusable analysis and visualisation pipelines to ensure consistent and reliable results. It simplifies big data analysis by leveraging Apache Spark to handle large and complex mobile health datasets efficiently.  Researchers can also save time and effort by reusing and extending existing pipelines built by others. Finally, the RADAR-Pipeline promotes collaboration and recognition by allowing researchers to share their work through the RADAR-base Analytics Catalogue, making their pipelines citable and accessible to the wider research community. Whilst Radar-pipeline has been designed to read data from Radar-base, it can also be used to read data from any dataset which uses Hadoop Distributed File System (HDFS) file system namespace. Relevance to Digital FootprintsMobile health data is rich and valuable for understanding human behaviour and health. RADAR-Pipeline addresses the challenges associated with analysing large and complex mobile health datasets, enabling researchers to extract valuable insights that can be used to (1) Improve public health: By enabling efficient analysis of large-scale mobile health data, RADAR-Pipeline can contribute to research efforts aimed at improving population health outcomes and developing effective interventions; (2) Personalised healthcare: By facilitating the extraction of individual-level features from mobile health data, R
简介与背景RADAR-Pipeline 是一个开源 Python 框架,旨在简化和增强移动健康数据分析。它旨在高效读取和处理通过 RADAR-Base 平台生成的大量数据。RADAR-base 是一个可扩展的实时流和分析开源平台,可满足研究访问和定制要求。使用 RADAR-base 平台进行的研究已经从可穿戴设备和手机中收集了精细的纵向数据。这些数据可能会产生大量数字生物标记,可用于向我们提供有关疾病状况的大量信息。由于数据量巨大,研究人员很难读取和处理这些数据--一项常见的任务是识别有用的特征和社区以前使用的常见数据处理/分析步骤。迄今为止,这些都是由个别数据科学家手工制作的,往往缺乏能力,无法在没有特定作者知识的情况下被社区轻松重用。此外,在已经建立的研究基础上生成更大规模的变量可能具有挑战性,并可能阻碍复制。因此,我们设计了 RADAR-Pipeline 来帮助研究人员克服这些挑战。它使研究人员能够创建和共享他们的数据分析和可视化管道,促进研究界的合作和知识共享。目标与方法 RADAR-Pipeline 的主要目标是为研究人员提供一个用户友好、功能强大的平台,用于开发和共享他们的研究成果。 研究人员可以建立可重复使用的分析和可视化管道,以确保结果的一致性和可靠性。它利用 Apache Spark 高效处理大型复杂的移动健康数据集,从而简化了大数据分析。 研究人员还可以通过重用和扩展他人构建的现有管道来节省时间和精力。最后,RADAR 管道允许研究人员通过 RADAR-base 分析目录分享他们的工作,使他们的管道可被更广泛的研究社区引用和访问,从而促进合作和认可。Radar-pipeline 设计用于从雷达基地读取数据,但也可用于从使用 Hadoop 分布式文件系统 (HDFS) 文件系统命名空间的任何数据集读取数据。与数字足迹的相关性移动健康数据非常丰富,对于了解人类行为和健康状况非常有价值。RADAR-Pipeline 解决了与分析大型复杂移动健康数据集相关的挑战,使研究人员能够提取有价值的见解,用于 (1) 改善公共健康:通过对大规模移动健康数据进行高效分析,RADAR-Pipeline 可以促进旨在改善人群健康结果和开发有效干预措施的研究工作;(2) 个性化医疗保健:通过促进从移动健康数据中提取个人层面的特征,RADAR-Pipeline 可以与 Kafka 数据流和机器学习管道无缝集成,实时处理数据,然后利用这些数据制定更有效、更有针对性的实时干预措施。(3) 促进可重复研究:该框架强调研究的透明度和可重复性,这与会议对负责任地使用数字移动健康数据的关注点不谋而合。结论与启示RADAR-Pipeline 是研究人员的宝贵工具,为他们提供了利用移动健康数据潜力的手段。通过采用这一框架,研究人员可以实现高效、可扩展的数据分析,从而简化从数字足迹中提取见解的过程。这种效率使研究人员能够深入研究数据,发现有价值的模式和趋势。此外,RADAR-Pipeline 还促进了研究界的合作和知识共享。通过提供标准化的数据分析框架,RADAR-Pipeline 促进了研究人员之间的合作,实现了最佳实践的共享和知识的传播。
{"title":"RADAR-Pipeline: Scalable Feature Generation for Mobile Health Data","authors":"H. Sankesara, Y. Ranjan, P. Conde, Z. Rashid, Akash Roy Choudhury, A. Folarin","doi":"10.23889/ijpds.v9i4.2421","DOIUrl":"https://doi.org/10.23889/ijpds.v9i4.2421","url":null,"abstract":"Introduction & BackgroundRADAR-Pipeline is an open-source Python framework designed to simplify and enhance mobile health data analysis. It has been designed to efficiently read and process the large amount of data generated through the RADAR-Base platform. RADAR-base is a scalable, real-time streaming and analytics open-source platform to facilitate research access and customisation requirements. Studies using the Radar-base platform have collected fine-grained longitudinal data from wearables and phones. The data can potentially create multitudes of digital biomarkers, which can be used to inform us greatly about the disease condition. Due to the sheer size of the data, it can be difficult for researchers to read and process those data -- a common task is identifying useful features and common data processing/analysis steps previously used by the community. Up to now, these have been hand-crafted by individual data scientists, often lacking the capability to be easily reused by the community without author-specific knowledge. \u0000Furthermore, generating variables based on already established research on a larger scale can be challenging and could hinder replication. Hence, we have designed RADAR-Pipeline to help researchers overcome these challenges. It empowers them to create and share their data analysis and visualisation pipelines, fostering collaboration and knowledge sharing within the research community. \u0000Objectives & ApproachThe primary objective of RADAR-Pipeline is to offer researchers a user-friendly and powerful platform to develop and share their research.  Researchers can build reusable analysis and visualisation pipelines to ensure consistent and reliable results. It simplifies big data analysis by leveraging Apache Spark to handle large and complex mobile health datasets efficiently.  Researchers can also save time and effort by reusing and extending existing pipelines built by others. Finally, the RADAR-Pipeline promotes collaboration and recognition by allowing researchers to share their work through the RADAR-base Analytics Catalogue, making their pipelines citable and accessible to the wider research community. \u0000Whilst Radar-pipeline has been designed to read data from Radar-base, it can also be used to read data from any dataset which uses Hadoop Distributed File System (HDFS) file system namespace. \u0000Relevance to Digital FootprintsMobile health data is rich and valuable for understanding human behaviour and health. RADAR-Pipeline addresses the challenges associated with analysing large and complex mobile health datasets, enabling researchers to extract valuable insights that can be used to (1) Improve public health: By enabling efficient analysis of large-scale mobile health data, RADAR-Pipeline can contribute to research efforts aimed at improving population health outcomes and developing effective interventions; (2) Personalised healthcare: By facilitating the extraction of individual-level features from mobile health data, R","PeriodicalId":507952,"journal":{"name":"International Journal of Population Data Science","volume":"111 46","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141361123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Population Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1