{"title":"Augmented Functional Analysis of Variance (A-fANOVA): Theory and Application to Google Trends for Detecting Differences in Abortion Drugs Queries","authors":"Fabrizio Maturo , Annamaria Porreca","doi":"10.1016/j.bdr.2022.100354","DOIUrl":null,"url":null,"abstract":"<div><p>The World Wide Web (WWW) has become a popular and readily accessible big data source in recent decades. The information in the WWW is offered in many different types, e.g. Google Trends, which provides deep insights into people's search queries in the Google Search engine. Analysing this kind of data is not straightforward because they usually take the form of high-dimensional data, given that the latter can be collected over extensive periods. Comparing Google Trends' means of different groups of people or Countries can help understand many phenomena and provide very appealing insights into populations' interests in specific periods and areas. However, appropriate statistical techniques should be adopted when inspecting and testing differences in such data due to the well-known curse of dimensionality. This paper suggests an original approach to dealing with Google Trends by concentrating on the search for the “<em>Cytotec</em><span>” abortion drug. The final purpose of the application is to determine if different Countries' abortion legislation can influence the research trends. This research focuses on Functional Data Analysis (FDA) to deal with high-dimensional data and proposes a generalisation of the classical functional analysis of variance model, namely the Augmented Functional Analysis of Variance (A-fANOVA). To test the existence of statistically significant differences among groups of Countries, A-fANOVA considers additional curves' characteristics provided by the velocity and acceleration of the original google queries over time. The proposed methodology appears to be intriguing for capturing additional information about curves' behaviours with the final aim of offering a monitoring tool for policy-makers.</span></p></div>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S221457962200048X","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 4
Abstract
The World Wide Web (WWW) has become a popular and readily accessible big data source in recent decades. The information in the WWW is offered in many different types, e.g. Google Trends, which provides deep insights into people's search queries in the Google Search engine. Analysing this kind of data is not straightforward because they usually take the form of high-dimensional data, given that the latter can be collected over extensive periods. Comparing Google Trends' means of different groups of people or Countries can help understand many phenomena and provide very appealing insights into populations' interests in specific periods and areas. However, appropriate statistical techniques should be adopted when inspecting and testing differences in such data due to the well-known curse of dimensionality. This paper suggests an original approach to dealing with Google Trends by concentrating on the search for the “Cytotec” abortion drug. The final purpose of the application is to determine if different Countries' abortion legislation can influence the research trends. This research focuses on Functional Data Analysis (FDA) to deal with high-dimensional data and proposes a generalisation of the classical functional analysis of variance model, namely the Augmented Functional Analysis of Variance (A-fANOVA). To test the existence of statistically significant differences among groups of Countries, A-fANOVA considers additional curves' characteristics provided by the velocity and acceleration of the original google queries over time. The proposed methodology appears to be intriguing for capturing additional information about curves' behaviours with the final aim of offering a monitoring tool for policy-makers.
近几十年来,万维网(WWW)已经成为一个流行且易于访问的大数据源。WWW上的信息以许多不同的类型提供,例如b谷歌Trends,它提供了对人们在谷歌搜索引擎上的搜索查询的深刻见解。分析这类数据并不简单,因为它们通常采用高维数据的形式,而后者可以在很长一段时间内收集。比较谷歌Trends对不同人群或国家的方法可以帮助理解许多现象,并对特定时期和地区的人群兴趣提供非常有吸引力的见解。然而,由于众所周知的维度诅咒,在检查和测试这些数据中的差异时,应采用适当的统计技术。本文提出了一种处理谷歌趋势的原始方法,即集中搜索“Cytotec”堕胎药物。申请的最终目的是确定不同国家的堕胎立法是否会影响研究趋势。本研究聚焦于功能数据分析(Functional Data Analysis, FDA)来处理高维数据,并提出了经典方差的功能分析模型的推广,即增强功能方差分析(Augmented Functional Analysis of variance, a - fanova)。为了检验国家组之间是否存在统计学上的显著差异,A-fANOVA考虑了原始谷歌查询随时间的速度和加速度所提供的附加曲线特征。所提出的方法似乎很有趣,因为它可以捕获曲线行为的额外信息,最终目的是为政策制定者提供一种监测工具。