{"title":"Of babies, bathwater, and big data: Going beneath the surface of Franzén’s (2023) Google Trends recommendations","authors":"J. Raubenheimer","doi":"10.1177/00016993231187489","DOIUrl":null,"url":null,"abstract":"Franzén (2023 : 1) has warned of ‘big problems’ when researchers attempt to use Google Trends (GT) data. The evidence she provides is examined, additional evidence is obtained and analysed, and a new set of conclusions are derived. The anomalies previously encountered are due to a combination of factors, but can be explained by noting that Google samples its data to provide GT results, these data are also scaled, which can exacerbate variation between samples, and Denmark is a small country and Jakob Scharf a low-probability search term, both of which would increase variation in search probabilities provided by GT. When multiple samples are obtained and aggregated (medians are best for low-probability search terms), this variation is controlled, and a stable time series is derived. Researchers should not see GT as an easy source of data, but should do the work required to understand the data, and should use it, and interpret their results, within the limitations inherent in these data. It is important to aggregate multiple samples (preferably with the median for each time point) in order to obtain more stable estimates from GT.","PeriodicalId":47591,"journal":{"name":"Acta Sociologica","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Sociologica","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/00016993231187489","RegionNum":3,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SOCIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Franzén (2023 : 1) has warned of ‘big problems’ when researchers attempt to use Google Trends (GT) data. The evidence she provides is examined, additional evidence is obtained and analysed, and a new set of conclusions are derived. The anomalies previously encountered are due to a combination of factors, but can be explained by noting that Google samples its data to provide GT results, these data are also scaled, which can exacerbate variation between samples, and Denmark is a small country and Jakob Scharf a low-probability search term, both of which would increase variation in search probabilities provided by GT. When multiple samples are obtained and aggregated (medians are best for low-probability search terms), this variation is controlled, and a stable time series is derived. Researchers should not see GT as an easy source of data, but should do the work required to understand the data, and should use it, and interpret their results, within the limitations inherent in these data. It is important to aggregate multiple samples (preferably with the median for each time point) in order to obtain more stable estimates from GT.
期刊介绍:
Acta Sociologica is a peer reviewed journal which publishes papers on high-quality innovative sociology peer reviewed journal which publishes papers on high-quality innovative sociology carried out from different theoretical and methodological starting points, in the form of full-length original articles and review essays, as well as book reviews and commentaries. Articles that present Nordic sociology or help mediate between Nordic and international scholarly discussions are encouraged.