Comparing Price Indices of Clothing and Footwear for Scanner Data and Web Scraped Data

A. Chessa, R. Griffioen
{"title":"Comparing Price Indices of Clothing and Footwear for Scanner Data and Web Scraped Data","authors":"A. Chessa, R. Griffioen","doi":"10.24187/ecostat.2019.509.1984","DOIUrl":null,"url":null,"abstract":"[eng] Statistical institutes are considering web scraping of online prices of consumer goods as a feasible alternative to scanner data. The lack of transaction data generates the question whether web scraped data are suited for price index calculation. This article investigates this question by comparing price indices based on web scraped and scanner data for clothing and footwear in the same webshop. Scanner data and web scraped prices are often equal, with the latter being slightly higher on average. Numbers of web scraped product prices and products sold show remarkably high correlations. Given the high churn rates of clothing products, a multilateral method (Geary-Khamis) was used to calculate price indices. For 16 product categories, the indices show small overall differences between the two data sources, with year on year indices differing only by 0.3 percentage point at COICOP level (men’s and women's clothing). It remains to be investigated whether such promising results for web scraped data will also be found for other retailers.","PeriodicalId":431625,"journal":{"name":"Economie et Statistique / Economics and Statistics","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Economie et Statistique / Economics and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24187/ecostat.2019.509.1984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

[eng] Statistical institutes are considering web scraping of online prices of consumer goods as a feasible alternative to scanner data. The lack of transaction data generates the question whether web scraped data are suited for price index calculation. This article investigates this question by comparing price indices based on web scraped and scanner data for clothing and footwear in the same webshop. Scanner data and web scraped prices are often equal, with the latter being slightly higher on average. Numbers of web scraped product prices and products sold show remarkably high correlations. Given the high churn rates of clothing products, a multilateral method (Geary-Khamis) was used to calculate price indices. For 16 product categories, the indices show small overall differences between the two data sources, with year on year indices differing only by 0.3 percentage point at COICOP level (men’s and women's clothing). It remains to be investigated whether such promising results for web scraped data will also be found for other retailers.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
服装和鞋类价格指数的扫描数据和网络抓取数据的比较
统计机构正考虑在网上搜集消费品的网上价格,作为扫描器数据的可行替代方案。交易数据的缺乏产生了一个问题,即网络抓取的数据是否适合用于价格指数的计算。本文通过比较同一网店中基于web抓取和扫描仪数据的服装和鞋类价格指数来研究这个问题。扫描仪数据和网页抓取的价格通常相等,后者的平均价格略高。网络抓取产品的数量和产品的销售价格显示出非常高的相关性。鉴于服装产品的高流失率,采用多边方法(Geary-Khamis)计算价格指数。对于16种产品类别,两种数据来源之间的指数总体差异很小,在COICOP水平(男装和女装)上,年度指数仅相差0.3个百分点。在其他零售商身上是否也能发现网络抓取数据的这种有希望的结果,还有待调查。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Effect of Informal Care Provided by Children on Health in Nursing Homes Geographical Distribution of Interns in General Practice: A Tool for Regulating Place of Settlement? The Impact of a Social Programme on the Healthcare Consumption of Elderly Self-Employed Workers in France How Can the Additional Cost Due to Disability Be Taken Into Account When Measuring the Standard of Living of Households in France? Biosimilar Prescribing Incentives: Results of a French Pilot of Gainsharing Between Hospitals and the National Health Insurance
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1