Introduction
Although price is critical in determining alcohol purchase and subsequent harms, researchers rarely have access to comprehensive alcohol price data. Web scraping is an advanced data collection technique that uses automated computer scripts to efficiently gather extensive website data. The aims of this paper are to demonstrate web scraping’s capacity to generate alcohol policy relevant data, and to assess the method’s consistency by comparing datasets collected by a commercial provider with those produced by a university-developed scraper.
Methods
Price and product data from the entire online catalogues of major retailers representing the majority of the Australian market were scraped daily by the commercial provider since 2020, with data collected from all jurisdictions, and products sold by multiple retailers matched. A university-developed web scraper collected a single-day’s catalogue data from the country’s largest alcohol retailer to compare with the commercial dataset as a reliability cross-check.
Results
Of the 16,409 products identified in both the commercial and university databases, there was an excellent match on the product prices (intraclass correlation coefficient=0.997 [95 %CI: 0.9972–0.9973]). A visualisation from the three largest Australian retailers demonstrated how daily prices varied over a 12-month period, for example with more frequent price changes for Australia’s largest retailer compared to the second and third, and across jurisdictions, such as some deeper discounting in Victoria.
Discussion
This study presented an independently cross-checked large-scale and longitudinal web scraping approach to collect alcohol price data, and demonstrated that the adapted data could aid understanding of the alcohol retail market. Web scraping is a feasible method to collect price data to support the development of evidence-based alcohol price policy.
扫码关注我们
求助内容:
应助结果提醒方式:
