On Landing and Internal Web Pages: The Strange Case of Jekyll and Hyde in Web Performance Measurement

Waqar Aqeel, B. Chandrasekaran, A. Feldmann, B. Maggs
{"title":"On Landing and Internal Web Pages: The Strange Case of Jekyll and Hyde in Web Performance Measurement","authors":"Waqar Aqeel, B. Chandrasekaran, A. Feldmann, B. Maggs","doi":"10.1145/3419394.3423626","DOIUrl":null,"url":null,"abstract":"There is a rich body of literature on measuring and optimizing nearly every aspect of the web, including characterizing the structure and content of web pages, devising new techniques to load pages quickly, and evaluating such techniques. Virtually all of this prior work used a single page, namely the landing page (i.e., root document, \"/\"), of each web site as the representative of all pages on that site. In this paper, we characterize the differences between landing and internal (i.e., non-root) pages of 1000 web sites to demonstrate that the structure and content of internal pages differ substantially from those of landing pages, as well as from one another. We review more than a hundred studies published at top-tier networking conferences between 2015 and 2019, and highlight how, in light of these differences, the insights and claims of nearly two-thirds of the relevant studies would need to be revised for them to apply to internal pages. Going forward, we urge the networking community to include internal pages for measuring and optimizing the web. This recommendation, however, poses a non-trivial challenge: How do we select a set of representative internal web pages from a web site? To address the challenge, we have developed Hispar, a \"top list\" of 100,000 pages updated weekly comprising both the landing pages and internal pages of around 2000 web sites. We make Hispar and the tools to recreate or customize it publicly available.","PeriodicalId":255324,"journal":{"name":"Proceedings of the ACM Internet Measurement Conference","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Internet Measurement Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3419394.3423626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 39

Abstract

There is a rich body of literature on measuring and optimizing nearly every aspect of the web, including characterizing the structure and content of web pages, devising new techniques to load pages quickly, and evaluating such techniques. Virtually all of this prior work used a single page, namely the landing page (i.e., root document, "/"), of each web site as the representative of all pages on that site. In this paper, we characterize the differences between landing and internal (i.e., non-root) pages of 1000 web sites to demonstrate that the structure and content of internal pages differ substantially from those of landing pages, as well as from one another. We review more than a hundred studies published at top-tier networking conferences between 2015 and 2019, and highlight how, in light of these differences, the insights and claims of nearly two-thirds of the relevant studies would need to be revised for them to apply to internal pages. Going forward, we urge the networking community to include internal pages for measuring and optimizing the web. This recommendation, however, poses a non-trivial challenge: How do we select a set of representative internal web pages from a web site? To address the challenge, we have developed Hispar, a "top list" of 100,000 pages updated weekly comprising both the landing pages and internal pages of around 2000 web sites. We make Hispar and the tools to recreate or customize it publicly available.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
登陆和内部网页:在网页性能测量中的双重人格的奇怪案例
关于测量和优化web的几乎每一个方面都有丰富的文献,包括描述web页面的结构和内容,设计快速加载页面的新技术,以及评估这些技术。实际上,所有这些先前的工作都使用了单个页面,即每个网站的着陆页面(即根文档“/”)作为该网站上所有页面的代表。在本文中,我们描述了1000个网站的着陆页和内部(即非根)页面之间的差异,以证明内部页面的结构和内容与着陆页以及彼此之间的结构和内容存在很大差异。我们回顾了2015年至2019年期间在顶级网络会议上发表的100多项研究,并强调,鉴于这些差异,近三分之二的相关研究的见解和主张需要进行修改才能适用于内部页面。展望未来,我们敦促网络社区包含用于测量和优化网络的内部页面。然而,这个建议提出了一个不容忽视的挑战:我们如何从网站中选择一组具有代表性的内部网页?为了应对这一挑战,我们开发了Hispar,这是一个每周更新10万页的“顶级列表”,包括大约2000个网站的登录页和内部页面。我们公开了Hispar和用于重新创建或自定义它的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Lumos5G A Bird's Eye View of the World's Fastest Networks Quantifying the Impact of Blocklisting in the Age of Address Reuse TopoScope No WAN's Land: Mapping U.S. Broadband Coverage with Millions of Address Queries to ISPs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1