Exploiting the untapped functional potential of Memento aggregators beyond aggregation

IF 1.7 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE International Journal on Digital Libraries Pub Date : 2024-01-27 DOI:10.1007/s00799-023-00391-0

Mat Kelly

{"title":"Exploiting the untapped functional potential of Memento aggregators beyond aggregation","authors":"Mat Kelly","doi":"10.1007/s00799-023-00391-0","DOIUrl":null,"url":null,"abstract":"<p>Web archives capture, retain, and present historical versions of web pages. Viewing web archives often amounts to a user visiting the Wayback Machine homepage, typing in a URL, then choosing a date and time significant of the capture. Other web archives also capture the web and use Memento as an interoperable point of querying their captures. Memento aggregators are web accessible software packages that allow clients to send requests for past web pages to a single endpoint source that then relays that request to a set of web archives. Though few deployed aggregator instances exist that exhibit this aggregation trait, they all, for the most part, align to a model of serving a request for a URI of an original resource (URI-R) to a client by first querying then aggregating the results of the responses from a collection of web archives. This single tier querying need not be the logical flow of an aggregator, so long as a user can still utilize the aggregator from a single URL. In this paper, we discuss theoretical aggregation models of web archives. We first describe the status quo as the conventional behavior exhibited by an aggregator. We then build on prior work to describe a multi-tiered, structured querying model that may be exhibited by an aggregator. We highlight some potential issues and high-level optimization to ensure efficient aggregation while also extending on the state-of-the-art of memento aggregation. Part of our contribution is the extension of an open-source, user-deployable Memento aggregator to exhibit the capability described in this paper. We also extend a browser extension that typically consults an aggregator to have the ability to aggregate itself rather than needing to consult an external service. A purely client-side, browser-based Memento aggregator is novel to this work.</p>","PeriodicalId":44974,"journal":{"name":"International Journal on Digital Libraries","volume":"4 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00799-023-00391-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Web archives capture, retain, and present historical versions of web pages. Viewing web archives often amounts to a user visiting the Wayback Machine homepage, typing in a URL, then choosing a date and time significant of the capture. Other web archives also capture the web and use Memento as an interoperable point of querying their captures. Memento aggregators are web accessible software packages that allow clients to send requests for past web pages to a single endpoint source that then relays that request to a set of web archives. Though few deployed aggregator instances exist that exhibit this aggregation trait, they all, for the most part, align to a model of serving a request for a URI of an original resource (URI-R) to a client by first querying then aggregating the results of the responses from a collection of web archives. This single tier querying need not be the logical flow of an aggregator, so long as a user can still utilize the aggregator from a single URL. In this paper, we discuss theoretical aggregation models of web archives. We first describe the status quo as the conventional behavior exhibited by an aggregator. We then build on prior work to describe a multi-tiered, structured querying model that may be exhibited by an aggregator. We highlight some potential issues and high-level optimization to ensure efficient aggregation while also extending on the state-of-the-art of memento aggregation. Part of our contribution is the extension of an open-source, user-deployable Memento aggregator to exhibit the capability described in this paper. We also extend a browser extension that typically consults an aggregator to have the ability to aggregate itself rather than needing to consult an external service. A purely client-side, browser-based Memento aggregator is novel to this work.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

挖掘 Memento 聚合器聚合之外的未开发功能潜力

网络档案可捕获、保留和展示网页的历史版本。查看网络档案通常只需用户访问 Wayback Machine 主页，输入 URL，然后选择捕获的重要日期和时间即可。其他网络档案也会捕获网页，并使用 Memento 作为查询其捕获内容的互操作点。Memento 聚合器是一种可访问网络的软件包，允许客户端向一个单一的端点源发送对过去网页的请求，然后该端点源再将请求转发给一组网络档案。尽管很少有已部署的聚合器实例显示出这种聚合特性，但它们在大多数情况下都符合一种模式，即通过首先查询然后聚合来自网络档案集的响应结果，向客户端提供对原始资源 URI（URI-R）的请求。这种单层查询不一定是聚合器的逻辑流程，只要用户仍可通过单个 URL 使用聚合器即可。本文将讨论网络档案的理论聚合模型。我们首先将现状描述为聚合器表现出的传统行为。然后，我们在先前工作的基础上，描述了聚合器可能表现出的多层次、结构化查询模型。我们强调了一些潜在问题和高层次优化，以确保高效聚合，同时也扩展了最先进的记忆聚合技术。我们的部分贡献是扩展了一个开源的、用户可部署的 Memento 聚合器，以展示本文所述的功能。我们还扩展了一个浏览器扩展，该浏览器通常会咨询聚合器，使其能够自行聚合，而无需咨询外部服务。纯客户端、基于浏览器的 Memento 聚合器是这项工作的新颖之处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal on Digital Libraries

CiteScore

4.30

自引率

6.70%

发文量

期刊介绍： The International Journal on Digital Libraries (IJDL) examines the theory and practice of acquisition definition organization management preservation and dissemination of digital information via global networking. It covers all aspects of digital libraries (DLs) from large-scale heterogeneous data and information management & access to linking and connectivity to security privacy and policies to its application use and evaluation.The scope of IJDL includes but is not limited to: The FAIR principle and the digital libraries infrastructure Findable: Information access and retrieval; semantic search; data and information exploration; information navigation; smart indexing and searching; resource discovery Accessible: visualization and digital collections; user interfaces; interfaces for handicapped users; HCI and UX in DLs; Security and privacy in DLs; multimodal access Interoperable: metadata (definition management curation integration); syntactic and semantic interoperability; linked data Reusable: reproducibility; Open Science; sustainability profitability repeatability of research results; confidentiality and privacy issues in DLs Digital Library Architectures including heterogeneous and dynamic data management; data and repositories Acquisition of digital information: authoring environments for digital objects; digitization of traditional content Digital Archiving and Preservation Digital Preservation and curation Digital archiving Web Archiving Archiving and preservation Strategies AI for Digital Libraries Machine Learning for DLs Data Mining in DLs NLP for DLs Applications of Digital Libraries Digital Humanities Open Data and their reuse Scholarly DLs (incl. bibliometrics altmetrics) Epigraphy and Paleography Digital Museums Future trends in Digital Libraries Definition of DLs in a ubiquitous digital library world Datafication of digital collections Interaction and user experience (UX) in DLs Information visualization Collection understanding Privacy and security Multimodal user interfaces Accessibility (or "Access for users with disabilities") UX studies