{"title":"Proactive Institutional Repository Collection Development Techniques: Archiving Gold Open Access Articles and Metadata Retrieved with Web Scraping","authors":"Brian Clark","doi":"10.1080/01930826.2023.2240190","DOIUrl":null,"url":null,"abstract":"Abstract Many institutions face low deposit rates with their institutional repositories despite investing substantial resources in implementing and supporting these systems. Deposit rates are higher in IRs that offer mediated deposits; however, this can be a time and labor intensive process. This article describes a method for copying open access articles and corresponding descriptive metadata from open repositories for archiving in an institutional repository using Beautiful Soup and Selenium as web scraping tools. This method quickly added hundreds of articles to an IR without relying on faculty participation or consulting publisher policies, increasing repository downloads and usage.","PeriodicalId":46427,"journal":{"name":"Journal of Library Administration","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Library Administration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/01930826.2023.2240190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Many institutions face low deposit rates with their institutional repositories despite investing substantial resources in implementing and supporting these systems. Deposit rates are higher in IRs that offer mediated deposits; however, this can be a time and labor intensive process. This article describes a method for copying open access articles and corresponding descriptive metadata from open repositories for archiving in an institutional repository using Beautiful Soup and Selenium as web scraping tools. This method quickly added hundreds of articles to an IR without relying on faculty participation or consulting publisher policies, increasing repository downloads and usage.
期刊介绍:
The Journal of Library Administration is the primary source of information on all aspects of the effective management of libraries. Stressing the practical, this valuable journal provides information that administrators need to efficiently and effectively manage their libraries. The journal seeks out the most modern advances being made in professional management and applies them to the library setting.