A preliminary evaluation of hathitrust metadata: Assessing the sufficiency of legacy records

Katrina Fenlon, Colleen Fallaw, Timothy W. Cole, Myung-Ja K. Han
{"title":"A preliminary evaluation of hathitrust metadata: Assessing the sufficiency of legacy records","authors":"Katrina Fenlon, Colleen Fallaw, Timothy W. Cole, Myung-Ja K. Han","doi":"10.1109/JCDL.2014.6970186","DOIUrl":null,"url":null,"abstract":"Print-based libraries use metadata (specifically MARC catalog records) for both bibliographic control and to support discovery through online public access catalogs. Depending on its accuracy, completeness, and detail, metadata can afford an aerial view of a collection's topical strengths, scope of coverage, and item-to-item relationships, but the view offered is in part a function of metadata design. Most MARC records were created to support management of large print collections and optimized to meet the requirements of library online public access catalogs. How well do pre-existing MARC records serve the discovery needs of scholars using a large-scale digital library hosting collections of retrospectively digitized books and serials? This paper reports on an ongoing assessment of the utility of the MARC-based metadata underlying the HathiTrust Digital Library and explores the implications for advanced computational access to texts in the HathiTrust. We consider here the utility of metadata to scholars creating worksets for analysis, examining three user scenarios, which were gleaned from an ongoing user-requirements study done for the HathiTrust Research Center: (1) using metadata fields in combination for corpus characterization and discovery; (2) relying on metadata to identify resources of interest; and (3) using bibliographies of known items to seed research worksets. Our goal is to better understand the need for metadata remediation and augmentation and assess the scope of additional work required.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"52 1","pages":"317-320"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/JCDL.2014.6970186","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Print-based libraries use metadata (specifically MARC catalog records) for both bibliographic control and to support discovery through online public access catalogs. Depending on its accuracy, completeness, and detail, metadata can afford an aerial view of a collection's topical strengths, scope of coverage, and item-to-item relationships, but the view offered is in part a function of metadata design. Most MARC records were created to support management of large print collections and optimized to meet the requirements of library online public access catalogs. How well do pre-existing MARC records serve the discovery needs of scholars using a large-scale digital library hosting collections of retrospectively digitized books and serials? This paper reports on an ongoing assessment of the utility of the MARC-based metadata underlying the HathiTrust Digital Library and explores the implications for advanced computational access to texts in the HathiTrust. We consider here the utility of metadata to scholars creating worksets for analysis, examining three user scenarios, which were gleaned from an ongoing user-requirements study done for the HathiTrust Research Center: (1) using metadata fields in combination for corpus characterization and discovery; (2) relying on metadata to identify resources of interest; and (3) using bibliographies of known items to seed research worksets. Our goal is to better understand the need for metadata remediation and augmentation and assess the scope of additional work required.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
对hathitrust元数据的初步评估:评估遗留记录的充分性
基于印刷的图书馆使用元数据(特别是MARC目录记录)进行书目控制和通过在线公共访问目录支持发现。根据其准确性、完整性和细节,元数据可以提供集合的主题优势、覆盖范围和项与项之间关系的鸟瞰图,但是所提供的视图在一定程度上是元数据设计的功能。大多数MARC记录的创建是为了支持大型印刷馆藏的管理,并优化以满足图书馆在线公共访问目录的要求。现有的MARC记录如何很好地服务于学者使用大型数字图书馆的发现需求,这些图书馆托管回顾性数字化书籍和连载的收藏?本文报告了对HathiTrust数字图书馆中基于marc的元数据的效用的持续评估,并探讨了对HathiTrust中文本的高级计算访问的影响。我们在这里考虑元数据对学者创建用于分析的工作集的效用,研究了三种用户场景,这些场景是从HathiTrust研究中心正在进行的用户需求研究中收集到的:(1)结合使用元数据字段进行语料库表征和发现;(2)依靠元数据识别感兴趣的资源;(3)利用已知项目的参考书目来播种研究工作集。我们的目标是更好地理解对元数据修复和扩展的需求,并评估所需额外工作的范围。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Keynote 1: A Conversation with Dr. Safiya Noble Towards Knowledge Maintenance in Scientific Digital Libraries with the Keystone Framework. Identifying the Development Process of the Electronic Health Records Research from the Perspective of Information Resource Management The Status, Hot Topics in the Field of Electronic Health Records: A Literature Review Based on Lda2vec Keynote: Standards and Communities: Connected People, Consistent Data, Usable Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1