The impact of user corrections on a crawl-based digital library: A CiteSeerX perspective

Jian Wu, Kyle Williams, Madian Khabsa, C. Lee Giles
{"title":"The impact of user corrections on a crawl-based digital library: A CiteSeerX perspective","authors":"Jian Wu, Kyle Williams, Madian Khabsa, C. Lee Giles","doi":"10.4108/icst.collaboratecom.2014.257563","DOIUrl":null,"url":null,"abstract":"CiteSeerX is a crawl-based digital library search engine providing free access to more than 4 million academic papers. Since metadata in the digital library is obtained through automatic extraction, it is inevitable that errors will occur. CiteSeerX offers a feature allowing registered users to correct paper metadata including titles, authors, abstracts, publication years, venues, etc. We claim that user corrections, as a form of crowd-collaboration, provide a useful and efficient way to improve metadata quality and the impact of the digital library. As evidence to support this claim, we investigate user corrections from the last 5 years and analyze: the nature of the corrections; the quality of the corrections; and the impact of the corrections on downloads.","PeriodicalId":432345,"journal":{"name":"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/icst.collaboratecom.2014.257563","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

CiteSeerX is a crawl-based digital library search engine providing free access to more than 4 million academic papers. Since metadata in the digital library is obtained through automatic extraction, it is inevitable that errors will occur. CiteSeerX offers a feature allowing registered users to correct paper metadata including titles, authors, abstracts, publication years, venues, etc. We claim that user corrections, as a form of crowd-collaboration, provide a useful and efficient way to improve metadata quality and the impact of the digital library. As evidence to support this claim, we investigate user corrections from the last 5 years and analyze: the nature of the corrections; the quality of the corrections; and the impact of the corrections on downloads.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用户更正对基于爬虫的数字图书馆的影响:CiteSeerX视角
CiteSeerX是一个基于爬虫的数字图书馆搜索引擎,提供400多万篇学术论文的免费访问。由于数字图书馆中的元数据是通过自动提取获得的,因此不可避免地会出现错误。CiteSeerX提供了一个功能,允许注册用户更正论文元数据,包括标题、作者、摘要、出版年份、地点等。我们声称,用户更正作为一种群体协作形式,提供了一种有用而有效的方式来提高元数据质量和数字图书馆的影响。作为支持这一说法的证据,我们调查了过去5年的用户更正,并分析了:更正的性质;改正的质量;以及修正对下载的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DQS-Cloud: A Data Quality-Aware autonomic cloud for sensor services Achieving security assurance with assertion-based application construction Distribution, correlation and prediction of response times in Stack Overflow Applications of multimodal physical (IoT), cyber and social data for reliable and actionable insights Resilient hybrid Mobile Ad-hoc Cloud over collaborating heterogeneous nodes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1