Exploring the Proportion of Content Represented by the Metadata of Research Articles

Shahzad Nazir, M. Asif, Shahbaz Ahmad
{"title":"Exploring the Proportion of Content Represented by the Metadata of Research Articles","authors":"Shahzad Nazir, M. Asif, Shahbaz Ahmad","doi":"10.1109/ICACS47775.2020.9055955","DOIUrl":null,"url":null,"abstract":"In this era, to find out relevant research articles is considered an important task to track the state-of-the-art-work, and it is termed as research paper recommender system. Considering the massive increase in research corpora, the research community has turned its focus towards finding the most relevant research papers. Researchers have adopted different techniques that are bibliographic information based, content-based, and collaborative filtering based. The most common approach for the research paper recommender system is content-based. According to a survey, 55% of research paper recommender systems use a content-based approach. On the other hand, due to the unavailability of the full text of research papers, researchers started utilizing the Meta-data. But it is still unclear that what proportion of full content can be represented by the Meta-data. This research explored the significant portion of the full content contained by the Metadata of research articles. We applied two different techniques; in the first technique, we implemented the TF-IDF over Metadata and full content and considered the intersection of key terms. Secondly, similarity scores of Meta-data and full content were calculated by applying cosine similarity. This approach was assessed on a dataset of 271 research articles that were automatically downloaded from CiteseerX. The results revealed that the Meta-data of research articles could effectively represent the 47% proportion.","PeriodicalId":268675,"journal":{"name":"2020 3rd International Conference on Advancements in Computational Sciences (ICACS)","volume":"171 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 3rd International Conference on Advancements in Computational Sciences (ICACS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACS47775.2020.9055955","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In this era, to find out relevant research articles is considered an important task to track the state-of-the-art-work, and it is termed as research paper recommender system. Considering the massive increase in research corpora, the research community has turned its focus towards finding the most relevant research papers. Researchers have adopted different techniques that are bibliographic information based, content-based, and collaborative filtering based. The most common approach for the research paper recommender system is content-based. According to a survey, 55% of research paper recommender systems use a content-based approach. On the other hand, due to the unavailability of the full text of research papers, researchers started utilizing the Meta-data. But it is still unclear that what proportion of full content can be represented by the Meta-data. This research explored the significant portion of the full content contained by the Metadata of research articles. We applied two different techniques; in the first technique, we implemented the TF-IDF over Metadata and full content and considered the intersection of key terms. Secondly, similarity scores of Meta-data and full content were calculated by applying cosine similarity. This approach was assessed on a dataset of 271 research articles that were automatically downloaded from CiteseerX. The results revealed that the Meta-data of research articles could effectively represent the 47% proportion.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
探索科研论文元数据所代表的内容比例
在这个时代,寻找相关的研究论文被认为是跟踪最新工作的一项重要任务,并被称为研究论文推荐系统。考虑到研究语料库的大量增加,研究界已将重点转向寻找最相关的研究论文。研究人员采用了基于书目信息、基于内容和基于协同过滤的不同技术。研究论文推荐系统最常见的方法是基于内容的。根据一项调查,55%的研究论文推荐系统使用基于内容的方法。另一方面,由于无法获得研究论文的全文,研究人员开始利用元数据。但是元数据能代表多少比例的完整内容还不清楚。本研究探讨了研究文章元数据所包含的完整内容的重要部分。我们采用了两种不同的技术;在第一种技术中,我们在元数据和完整内容上实现TF-IDF,并考虑关键术语的交集。其次,利用余弦相似度计算元数据与完整内容的相似度得分;该方法在从CiteseerX自动下载的271篇研究文章的数据集上进行了评估。结果显示,研究论文的Meta-data可以有效地代表47%的比例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Review of Security Machanism in internet of Things(IoT) Effects of Refactoring upon Efficiency of an NP-Hard Task Assignment Problem: A case study Preventive Techniques of Phishing Attacks in Networks CBAM: A Controller based Broadcast Storm Avoidance Mechanism in SDN based NDN-IoTs Machine and Deep Learning Based Comparative Analysis Using Hybrid Approaches for Intrusion Detection System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1