Siamese Long Short-Term Memory for Detecting Conflict of Interest on Scientific Papers

Akhmad Bakhrul Ilmi, D. Purwitasari, C. Fatichah
{"title":"Siamese Long Short-Term Memory for Detecting Conflict of Interest on Scientific Papers","authors":"Akhmad Bakhrul Ilmi, D. Purwitasari, C. Fatichah","doi":"10.12962/j20882033.v30i2.5008","DOIUrl":null,"url":null,"abstract":"Scientific articles cited by other researchers have an impact on increasing author credibility. However, the citation process may be misused to unnaturally raise a bibliometric indicator value such as researcher’s h-index. Researchers may overly cites their own works, referred as self-citation, even though the topic of the references are not related to the current article. Further misconduct is excessive citations on the works of peoples related to the researcher which can be coercive or not, referred as conflict of interest (CoI). The proposed method uses a deep learning approach, Siamese Long ShortTerm Memory (LSTM), to recognize subject similarities between a scientific article and its references. Standard text similarity fails to do so because contextual relatedness of sentences in the articles need some learning process. Siamese-LSTM learns contextual relatedness of sentences in the article using two identical LSTM. Steps of the proposed method are (i) wordembedding to get weight values of terms but still considers their semantic relations, (ii) k-means clustering to generate training data for reducing time complexity in Siamese-LSTM learning of scientific articles, (iii) learns Siamese-LSTM weight from training data to identify contextual relatedness of sentences, (iv) calculate similarity of a scientific article with its references based on Siamese-LSTM. The empirical experiments are used to analyze similarity values and the possibility for conflict of interest in an article. KeywordsCitation, Conflict of Interest, Scientific Text, Deep Learning, Similarity, Text Processing.","PeriodicalId":14549,"journal":{"name":"IPTEK: The Journal for Technology and Science","volume":"41 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IPTEK: The Journal for Technology and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12962/j20882033.v30i2.5008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Scientific articles cited by other researchers have an impact on increasing author credibility. However, the citation process may be misused to unnaturally raise a bibliometric indicator value such as researcher’s h-index. Researchers may overly cites their own works, referred as self-citation, even though the topic of the references are not related to the current article. Further misconduct is excessive citations on the works of peoples related to the researcher which can be coercive or not, referred as conflict of interest (CoI). The proposed method uses a deep learning approach, Siamese Long ShortTerm Memory (LSTM), to recognize subject similarities between a scientific article and its references. Standard text similarity fails to do so because contextual relatedness of sentences in the articles need some learning process. Siamese-LSTM learns contextual relatedness of sentences in the article using two identical LSTM. Steps of the proposed method are (i) wordembedding to get weight values of terms but still considers their semantic relations, (ii) k-means clustering to generate training data for reducing time complexity in Siamese-LSTM learning of scientific articles, (iii) learns Siamese-LSTM weight from training data to identify contextual relatedness of sentences, (iv) calculate similarity of a scientific article with its references based on Siamese-LSTM. The empirical experiments are used to analyze similarity values and the possibility for conflict of interest in an article. KeywordsCitation, Conflict of Interest, Scientific Text, Deep Learning, Similarity, Text Processing.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用连体长短期记忆检测科学论文中的利益冲突
被其他研究人员引用的科学文章对提高作者的可信度有影响。然而,引文过程可能被滥用,以不自然地提高文献计量指标值,如研究者的h指数。研究人员可能会过度引用自己的作品,被称为自引,即使参考文献的主题与当前文章无关。进一步的不当行为是过度引用与研究人员有关的人的作品,这可能是强制性的,也可能不是,称为利益冲突(CoI)。提出的方法使用深度学习方法,暹罗长短期记忆(LSTM),以识别科学文章及其参考文献之间的主题相似性。标准文本相似度无法做到这一点,因为文章中句子的语境相关性需要一定的学习过程。siame -LSTM使用两个相同的LSTM学习文章中句子的上下文相关性。本文提出的方法的步骤是(i)在考虑其语义关系的情况下,对术语进行词嵌入,获得其权重值;(ii) k-means聚类,生成训练数据,降低科学文章暹罗- lstm学习的时间复杂度;(iii)从训练数据中学习暹罗- lstm权重,识别句子的上下文相关性;(iv)基于暹罗- lstm计算科学文章与其参考文献的相似度。通过实证实验分析了文章的相似度值和利益冲突的可能性。关键词引文,利益冲突,科学文本,深度学习,相似度,文本处理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
17
审稿时长
9 weeks
期刊最新文献
Deposition Silver Based Thin Film on Stainless Steel 316l as Antimicrobial Agent Using Electrophoretic Deposition Method Probabilistic Scheduling Based On Hybrid Bayesian Network–Program Evaluation Review Technique, Analysis of Level Team Effectiveness in The Implementation of Scrum Using Evidence-Based Management (Case Study: Company A as A Fintech Industry) Project Delay Risk Assessment User-Centered Design-Based Approach in Scheduling Management Application Design and Development
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1