在不完全信息条件下从引文系统中收集书目信息数据的改进筛选方法的开发

I. Bolodurina, Yu. P. Ivanova, L. M. Antsiferova, V. D. Blinov
{"title":"在不完全信息条件下从引文系统中收集书目信息数据的改进筛选方法的开发","authors":"I. Bolodurina, Yu. P. Ivanova, L. M. Antsiferova, V. D. Blinov","doi":"10.14529/ctcr200413","DOIUrl":null,"url":null,"abstract":"Currently, transition to the electronic presentation of bibliographic information about scientific works has caused an increased interest in scientometric research. At the same time, the existing sci-entometric methods are criticized by scientists, since the incomplete bibliographic base and tools for its assessment do not allow the most accurate assessment of the contribution of scientific work. The problem of the quality of scientometric assessments, as a rule, is based on the study of the data of a certain citation system, which does not include complete information about all publications of the authors contained in other citation systems. Aim. This study is aimed at developing an adaptive ap-proach for the formation of aggregated data of bibliographic information of a scientific organiza-tion in conditions of incomplete information from the citation systems of the RSCI, “Google Aca-demy” and Scopus. Methods. The definition of the aggregated list of publications for the analysis of scientometric indicators was carried out by the Winnowing method, the Levenshtein algorithm, the shingle method and the Jaro–Winkler method. In the framework of the experimental study, the effectiveness of the application of the considered methods for aggregating information from cita-tion systems was assessed based on the analysis of accuracy, completeness and F-measure. Results. Expe¬riments on test data from the list of publications by authors of the Orenburg State University from the citation systems RSCI, Google Academy and Scopus showed that the Winnowing method formed the most accurate lists of publications by the F-measure criterion. To improve the perfor-mance of this algorithm, a two-stage optimization of the aggregation process was carried out, which made it possible to improve the running time of the algorithm when generating a list of bibliographic descriptions. Conclusion. The proposed approach for the formation of aggregated data of biblio-graphic information of a scientific organization in conditions of incomplete information from the ci-tation systems of the Russian Science Citation Index, Google Academy and Scopus allows increas-ing productivity in the formation of a list of authors' publications and shows good efficiency in de-termining the scientometric characteristics of authors.","PeriodicalId":338904,"journal":{"name":"Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control & Radioelectronics","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DEVELOPMENT OF A MODIFIED WINNOWING METHOD FOR AGGREGATING BIBLIOGRAPHIC INFORMATION DATA FROM CITATION SYSTEMS UNDER THE CONDITIONS OF INCOMPLETE INFORMATION\",\"authors\":\"I. Bolodurina, Yu. P. Ivanova, L. M. Antsiferova, V. D. Blinov\",\"doi\":\"10.14529/ctcr200413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently, transition to the electronic presentation of bibliographic information about scientific works has caused an increased interest in scientometric research. At the same time, the existing sci-entometric methods are criticized by scientists, since the incomplete bibliographic base and tools for its assessment do not allow the most accurate assessment of the contribution of scientific work. The problem of the quality of scientometric assessments, as a rule, is based on the study of the data of a certain citation system, which does not include complete information about all publications of the authors contained in other citation systems. Aim. This study is aimed at developing an adaptive ap-proach for the formation of aggregated data of bibliographic information of a scientific organiza-tion in conditions of incomplete information from the citation systems of the RSCI, “Google Aca-demy” and Scopus. Methods. The definition of the aggregated list of publications for the analysis of scientometric indicators was carried out by the Winnowing method, the Levenshtein algorithm, the shingle method and the Jaro–Winkler method. In the framework of the experimental study, the effectiveness of the application of the considered methods for aggregating information from cita-tion systems was assessed based on the analysis of accuracy, completeness and F-measure. Results. Expe¬riments on test data from the list of publications by authors of the Orenburg State University from the citation systems RSCI, Google Academy and Scopus showed that the Winnowing method formed the most accurate lists of publications by the F-measure criterion. To improve the perfor-mance of this algorithm, a two-stage optimization of the aggregation process was carried out, which made it possible to improve the running time of the algorithm when generating a list of bibliographic descriptions. Conclusion. The proposed approach for the formation of aggregated data of biblio-graphic information of a scientific organization in conditions of incomplete information from the ci-tation systems of the Russian Science Citation Index, Google Academy and Scopus allows increas-ing productivity in the formation of a list of authors' publications and shows good efficiency in de-termining the scientometric characteristics of authors.\",\"PeriodicalId\":338904,\"journal\":{\"name\":\"Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control & Radioelectronics\",\"volume\":\"95 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control & Radioelectronics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14529/ctcr200413\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control & Radioelectronics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14529/ctcr200413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目前,向科学著作书目信息的电子表示的过渡引起了人们对科学计量学研究的兴趣。与此同时,现有的科学计量学方法也受到了科学家的批评,因为其评估的书目基础和工具不完整,无法最准确地评估科学工作的贡献。通常,科学计量学评估的质量问题是基于对某一引文系统数据的研究,而该系统不包括其他引文系统中作者所有出版物的完整信息。的目标。本研究旨在开发一种在RSCI、“Google academy -demy”和Scopus引文系统信息不完全的情况下,形成科学组织书目信息汇总数据的自适应方法。方法。采用Winnowing法、Levenshtein算法、shingle法和Jaro-Winkler法定义用于科学计量指标分析的出版物汇总列表。在实验研究的框架内,基于准确性、完整性和F-measure分析,评估了所考虑的引文系统信息聚合方法的有效性。结果。对来自RSCI、Google Academy和Scopus引文系统的奥伦堡州立大学作者的出版物列表的测试数据进行了实验,结果表明,根据F-measure标准,Winnowing方法形成了最准确的出版物列表。为了提高算法的性能,对聚合过程进行了两阶段优化,使得算法在生成书目描述列表时的运行时间得以提高。结论。在俄罗斯科学引文索引(Russian Science Citation Index)、Google Academy和Scopus等引文系统信息不完全的情况下,提出的形成科学组织书目信息汇总数据的方法可以提高作者出版物列表的形成效率,并在确定作者的科学计量特征方面显示出良好的效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DEVELOPMENT OF A MODIFIED WINNOWING METHOD FOR AGGREGATING BIBLIOGRAPHIC INFORMATION DATA FROM CITATION SYSTEMS UNDER THE CONDITIONS OF INCOMPLETE INFORMATION
Currently, transition to the electronic presentation of bibliographic information about scientific works has caused an increased interest in scientometric research. At the same time, the existing sci-entometric methods are criticized by scientists, since the incomplete bibliographic base and tools for its assessment do not allow the most accurate assessment of the contribution of scientific work. The problem of the quality of scientometric assessments, as a rule, is based on the study of the data of a certain citation system, which does not include complete information about all publications of the authors contained in other citation systems. Aim. This study is aimed at developing an adaptive ap-proach for the formation of aggregated data of bibliographic information of a scientific organiza-tion in conditions of incomplete information from the citation systems of the RSCI, “Google Aca-demy” and Scopus. Methods. The definition of the aggregated list of publications for the analysis of scientometric indicators was carried out by the Winnowing method, the Levenshtein algorithm, the shingle method and the Jaro–Winkler method. In the framework of the experimental study, the effectiveness of the application of the considered methods for aggregating information from cita-tion systems was assessed based on the analysis of accuracy, completeness and F-measure. Results. Expe¬riments on test data from the list of publications by authors of the Orenburg State University from the citation systems RSCI, Google Academy and Scopus showed that the Winnowing method formed the most accurate lists of publications by the F-measure criterion. To improve the perfor-mance of this algorithm, a two-stage optimization of the aggregation process was carried out, which made it possible to improve the running time of the algorithm when generating a list of bibliographic descriptions. Conclusion. The proposed approach for the formation of aggregated data of biblio-graphic information of a scientific organization in conditions of incomplete information from the ci-tation systems of the Russian Science Citation Index, Google Academy and Scopus allows increas-ing productivity in the formation of a list of authors' publications and shows good efficiency in de-termining the scientometric characteristics of authors.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Formalization of Basic Processes and Mathematical Model of the System for Monitoring and Analysis of Publications of Electronic Media Determination of the Parameters of the La¬mination of a Bimetallic Plate by Means of Active Thermal Non-Destructive Control Perm Region Natural Resource Potential Forecasting Using Machine Learning Models To the Question of Determining the Barometric Height by a Mechanical Altimeter and Air Signal System Formalism of Writing Out of Manipulators Dynamic Equation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1