国家科学界合作作者数据的质量问题

IF 1.4 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY Network Science Pub Date : 2023-01-20 DOI:10.1017/nws.2022.40
Domenico De Stefano, V. Fuccella, M. P. Vitale, S. Zaccarin
{"title":"国家科学界合作作者数据的质量问题","authors":"Domenico De Stefano, V. Fuccella, M. P. Vitale, S. Zaccarin","doi":"10.1017/nws.2022.40","DOIUrl":null,"url":null,"abstract":"Abstract A stream of research on co-authorship, used as a proxy of scholars’ collaborative behavior, focuses on members of a given scientific community defined at discipline and/or national basis for which co-authorship data have to be retrieved. Recent literature pointed out that international digital libraries provide partial coverage of the entire scholar scientific production as well as under-coverage of the scholars in the community. Bias in retrieving co-authorship data of the community of interest can affect network construction and network measures in several ways, providing a partial picture of the real collaboration in writing papers among scholars. In this contribution, we collected bibliographic records of Italian academic statisticians from an online platform (IRIS) available at most universities. Even if it guarantees a high coverage rate of our population and its scientific production, it is necessary to deal with some data quality issues. Thus, a web scraping procedure based on a semi-automatic tool to retrieve publication metadata, as well as data management tools to detect duplicate records and to reconcile authors, is proposed. As a result of our procedure, it emerged that collaboration is an active and increasing practice for Italian academic statisticians with some differences according to the gender, the academic ranking, and the university location of scholars. The heuristic procedure to accomplish data quality issues in the IRIS platform can represent a working case report to adapt to other bibliographic archives with similar characteristics.","PeriodicalId":51827,"journal":{"name":"Network Science","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Quality issues in co-authorship data of a national scientific community\",\"authors\":\"Domenico De Stefano, V. Fuccella, M. P. Vitale, S. Zaccarin\",\"doi\":\"10.1017/nws.2022.40\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract A stream of research on co-authorship, used as a proxy of scholars’ collaborative behavior, focuses on members of a given scientific community defined at discipline and/or national basis for which co-authorship data have to be retrieved. Recent literature pointed out that international digital libraries provide partial coverage of the entire scholar scientific production as well as under-coverage of the scholars in the community. Bias in retrieving co-authorship data of the community of interest can affect network construction and network measures in several ways, providing a partial picture of the real collaboration in writing papers among scholars. In this contribution, we collected bibliographic records of Italian academic statisticians from an online platform (IRIS) available at most universities. Even if it guarantees a high coverage rate of our population and its scientific production, it is necessary to deal with some data quality issues. Thus, a web scraping procedure based on a semi-automatic tool to retrieve publication metadata, as well as data management tools to detect duplicate records and to reconcile authors, is proposed. As a result of our procedure, it emerged that collaboration is an active and increasing practice for Italian academic statisticians with some differences according to the gender, the academic ranking, and the university location of scholars. The heuristic procedure to accomplish data quality issues in the IRIS platform can represent a working case report to adapt to other bibliographic archives with similar characteristics.\",\"PeriodicalId\":51827,\"journal\":{\"name\":\"Network Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-01-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Network Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1017/nws.2022.40\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"SOCIAL SCIENCES, INTERDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Network Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/nws.2022.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}
引用次数: 1

摘要

作为学者合作行为的代理,关于共同作者的研究流侧重于特定科学共同体的成员,这些成员以学科和/或国家为基础,必须检索共同作者数据。最近的文献指出,国际数字图书馆提供了部分覆盖整个学者的科学成果,以及对社区学者的覆盖不足。检索共同作者数据的偏见会从几个方面影响网络建设和网络措施,从而提供了学者之间真正合作撰写论文的部分情况。在这篇文章中,我们从大多数大学可用的在线平台(IRIS)收集了意大利学术统计学家的书目记录。即使保证了我国人口及其科学生产的高覆盖率,也需要处理一些数据质量问题。因此,提出了一种基于半自动工具检索出版物元数据的web抓取程序,以及基于数据管理工具检测重复记录和协调作者的web抓取程序。根据我们的程序,意大利学术统计学家的合作是一种积极的、日益增加的做法,根据性别、学术排名和学者所在大学的位置,合作存在一些差异。在IRIS平台中完成数据质量问题的启发式过程可以代表一个工作案例报告,以适应其他具有相似特征的书目档案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Quality issues in co-authorship data of a national scientific community
Abstract A stream of research on co-authorship, used as a proxy of scholars’ collaborative behavior, focuses on members of a given scientific community defined at discipline and/or national basis for which co-authorship data have to be retrieved. Recent literature pointed out that international digital libraries provide partial coverage of the entire scholar scientific production as well as under-coverage of the scholars in the community. Bias in retrieving co-authorship data of the community of interest can affect network construction and network measures in several ways, providing a partial picture of the real collaboration in writing papers among scholars. In this contribution, we collected bibliographic records of Italian academic statisticians from an online platform (IRIS) available at most universities. Even if it guarantees a high coverage rate of our population and its scientific production, it is necessary to deal with some data quality issues. Thus, a web scraping procedure based on a semi-automatic tool to retrieve publication metadata, as well as data management tools to detect duplicate records and to reconcile authors, is proposed. As a result of our procedure, it emerged that collaboration is an active and increasing practice for Italian academic statisticians with some differences according to the gender, the academic ranking, and the university location of scholars. The heuristic procedure to accomplish data quality issues in the IRIS platform can represent a working case report to adapt to other bibliographic archives with similar characteristics.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Network Science
Network Science SOCIAL SCIENCES, INTERDISCIPLINARY-
CiteScore
3.50
自引率
5.90%
发文量
24
期刊介绍: Network Science is an important journal for an important discipline - one using the network paradigm, focusing on actors and relational linkages, to inform research, methodology, and applications from many fields across the natural, social, engineering and informational sciences. Given growing understanding of the interconnectedness and globalization of the world, network methods are an increasingly recognized way to research aspects of modern society along with the individuals, organizations, and other actors within it. The discipline is ready for a comprehensive journal, open to papers from all relevant areas. Network Science is a defining work, shaping this discipline. The journal welcomes contributions from researchers in all areas working on network theory, methods, and data.
期刊最新文献
Guiding prevention initiatives by applying network analysis to systems maps of adverse childhood experiences and adolescent suicide The latent cognitive structures of social networks Algorithmic aspects of temporal betweenness When can networks be inferred from observed groups? Generating preferential attachment graphs via a Pólya urn with expanding colors
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1