Pub Date : 2024-08-01DOI: 10.1007/s11192-024-05114-z
Biao Zhang, Yunwei Chen
Research on innovative content within academic articles plays a vital role in exploring the frontiers of scientific and technological innovation while facilitating the integration of scientific and technological evaluation into academic discourse. To efficiently gather the latest innovative concepts, it is essential to accurately recognize innovative sentences within academic articles. Although several supervised methods for classifying article sentences exist, such as citation function sentences, future work sentences, and formal citation sentences, most of these methods rely on manual annotations or rule-based matching to construct datasets, often neglecting an in-depth exploration of model performance enhancement. To address the limitations of existing research in this domain, this study introduces a semi-automatic annotation method for innovative sentences (IS) with the assistance of expert comments information and proposes a data augmentation method by SAO reconstruction to augment the training dataset. Within this paper, we compared and analyzed the effectiveness of multiple algorithms for recognizing IS within academic articles. This study utilized the full text of academic articles as the research subject and employed the semi-automatic method to annotate IS for creating the training dataset. Then, this study validated the effectiveness of the semi-automatic annotation method through manual inspection and compared it with rule-based annotation methods. Additionally, the impacts of different augmentation ratios on model performance were also explored. The empirical results reveal the following: (1) The semi-automatic annotation method proposed in this study achieves an accuracy rate of 0.87239, ensuring the validity of annotated data while reducing the manual annotation cost. (2) The SAO reconstruction for data augmentation method significantly improved the accuracy of machine learning and deep learning algorithms in the recognition of IS. (3) When the augmentation ratio in the training set was set to 50%, the trained GPT-2 model was superior to other algorithms, achieving an ACC of 0.97883 in the test set and an F1 score of 0.95505 in practical application.
对学术文章中创新内容的研究在探索科技创新前沿、促进科技评价融入学术话语方面发挥着至关重要的作用。为了有效收集最新的创新概念,准确识别学术文章中的创新句子至关重要。虽然目前已有多种有监督的文章句子分类方法,如引用功能句子、未来工作句子和正式引用句子等,但这些方法大多依赖人工标注或基于规则的匹配来构建数据集,往往忽视了对模型性能提升的深入探索。针对该领域现有研究的局限性,本研究引入了一种借助专家评论信息的创新句子(IS)半自动标注方法,并提出了一种通过SAO重构来增强训练数据集的数据增强方法。在本文中,我们比较并分析了多种算法识别学术文章中创新句子的有效性。本研究以学术文章全文为研究对象,采用半自动方法对 IS 进行注释以创建训练数据集。然后,本研究通过人工检查验证了半自动注释方法的有效性,并将其与基于规则的注释方法进行了比较。此外,还探讨了不同的增强比例对模型性能的影响。实证结果显示了以下几点:(1) 本研究提出的半自动标注方法准确率达到 0.87239,确保了标注数据的有效性,同时降低了人工标注成本。(2)数据扩增的 SAO 重构方法显著提高了机器学习和深度学习算法在 IS 识别中的准确率。(3)当训练集的扩增比例设置为50%时,训练出的GPT-2模型优于其他算法,在测试集中的ACC达到0.97883,在实际应用中的F1得分达到0.95505。
{"title":"Automated recognition of innovative sentences in academic articles: semi-automatic annotation for cost reduction and SAO reconstruction for enhanced data","authors":"Biao Zhang, Yunwei Chen","doi":"10.1007/s11192-024-05114-z","DOIUrl":"https://doi.org/10.1007/s11192-024-05114-z","url":null,"abstract":"<p>Research on innovative content within academic articles plays a vital role in exploring the frontiers of scientific and technological innovation while facilitating the integration of scientific and technological evaluation into academic discourse. To efficiently gather the latest innovative concepts, it is essential to accurately recognize innovative sentences within academic articles. Although several supervised methods for classifying article sentences exist, such as citation function sentences, future work sentences, and formal citation sentences, most of these methods rely on manual annotations or rule-based matching to construct datasets, often neglecting an in-depth exploration of model performance enhancement. To address the limitations of existing research in this domain, this study introduces a semi-automatic annotation method for innovative sentences (IS) with the assistance of expert comments information and proposes a data augmentation method by SAO reconstruction to augment the training dataset. Within this paper, we compared and analyzed the effectiveness of multiple algorithms for recognizing IS within academic articles. This study utilized the full text of academic articles as the research subject and employed the semi-automatic method to annotate IS for creating the training dataset. Then, this study validated the effectiveness of the semi-automatic annotation method through manual inspection and compared it with rule-based annotation methods. Additionally, the impacts of different augmentation ratios on model performance were also explored. The empirical results reveal the following: (1) The semi-automatic annotation method proposed in this study achieves an accuracy rate of 0.87239, ensuring the validity of annotated data while reducing the manual annotation cost. (2) The SAO reconstruction for data augmentation method significantly improved the accuracy of machine learning and deep learning algorithms in the recognition of IS. (3) When the augmentation ratio in the training set was set to 50%, the trained GPT-2 model was superior to other algorithms, achieving an ACC of 0.97883 in the test set and an F1 score of 0.95505 in practical application.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"150 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141882449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1007/s11192-024-05113-0
Jiaqi Wei, Ying Guo
Knowledge has become a crucial and foundational resource for the development of the digital economy. Employing a fixed-effects panel model and drawing upon panel data from 279 Chinese cities from 2014 to 2019, this study empirically investigates the differential impacts of two distinct knowledge recombination activities—recombinant reuse and recombinant creation—on the development of the digital economy at the city level. Additionally, the moderating role of knowledge diversification in this relationship is explored. Our findings reveal that recombinant reuse exerts a negative influence on urban digital economy development, whereas recombinant creation demonstrates a positive influence. Furthermore, this study observe that knowledge diversification plays a positive moderating role in the relationship between the two divergent types of knowledge recombination and urban digital economy development. The finding suggests that a higher degree of knowledge diversification may exacerbate the detrimental impact of recombinant reuse on urban digital economy development in cities where such activities are prevalent. Conversely, cities that prioritize recombinant creation may accrue additional benefits for digital economy growth by fostering a diverse knowledge base. This study emphasizes the significance of knowledge recombination types and knowledge structure features in digital economy development. It contributes to the enrichment of theoretical studies related to the digital economy and provides insights for policymakers in cities to formulate appropriate digital economy development strategies based on local knowledge production mechanisms.
{"title":"The effect of urban capacity in knowledge recombination on digital economy development","authors":"Jiaqi Wei, Ying Guo","doi":"10.1007/s11192-024-05113-0","DOIUrl":"https://doi.org/10.1007/s11192-024-05113-0","url":null,"abstract":"<p>Knowledge has become a crucial and foundational resource for the development of the digital economy. Employing a fixed-effects panel model and drawing upon panel data from 279 Chinese cities from 2014 to 2019, this study empirically investigates the differential impacts of two distinct knowledge recombination activities—recombinant reuse and recombinant creation—on the development of the digital economy at the city level. Additionally, the moderating role of knowledge diversification in this relationship is explored. Our findings reveal that recombinant reuse exerts a negative influence on urban digital economy development, whereas recombinant creation demonstrates a positive influence. Furthermore, this study observe that knowledge diversification plays a positive moderating role in the relationship between the two divergent types of knowledge recombination and urban digital economy development. The finding suggests that a higher degree of knowledge diversification may exacerbate the detrimental impact of recombinant reuse on urban digital economy development in cities where such activities are prevalent. Conversely, cities that prioritize recombinant creation may accrue additional benefits for digital economy growth by fostering a diverse knowledge base. This study emphasizes the significance of knowledge recombination types and knowledge structure features in digital economy development. It contributes to the enrichment of theoretical studies related to the digital economy and provides insights for policymakers in cities to formulate appropriate digital economy development strategies based on local knowledge production mechanisms.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"7 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1007/s11192-024-05124-x
Kyriakos Drivas
We examine the evolution of order of authorship based on seniority during 1975–2021. Results show that for small teams (≤ 5 authors), the likelihood of placing the most junior author first has been increasing since the nineties. Additionally, the likelihood of placing the most senior author in last place has also been increasing. The results are at least partially driven by digitization of bibliographic records that drastically facilitated assignment of citations to all authors. We interpret our findings as a growing trend of small author teams becoming fairer. We do not find any significant effects for larger teams suggesting different practices when team size increases. Given that team size is, slowly but steadily, increasing over the last decades, the debate over the ethical considerations around authorship practices should place significance on the number of co-authors.
{"title":"The evolution of order of authorship based on researchers’ age","authors":"Kyriakos Drivas","doi":"10.1007/s11192-024-05124-x","DOIUrl":"https://doi.org/10.1007/s11192-024-05124-x","url":null,"abstract":"<p>We examine the evolution of order of authorship based on seniority during 1975–2021. Results show that for small teams (≤ 5 authors), the likelihood of placing the most junior author first has been increasing since the nineties. Additionally, the likelihood of placing the most senior author in last place has also been increasing. The results are at least partially driven by digitization of bibliographic records that drastically facilitated assignment of citations to all authors. We interpret our findings as a growing trend of small author teams becoming fairer. We do not find any significant effects for larger teams suggesting different practices when team size increases. Given that team size is, slowly but steadily, increasing over the last decades, the debate over the ethical considerations around authorship practices should place significance on the number of co-authors.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"42 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1007/s11192-024-05121-0
Luis Fernando Gómez, Andrés Felipe Montoya-Rendón, Juan Pablo Vélez-Uribe
The rise of globalization and the advent of Internet gave birth to a new science model in which national systems compete for a place in a global communication network where their products could circulate and gain notoriety. Several studies have been carried out to assess national performance in such network, particularly in terms of scientific research output and collaboration networks. However, academic journals in specific disciplines have not received the same attention. The purpose of this paper was to evaluate the evolution of journal prestige in terms of country and region of origin in the field of environmental engineering in SCImago Journal and Rank database during 1999–2022. It was found that Western countries and private publishers still dominate the discipline in 2022. The United Kingdom, the United States, and the Netherlands housed 51.16% of journals in 2022. Also, corporate publishers with headquarters in these countries own most of the journals, particularly in the top tier. Elsevier, Springer, and Taylor & Francis had a total 54 journals indexed in 2022, and 65.9% of journals rank in the first quartile belonged to these groups. However, Poland, China, and Iran have become major players. By 2022, they had 12, 10, and 7 environmental engineering journals indexed in SCImago Journal and Country Rank, and China and Iran’s journals have been ranked as Q1.
{"title":"Distribution by country, region, and publisher in environmental engineering journals in SCImago Journal and Country Rank database (1999–2022)","authors":"Luis Fernando Gómez, Andrés Felipe Montoya-Rendón, Juan Pablo Vélez-Uribe","doi":"10.1007/s11192-024-05121-0","DOIUrl":"https://doi.org/10.1007/s11192-024-05121-0","url":null,"abstract":"<p>The rise of globalization and the advent of Internet gave birth to a new science model in which national systems compete for a place in a global communication network where their products could circulate and gain notoriety. Several studies have been carried out to assess national performance in such network, particularly in terms of scientific research output and collaboration networks. However, academic journals in specific disciplines have not received the same attention. The purpose of this paper was to evaluate the evolution of journal prestige in terms of country and region of origin in the field of environmental engineering in SCImago Journal and Rank database during 1999–2022. It was found that Western countries and private publishers still dominate the discipline in 2022. The United Kingdom, the United States, and the Netherlands housed 51.16% of journals in 2022. Also, corporate publishers with headquarters in these countries own most of the journals, particularly in the top tier. Elsevier, Springer, and Taylor & Francis had a total 54 journals indexed in 2022, and 65.9% of journals rank in the first quartile belonged to these groups. However, Poland, China, and Iran have become major players. By 2022, they had 12, 10, and 7 environmental engineering journals indexed in SCImago Journal and Country Rank, and China and Iran’s journals have been ranked as Q1.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"190 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141882450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-30DOI: 10.1007/s11192-024-05109-w
Tayyaba Kanwal, Tehmina Amjad
With tremendous growth in the volume of published scholarly work, it becomes quite difficult for researchers to find appropriate documents relevant to their research topic. Many research paper recommendation approaches have been proposed and implemented which include collaborative filtering, content-based, metadata, link-based and multi-level citation network. In this research, a novel Research paper Recommendation system is proposed by integrating Multiple Features (RRMF). RRMF constructs a multi-level citation network and collaboration network of authors for feature integration. The structure and semantic based relationships are identified from the citation network whereas key authors are extracted from collaboration network for the study. For experimentation and analysis, AMiner v12 DBLP-Citation Network is used that covers 4,894,081 academic papers and 45,564,149 citation relationships. The information retrieval metrices including Mean Average Precision, Mean Reciprocal Rank and Normalized Discounted Cumulative Gain are used for evaluating the performance of proposed system. The research results of proposed approach RRMF are compared with baseline Multilevel Simultaneous Citation Network (MSCN) and Google Scholar. Consequently, comparison of RRMF showed 87% better recommendations than the traditional MSCN and Google Scholar.
{"title":"Research paper recommendation system based on multiple features from citation network","authors":"Tayyaba Kanwal, Tehmina Amjad","doi":"10.1007/s11192-024-05109-w","DOIUrl":"https://doi.org/10.1007/s11192-024-05109-w","url":null,"abstract":"<p>With tremendous growth in the volume of published scholarly work, it becomes quite difficult for researchers to find appropriate documents relevant to their research topic. Many research paper recommendation approaches have been proposed and implemented which include collaborative filtering, content-based, metadata, link-based and multi-level citation network. In this research, a novel Research paper Recommendation system is proposed by integrating Multiple Features (RRMF). RRMF constructs a multi-level citation network and collaboration network of authors for feature integration. The structure and semantic based relationships are identified from the citation network whereas key authors are extracted from collaboration network for the study. For experimentation and analysis, AMiner v12 DBLP-Citation Network is used that covers 4,894,081 academic papers and 45,564,149 citation relationships. The information retrieval metrices including Mean Average Precision, Mean Reciprocal Rank and Normalized Discounted Cumulative Gain are used for evaluating the performance of proposed system. The research results of proposed approach RRMF are compared with baseline Multilevel Simultaneous Citation Network (MSCN) and Google Scholar. Consequently, comparison of RRMF showed 87% better recommendations than the traditional MSCN and Google Scholar.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"49 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-27DOI: 10.1007/s11192-024-05046-8
Giulia Rossello, Arianna Martinelli
This paper investigates the growing evidence of research-related misconduct by developing and testing a theoretical framework. We study the deep causes of misconduct by asking whether the perception of an erosion of the core academic values, formally an ideology-based psychological contract breach, is associated with research-related misconduct. We test our framework by examining the use of Sci-Hub and providing empirical evidence that the loss of faith in scientific research sparkles research-related misconduct against publishers. Based on a stratified sample of 2849 academics working in 30 institutions in 6 European countries, we find that ideology-based psychological contract breach explains Sci-Hub usage, also when controlling for other possible motivations. The magnitude of the effect depends on contextual and demographic characteristics. Females, foreign, and tenured scholars are less likely to download papers illegally when experiencing a contract breach of academic values. Our results suggest that policies restoring academic values might also address research-related misconduct.
{"title":"Breach of academic values and misconduct: the case of Sci-Hub","authors":"Giulia Rossello, Arianna Martinelli","doi":"10.1007/s11192-024-05046-8","DOIUrl":"https://doi.org/10.1007/s11192-024-05046-8","url":null,"abstract":"<p>This paper investigates the growing evidence of research-related misconduct by developing and testing a theoretical framework. We study the deep causes of misconduct by asking whether the perception of an erosion of the core academic values, formally an ideology-based psychological contract breach, is associated with research-related misconduct. We test our framework by examining the use of Sci-Hub and providing empirical evidence that the loss of faith in scientific research sparkles research-related misconduct against publishers. Based on a stratified sample of 2849 academics working in 30 institutions in 6 European countries, we find that ideology-based psychological contract breach explains Sci-Hub usage, also when controlling for other possible motivations. The magnitude of the effect depends on contextual and demographic characteristics. Females, foreign, and tenured scholars are less likely to download papers illegally when experiencing a contract breach of academic values. Our results suggest that policies restoring academic values might also address research-related misconduct.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"67 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141774164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-27DOI: 10.1007/s11192-024-05115-y
Jens Peter Andersen, Serge P. J. M. Horbach, Tony Ross-Hellauer
This work studies “Contributed” articles in the Proceedings of the National Academy of Sciences of the United States of America (PNAS), a streamlined submission track for members of the US National Academy of Sciences (NAS). We assess the characteristics and impact of those articles and the background and status of their authors, by comparing these articles to PNAS articles following the traditional editorial process. Analyzing over 46,000 articles published between 2007 and 2020, we find: Firstly, and perhaps most centrally, (1) Contributed articles generally appear in lower per-author citation deciles than Direct submissions, but are more likely to appear in the overall top citation deciles of authors; (2) PNAS-Contributed articles tend to spend less time in the review process than Direct submissions; (3) Direct submissions tend to be slightly higher cited than Contributed articles, which are particularly overrepresented amongst least-cited PNAS papers. Disciplinary differences were negligible; (4) authors with lower mean normalized citation scores are profiting most from articles published as Contributed papers, in terms of citation impact; (5) NAS members tend to publish most Contributed articles in the first years after becoming an NAS member, with men publishing more of these articles than women; (6) Contributing authors take up a unique niche in terms of authorship roles, mainly performing supervisory and conceptualisation tasks, without the administration and funding acquisition tasks usually associated with last authors.
{"title":"Through the secret gate: a study of member-contributed submissions in PNAS","authors":"Jens Peter Andersen, Serge P. J. M. Horbach, Tony Ross-Hellauer","doi":"10.1007/s11192-024-05115-y","DOIUrl":"https://doi.org/10.1007/s11192-024-05115-y","url":null,"abstract":"<p>This work studies “Contributed” articles in the Proceedings of the National Academy of Sciences of the United States of America (PNAS), a streamlined submission track for members of the US National Academy of Sciences (NAS). We assess the characteristics and impact of those articles and the background and status of their authors, by comparing these articles to PNAS articles following the traditional editorial process. Analyzing over 46,000 articles published between 2007 and 2020, we find: Firstly, and perhaps most centrally, (1) Contributed articles generally appear in lower per-author citation deciles than Direct submissions, but are more likely to appear in the overall top citation deciles of authors; (2) PNAS-Contributed articles tend to spend less time in the review process than Direct submissions; (3) Direct submissions tend to be slightly higher cited than Contributed articles, which are particularly overrepresented amongst least-cited PNAS papers. Disciplinary differences were negligible; (4) authors with lower mean normalized citation scores are profiting most from articles published as Contributed papers, in terms of citation impact; (5) NAS members tend to publish most Contributed articles in the first years after becoming an NAS member, with men publishing more of these articles than women; (6) Contributing authors take up a unique niche in terms of authorship roles, mainly performing supervisory and conceptualisation tasks, without the administration and funding acquisition tasks usually associated with last authors.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"15 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141774163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-27DOI: 10.1007/s11192-024-05110-3
Ziyou Teng, Xuezhong Zhu
Tracing the utilization of science in technological innovations, especially the fraction with regard to public research, is of major importance in science policy. We explore the evolution of the global and domestic technological impact of Chinese scientific output with a detailed analysis of 6,901,428 utility patents granted at USPTO from 1976 to 2020 and their 337,949 citations to Chinese scientific publications. The results show that Chinese scientific output plays an increasingly critical role in science-based innovations while its contributions to domestic and foreign technology are fluctuated over the period. The domestic use of Chinese research is shrinking in late 1990s but keeps increasing thereafter. The technological impact of Chinese scientific output varies in different technology sectors. The recent growing share of Chinese-invented technology in the citing patents is dominated by Chinese patents in digital communication. The time lag of domestic citations is smaller than foreign citations, which is partially owing to the self-citations of Chinese inventors. However, the contributions of self-citations to short knowledge diffusion times are heterogeneous across technology fields. The largest producer of the cited science is universities and the next is public research organizations. Companies account for a meager quantity of total citations and their proportion is shrinking since 2007. Specifically, private technology depends substantially on public research for scientific knowledge. A national bias is found in the scientific knowledge components of patents assigned to companies, which to a certain point indicates the area where academia and industry hold a close relationship in China and Chinese companies are specialized. Taken together, these findings provide a dynamic country- and sector-dependent linkage of Chinese scientific output to domestic and global technology.
{"title":"Measuring the global and domestic technological impact of Chinese scientific output: a patent-to-paper citation analysis of science-technology linkage","authors":"Ziyou Teng, Xuezhong Zhu","doi":"10.1007/s11192-024-05110-3","DOIUrl":"https://doi.org/10.1007/s11192-024-05110-3","url":null,"abstract":"<p>Tracing the utilization of science in technological innovations, especially the fraction with regard to public research, is of major importance in science policy. We explore the evolution of the global and domestic technological impact of Chinese scientific output with a detailed analysis of 6,901,428 utility patents granted at USPTO from 1976 to 2020 and their 337,949 citations to Chinese scientific publications. The results show that Chinese scientific output plays an increasingly critical role in science-based innovations while its contributions to domestic and foreign technology are fluctuated over the period. The domestic use of Chinese research is shrinking in late 1990s but keeps increasing thereafter. The technological impact of Chinese scientific output varies in different technology sectors. The recent growing share of Chinese-invented technology in the citing patents is dominated by Chinese patents in digital communication. The time lag of domestic citations is smaller than foreign citations, which is partially owing to the self-citations of Chinese inventors. However, the contributions of self-citations to short knowledge diffusion times are heterogeneous across technology fields. The largest producer of the cited science is universities and the next is public research organizations. Companies account for a meager quantity of total citations and their proportion is shrinking since 2007. Specifically, private technology depends substantially on public research for scientific knowledge. A national bias is found in the scientific knowledge components of patents assigned to companies, which to a certain point indicates the area where academia and industry hold a close relationship in China and Chinese companies are specialized. Taken together, these findings provide a dynamic country- and sector-dependent linkage of Chinese scientific output to domestic and global technology.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"52 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141774165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-26DOI: 10.1007/s11192-024-05111-2
Wentao Cui, Meng Xiao, Ludi Wang, Xuezhi Wang, Yi Du, Yuanchun Zhou
Taxonomy alignment is essential for integrating knowledge across diverse domains and languages, facilitating information retrieval and data integration. Traditional methods heavily reliant on domain experts are time-consuming and resource-intensive. To address this challenge, this paper proposes an automated taxonomy alignment approach leveraging large language models (LLMs). We introduce a method that embeds taxonomy nodes into a continuous low-dimensional vector space, utilizing hierarchical relationships within category concepts to enhance alignment accuracy. Our approach capitalizes on the contextual understanding and semantic information capabilities of LLMs, offering a promising solution to the challenges of taxonomy alignment. We conducted experiments on two pairs of real-world taxonomies and demonstrated that our method is comparable in accuracy to manual alignment, while significantly reducing time, operational, and maintenance costs associated with taxonomy alignment. Our case study showcases the effectiveness of our approach by visualizing the taxonomy alignment results. This automated alignment framework addresses the increasing demand for accurate and efficient alignment processes across diverse knowledge domains.
{"title":"Automated taxonomy alignment via large language models: bridging the gap between knowledge domains","authors":"Wentao Cui, Meng Xiao, Ludi Wang, Xuezhi Wang, Yi Du, Yuanchun Zhou","doi":"10.1007/s11192-024-05111-2","DOIUrl":"https://doi.org/10.1007/s11192-024-05111-2","url":null,"abstract":"<p>Taxonomy alignment is essential for integrating knowledge across diverse domains and languages, facilitating information retrieval and data integration. Traditional methods heavily reliant on domain experts are time-consuming and resource-intensive. To address this challenge, this paper proposes an automated taxonomy alignment approach leveraging large language models (LLMs). We introduce a method that embeds taxonomy nodes into a continuous low-dimensional vector space, utilizing hierarchical relationships within category concepts to enhance alignment accuracy. Our approach capitalizes on the contextual understanding and semantic information capabilities of LLMs, offering a promising solution to the challenges of taxonomy alignment. We conducted experiments on two pairs of real-world taxonomies and demonstrated that our method is comparable in accuracy to manual alignment, while significantly reducing time, operational, and maintenance costs associated with taxonomy alignment. Our case study showcases the effectiveness of our approach by visualizing the taxonomy alignment results. This automated alignment framework addresses the increasing demand for accurate and efficient alignment processes across diverse knowledge domains.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"26 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141774410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-26DOI: 10.1007/s11192-024-05117-w
John P. A. Ioannidis, Thomas A. Collins, Jeroen Baas
Extreme publishing behavior may reflect a combination of some authors with genuinely high publication output and of other people who have their names listed too frequently in publications because of consortium agreements, gift authorship or other spurious practices. We aimed to evaluate the evolution of extreme publishing behavior across countries and scientific fields during 2000–2022. Extreme publishing behavior was defined as having > 60 full articles (original articles, reviews, conference papers) in a single calendar year and indexed in Scopus. We identified 3191 authors with extreme publishing behavior across science excluding Physics and 12624 such authors in Physics. While Physics had much higher numbers of extreme publishing authors in the past, in 2022 extreme publishing authors was almost as numerous in non-Physics and Physics disciplines (1226 vs. 1480). Excluding Physics, China had the largest number of extreme publishing authors, followed by the USA. The largest fold-wise increases between 2016 and 2022 (5-19-fold) occurred in Thailand, Saudi Arabia, Spain, India, Italy, Russia, Pakistan, and South Korea. Excluding Physics, most extreme publishing authors were in Clinical Medicine, but from 2016 to 2022 the largest relative increases (> sixfold) were seen in Agriculture, Fisheries & Forestry, Biology, and Mathematics and Statistics. Extreme publishing authors accounted for 4360 of the 10000 most-cited authors (based on raw citation count) across science. While most Physics authors with extreme publishing behavior had modest citation impact in a composite citation indicator that adjusts for co-authorship and author positions, 67% of authors with extreme publishing behavior in non-Physics fields remained within the top-2% according to that indicator among all authors with > = 5 full articles. Extreme publishing behavior has become worryingly common across scientific fields with rapidly increasing rates in some countries and settings and may herald a rapid depreciation of authorship standards.
{"title":"Evolving patterns of extreme publishing behavior across science","authors":"John P. A. Ioannidis, Thomas A. Collins, Jeroen Baas","doi":"10.1007/s11192-024-05117-w","DOIUrl":"https://doi.org/10.1007/s11192-024-05117-w","url":null,"abstract":"<p>Extreme publishing behavior may reflect a combination of some authors with genuinely high publication output and of other people who have their names listed too frequently in publications because of consortium agreements, gift authorship or other spurious practices. We aimed to evaluate the evolution of extreme publishing behavior across countries and scientific fields during 2000–2022. Extreme publishing behavior was defined as having > 60 full articles (original articles, reviews, conference papers) in a single calendar year and indexed in Scopus. We identified 3191 authors with extreme publishing behavior across science excluding Physics and 12624 such authors in Physics. While Physics had much higher numbers of extreme publishing authors in the past, in 2022 extreme publishing authors was almost as numerous in non-Physics and Physics disciplines (1226 vs. 1480). Excluding Physics, China had the largest number of extreme publishing authors, followed by the USA. The largest fold-wise increases between 2016 and 2022 (5-19-fold) occurred in Thailand, Saudi Arabia, Spain, India, Italy, Russia, Pakistan, and South Korea. Excluding Physics, most extreme publishing authors were in Clinical Medicine, but from 2016 to 2022 the largest relative increases (> sixfold) were seen in Agriculture, Fisheries & Forestry, Biology, and Mathematics and Statistics. Extreme publishing authors accounted for 4360 of the 10000 most-cited authors (based on raw citation count) across science. While most Physics authors with extreme publishing behavior had modest citation impact in a composite citation indicator that adjusts for co-authorship and author positions, 67% of authors with extreme publishing behavior in non-Physics fields remained within the top-2% according to that indicator among all authors with > = 5 full articles. Extreme publishing behavior has become worryingly common across scientific fields with rapidly increasing rates in some countries and settings and may herald a rapid depreciation of authorship standards.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"61 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141774409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}