Pub Date : 2025-01-21DOI: 10.1016/j.joi.2025.101641
Zuguang Gu
We introduced a new metric, “citation enrichment”, to measure country-to-country influence using citation data. This metric evaluates the degree to which a country prefers to cite another country compared to a random citation process. We applied the citation enrichment method to over 12 million publications in the life science and biomedical fields and we have the following key findings: 1) The global scientific landscape is divided into two separated worlds where developed Western countries exhibit an overall mutual under-influence with the rest of the world; 2) Within each world, countries form clusters based on their mutual citation preferences, with these groupings strongly associated with their geographical and cultural proximity; 3) The two worlds exhibit distinct patterns of the influence balance among countries, revealing underlying mechanisms that drive influence dynamics. We have constructed a comprehensive world map of scientific influence which greatly enhances the deep understanding of the international exchange of scientific knowledge. The citation enrichment metric is developed under a well-defined statistical framework and has the potential to be extended into a versatile and powerful tool for bibliometrics and related research fields.
{"title":"Two separated worlds: On the preference of influence in life science and biomedical research","authors":"Zuguang Gu","doi":"10.1016/j.joi.2025.101641","DOIUrl":"10.1016/j.joi.2025.101641","url":null,"abstract":"<div><div>We introduced a new metric, “citation enrichment”, to measure country-to-country influence using citation data. This metric evaluates the degree to which a country prefers to cite another country compared to a random citation process. We applied the citation enrichment method to over 12 million publications in the life science and biomedical fields and we have the following key findings: 1) The global scientific landscape is divided into two separated worlds where developed Western countries exhibit an overall mutual under-influence with the rest of the world; 2) Within each world, countries form clusters based on their mutual citation preferences, with these groupings strongly associated with their geographical and cultural proximity; 3) The two worlds exhibit distinct patterns of the influence balance among countries, revealing underlying mechanisms that drive influence dynamics. We have constructed a comprehensive world map of scientific influence which greatly enhances the deep understanding of the international exchange of scientific knowledge. The citation enrichment metric is developed under a well-defined statistical framework and has the potential to be extended into a versatile and powerful tool for bibliometrics and related research fields.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 2","pages":"Article 101641"},"PeriodicalIF":3.4,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-03DOI: 10.1016/j.joi.2024.101628
Chengjun Zhang , ZhengJu Ren , Gaofeng Xiang , Wenbin Yu , Zeyu Xu , Jin Liu , Yadang Chen
The increasing number of academic practitioners has resulted in a significantly increased volume of scientific papers, attracting considerable interest among researchers examining this correlation. However, little research has been devoted to the phenomenon of scientists monopolizing authorship in academic journals. This study thus introduces the term Publication Monopoly (PM) to describe this effect. The study refers to the prolific authors as Monopoly Authors. In addition, it proposes a Monopoly Index to assess PM severity. For each journal, the Monopoly Contribution (MC) quantifies the impact of Monopoly Authors. Using the Open Academic Graph dataset, our analysis explores the prevalence of PM and the corresponding MC in selected journals and academic fields. The findings demonstrate a positive relationship between the number of articles published and the likelihood of PM occurrence in most journals. Furthermore, fields relying heavily on laboratory environments or specialized equipment are particularly susceptible to PM. Additionally, once a journal becomes entrenched in PM, it is challenging to alleviate this phenomenon over time. Our study of PM aimed to prompt academic practitioners to carefully consider the likelihood of acceptance in journals characterized by high PM levels. Moreover, the study encourages journals to reconsider their need to accept more articles from Monopoly Authors.
{"title":"A comprehensive comparative analysis of publication monopoly phenomenon in scientific journals","authors":"Chengjun Zhang , ZhengJu Ren , Gaofeng Xiang , Wenbin Yu , Zeyu Xu , Jin Liu , Yadang Chen","doi":"10.1016/j.joi.2024.101628","DOIUrl":"10.1016/j.joi.2024.101628","url":null,"abstract":"<div><div>The increasing number of academic practitioners has resulted in a significantly increased volume of scientific papers, attracting considerable interest among researchers examining this correlation. However, little research has been devoted to the phenomenon of scientists monopolizing authorship in academic journals. This study thus introduces the term Publication Monopoly (PM) to describe this effect. The study refers to the prolific authors as Monopoly Authors. In addition, it proposes a Monopoly Index to assess PM severity. For each journal, the Monopoly Contribution (MC) quantifies the impact of Monopoly Authors. Using the Open Academic Graph dataset, our analysis explores the prevalence of PM and the corresponding MC in selected journals and academic fields. The findings demonstrate a positive relationship between the number of articles published and the likelihood of PM occurrence in most journals. Furthermore, fields relying heavily on laboratory environments or specialized equipment are particularly susceptible to PM. Additionally, once a journal becomes entrenched in PM, it is challenging to alleviate this phenomenon over time. Our study of PM aimed to prompt academic practitioners to carefully consider the likelihood of acceptance in journals characterized by high PM levels. Moreover, the study encourages journals to reconsider their need to accept more articles from Monopoly Authors.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101628"},"PeriodicalIF":3.4,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-28DOI: 10.1016/j.joi.2024.101616
Munan Li , Liang Wang
With the trends of technology convergence and technology interdisciplinarity, technology-field (TF) resolution and classification of patents have gradually been challenged. Whether for patent applicants or for patent examiners, more precisely labeling the TF for a certain patent is important for technological searches. However, determining the TF of a patent may be difficult and may even involve the strategic behavior of patenting, which can cause noise in patent classification systems (PCSs). In addition, some specific patents could contain more TFs than claimed or be assigned questionable IPC codes; subsequently, in a regular search for technology/patents, information could be missed. Considering the advantages of deep learning compared with traditional machine learning algorithms in areas such as natural language processing (NLP), text classification and text sentiment analysis, this paper investigates several popular deep learning models and proposes a large-scale multilabel regression (MLR) model to handle specific patent analyses under situations of small sample learning. To verify the proposed MLR model for patent classification, the case study on smart cities and industrial Internet of Things (IIoT) is conducted. The MLR experiments on the TF resolution of smart cities and IIoT have yielded moderate results compared with those of the latest patent classification studies, which also rely on deep learning and the large language models (LLMs), which include RCNN, Bi-LSTM, BERT and GPT-4 etc. Therefore, the proposed MLR model with a customized loss function could be moderately effective for patent classification within a specific technology theme, could have implications for patent classification and the TF resolution of patents, and could further enrich methodologies for patent mining and informetrics based on artificial intelligence (AI).
{"title":"Leveraging patent classification based on deep learning: The case study on smart cities and industrial Internet of Things","authors":"Munan Li , Liang Wang","doi":"10.1016/j.joi.2024.101616","DOIUrl":"10.1016/j.joi.2024.101616","url":null,"abstract":"<div><div>With the trends of technology convergence and technology interdisciplinarity, technology-field (TF) resolution and classification of patents have gradually been challenged. Whether for patent applicants or for patent examiners, more precisely labeling the TF for a certain patent is important for technological searches. However, determining the TF of a patent may be difficult and may even involve the strategic behavior of patenting, which can cause noise in patent classification systems (PCSs). In addition, some specific patents could contain more TFs than claimed or be assigned questionable IPC codes; subsequently, in a regular search for technology/patents, information could be missed. Considering the advantages of deep learning compared with traditional machine learning algorithms in areas such as natural language processing (NLP), text classification and text sentiment analysis, this paper investigates several popular deep learning models and proposes a large-scale multilabel regression (MLR) model to handle specific patent analyses under situations of small sample learning. To verify the proposed MLR model for patent classification, the case study on smart cities and industrial Internet of Things (IIoT) is conducted. The MLR experiments on the TF resolution of smart cities and IIoT have yielded moderate results compared with those of the latest patent classification studies, which also rely on deep learning and the large language models (LLMs), which include RCNN, Bi-LSTM, BERT and GPT-4 etc. Therefore, the proposed MLR model with a customized loss function could be moderately effective for patent classification within a specific technology theme, could have implications for patent classification and the TF resolution of patents, and could further enrich methodologies for patent mining and informetrics based on artificial intelligence (AI).</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101616"},"PeriodicalIF":3.4,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142747971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.joi.2024.101617
Shuo Xu , Zhen Liu , Xin An , Hong Wang , Hongshen Pang
Compared to the science-technology linkages, the linkages among science, technology, and industry are largely under-studied. Therefore, this paper proposes a main path analysis based framework to discover the science-technology-industry linkages, in which scientific publications, patents, and products are viewed as respective proxies of scientific research, technological advance, and industrial development. To validate the feasibility and effectiveness of our framework, after the DrugBank dataset in pharmaceutical industry was downloaded in XML form on 1 November 2019, this dataset is further enriched, drug entity mentions are recognized from scholarly articles and patents, and several citation cycles are eliminated. The scientific publications span from 1871 to 2019, and patents from 1953 to 2019. There are 8,421, 5,590, and 2,136 article, patent, and drug nodes and 41,200 citations in the largest weakly connected component of the constructed heterogeneous citation network. From empirical analysis on the largest weakly connected component, main conclusions can be drawn as follows. (1) The discovered developmental trajectories indeed encode the interactions among science, technology, and industry. Science and technology not only interact with each other, but also jointly promote the development of the industry, and the industry, in turn, influences the advancement of science and technology. (2) The developmental modes in the pharmaceutical industry can be grouped into three categories: pushed by only science, pushed by only technology, and pushed by science and technology simultaneously. (3) The drugs bridge scientific research and technological advance, and thereby help enhance knowledge exchanges between science and technology and shorten the cycle of drug development. This study contributes to discovering the linkages among science, technology, and industry from the perspective of mutual citations among scholarly articles, patents, and products. However, a scientific verification of our framework in other industries apart from pharmaceutical industry still needs to be further investigated.
{"title":"Linkages among science, technology, and industry on the basis of main path analysis","authors":"Shuo Xu , Zhen Liu , Xin An , Hong Wang , Hongshen Pang","doi":"10.1016/j.joi.2024.101617","DOIUrl":"10.1016/j.joi.2024.101617","url":null,"abstract":"<div><div>Compared to the science-technology linkages, the linkages among science, technology, and industry are largely under-studied. Therefore, this paper proposes a main path analysis based framework to discover the science-technology-industry linkages, in which scientific publications, patents, and products are viewed as respective proxies of scientific research, technological advance, and industrial development. To validate the feasibility and effectiveness of our framework, after the DrugBank dataset in pharmaceutical industry was downloaded in XML form on 1 November 2019, this dataset is further enriched, drug entity mentions are recognized from scholarly articles and patents, and several citation cycles are eliminated. The scientific publications span from 1871 to 2019, and patents from 1953 to 2019. There are 8,421, 5,590, and 2,136 article, patent, and drug nodes and 41,200 citations in the largest weakly connected component of the constructed heterogeneous citation network. From empirical analysis on the largest weakly connected component, main conclusions can be drawn as follows. (1) The discovered developmental trajectories indeed encode the interactions among science, technology, and industry. Science and technology not only interact with each other, but also jointly promote the development of the industry, and the industry, in turn, influences the advancement of science and technology. (2) The developmental modes in the pharmaceutical industry can be grouped into three categories: pushed by only science, pushed by only technology, and pushed by science and technology simultaneously. (3) The drugs bridge scientific research and technological advance, and thereby help enhance knowledge exchanges between science and technology and shorten the cycle of drug development. This study contributes to discovering the linkages among science, technology, and industry from the perspective of mutual citations among scholarly articles, patents, and products. However, a scientific verification of our framework in other industries apart from pharmaceutical industry still needs to be further investigated.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101617"},"PeriodicalIF":3.4,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.joi.2024.101618
Lorena Delgado-Quirós , José Luis Ortega
The aim of this study is to examine disparities in citation counts amongst scholarly databases and the reasons that contribute to these differences. A random Crossref sample of >115k DOIs was selected and subsequently searched across six databases (Dimensions, Google Scholar, Microsoft Academic, Scilit, Semantic Scholar and The Lens). In July 2021, citation counts and lists of references were extracted from each database for comparative processing and analysis. The findings indicate that publications in Crossref-based databases (Crossref, Dimensions, Scilit and The Lens) have similar citation counts, while documents in search engines (Google Scholar, Microsoft Academic and Semantic Scholar) have a higher number of citations due to a greater coverage of publications, but also to the integration of web copies. Analysis of references has revealed that Scilit only extracts references with Digital Object Identifiers (DOI) and that Semantic Scholar causes significant problems when it adds references from external web versions. Ultimately, the study has shown that all the databases struggle to index references from books and book chapters, which may be attributable to certain academic publishers. The study concludes with a discussion of the potential effects on research evaluation that may arise from this lack of citations.
{"title":"Citation counts and inclusion of references in seven free-access scholarly databases: A comparative analysis","authors":"Lorena Delgado-Quirós , José Luis Ortega","doi":"10.1016/j.joi.2024.101618","DOIUrl":"10.1016/j.joi.2024.101618","url":null,"abstract":"<div><div>The aim of this study is to examine disparities in citation counts amongst scholarly databases and the reasons that contribute to these differences. A random Crossref sample of >115k DOIs was selected and subsequently searched across six databases (Dimensions, Google Scholar, Microsoft Academic, Scilit, Semantic Scholar and The Lens). In July 2021, citation counts and lists of references were extracted from each database for comparative processing and analysis. The findings indicate that publications in Crossref-based databases (Crossref, Dimensions, Scilit and The Lens) have similar citation counts, while documents in search engines (Google Scholar, Microsoft Academic and Semantic Scholar) have a higher number of citations due to a greater coverage of publications, but also to the integration of web copies. Analysis of references has revealed that Scilit only extracts references with Digital Object Identifiers (DOI) and that Semantic Scholar causes significant problems when it adds references from external web versions. Ultimately, the study has shown that all the databases struggle to index references from books and book chapters, which may be attributable to certain academic publishers. The study concludes with a discussion of the potential effects on research evaluation that may arise from this lack of citations.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101618"},"PeriodicalIF":3.4,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-21DOI: 10.1016/j.joi.2024.101615
Yunhan Yang , Chenwei Zhang , Huimin Xu , Yi Bu , Meijun Liu , Ying Ding
The dropout of scholars poses risks by depleting valuable resources and hindering the scientific community. Knowledge gaps on this issue lack consistency across career statuses and overlook its dynamic nature. To address this gap, we analyzed the career trajectories of over 24 million scholars in 19 fields from the MAG dataset, examining dropout rates by field, career status, and generation. Firstly, we observed an unexpectedly high proportion of transients, comprising a growing proportion of newcomers and accounting for over 50% of publications in most soft sciences. This highlights the shortage of continuants, such as scholars with full careers, who contribute to scientific communities. Secondly, our exploration into gender-specific dropout rates revealed that women exhibit a significantly higher dropout rates within the first 20 years, covering career statuses including junior dropout, early-career dropout, and mid-career dropouts. Notably, early- and mid-career dropouts demonstrate the lowest and most stable dropout rates. These insights prompted the development of a gendered scientific career model that combines changes in scholar numbers and dropout rates across career statuses. Lastly, our generational analysis spanning four generations unveiled a diminishing gender gap in dropout rates. In hard sciences, women encounter initial career challenges, with the gender gap in dropout rates decreasing over time. In contrast, the gender gap in soft sciences persists longer. These findings hold consistent across six subfields, offering implications for field evaluation, gender disparities policies, and a deeper understanding of scholarly dropout across generations.
{"title":"Gender differences in dropout rate: From field, career status, and generation perspectives","authors":"Yunhan Yang , Chenwei Zhang , Huimin Xu , Yi Bu , Meijun Liu , Ying Ding","doi":"10.1016/j.joi.2024.101615","DOIUrl":"10.1016/j.joi.2024.101615","url":null,"abstract":"<div><div>The dropout of scholars poses risks by depleting valuable resources and hindering the scientific community. Knowledge gaps on this issue lack consistency across career statuses and overlook its dynamic nature. To address this gap, we analyzed the career trajectories of over 24 million scholars in 19 fields from the MAG dataset, examining dropout rates by field, career status, and generation. Firstly, we observed an unexpectedly high proportion of transients, comprising a growing proportion of newcomers and accounting for over 50% of publications in most soft sciences. This highlights the shortage of continuants, such as scholars with full careers, who contribute to scientific communities. Secondly, our exploration into gender-specific dropout rates revealed that women exhibit a significantly higher dropout rates within the first 20 years, covering career statuses including junior dropout, early-career dropout, and mid-career dropouts. Notably, early- and mid-career dropouts demonstrate the lowest and most stable dropout rates. These insights prompted the development of a gendered scientific career model that combines changes in scholar numbers and dropout rates across career statuses. Lastly, our generational analysis spanning four generations unveiled a diminishing gender gap in dropout rates. In hard sciences, women encounter initial career challenges, with the gender gap in dropout rates decreasing over time. In contrast, the gender gap in soft sciences persists longer. These findings hold consistent across six subfields, offering implications for field evaluation, gender disparities policies, and a deeper understanding of scholarly dropout across generations.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101615"},"PeriodicalIF":3.4,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-21DOI: 10.1016/j.joi.2024.101609
Linlin Ren , Lei Guo , Hui Yu , Feng Guo , Xinhua Wang , Xiaohui Han
In previous collaboration studies, a majority of them concentrate on examining cooperation models, often overlooking the pivotal role played by a Top Scientist (TS) in scientific advancements. As far as my knowledge extends, only one relevant work delves into the correlation between innovation and collaboration with TSs, and no research has explored this relationship from a causal perspective. More precisely, previous studies suffer from several limitations in their examination of this topic: 1) Existing studies on Papers' Novelty (PN) primarily focus on calculating methods, with limited exploration of its relationship with scientific cooperation. 2) Research that has explored the link between collaboration with TSs and output innovation often adopts a correlational perspective, lacking a causal analysis that could correct for potential confounding factors. 3) Previous methodologies overlook the attributes of citation networks as potential confounding factors, a crucial consideration in identifying identical papers in causal analyses. 4) The impact of disciplinary diversity of papers on the innovation output when collaborating with TSs is often overlooked in prior research. To address these limitations, we conduct a causal analysis of publications in three subfields of computer science from the Web of Science (WoS) database to demonstrate the impact of collaborating with TSs on PN. Specifically, to tackle Limitations 1) and 2), we employ PN as a metric to assess the quality of academic output and explore its causal relationship with collaborating with TSs using the Propensity Score Matching (PSM) method. To address Limitation 3), we comprehensively consider potential confounding factors influencing PSM matching by further incorporating the attributes of citation networks, thereby minimizing selection bias. To deal with Limitation 4), we not only focus on the overall treatment effect but also delve into the treatment effect of intra-disciplinary and interdisciplinary collaboration modes. The research findings indicate that the papers collaborating with TSs exhibit lower PN compared to those without the participation of TSs. This suggests that collaboration with TSs may come at the cost of reduced novelty. This discovery prompts profound reflections on scientific collaboration, emphasizing the challenges and trade-offs that may exist in collaboration.
{"title":"Collaborating with top scientists may not improve paper novelty: A causal analysis based on the propensity score matching method","authors":"Linlin Ren , Lei Guo , Hui Yu , Feng Guo , Xinhua Wang , Xiaohui Han","doi":"10.1016/j.joi.2024.101609","DOIUrl":"10.1016/j.joi.2024.101609","url":null,"abstract":"<div><div>In previous collaboration studies, a majority of them concentrate on examining cooperation models, often overlooking the pivotal role played by a Top Scientist (TS) in scientific advancements. As far as my knowledge extends, only one relevant work delves into the correlation between innovation and collaboration with TSs, and no research has explored this relationship from a causal perspective. More precisely, previous studies suffer from several limitations in their examination of this topic: 1) Existing studies on Papers' Novelty (PN) primarily focus on calculating methods, with limited exploration of its relationship with scientific cooperation. 2) Research that has explored the link between collaboration with TSs and output innovation often adopts a correlational perspective, lacking a causal analysis that could correct for potential confounding factors. 3) Previous methodologies overlook the attributes of citation networks as potential confounding factors, a crucial consideration in identifying identical papers in causal analyses. 4) The impact of disciplinary diversity of papers on the innovation output when collaborating with TSs is often overlooked in prior research. To address these limitations, we conduct a causal analysis of publications in three subfields of computer science from the Web of Science (WoS) database to demonstrate the impact of collaborating with TSs on PN. Specifically, to tackle Limitations 1) and 2), we employ PN as a metric to assess the quality of academic output and explore its causal relationship with collaborating with TSs using the Propensity Score Matching (PSM) method. To address Limitation 3), we comprehensively consider potential confounding factors influencing PSM matching by further incorporating the attributes of citation networks, thereby minimizing selection bias. To deal with Limitation 4), we not only focus on the overall treatment effect but also delve into the treatment effect of intra-disciplinary and interdisciplinary collaboration modes. The research findings indicate that the papers collaborating with TSs exhibit lower PN compared to those without the participation of TSs. This suggests that collaboration with TSs may come at the cost of reduced novelty. This discovery prompts profound reflections on scientific collaboration, emphasizing the challenges and trade-offs that may exist in collaboration.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101609"},"PeriodicalIF":3.4,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-20DOI: 10.1016/j.joi.2024.101614
Giovanni Abramo , Ciriaco Andrea D'Angelo
Just as innovations often succeed in fields beyond their original domains, this study explores whether the same applies to scientific discoveries. We investigate the flow of knowledge across scientific disciplines by analyzing connections between 2015 cited publications indexed in the Web of Science and their citing counterparts. Specifically, we measure the rates of knowledge dissemination within and across different fields. Our study addresses key questions about disparities between inter- and intra-domain dissemination rates, the relationship between dissemination types and scholarly impact, and the evolution of these patterns over time. These findings enhance our understanding of knowledge flows and provide practical insights with significant implications for evaluative bibliometrics.
正如创新往往在其原有领域之外的领域取得成功一样,本研究探讨了科学发现是否也是如此。我们通过分析被《科学网》(Web of Science)收录的 2015 篇被引用的出版物与其被引用的同类出版物之间的联系,来研究知识在科学学科间的流动。具体来说,我们测量了知识在不同领域内和不同领域间的传播率。我们的研究解决了以下关键问题:领域间和领域内传播率的差异、传播类型与学术影响力之间的关系,以及这些模式随时间的演变。这些发现加深了我们对知识流动的理解,并提供了对文献计量学评价具有重要意义的实用见解。
{"title":"Inter- and intra-domain knowledge flows: Examining their relationship with impact at the field level over time","authors":"Giovanni Abramo , Ciriaco Andrea D'Angelo","doi":"10.1016/j.joi.2024.101614","DOIUrl":"10.1016/j.joi.2024.101614","url":null,"abstract":"<div><div>Just as innovations often succeed in fields beyond their original domains, this study explores whether the same applies to scientific discoveries. We investigate the flow of knowledge across scientific disciplines by analyzing connections between 2015 cited publications indexed in the Web of Science and their citing counterparts. Specifically, we measure the rates of knowledge dissemination within and across different fields. Our study addresses key questions about disparities between inter- and intra-domain dissemination rates, the relationship between dissemination types and scholarly impact, and the evolution of these patterns over time. These findings enhance our understanding of knowledge flows and provide practical insights with significant implications for evaluative bibliometrics.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101614"},"PeriodicalIF":3.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-20DOI: 10.1016/j.joi.2024.101612
Jinqing Yang , Jiming Hu
There are several potential patterns in the evolution of scientific knowledge. In order to delve deeper into the changes in function and role during the evolution of knowledge, we have proposed a research framework that examines the transition of scientific knowledge roles from the perspective of a hierarchical structure. We constructed two classification models of transition possibility and transition type to predict whether one undergoes a role transition and which type of role transition it belongs to. Several datasets were constructed by utilizing the entire corpus of publications available in PubMed and the history records of MeSH. Among the tasks of transition type prediction and transition possibility prediction, the Gradient Boosting classifier performed the best. The binary classification model of transition possibility achieved a precision of 72.58 %, a recall of 71.04 %, and an F1 score of 71.78 %. The multi-classification model of transition possibility had a macro-F1 score of 61.29 %, a micro-F1 score of 84.07 %, and a weighted-F1 score of 82.90 %. Further, we found that the knowledge genealogy features contribute the most to the prediction of transition possibility while knowledge attribute and network structure features have a significantly greater influence on the prediction of transition type. Most features have an obvious effect on the role transition of the Content-change type, followed by Child-generation and Localization-shift types.
科学知识的演变有几种潜在的模式。为了深入探讨知识演化过程中功能和角色的变化,我们提出了一个研究框架,从层次结构的角度考察科学知识角色的转换。我们构建了过渡可能性和过渡类型两个分类模型,以预测是否发生角色过渡以及属于哪种类型的角色过渡。我们利用 PubMed 中的全部出版物语料库和 MeSH 的历史记录构建了多个数据集。在过渡类型预测和过渡可能性预测任务中,梯度提升分类器的表现最好。过渡可能性二元分类模型的精确度为 72.58 %,召回率为 71.04 %,F1 得分为 71.78 %。过渡可能性多分类模型的宏观 F1 得分为 61.29 %,微观 F1 得分为 84.07 %,加权 F1 得分为 82.90 %。此外,我们发现知识谱系特征对过渡可能性预测的贡献最大,而知识属性和网络结构特征对过渡类型预测的影响明显更大。大多数特征对 "内容变化 "类型的角色转换有明显影响,其次是 "子代 "和 "本地化转移 "类型。
{"title":"Scientific knowledge role transition prediction from a knowledge hierarchical structure perspective","authors":"Jinqing Yang , Jiming Hu","doi":"10.1016/j.joi.2024.101612","DOIUrl":"10.1016/j.joi.2024.101612","url":null,"abstract":"<div><div>There are several potential patterns in the evolution of scientific knowledge. In order to delve deeper into the changes in function and role during the evolution of knowledge, we have proposed a research framework that examines the transition of scientific knowledge roles from the perspective of a hierarchical structure. We constructed two classification models of transition possibility and transition type to predict whether one undergoes a role transition and which type of role transition it belongs to. Several datasets were constructed by utilizing the entire corpus of publications available in <em>PubMed</em> and the history records of <em>MeSH</em>. Among the tasks of transition type prediction and transition possibility prediction, the <em>Gradient Boosting</em> classifier performed the best. The binary classification model of transition possibility achieved a precision of 72.58 %, a recall of 71.04 %, and an F1 score of 71.78 %. The multi-classification model of transition possibility had a macro-F1 score of 61.29 %, a micro-F1 score of 84.07 %, and a weighted-F1 score of 82.90 %. Further, we found that the knowledge genealogy features contribute the most to the prediction of transition possibility while knowledge attribute and network structure features have a significantly greater influence on the prediction of transition type. Most features have an obvious effect on the role transition of the <strong><em>Content-change type</em></strong>, followed by <strong><em>Child-generation</em></strong> and <strong><em>Localization-shift types.</em></strong></div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101612"},"PeriodicalIF":3.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-19DOI: 10.1016/j.joi.2024.101606
Dan Wang , Xiao Zhou , Pengwei Zhao , Juan Pang , Qiaoyang Ren
Identifying breakthrough technologies is crucial for advancing technological innovation and, in this sense, the innovation patterns driven by science are considered to be key pathways for forming breakthrough technologies. Building on this premise, this paper presents a framework for identifying breakthrough technologies that starts with these signals of scientific innovation. The first step in the method is to construct a science-technology knowledge network based on papers and patents. Then a two-stage selection method funnels the scientific innovation signals, filtering out those with the potential to trigger technological breakthroughs. Next, a machine learning-based link prediction model, integrating three types of features, identifies new links between science-driven signals and existing technologies. A community detection algorithm then identifies sub-networks of technologies formed around these new links. Finally, a structural entropy index is used to evaluate these sub-networks to determine potential breakthrough technologies. By systematically characterizing the content and core features of scientific innovation signals, this study reveals the driving sources of technological breakthroughs and sheds light on the absorption and diffusion processes of scientific innovation. We validated the method through a use case in the field of artificial intelligence. Those who manage technological innovation should find the insights of this research particularly valuable.
{"title":"Early identification of breakthrough technologies: Insights from science-driven innovations","authors":"Dan Wang , Xiao Zhou , Pengwei Zhao , Juan Pang , Qiaoyang Ren","doi":"10.1016/j.joi.2024.101606","DOIUrl":"10.1016/j.joi.2024.101606","url":null,"abstract":"<div><div>Identifying breakthrough technologies is crucial for advancing technological innovation and, in this sense, the innovation patterns driven by science are considered to be key pathways for forming breakthrough technologies. Building on this premise, this paper presents a framework for identifying breakthrough technologies that starts with these signals of scientific innovation. The first step in the method is to construct a science-technology knowledge network based on papers and patents. Then a two-stage selection method funnels the scientific innovation signals, filtering out those with the potential to trigger technological breakthroughs. Next, a machine learning-based link prediction model, integrating three types of features, identifies new links between science-driven signals and existing technologies. A community detection algorithm then identifies sub-networks of technologies formed around these new links. Finally, a structural entropy index is used to evaluate these sub-networks to determine potential breakthrough technologies. By systematically characterizing the content and core features of scientific innovation signals, this study reveals the driving sources of technological breakthroughs and sheds light on the absorption and diffusion processes of scientific innovation. We validated the method through a use case in the field of artificial intelligence. Those who manage technological innovation should find the insights of this research particularly valuable.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101606"},"PeriodicalIF":3.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}