Yang Zhao, Mengting Zhang, Xiaoli Chen, Zhixiong Zhang
{"title":"Early identification of scientific breakthroughs through outlier analysis based on research entities","authors":"Yang Zhao, Mengting Zhang, Xiaoli Chen, Zhixiong Zhang","doi":"10.2478/jdis-2024-0027","DOIUrl":null,"url":null,"abstract":"Purpose To address the “anomalies” that occur when scientific breakthroughs emerge, this study focuses on identifying early signs and nascent stages of breakthrough innovations from the perspective of outliers, aiming to achieve early identification of scientific breakthroughs in papers. Design/methodology/approach This study utilizes semantic technology to extract research entities from the titles and abstracts of papers to represent each paper’s research content. Outlier detection methods are then employed to measure and analyze the anomalies in breakthrough papers during their early stages. The development and evolution process are traced using literature time tags. Finally, a case study is conducted using the key publications of the 2021 Nobel Prize laureates in Physiology or Medicine. Findings Through manual analysis of all identified outlier papers, the effectiveness of the proposed method for early identifying potential scientific breakthroughs is verified. Research limitations The study’s applicability has only been empirically tested in the biomedical field. More data from various fields are needed to validate the robustness and generalizability of the method. Practical implications This study provides a valuable supplement to current methods for early identification of scientific breakthroughs, effectively supporting technological intelligence decision-making and services. Originality/Value The study introduces a novel approach to early identification of scientific breakthroughs by leveraging outlier analysis of research entities, offering a more sensitive, precise, and fine-grained alternative method compared to traditional citation-based evaluations, which enhances the ability to identify nascent breakthrough innovations.","PeriodicalId":44622,"journal":{"name":"Journal of Data and Information Science","volume":"49 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Data and Information Science","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.2478/jdis-2024-0027","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose To address the “anomalies” that occur when scientific breakthroughs emerge, this study focuses on identifying early signs and nascent stages of breakthrough innovations from the perspective of outliers, aiming to achieve early identification of scientific breakthroughs in papers. Design/methodology/approach This study utilizes semantic technology to extract research entities from the titles and abstracts of papers to represent each paper’s research content. Outlier detection methods are then employed to measure and analyze the anomalies in breakthrough papers during their early stages. The development and evolution process are traced using literature time tags. Finally, a case study is conducted using the key publications of the 2021 Nobel Prize laureates in Physiology or Medicine. Findings Through manual analysis of all identified outlier papers, the effectiveness of the proposed method for early identifying potential scientific breakthroughs is verified. Research limitations The study’s applicability has only been empirically tested in the biomedical field. More data from various fields are needed to validate the robustness and generalizability of the method. Practical implications This study provides a valuable supplement to current methods for early identification of scientific breakthroughs, effectively supporting technological intelligence decision-making and services. Originality/Value The study introduces a novel approach to early identification of scientific breakthroughs by leveraging outlier analysis of research entities, offering a more sensitive, precise, and fine-grained alternative method compared to traditional citation-based evaluations, which enhances the ability to identify nascent breakthrough innovations.
期刊介绍:
JDIS devotes itself to the study and application of the theories, methods, techniques, services, infrastructural facilities using big data to support knowledge discovery for decision & policy making. The basic emphasis is big data-based, analytics centered, knowledge discovery driven, and decision making supporting. The special effort is on the knowledge discovery to detect and predict structures, trends, behaviors, relations, evolutions and disruptions in research, innovation, business, politics, security, media and communications, and social development, where the big data may include metadata or full content data, text or non-textural data, structured or non-structural data, domain specific or cross-domain data, and dynamic or interactive data.
The main areas of interest are:
(1) New theories, methods, and techniques of big data based data mining, knowledge discovery, and informatics, including but not limited to scientometrics, communication analysis, social network analysis, tech & industry analysis, competitive intelligence, knowledge mapping, evidence based policy analysis, and predictive analysis.
(2) New methods, architectures, and facilities to develop or improve knowledge infrastructure capable to support knowledge organization and sophisticated analytics, including but not limited to ontology construction, knowledge organization, semantic linked data, knowledge integration and fusion, semantic retrieval, domain specific knowledge infrastructure, and semantic sciences.
(3) New mechanisms, methods, and tools to embed knowledge analytics and knowledge discovery into actual operation, service, or managerial processes, including but not limited to knowledge assisted scientific discovery, data mining driven intelligent workflows in learning, communications, and management.
Specific topic areas may include:
Knowledge organization
Knowledge discovery and data mining
Knowledge integration and fusion
Semantic Web metrics
Scientometrics
Analytic and diagnostic informetrics
Competitive intelligence
Predictive analysis
Social network analysis and metrics
Semantic and interactively analytic retrieval
Evidence-based policy analysis
Intelligent knowledge production
Knowledge-driven workflow management and decision-making
Knowledge-driven collaboration and its management
Domain knowledge infrastructure with knowledge fusion and analytics
Development of data and information services