Early identification of scientific breakthroughs through outlier analysis based on research entities

IF 1.5 3区 管理学 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE Journal of Data and Information Science Pub Date : 2024-09-04 DOI:10.2478/jdis-2024-0027
Yang Zhao, Mengting Zhang, Xiaoli Chen, Zhixiong Zhang
{"title":"Early identification of scientific breakthroughs through outlier analysis based on research entities","authors":"Yang Zhao, Mengting Zhang, Xiaoli Chen, Zhixiong Zhang","doi":"10.2478/jdis-2024-0027","DOIUrl":null,"url":null,"abstract":"Purpose To address the “anomalies” that occur when scientific breakthroughs emerge, this study focuses on identifying early signs and nascent stages of breakthrough innovations from the perspective of outliers, aiming to achieve early identification of scientific breakthroughs in papers. Design/methodology/approach This study utilizes semantic technology to extract research entities from the titles and abstracts of papers to represent each paper’s research content. Outlier detection methods are then employed to measure and analyze the anomalies in breakthrough papers during their early stages. The development and evolution process are traced using literature time tags. Finally, a case study is conducted using the key publications of the 2021 Nobel Prize laureates in Physiology or Medicine. Findings Through manual analysis of all identified outlier papers, the effectiveness of the proposed method for early identifying potential scientific breakthroughs is verified. Research limitations The study’s applicability has only been empirically tested in the biomedical field. More data from various fields are needed to validate the robustness and generalizability of the method. Practical implications This study provides a valuable supplement to current methods for early identification of scientific breakthroughs, effectively supporting technological intelligence decision-making and services. Originality/Value The study introduces a novel approach to early identification of scientific breakthroughs by leveraging outlier analysis of research entities, offering a more sensitive, precise, and fine-grained alternative method compared to traditional citation-based evaluations, which enhances the ability to identify nascent breakthrough innovations.","PeriodicalId":44622,"journal":{"name":"Journal of Data and Information Science","volume":"49 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Data and Information Science","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.2478/jdis-2024-0027","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose To address the “anomalies” that occur when scientific breakthroughs emerge, this study focuses on identifying early signs and nascent stages of breakthrough innovations from the perspective of outliers, aiming to achieve early identification of scientific breakthroughs in papers. Design/methodology/approach This study utilizes semantic technology to extract research entities from the titles and abstracts of papers to represent each paper’s research content. Outlier detection methods are then employed to measure and analyze the anomalies in breakthrough papers during their early stages. The development and evolution process are traced using literature time tags. Finally, a case study is conducted using the key publications of the 2021 Nobel Prize laureates in Physiology or Medicine. Findings Through manual analysis of all identified outlier papers, the effectiveness of the proposed method for early identifying potential scientific breakthroughs is verified. Research limitations The study’s applicability has only been empirically tested in the biomedical field. More data from various fields are needed to validate the robustness and generalizability of the method. Practical implications This study provides a valuable supplement to current methods for early identification of scientific breakthroughs, effectively supporting technological intelligence decision-making and services. Originality/Value The study introduces a novel approach to early identification of scientific breakthroughs by leveraging outlier analysis of research entities, offering a more sensitive, precise, and fine-grained alternative method compared to traditional citation-based evaluations, which enhances the ability to identify nascent breakthrough innovations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过基于研究实体的离群值分析,及早发现科学突破
目的 针对科学突破出现时的 "异常现象",本研究侧重于从异常值的角度识别突破性创新的早期征兆和萌芽阶段,旨在实现对论文中科学突破的早期识别。设计/方法/途径 本研究利用语义技术从论文标题和摘要中提取研究实体,以代表每篇论文的研究内容。然后采用离群点检测方法来测量和分析突破性论文早期阶段的异常情况。利用文献时间标签追踪论文的发展和演变过程。最后,利用 2021 年诺贝尔生理学或医学奖得主的主要论文进行案例研究。研究结果 通过对所有识别出的离群论文进行人工分析,验证了所提出的方法在早期识别潜在科学突破方面的有效性。研究局限性 该研究的适用性仅在生物医学领域进行了经验测试。需要更多来自不同领域的数据来验证该方法的稳健性和可推广性。实践意义 本研究为当前早期识别科学突破的方法提供了宝贵的补充,有效地支持了科技情报决策和服务。原创性/价值 该研究通过对研究实体的离群值分析,引入了一种早期识别科学突破的新方法,与传统的基于引文的评估相比,提供了一种更灵敏、更精确、更精细的替代方法,提高了识别新生突破性创新的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Data and Information Science
Journal of Data and Information Science INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
3.50
自引率
6.70%
发文量
495
期刊介绍: JDIS devotes itself to the study and application of the theories, methods, techniques, services, infrastructural facilities using big data to support knowledge discovery for decision & policy making. The basic emphasis is big data-based, analytics centered, knowledge discovery driven, and decision making supporting. The special effort is on the knowledge discovery to detect and predict structures, trends, behaviors, relations, evolutions and disruptions in research, innovation, business, politics, security, media and communications, and social development, where the big data may include metadata or full content data, text or non-textural data, structured or non-structural data, domain specific or cross-domain data, and dynamic or interactive data. The main areas of interest are: (1) New theories, methods, and techniques of big data based data mining, knowledge discovery, and informatics, including but not limited to scientometrics, communication analysis, social network analysis, tech & industry analysis, competitive intelligence, knowledge mapping, evidence based policy analysis, and predictive analysis. (2) New methods, architectures, and facilities to develop or improve knowledge infrastructure capable to support knowledge organization and sophisticated analytics, including but not limited to ontology construction, knowledge organization, semantic linked data, knowledge integration and fusion, semantic retrieval, domain specific knowledge infrastructure, and semantic sciences. (3) New mechanisms, methods, and tools to embed knowledge analytics and knowledge discovery into actual operation, service, or managerial processes, including but not limited to knowledge assisted scientific discovery, data mining driven intelligent workflows in learning, communications, and management. Specific topic areas may include: Knowledge organization Knowledge discovery and data mining Knowledge integration and fusion Semantic Web metrics Scientometrics Analytic and diagnostic informetrics Competitive intelligence Predictive analysis Social network analysis and metrics Semantic and interactively analytic retrieval Evidence-based policy analysis Intelligent knowledge production Knowledge-driven workflow management and decision-making Knowledge-driven collaboration and its management Domain knowledge infrastructure with knowledge fusion and analytics Development of data and information services
期刊最新文献
Early identification of scientific breakthroughs through outlier analysis based on research entities Community detection on elite mathematicians’ collaboration network Navigating interdisciplinary research: Historical progression and contemporary challenges Data-enhanced revealing of trends in Geoscience Identifying multidisciplinary problems from scientific publications based on a text generation method
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1