Lead–lag effect of research between conference papers and journal papers in data mining

Yue Huang, Runyu Tian
{"title":"Lead–lag effect of research between conference papers and journal papers in data mining","authors":"Yue Huang, Runyu Tian","doi":"10.1002/widm.1561","DOIUrl":null,"url":null,"abstract":"The examination of the lead–lag effect between different publication types, incorporating a temporal dimension, is very significant for assessing research. In this article, we introduce a novel framework to quantify the lead–lag effect between the research topics of conference papers and journal papers. We first identify research topics via the text‐embedding‐based topic modeling technique BERTopic, then extract the research topics of each time slice, construct and visualize the similarity matrix of topics to reveal the time‐lag direction and finally quantify the lead–lag effect by four proposed indicators, as well as by average influence topic similarity comparison maps. We conduct a detailed analysis of 19,166 bibliographic data for top conference papers and journal papers from 2015 to 2019 in the data mining field, calculate the similarity of topics obtained by BERTopic between each time slice divided by quarters. The results show that journal paper topics lag behind conference paper topics in the data mining field. The most significant lead–lag effect is 2.5 years, with approximately 33.45% of topics affected by this lag. The methodology presented here holds potential for broader application in the analysis of lead–lag effects across diverse research areas, offering valuable insights into the state of research development and informing policy decisions.This article is categorized under:<jats:list list-type=\"simple\"> <jats:list-item>Application Areas &gt; Science and Technology</jats:list-item> </jats:list>","PeriodicalId":501013,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"35 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"WIREs Data Mining and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/widm.1561","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The examination of the lead–lag effect between different publication types, incorporating a temporal dimension, is very significant for assessing research. In this article, we introduce a novel framework to quantify the lead–lag effect between the research topics of conference papers and journal papers. We first identify research topics via the text‐embedding‐based topic modeling technique BERTopic, then extract the research topics of each time slice, construct and visualize the similarity matrix of topics to reveal the time‐lag direction and finally quantify the lead–lag effect by four proposed indicators, as well as by average influence topic similarity comparison maps. We conduct a detailed analysis of 19,166 bibliographic data for top conference papers and journal papers from 2015 to 2019 in the data mining field, calculate the similarity of topics obtained by BERTopic between each time slice divided by quarters. The results show that journal paper topics lag behind conference paper topics in the data mining field. The most significant lead–lag effect is 2.5 years, with approximately 33.45% of topics affected by this lag. The methodology presented here holds potential for broader application in the analysis of lead–lag effects across diverse research areas, offering valuable insights into the state of research development and informing policy decisions.This article is categorized under: Application Areas > Science and Technology
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
数据挖掘领域会议论文与期刊论文之间的研究滞后效应
结合时间维度,研究不同出版物类型之间的滞后效应对于评估研究工作意义重大。在本文中,我们介绍了一种量化会议论文和期刊论文研究课题之间滞后效应的新框架。我们首先通过基于文本嵌入的主题建模技术 BERTopic 识别研究主题,然后提取每个时间片的研究主题,构建并可视化主题相似性矩阵以揭示时滞方向,最后通过四个拟议指标以及平均影响主题相似性比较图量化时滞效应。我们对2015年至2019年数据挖掘领域顶级会议论文和期刊论文的19166条书目数据进行了详细分析,计算了BERTopic得到的各时间片之间除以季度的话题相似度。结果显示,在数据挖掘领域,期刊论文主题落后于会议论文主题。最明显的滞后效应是 2.5 年,约有 33.45% 的主题受到这一滞后效应的影响。本文介绍的方法有望更广泛地应用于不同研究领域的滞后效应分析,为了解研究发展状况提供有价值的见解,并为政策决策提供参考:应用领域 > 科学与技术
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Trace Encoding Techniques for Multi‐Perspective Process Mining: A Comparative Study Hyper‐Parameter Optimization of Kernel Functions on Multi‐Class Text Categorization: A Comparative Evaluation Dimensionality Reduction for Data Analysis With Quantum Feature Learning Business Analytics in Customer Lifetime Value: An Overview Analysis Knowledge Graph for Solubility Big Data: Construction and Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1