Journal of data and information science (Warsaw, Poland)最新文献

Editorial board publication strategy and acceptance rates in Turkish national journals 编辑委员会出版策略和土耳其国家期刊的接受率

Journal of data and information science (Warsaw, Poland)

Pub Date : 2023-08-25 DOI: 10.2478/jdis-2023-0019

Lokman Tutuncu

Abstract Purpose This study takes advantage of newly released journal metrics to investigate whether local journals with more qualified boards have lower acceptance rates, based on data from 219 Turkish national journals and 2,367 editorial board members. Design/methodology/approach This study argues that journal editors can signal their scholarly quality by publishing in reputable journals. Conversely, editors publishing inside articles in affiliated national journals would send negative signals. The research predicts that high (low) quality editorial boards will conduct more (less) selective evaluation and their journals will have lower (higher) acceptance rates. Based on the publication strategy of editors, four measures of board quality are defined: Number of board inside publications per editor (INSIDER), number of board Social Sciences Citation Index publications per editor (SSCI), inside-to-SSCI article ratio (ISRA), and board citation per editor (CITATION). Predictions are tested by correlation and regression analysis. Findings Low-quality board proxies (INSIDER, ISRA) are positively, and high-quality board proxies (SSCI, CITATION) are negatively associated with acceptance rates. Further, we find that receiving a larger number of submissions, greater women representation on boards, and Web of Science and Scopus (WOSS) coverage are associated with lower acceptance rates. Acceptance rates for journals range from 12% to 91%, with an average of 54% and a median of 53%. Law journals have significantly higher average acceptance rate (68%) than other journals, while WOSS journals have the lowest (43%). Findings indicate some of the highest acceptance rates in Social Sciences literature, including competitive Business and Economics journals that traditionally have low acceptance rates. Limitations Research relies on local context to define publication strategy of editors. Findings may not be generalizable to mainstream journals and core science countries where emphasis on research quality is stronger and editorial selection is based on scientific merit. Practical implications Results offer useful insights into editorial management of national journals and allow us to make sense of local editorial practices. The importance of scientific merit for selection to national journal editorial boards is particularly highlighted for sound editorial evaluation of submitted manuscripts. Originality/value This is the first attempt to document a significant relation between acceptance rates and editorial board publication behavior.

摘要目的本研究利用最新发布的期刊指标，基于219家土耳其国家期刊和2367名编委会成员的数据，调查拥有更多合格委员会的地方期刊是否接受率较低。设计/方法论/方法本研究认为，期刊编辑可以通过在声誉良好的期刊上发表文章来表明他们的学术素质。相反，编辑在附属国家期刊上发表内部文章会发出负面信号。研究预测，高（低）质量的编委会将进行更多（更少）的选择性评估，他们的期刊接受率将更低（更高）。根据编辑的出版策略，定义了四个衡量董事会质量的指标：每位编辑的董事会内部出版物数量（INSIDER）、每位编辑的社会科学引文索引出版物数量（SSCI）、内部与SSCI文章比率（ISRA）和每位编辑员的董事会引文数量（Citation）。预测通过相关和回归分析进行检验。调查结果低质量董事会代理（INSIDER，ISRA）与接受率呈正相关，高质量董事会代理人（SSCI，CITATION）与接受度呈负相关。此外，我们发现，收到更多的投稿、女性在董事会中的代表性以及科学与Scopus网络（WOSS）的覆盖率与较低的接受率有关。期刊的接受率从12%到91%不等，平均为54%，中位数为53%。法律期刊的平均接受率（68%）明显高于其他期刊，而WOSS期刊的平均接收率最低（43%）。研究结果表明，一些社会科学文献的接受率最高，包括传统上接受率较低的竞争性商业和经济学期刊。局限性研究依赖于当地环境来定义编辑的出版策略。研究结果可能无法推广到主流期刊和核心科学国家，因为这些国家更重视研究质量，编辑选择基于科学价值。实际意义研究结果为国家期刊的编辑管理提供了有用的见解，并使我们能够理解当地的编辑实践。科学价值对国家期刊编辑委员会选拔的重要性尤其突出，因为它对提交的稿件进行了良好的编辑评估。原创性/价值这是首次尝试记录接受率与编委会出版行为之间的显著关系。

{"title":"Editorial board publication strategy and acceptance rates in Turkish national journals","authors":"Lokman Tutuncu","doi":"10.2478/jdis-2023-0019","DOIUrl":"https://doi.org/10.2478/jdis-2023-0019","url":null,"abstract":"Abstract Purpose This study takes advantage of newly released journal metrics to investigate whether local journals with more qualified boards have lower acceptance rates, based on data from 219 Turkish national journals and 2,367 editorial board members. Design/methodology/approach This study argues that journal editors can signal their scholarly quality by publishing in reputable journals. Conversely, editors publishing inside articles in affiliated national journals would send negative signals. The research predicts that high (low) quality editorial boards will conduct more (less) selective evaluation and their journals will have lower (higher) acceptance rates. Based on the publication strategy of editors, four measures of board quality are defined: Number of board inside publications per editor (INSIDER), number of board Social Sciences Citation Index publications per editor (SSCI), inside-to-SSCI article ratio (ISRA), and board citation per editor (CITATION). Predictions are tested by correlation and regression analysis. Findings Low-quality board proxies (INSIDER, ISRA) are positively, and high-quality board proxies (SSCI, CITATION) are negatively associated with acceptance rates. Further, we find that receiving a larger number of submissions, greater women representation on boards, and Web of Science and Scopus (WOSS) coverage are associated with lower acceptance rates. Acceptance rates for journals range from 12% to 91%, with an average of 54% and a median of 53%. Law journals have significantly higher average acceptance rate (68%) than other journals, while WOSS journals have the lowest (43%). Findings indicate some of the highest acceptance rates in Social Sciences literature, including competitive Business and Economics journals that traditionally have low acceptance rates. Limitations Research relies on local context to define publication strategy of editors. Findings may not be generalizable to mainstream journals and core science countries where emphasis on research quality is stronger and editorial selection is based on scientific merit. Practical implications Results offer useful insights into editorial management of national journals and allow us to make sense of local editorial practices. The importance of scientific merit for selection to national journal editorial boards is particularly highlighted for sound editorial evaluation of submitted manuscripts. Originality/value This is the first attempt to document a significant relation between acceptance rates and editorial board publication behavior.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"0 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42529544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Build neural network models to identify and correct news headlines exaggerating obesity-related scientific findings 建立神经网络模型，识别和纠正夸大肥胖相关科学发现的新闻标题

Journal of data and information science (Warsaw, Poland)

Pub Date : 2023-06-01 DOI: 10.2478/jdis-2023-0014

R. An, Quinlan Batcheller, Junjie Wang, Yuyi Yang

Abstract Purpose Media exaggerations of health research may confuse readers’ understanding, erode public trust in science and medicine, and cause disease mismanagement. This study built artificial intelligence (AI) models to automatically identify and correct news headlines exaggerating obesity-related research findings. Design/methodology/approach We searched popular digital media outlets to collect 523 headlines exaggerating obesity-related research findings. The reasons for exaggerations include: inferring causality from observational studies, inferring human outcomes from animal research, inferring distant/end outcomes (e.g., obesity) from immediate/intermediate outcomes (e.g., calorie intake), and generalizing findings to the population from a subgroup or convenience sample. Each headline was paired with the title and abstract of the peer-reviewed journal publication covered by the news article. We drafted an exaggeration-free counterpart for each original headline and fined-tuned a BERT model to differentiate between them. We further fine-tuned three generative language models—BART, PEGASUS, and T5 to autogenerate exaggeration-free headlines based on a journal publication’s title and abstract. Model performance was evaluated using the ROUGE metrics by comparing model-generated headlines with journal publication titles. Findings The fine-tuned BERT model achieved 92.5% accuracy in differentiating between exaggeration-free and original headlines. Baseline ROUGE scores averaged 0.311 for ROUGE-1, 0.113 for ROUGE-2, 0.253 for ROUGE-L, and 0.253 ROUGE-Lsum. PEGASUS, T5, and BART all outperformed the baseline. The best-performing BART model attained 0.447 for ROUGE-1, 0.221 for ROUGE-2, 0.402 for ROUGE-L, and 0.402 for ROUGE-Lsum. Originality/value This study demonstrated the feasibility of leveraging AI to automatically identify and correct news headlines exaggerating obesity-related research findings.

摘要目的媒体对健康研究的夸大可能会混淆读者的理解，侵蚀公众对科学和医学的信任，并导致疾病管理不善。该研究建立了人工智能(AI)模型，自动识别和纠正夸大肥胖相关研究结果的新闻标题。设计/方法/方法我们搜索了流行的数字媒体，收集了523个夸大肥胖相关研究结果的标题。夸大的原因包括:从观察性研究推断因果关系，从动物研究推断人类结果，从直接/中期结果(如卡路里摄入量)推断遥远/最终结果(如肥胖)，以及从亚组或方便样本中将结果推广到人群。每个标题都与新闻文章所涉及的同行评审期刊出版物的标题和摘要配对。我们为每个原始标题起草了一个没有夸张的对应标题，并对BERT模型进行了微调，以区分它们。我们进一步微调了三个生成语言模型——bart、PEGASUS和T5，以根据期刊出版物的标题和摘要自动生成无夸张的标题。通过比较模型生成的标题和期刊出版标题，使用ROUGE度量来评估模型的性能。结果改进后的BERT模型对无夸张标题和原创标题的区分准确率达到92.5%。ROUGE-1的基线评分平均为0.311,ROUGE-2为0.113,ROUGE- l为0.253,ROUGE- lsum为0.253。PEGASUS、T5和BART的表现都优于基线。表现最好的BART模型为ROUGE-1 0.447, ROUGE-2 0.221, ROUGE-L 0.402, ROUGE-Lsum 0.402。独创性/价值本研究证明了利用人工智能自动识别和纠正夸大肥胖相关研究结果的新闻标题的可行性。

{"title":"Build neural network models to identify and correct news headlines exaggerating obesity-related scientific findings","authors":"R. An, Quinlan Batcheller, Junjie Wang, Yuyi Yang","doi":"10.2478/jdis-2023-0014","DOIUrl":"https://doi.org/10.2478/jdis-2023-0014","url":null,"abstract":"Abstract Purpose Media exaggerations of health research may confuse readers’ understanding, erode public trust in science and medicine, and cause disease mismanagement. This study built artificial intelligence (AI) models to automatically identify and correct news headlines exaggerating obesity-related research findings. Design/methodology/approach We searched popular digital media outlets to collect 523 headlines exaggerating obesity-related research findings. The reasons for exaggerations include: inferring causality from observational studies, inferring human outcomes from animal research, inferring distant/end outcomes (e.g., obesity) from immediate/intermediate outcomes (e.g., calorie intake), and generalizing findings to the population from a subgroup or convenience sample. Each headline was paired with the title and abstract of the peer-reviewed journal publication covered by the news article. We drafted an exaggeration-free counterpart for each original headline and fined-tuned a BERT model to differentiate between them. We further fine-tuned three generative language models—BART, PEGASUS, and T5 to autogenerate exaggeration-free headlines based on a journal publication’s title and abstract. Model performance was evaluated using the ROUGE metrics by comparing model-generated headlines with journal publication titles. Findings The fine-tuned BERT model achieved 92.5% accuracy in differentiating between exaggeration-free and original headlines. Baseline ROUGE scores averaged 0.311 for ROUGE-1, 0.113 for ROUGE-2, 0.253 for ROUGE-L, and 0.253 ROUGE-Lsum. PEGASUS, T5, and BART all outperformed the baseline. The best-performing BART model attained 0.447 for ROUGE-1, 0.221 for ROUGE-2, 0.402 for ROUGE-L, and 0.402 for ROUGE-Lsum. Originality/value This study demonstrated the feasibility of leveraging AI to automatically identify and correct news headlines exaggerating obesity-related research findings.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"8 1","pages":"88 - 97"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45890018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multimodal sentiment analysis for social media contents during public emergencies 突发公共事件中社交媒体内容的多模态情感分析

Journal of data and information science (Warsaw, Poland)

Pub Date : 2023-06-01 DOI: 10.2478/jdis-2023-0012

Tao Fan, Hao Wang, Peng Wu, Chen Ling, Milad Taleby Ahvanooey

Abstract Purpose Nowadays, public opinions during public emergencies involve not only textual contents but also contain images. However, the existing works mainly focus on textual contents and they do not provide a satisfactory accuracy of sentiment analysis, lacking the combination of multimodal contents. In this paper, we propose to combine texts and images generated in the social media to perform sentiment analysis. Design/methodology/approach We propose a Deep Multimodal Fusion Model (DMFM), which combines textual and visual sentiment analysis. We first train word2vec model on a large-scale public emergency corpus to obtain semantic-rich word vectors as the input of textual sentiment analysis. BiLSTM is employed to generate encoded textual embeddings. To fully excavate visual information from images, a modified pretrained VGG16-based sentiment analysis network is used with the best-performed fine-tuning strategy. A multimodal fusion method is implemented to fuse textual and visual embeddings completely, producing predicted labels. Findings We performed extensive experiments on Weibo and Twitter public emergency datasets, to evaluate the performance of our proposed model. Experimental results demonstrate that the DMFM provides higher accuracy compared with baseline models. The introduction of images can boost the performance of sentiment analysis during public emergencies. Research limitations In the future, we will test our model in a wider dataset. We will also consider a better way to learn the multimodal fusion information. Practical implications We build an efficient multimodal sentiment analysis model for the social media contents during public emergencies. Originality/value We consider the images posted by online users during public emergencies on social platforms. The proposed method can present a novel scope for sentiment analysis during public emergencies and provide the decision support for the government when formulating policies in public emergencies.

摘要目的当前，突发公共事件中的舆论不仅涉及文本内容，还包含图像。然而，现有的作品主要关注文本内容，并没有提供令人满意的情感分析准确性，缺乏多模式内容的结合。在本文中，我们建议将社交媒体中生成的文本和图像结合起来进行情感分析。设计/方法论/方法我们提出了一个深度多模式融合模型（DMFM），它结合了文本和视觉情感分析。我们首先在大规模的公共应急语料库上训练word2vec模型，以获得语义丰富的词向量作为文本情感分析的输入。BiLSTM用于生成编码的文本嵌入。为了从图像中充分挖掘视觉信息，使用了一种改进的基于VGG16的预训练情绪分析网络，并采用了性能最佳的微调策略。实现了一种多模式融合方法，以完全融合文本和视觉嵌入，产生预测标签。研究结果我们在微博和推特公共应急数据集上进行了广泛的实验，以评估我们提出的模型的性能。实验结果表明，与基线模型相比，DMFM提供了更高的精度。图像的引入可以提高突发公共事件中情绪分析的性能。研究局限性未来，我们将在更广泛的数据集中测试我们的模型。我们还将考虑一种更好的方法来学习多模式融合信息。实际意义我们为突发公共事件期间的社交媒体内容建立了一个有效的多模态情绪分析模型。原创/价值我们考虑的是网络用户在公共突发事件期间在社交平台上发布的图片。所提出的方法可以为突发公共事件中的情绪分析提供一个新的范围，并为政府制定突发公共事件政策提供决策支持。

{"title":"Multimodal sentiment analysis for social media contents during public emergencies","authors":"Tao Fan, Hao Wang, Peng Wu, Chen Ling, Milad Taleby Ahvanooey","doi":"10.2478/jdis-2023-0012","DOIUrl":"https://doi.org/10.2478/jdis-2023-0012","url":null,"abstract":"Abstract Purpose Nowadays, public opinions during public emergencies involve not only textual contents but also contain images. However, the existing works mainly focus on textual contents and they do not provide a satisfactory accuracy of sentiment analysis, lacking the combination of multimodal contents. In this paper, we propose to combine texts and images generated in the social media to perform sentiment analysis. Design/methodology/approach We propose a Deep Multimodal Fusion Model (DMFM), which combines textual and visual sentiment analysis. We first train word2vec model on a large-scale public emergency corpus to obtain semantic-rich word vectors as the input of textual sentiment analysis. BiLSTM is employed to generate encoded textual embeddings. To fully excavate visual information from images, a modified pretrained VGG16-based sentiment analysis network is used with the best-performed fine-tuning strategy. A multimodal fusion method is implemented to fuse textual and visual embeddings completely, producing predicted labels. Findings We performed extensive experiments on Weibo and Twitter public emergency datasets, to evaluate the performance of our proposed model. Experimental results demonstrate that the DMFM provides higher accuracy compared with baseline models. The introduction of images can boost the performance of sentiment analysis during public emergencies. Research limitations In the future, we will test our model in a wider dataset. We will also consider a better way to learn the multimodal fusion information. Practical implications We build an efficient multimodal sentiment analysis model for the social media contents during public emergencies. Originality/value We consider the images posted by online users during public emergencies on social platforms. The proposed method can present a novel scope for sentiment analysis during public emergencies and provide the decision support for the government when formulating policies in public emergencies.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"8 1","pages":"61 - 87"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41694711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An author credit allocation method with improved distinguishability and robustness 一种提高可分辨性和鲁棒性的作者信用分配方法

Journal of data and information science (Warsaw, Poland)

Pub Date : 2023-06-01 DOI: 10.2478/jdis-2023-0016

Yang Li, Tao Jia

Abstract Purpose The purpose of this study is to propose an improved credit allocation method that makes the leading author of the paper more distinguishable and makes the deification more robust under malicious manipulations. Design/methodology/approach We utilize a modified Sigmoid function to handle the fat-tail distributed citation counts. We also remove the target paper in calculating the contribution of co-citations. Following previous studies, we use 30 Nobel Prize-winning papers and their citation networks based on the American Physical Society (APS) and the Microsoft Academic Graph (MAG) dataset to test the accuracy of our proposed method (NCCAS). In addition, we use 654,148 articles published in the field of computer science from 2000 to 2009 in the MAG dataset to validate the distinguishability and robustness of NCCAS. Finding Compared with the state-of-the-art methods, NCCAS gives the most accurate prediction of Nobel laureates. Furthermore, the leading author of the paper identified by NCCAS is more distinguishable compared with other co-authors. The results by NCCAS are also more robust to malicious manipulation. Finally, we perform ablation studies to show the contribution of different components in our methods. Research limitations Due to limited ground truth on the true leading author of a work, the accuracy of NCCAS and other related methods can only be tested in Nobel Physics Prize-winning papers. Practical implications NCCAS is successfully applied to a large number of publications, demonstrating its potential in analyzing the relationship between the contribution and the recognition of authors with different by-line orders. Originality/value Compared with existing methods, NCCAS not only identifies the leading author of a paper more accurately, but also makes the deification more distinguishable and more robust, providing a new tool for related studies.

摘要目的本研究的目的是提出一种改进的信用分配方法，使论文的主要作者更容易区分，并使神化在恶意操作下更稳健。设计/方法论/方法我们利用改进的Sigmoid函数来处理胖尾分布的引文计数。我们还删除了计算共同引用贡献的目标论文。根据之前的研究，我们使用了30篇诺贝尔奖获奖论文及其基于美国物理学会（APS）和微软学术图谱（MAG）数据集的引文网络来测试我们提出的方法（NCCAS）的准确性。此外，我们在MAG数据集中使用了2000年至2009年在计算机科学领域发表的654148篇文章来验证NCCAS的可区分性和稳健性。发现与最先进的方法相比，NCCAS给出了诺贝尔奖获得者最准确的预测。此外，与其他合著者相比，NCCAS确定的论文的主要作者更容易区分。NCCAS的结果对恶意操纵也更具鲁棒性。最后，我们进行消融研究，以显示不同成分在我们的方法中的贡献。研究局限性由于作品真正的主要作者的基本事实有限，NCCAS和其他相关方法的准确性只能在诺贝尔物理学奖获奖论文中进行测试。实际意义NCCAS已成功应用于大量出版物，证明了其在分析贡献与对具有不同行序的作者的认可之间的关系方面的潜力。独创性/价值与现有方法相比，NCCAS不仅更准确地确定了论文的主要作者，而且使神化更加可区分和稳健，为相关研究提供了新的工具。

{"title":"An author credit allocation method with improved distinguishability and robustness","authors":"Yang Li, Tao Jia","doi":"10.2478/jdis-2023-0016","DOIUrl":"https://doi.org/10.2478/jdis-2023-0016","url":null,"abstract":"Abstract Purpose The purpose of this study is to propose an improved credit allocation method that makes the leading author of the paper more distinguishable and makes the deification more robust under malicious manipulations. Design/methodology/approach We utilize a modified Sigmoid function to handle the fat-tail distributed citation counts. We also remove the target paper in calculating the contribution of co-citations. Following previous studies, we use 30 Nobel Prize-winning papers and their citation networks based on the American Physical Society (APS) and the Microsoft Academic Graph (MAG) dataset to test the accuracy of our proposed method (NCCAS). In addition, we use 654,148 articles published in the field of computer science from 2000 to 2009 in the MAG dataset to validate the distinguishability and robustness of NCCAS. Finding Compared with the state-of-the-art methods, NCCAS gives the most accurate prediction of Nobel laureates. Furthermore, the leading author of the paper identified by NCCAS is more distinguishable compared with other co-authors. The results by NCCAS are also more robust to malicious manipulation. Finally, we perform ablation studies to show the contribution of different components in our methods. Research limitations Due to limited ground truth on the true leading author of a work, the accuracy of NCCAS and other related methods can only be tested in Nobel Physics Prize-winning papers. Practical implications NCCAS is successfully applied to a large number of publications, demonstrating its potential in analyzing the relationship between the contribution and the recognition of authors with different by-line orders. Originality/value Compared with existing methods, NCCAS not only identifies the leading author of a paper more accurately, but also makes the deification more distinguishable and more robust, providing a new tool for related studies.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"8 1","pages":"15 - 46"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47722992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Differences between journal and conference in computer science: a bibliometric view based on Bayesian network 计算机科学期刊与会议的差异——基于贝叶斯网络的文献计量学观点

Journal of data and information science (Warsaw, Poland)

Pub Date : 2023-06-01 DOI: 10.2478/jdis-2023-0017

Mingyue Sun, Mingliang Yue, Tingcan Ma

Abstract Purpose This paper aims to investigate the differences between conference papers and journal papers in the field of computer science based on Bayesian network. Design/methodology/approach This paper investigated the differences between conference papers and journal papers in the field of computer science based on Bayesian network, a knowledge-representative framework that can model relationships among all variables in the network. We defined the variables required for Bayesian networks modeling, calculated the values of each variable based Aminer dataset (a literature data set in the field of computer science), learned the Bayesian network and derived some findings based on network inference. Findings The study found that conferences are more attractive to senior scholars, the academic impact of conference papers is slightly higher than journal papers, and it is uncertain whether conference papers are more innovative than journal papers. Research limitations The study was limited to the field of computer science and employed Aminer dataset as the sample. Further studies involving more diverse datasets and different fields could provide a more complete picture of the matter. Practical implications By demonstrating that Bayesian networks can effectively analyze issues in Scientometrics, the study offers valuable insights that may enhance researchers’ understanding of the differences between journal and conference in computer science. Originality/value Academic conferences play a crucial role in facilitating scholarly exchange and knowledge dissemination within the field of computer science. Several studies have been conducted to examine the distinctions between conference papers and journal papers in terms of various factors, such as authors, citations, h-index and others. Those studies were carried out from different (independent) perspectives, lacking a systematic examination of the connections and interactions between multiple perspectives. This paper supplements this deficiency based on Bayesian network modeling.

摘要目的本文旨在研究基于贝叶斯网络的计算机科学领域会议论文和期刊论文之间的差异。设计/方法论/方法本文基于贝叶斯网络研究了计算机科学领域会议论文和期刊论文之间的差异，贝叶斯网络是一个知识代表性框架，可以对网络中所有变量之间的关系进行建模。我们定义了贝叶斯网络建模所需的变量，计算了每个基于变量的Aminer数据集（计算机科学领域的文献数据集）的值，学习了贝叶斯网络，并基于网络推理得出了一些发现。研究发现，会议对资深学者更有吸引力，会议论文的学术影响力略高于期刊论文，并且不确定会议论文是否比期刊论文更具创新性。研究局限性该研究仅限于计算机科学领域，并采用Aminer数据集作为样本。涉及更多不同数据集和不同领域的进一步研究可以提供更完整的情况。实践意义通过证明贝叶斯网络可以有效地分析科学计量学中的问题，该研究提供了有价值的见解，可以增强研究人员对计算机科学期刊和会议之间差异的理解。原创性/价值学术会议在促进计算机科学领域的学术交流和知识传播方面发挥着至关重要的作用。已经进行了几项研究，从作者、引文、h指数等各种因素来检验会议论文和期刊论文之间的区别。这些研究是从不同（独立）的角度进行的，缺乏对多个角度之间的联系和相互作用的系统检查。本文在贝叶斯网络建模的基础上对这一不足进行了补充。

{"title":"Differences between journal and conference in computer science: a bibliometric view based on Bayesian network","authors":"Mingyue Sun, Mingliang Yue, Tingcan Ma","doi":"10.2478/jdis-2023-0017","DOIUrl":"https://doi.org/10.2478/jdis-2023-0017","url":null,"abstract":"Abstract Purpose This paper aims to investigate the differences between conference papers and journal papers in the field of computer science based on Bayesian network. Design/methodology/approach This paper investigated the differences between conference papers and journal papers in the field of computer science based on Bayesian network, a knowledge-representative framework that can model relationships among all variables in the network. We defined the variables required for Bayesian networks modeling, calculated the values of each variable based Aminer dataset (a literature data set in the field of computer science), learned the Bayesian network and derived some findings based on network inference. Findings The study found that conferences are more attractive to senior scholars, the academic impact of conference papers is slightly higher than journal papers, and it is uncertain whether conference papers are more innovative than journal papers. Research limitations The study was limited to the field of computer science and employed Aminer dataset as the sample. Further studies involving more diverse datasets and different fields could provide a more complete picture of the matter. Practical implications By demonstrating that Bayesian networks can effectively analyze issues in Scientometrics, the study offers valuable insights that may enhance researchers’ understanding of the differences between journal and conference in computer science. Originality/value Academic conferences play a crucial role in facilitating scholarly exchange and knowledge dissemination within the field of computer science. Several studies have been conducted to examine the distinctions between conference papers and journal papers in terms of various factors, such as authors, citations, h-index and others. Those studies were carried out from different (independent) perspectives, lacking a systematic examination of the connections and interactions between multiple perspectives. This paper supplements this deficiency based on Bayesian network modeling.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"8 1","pages":"47 - 60"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47950652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Perspectives from a publishing ethics and research integrity team for required improvements 出版伦理和研究诚信团队对所需改进的看法

Journal of data and information science (Warsaw, Poland)

Pub Date : 2023-06-01 DOI: 10.2478/jdis-2023-0018

Sabina Alam, L. Wilson

Abstract It is imperative that all stakeholders within the research ecosystem take responsibility to improve research integrity and reliability of published research. Based on the unique experiences of a specialist publishing ethics and research integrity team within a major publisher, this article provides insights into the observed trends of misconduct and how those have evolved over time, and addresses key actions needed to improve the interface between researchers, funders, institutions and publishers to collectively improve research integrity on a global scale.

摘要研究生态系统中的所有利益相关者都有责任提高已发表研究的完整性和可靠性。基于主要出版商内部专业出版道德和研究诚信团队的独特经验，本文深入了解了观察到的不当行为趋势以及这些趋势是如何随着时间的推移而演变的，并阐述了改善研究人员、资助者、，机构和出版商共同提高全球范围内的研究诚信。

引用次数: 0

Assessment of retracted papers, and their retraction notices, from a cancer journal associated with “paper mills” 一份与“造纸厂”相关的癌症杂志撤回论文及其撤回通知的评估

Journal of data and information science (Warsaw, Poland)

Pub Date : 2023-04-01 DOI: 10.2478/jdis-2023-0009

J. A. T. da Silva, Serhii Nazarovets

Abstract Cancer research is occasionally described as being in a reproducibility crisis. The cancer literature has ample papers retracted due to misconduct, including the use of paper mills, invalid authorship, or fake data. The objective of this paper was to gain an appreciation of the balance of retractions and associated retraction notices of 23 retracted Cancer Biotherapy and Radiopharmaceuticals papers associated with paper mills. By 23 March 2023, these retracted papers had already accumulated 287 citations according to Web of Science Core Collection, 253 according to Scopus, and 365 according to Google Scholar, i.e., metrically speaking, they were highly rewarded. All authors had an affiliation (71% being a hospital) in China. Most (12/21; 57%) of corresponding authors had emails with a @163.com suffix. Four of the retraction notices (i.e., 17%) explicitly indicated paper mills as a reason for retraction although, in general, the retraction notices lacked details and background that could assist readers’ understanding of the retractions.

摘要癌症研究偶尔被描述为处于再现性危机中。癌症文献中有大量论文因不当行为而被撤回，包括使用造纸厂、无效作者或伪造数据。本文的目的是了解23篇与造纸厂相关的撤回癌症生物治疗和放射性药物论文的撤回平衡和相关撤回通知。截至2023年3月23日，根据Web of Science Core Collection的数据，这些被撤回的论文已经累积了287次引用，Scopus的数据为253次，Google Scholar的数据为365次，也就是说，从度量上讲，它们获得了很高的奖励。所有作者在中国都有附属机构（71%是医院）。大多数（12/21；57%）通讯作者的电子邮件后缀为@163.com。四份撤回通知（即17%）明确指出造纸厂是撤回的原因，尽管总体而言，撤回通知缺乏有助于读者理解撤回的细节和背景。

引用次数: 1

International visibility of Armenian domestic journals: the role of scientific diaspora 亚美尼亚国内期刊的国际知名度：散居国外的科学工作者的作用

Journal of data and information science (Warsaw, Poland)

Pub Date : 2023-04-01 DOI: 10.2478/jdis-2023-0011

E. Gzoyan, A. Mirzoyan, Anush Sargsyan, Mariam Yeghikyan, D. Maisano, Shushanik A. Sargsyan

Abstract Purpose Nearly 122 scientific journals are currently being published in Armenia—of which only six are indexed by WoS and/or Scopus databases. The majority of the national journals are published in the Armenian language, solely possessing abstracts written in English, although there are also English-language and multi-language journals with articles not only in Armenian but also in other foreign languages. The aim of this article is to study the visibility of the (non-indexed) national Armenian journals in the WoS database through citation analysis. In consideration of the existence of a relevant Armenian “diaspora” in the world, this article also attempts to estimate its impact in terms of citation statistics. Design/methodology/approach For this end, we have identified citations to the national/domestic Armenian journals in the WoS database in comparison with the share of citations received from “diaspora” researchers (researchers of Armenian origin born in foreign countries and those originally from Armenia who have emigrated to foreign countries). Findings Among the 116 Armenian domestic journals analyzed (not indexed by WoS), only 47 were found to be cited in WoS. Of these journals, almost 12% are citations by “diaspora” researchers, most of which concern Social Science and Humanities journals. Research limitations Although the surnames of Armenians end with -i(y)an, sometimes, the Diaspora Armenians, surnames are changed or modified or they are not ending with -i(y)an, in this case we may fail to identify them. Practical implications This study can help to build new, more deep and comprehensive relations with scientific diasporas. Originality/value This study offers a new understanding of multifaced research collaboration with scientific diasporas and their role in internationalization of domestic journals.

摘要目的亚美尼亚目前出版了近122种科学期刊，其中只有6种由WoS和/或Scopus数据库索引。大多数国家期刊都以亚美尼亚语出版，只有英文摘要，但也有英文和多语言期刊，不仅用亚美尼亚语发表文章，还用其他外语发表文章。本文的目的是通过引文分析来研究（未编入索引的）亚美尼亚国家期刊在WoS数据库中的可见性。考虑到世界上存在相关的亚美尼亚“侨民”，本文还试图通过引文统计来估计其影响。设计/方法/方法为此，我们在WoS数据库中确定了对国家/国内亚美尼亚期刊的引用，并与“散居国外”研究人员（出生在外国的亚美尼亚裔研究人员和移民到外国的亚美尼亚籍研究人员）的引用份额进行了比较。调查结果在分析的116份亚美尼亚国内期刊中（未被《世界科学报》收录），只有47份被《世界研究报》引用。在这些期刊中，近12%被“散居国外”的研究人员引用，其中大多数涉及社会科学和人文学科期刊。研究局限性尽管亚美尼亚人的姓氏以-i（y）an结尾，有时是散居的亚美尼亚人，但姓氏会被更改或修改，或者不是以-i（y）an结尾的，在这种情况下，我们可能无法识别他们。这项研究有助于与科学流散者建立新的、更深入、更全面的关系。原创性/价值本研究对与散居国外的科学工作者的各种研究合作及其在国内期刊国际化中的作用提供了新的理解。

{"title":"International visibility of Armenian domestic journals: the role of scientific diaspora","authors":"E. Gzoyan, A. Mirzoyan, Anush Sargsyan, Mariam Yeghikyan, D. Maisano, Shushanik A. Sargsyan","doi":"10.2478/jdis-2023-0011","DOIUrl":"https://doi.org/10.2478/jdis-2023-0011","url":null,"abstract":"Abstract Purpose Nearly 122 scientific journals are currently being published in Armenia—of which only six are indexed by WoS and/or Scopus databases. The majority of the national journals are published in the Armenian language, solely possessing abstracts written in English, although there are also English-language and multi-language journals with articles not only in Armenian but also in other foreign languages. The aim of this article is to study the visibility of the (non-indexed) national Armenian journals in the WoS database through citation analysis. In consideration of the existence of a relevant Armenian “diaspora” in the world, this article also attempts to estimate its impact in terms of citation statistics. Design/methodology/approach For this end, we have identified citations to the national/domestic Armenian journals in the WoS database in comparison with the share of citations received from “diaspora” researchers (researchers of Armenian origin born in foreign countries and those originally from Armenia who have emigrated to foreign countries). Findings Among the 116 Armenian domestic journals analyzed (not indexed by WoS), only 47 were found to be cited in WoS. Of these journals, almost 12% are citations by “diaspora” researchers, most of which concern Social Science and Humanities journals. Research limitations Although the surnames of Armenians end with -i(y)an, sometimes, the Diaspora Armenians, surnames are changed or modified or they are not ending with -i(y)an, in this case we may fail to identify them. Practical implications This study can help to build new, more deep and comprehensive relations with scientific diasporas. Originality/value This study offers a new understanding of multifaced research collaboration with scientific diasporas and their role in internationalization of domestic journals.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"8 1","pages":"93 - 117"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44902267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Can first or last name uniqueness help to identify diaspora researchers from any country? 名字或姓氏的独特性是否有助于识别来自任何国家的散居研究人员?

Journal of data and information science (Warsaw, Poland)

Pub Date : 2023-04-01 DOI: 10.2478/jdis-2023-0013

M. Thelwall

Abstract Purpose Diaspora researchers work in one country but have ancestral origins in another, either through moves during a research career (mobile diaspora researchers) or by starting research in the target country (embedded diaspora researchers). Whilst mobile researchers might be tracked through affiliation changes in bibliometric databases, embedded researchers cannot. This article reports an evidence-based discussion of which countries’ diaspora researchers can be partially tracked using first or last names, addressing this limitation. Design/methodology/approach A frequency analysis of first and last names of authors of all Scopus journal articles 2001-2021 for 200 countries or regions. Findings There are great variations in the extent to which first or last names are uniquely national, from Monserrat (no unique first names) to Thailand (81% unique last names). Nevertheless, most countries have a subset of first or last names that are relatively unique. For the 50 countries with the most researchers, authors with relatively national names are always more likely to research their name-associated country, suggesting a continued national association. Lists of researchers’ first and last name frequencies and proportions are provided for 200 countries/regions. Research limitations Only one period is tracked (2001-2021) and no attempt was made to validate the ancestral origins of any researcher. Practical implications Simple name heuristics can be used to identify the international spread of a sample of most countries’ diaspora researchers, but some manual checks of individual names are needed to weed out false matches. This can supplement mobile researcher data from bibliometric databases. Originality/value This is the first attempt to list name associations for the authors of all countries and large regions, and to identify the countries for which diaspora researchers could be tracked by name.

散居研究人员在一个国家工作，但祖先起源在另一个国家，要么通过在研究生涯中的迁移(流动散居研究人员)，要么通过在目标国家开始研究(嵌入式散居研究人员)。虽然移动研究人员可以通过文献计量数据库的隶属关系变化来跟踪，但嵌入式研究人员却不能。本文报告了一项基于证据的讨论，讨论了哪些国家的侨民研究人员可以使用名字或姓氏进行部分追踪，从而解决了这一限制。设计/方法/方法对2001-2021年200个国家或地区的所有Scopus期刊文章的作者姓名进行频率分析。从蒙塞拉特(没有独特的名字)到泰国(81%的人有独特的姓氏)，姓氏在多大程度上具有独特的民族特征存在很大差异。然而，大多数国家都有一个相对独特的名字或姓氏子集。在拥有最多研究人员的50个国家中，拥有相对国家名称的作者总是更有可能研究与他们名字相关的国家，这表明国家之间存在持续的联系。提供了200个国家/地区研究人员姓和名的频率和比例列表。研究局限只追踪了一个时期(2001-2021)，没有试图验证任何研究人员的祖先起源。简单的名字启发式可以用来识别大多数国家的散居研究人员样本的国际传播，但需要对个别名字进行一些人工检查，以剔除错误的匹配。这可以补充来自文献计量数据库的移动研究人员数据。原创性/价值这是第一次尝试列出所有国家和大地区作者的名字联系，并确定可以通过名字跟踪侨民研究人员的国家。

{"title":"Can first or last name uniqueness help to identify diaspora researchers from any country?","authors":"M. Thelwall","doi":"10.2478/jdis-2023-0013","DOIUrl":"https://doi.org/10.2478/jdis-2023-0013","url":null,"abstract":"Abstract Purpose Diaspora researchers work in one country but have ancestral origins in another, either through moves during a research career (mobile diaspora researchers) or by starting research in the target country (embedded diaspora researchers). Whilst mobile researchers might be tracked through affiliation changes in bibliometric databases, embedded researchers cannot. This article reports an evidence-based discussion of which countries’ diaspora researchers can be partially tracked using first or last names, addressing this limitation. Design/methodology/approach A frequency analysis of first and last names of authors of all Scopus journal articles 2001-2021 for 200 countries or regions. Findings There are great variations in the extent to which first or last names are uniquely national, from Monserrat (no unique first names) to Thailand (81% unique last names). Nevertheless, most countries have a subset of first or last names that are relatively unique. For the 50 countries with the most researchers, authors with relatively national names are always more likely to research their name-associated country, suggesting a continued national association. Lists of researchers’ first and last name frequencies and proportions are provided for 200 countries/regions. Research limitations Only one period is tracked (2001-2021) and no attempt was made to validate the ancestral origins of any researcher. Practical implications Simple name heuristics can be used to identify the international spread of a sample of most countries’ diaspora researchers, but some manual checks of individual names are needed to weed out false matches. This can supplement mobile researcher data from bibliometric databases. Originality/value This is the first attempt to list name associations for the authors of all countries and large regions, and to identify the countries for which diaspora researchers could be tracked by name.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"8 1","pages":"1 - 25"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41900076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Global trends in international research collaboration, 1980-2021 国际研究合作的全球趋势，1980-2021

Journal of data and information science (Warsaw, Poland)

Pub Date : 2023-04-01 DOI: 10.2478/jdis-2023-0015

D. Aksnes, G. Sivertsen

Abstract Purpose The aim of this study is to analyze the evolution of international research collaboration from 1980 to 2021. The study examines the main global patterns as well as those specific to individual countries, country groups, and different areas of research. Design/methodology/approach The study is based on the Web of Science Core collection database. More than 50 million publications are analyzed using co-authorship data. International collaboration is defined as publications having authors affiliated with institutions located in more than one country. Findings At the global level, the share of publications representing international collaboration has gradually increased from 4.7% in 1980 to 25.7% in 2021. The proportion of such publications within each country is higher and, in 2021, varied from less than 30% to more than 90%. There are notable disparities in the temporal trends, indicating that the process of internationalization has impacted countries in different ways. Several factors such as country size, income level, and geopolitics may explain the variance. Research limitations Not all international research collaboration results in joint co-authored scientific publications. International co-authorship is a partial indicator of such collaboration. Another limitation is that the applied full counting method does not take into account the number of authors representing in each country in the publication. Practical implications The study provides global averages, indicators, and concepts that can provide a useful framework of reference for further comparative studies of international research collaboration. Originality/value Long-term macro-level studies of international collaboration are rare, and as a novelty, this study includes an analysis by the World Bank’s division of countries into four income groups.

摘要目的本研究旨在分析1980-2021年国际研究合作的演变。该研究考察了主要的全球模式以及个别国家、国家集团和不同研究领域的具体模式。设计/方法论/方法本研究基于Web of Science核心收集数据库。使用合著者数据对5000多万份出版物进行了分析。国际合作是指作者隶属于一个以上国家机构的出版物。调查结果在全球范围内，代表国际合作的出版物所占比例从1980年的4.7%逐渐增加到2021年的25.7%。这些出版物在每个国家的比例都更高，2021年的比例从不到30%到超过90%不等。时间趋势存在显著差异，表明国际化进程对各国产生了不同的影响。国家规模、收入水平和地缘政治等几个因素可能解释了这种差异。研究局限性并非所有的国际研究合作都会导致联合撰写的科学出版物。国际合作是这种合作的部分指标。另一个限制是，所采用的全面计数方法没有考虑到出版物中每个国家的作者人数。实际意义该研究提供了全球平均值、指标和概念，可以为国际研究合作的进一步比较研究提供有用的参考框架。原创性/价值国际合作的长期宏观层面研究很少，作为一项新颖的研究，这项研究包括世界银行将国家划分为四个收入群体的分析。

{"title":"Global trends in international research collaboration, 1980-2021","authors":"D. Aksnes, G. Sivertsen","doi":"10.2478/jdis-2023-0015","DOIUrl":"https://doi.org/10.2478/jdis-2023-0015","url":null,"abstract":"Abstract Purpose The aim of this study is to analyze the evolution of international research collaboration from 1980 to 2021. The study examines the main global patterns as well as those specific to individual countries, country groups, and different areas of research. Design/methodology/approach The study is based on the Web of Science Core collection database. More than 50 million publications are analyzed using co-authorship data. International collaboration is defined as publications having authors affiliated with institutions located in more than one country. Findings At the global level, the share of publications representing international collaboration has gradually increased from 4.7% in 1980 to 25.7% in 2021. The proportion of such publications within each country is higher and, in 2021, varied from less than 30% to more than 90%. There are notable disparities in the temporal trends, indicating that the process of internationalization has impacted countries in different ways. Several factors such as country size, income level, and geopolitics may explain the variance. Research limitations Not all international research collaboration results in joint co-authored scientific publications. International co-authorship is a partial indicator of such collaboration. Another limitation is that the applied full counting method does not take into account the number of authors representing in each country in the publication. Practical implications The study provides global averages, indicators, and concepts that can provide a useful framework of reference for further comparative studies of international research collaboration. Originality/value Long-term macro-level studies of international collaboration are rare, and as a novelty, this study includes an analysis by the World Bank’s division of countries into four income groups.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"8 1","pages":"26 - 42"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43577441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0