首页 > 最新文献

Journal of Information Science最新文献

英文 中文
A polyphony of characteristics: An analysis of the categorisation of music’s subgenres 特征复调:音乐子类型分类分析
4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-10-24 DOI: 10.1177/01655515231203511
Philip Hider, Deborah Lee
We examine how music subgenres are differentiated from each other within seven parent genres – classical, folk, reggae, country, blues, electronic and jazz – according to two different sources, AllMusic and the Library of Congress Genre/Form Terms. Medium was by far the most common differentiator, but there were many others, with most subgenres defined according to multiple characteristic types, the use of which varied greatly across genres. Overall, differentiation was based more on characteristics intrinsic to the music, but prominent extrinsic characteristic types included culture and period. Also prominent was the identification of characteristics associated with other subgenres and genres, representing hybridisation. The resulting codebook of characteristics only partly overlaps with the major facets of music identified in the knowledge organisation literature. Our research conceptualises the musical subgenre, suggesting that music subgenres are differentiated from and connected to other subgenres, and to higher-level genres, in complex, familial ways – horizontally, vertically and obliquely.
根据AllMusic和国会图书馆的流派/形式术语,我们研究了音乐子流派如何在七个母流派中相互区分——古典、民间、雷鬼、乡村、蓝调、电子和爵士。媒介是迄今为止最常见的区分因素,但还有许多其他区分因素,其中大多数子类型是根据多种特征类型定义的,不同类型的游戏对这些特征类型的使用也各不相同。总体而言,这种区分更多是基于音乐的内在特征,但突出的外在特征类型包括文化和时代。同样突出的是识别与其他子类型和类型相关的特征,代表杂交。由此产生的特征代码本与知识组织文献中确定的音乐的主要方面只有部分重叠。我们的研究将音乐亚类型概念化,表明音乐亚类型以复杂的、家族的方式——水平的、垂直的和倾斜的——与其他亚类型和更高层次的类型区分并联系在一起。
{"title":"A polyphony of characteristics: An analysis of the categorisation of music’s subgenres","authors":"Philip Hider, Deborah Lee","doi":"10.1177/01655515231203511","DOIUrl":"https://doi.org/10.1177/01655515231203511","url":null,"abstract":"We examine how music subgenres are differentiated from each other within seven parent genres – classical, folk, reggae, country, blues, electronic and jazz – according to two different sources, AllMusic and the Library of Congress Genre/Form Terms. Medium was by far the most common differentiator, but there were many others, with most subgenres defined according to multiple characteristic types, the use of which varied greatly across genres. Overall, differentiation was based more on characteristics intrinsic to the music, but prominent extrinsic characteristic types included culture and period. Also prominent was the identification of characteristics associated with other subgenres and genres, representing hybridisation. The resulting codebook of characteristics only partly overlaps with the major facets of music identified in the knowledge organisation literature. Our research conceptualises the musical subgenre, suggesting that music subgenres are differentiated from and connected to other subgenres, and to higher-level genres, in complex, familial ways – horizontally, vertically and obliquely.","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":"399 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135273574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A deep CNN architecture with novel pooling layer applied to two Sudanese Arabic sentiment data sets 将具有新颖池化层的深度CNN架构应用于两个苏丹阿拉伯情感数据集
4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-10-21 DOI: 10.1177/01655515231188341
Mustafa Mhamed, Richard Sutcliffe, Husam Quteineh, Xia Sun, Eiad Almekhlafi, Ephrem Afele Retta, Jun Feng
Arabic sentiment analysis has become an important research field in recent years. Initially, work focused on Modern Standard Arabic (MSA), which is the most widely used form. Since then, work has been carried out on several different dialects, including Egyptian, Levantine and Moroccan. Moreover, a number of data sets have been created to support such work. However, up until now, no work has been carried out on Sudanese Arabic, a dialect which has 32 million speakers. In this article, two new public data sets are introduced, the two-class Sudanese Sentiment Data set (SudSenti2) and the three-class Sudanese Sentiment Data set (SudSenti3). In the preparation phase, we establish a Sudanese stopword list. Furthermore, a convolutional neural network (CNN) architecture, Sentiment Convolutional MMA (SCM), is proposed, comprising five CNN layers together with a novel Mean Max Average (MMA) pooling layer, to extract the best features. This SCM model is applied to SudSenti2 and SudSenti3 and shown to be superior to the baseline models, with accuracies of 92.25% and 85.23% (Experiments 1 and 2). The performance of MMA is compared with Max, Avg and Min and shown to be better on SudSenti2, the Saudi Sentiment Data set and the MSA Hotel Arabic Review Data set by 1.00%, 0.83% and 0.74%, respectively (Experiment 3). Next, we conduct an ablation study to determine the contribution to performance of text normalisation and the Sudanese stopword list (Experiment 4). For normalisation, this makes a difference of 0.43% on two-class and 0.45% on three-class. For the custom stoplist, the differences are 0.82% and 0.72%, respectively. Finally, the model is compared with other deep learning classifiers, including transformer-based language models for Arabic, and shown to be comparable for SudSenti2 (Experiment 5).
近年来,阿拉伯语情感分析已成为一个重要的研究领域。最初,工作重点是现代标准阿拉伯语(MSA),这是最广泛使用的形式。从那以后,对几种不同的方言进行了研究,包括埃及语、黎凡特语和摩洛哥语。此外,还建立了一些数据集来支持这项工作。然而,到目前为止,还没有对苏丹阿拉伯语进行任何研究,这是一种有3200万人使用的方言。本文介绍了两个新的公共数据集,两类苏丹情感数据集(SudSenti2)和三类苏丹情感数据集(SudSenti3)。在准备阶段,我们建立了苏丹语停词表。此外,提出了一种卷积神经网络(CNN)架构,即情感卷积MMA (SCM),该架构由五个CNN层和一个新颖的Mean Max Average (MMA)池化层组成,用于提取最佳特征。该SCM模型应用于SudSenti2和SudSenti3,结果显示优于基线模型,准确率为92.25%和85.23%(实验1和2)。MMA的性能与Max、Avg和Min进行了比较,结果显示在SudSenti2、沙特情绪数据集和MSA酒店阿拉伯评论数据集上分别提高了1.00%、0.83%和0.74%(实验3)。我们进行了消融研究,以确定文本规范化和苏丹停顿词列表对性能的贡献(实验4)。对于规范化,这使得两类和三类的差异分别为0.43%和0.45%。对于自定义停车表,差异分别为0.82%和0.72%。最后,将该模型与其他深度学习分类器(包括阿拉伯语的基于转换器的语言模型)进行比较,并证明与SudSenti2具有可比性(实验5)。
{"title":"A deep CNN architecture with novel pooling layer applied to two Sudanese Arabic sentiment data sets","authors":"Mustafa Mhamed, Richard Sutcliffe, Husam Quteineh, Xia Sun, Eiad Almekhlafi, Ephrem Afele Retta, Jun Feng","doi":"10.1177/01655515231188341","DOIUrl":"https://doi.org/10.1177/01655515231188341","url":null,"abstract":"Arabic sentiment analysis has become an important research field in recent years. Initially, work focused on Modern Standard Arabic (MSA), which is the most widely used form. Since then, work has been carried out on several different dialects, including Egyptian, Levantine and Moroccan. Moreover, a number of data sets have been created to support such work. However, up until now, no work has been carried out on Sudanese Arabic, a dialect which has 32 million speakers. In this article, two new public data sets are introduced, the two-class Sudanese Sentiment Data set (SudSenti2) and the three-class Sudanese Sentiment Data set (SudSenti3). In the preparation phase, we establish a Sudanese stopword list. Furthermore, a convolutional neural network (CNN) architecture, Sentiment Convolutional MMA (SCM), is proposed, comprising five CNN layers together with a novel Mean Max Average (MMA) pooling layer, to extract the best features. This SCM model is applied to SudSenti2 and SudSenti3 and shown to be superior to the baseline models, with accuracies of 92.25% and 85.23% (Experiments 1 and 2). The performance of MMA is compared with Max, Avg and Min and shown to be better on SudSenti2, the Saudi Sentiment Data set and the MSA Hotel Arabic Review Data set by 1.00%, 0.83% and 0.74%, respectively (Experiment 3). Next, we conduct an ablation study to determine the contribution to performance of text normalisation and the Sudanese stopword list (Experiment 4). For normalisation, this makes a difference of 0.43% on two-class and 0.45% on three-class. For the custom stoplist, the differences are 0.82% and 0.72%, respectively. Finally, the model is compared with other deep learning classifiers, including transformer-based language models for Arabic, and shown to be comparable for SudSenti2 (Experiment 5).","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":"14 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135513261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A social diagnosis mechanism for healthcare knowledge sharing 医疗保健知识共享的社会诊断机制
4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-10-19 DOI: 10.1177/01655515231199929
Lien-Fa Lin, Yung-Ming Li, Yen-Chen Lin
In recent years, social networks have grown rapidly, and their applications in the healthcare domain are increasingly proposed. Using the crowd wisdom generated from social networks, we can find similar and reliable people sharing helpful experiences. The existing dedicated social networking services for health mainly focus on sharing, but not categorising and extracting. In this research, we construct an environment for social knowledge sharing and expert referring. Analysing queries from online public health databases and the factors of health similarity, social reliability and social intimacy, we extract health knowledge to recommend relevant social knowledge (also called threads) and helpful experts providing consulting. Specifically, the proposed social diagnosis mechanism helps the health seeker to identify relevant threads and recommends enthusiastic experts for healthcare support. Experimental results reveal that the proposed mechanism can effectively improve healthcare knowledge sharing and realise diagnosis support from the crowd.
近年来,社交网络发展迅速,其在医疗保健领域的应用越来越多。利用社交网络产生的群体智慧,我们可以找到相似且可靠的人分享有益的经验。现有的专门的健康社交网络服务主要侧重于分享,而不是分类和提取。在本研究中,我们构建了一个社会知识共享和专家咨询的环境。通过分析在线公共卫生数据库的查询以及健康相似度、社会可靠性和社会亲密度等因素,提取健康知识,推荐相关的社会知识(也称为线索)和有帮助的专家提供咨询。具体而言,所提出的社会诊断机制有助于求医者识别相关线索,并推荐热心的专家进行医疗保健支持。实验结果表明,该机制能够有效地促进医疗卫生知识共享,实现群体诊断支持。
{"title":"A social diagnosis mechanism for healthcare knowledge sharing","authors":"Lien-Fa Lin, Yung-Ming Li, Yen-Chen Lin","doi":"10.1177/01655515231199929","DOIUrl":"https://doi.org/10.1177/01655515231199929","url":null,"abstract":"In recent years, social networks have grown rapidly, and their applications in the healthcare domain are increasingly proposed. Using the crowd wisdom generated from social networks, we can find similar and reliable people sharing helpful experiences. The existing dedicated social networking services for health mainly focus on sharing, but not categorising and extracting. In this research, we construct an environment for social knowledge sharing and expert referring. Analysing queries from online public health databases and the factors of health similarity, social reliability and social intimacy, we extract health knowledge to recommend relevant social knowledge (also called threads) and helpful experts providing consulting. Specifically, the proposed social diagnosis mechanism helps the health seeker to identify relevant threads and recommends enthusiastic experts for healthcare support. Experimental results reveal that the proposed mechanism can effectively improve healthcare knowledge sharing and realise diagnosis support from the crowd.","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135730036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A fieldwork manual as a regulatory device: Instructing, prescribing and describing documentation work 作为监管工具的野外工作手册:指导、规定和描述文件工作
4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-10-19 DOI: 10.1177/01655515231203506
Isto Huvila, Olle Sköld
Research on how archaeological fieldwork manuals, a sub-category of methods handbooks, regulate research documentation is limited. Qualitative content analysis of 25 English-language archaeological field manuals from the early 1900s to 2010s showed that they instruct how to describe the documentation work, prescribe practices and workflows, and function as often pre-coordinated descriptions of work. A manual forms a ‘working space’ that is sometimes adopted as such by following the detailed advice given in some of the texts but likely more often used as a more general point of reference. The fact that many manuals do not provide exact recipes for the fieldwork as a whole means that they function as comprehensive representations and documentation (paradata) of actual fieldwork practices only when read in parallel with field documentation.
考古野外工作手册作为方法手册的一个子类,如何规范研究文献的研究是有限的。对20世纪初至2010年代25本英语考古现场手册的定性内容分析表明,它们指导了如何描述文献工作,规定了实践和工作流程,并经常作为预先协调的工作描述。手册形成了一个“工作空间”,有时通过遵循某些文本中给出的详细建议来采用,但可能更多地用作更一般的参考点。事实上,许多手册并没有作为一个整体为实地工作提供精确的方法,这意味着只有在与实地文档同时阅读时,它们才能作为实际实地工作实践的综合表述和文档(para - ata)。
{"title":"A fieldwork manual as a regulatory device: Instructing, prescribing and describing documentation work","authors":"Isto Huvila, Olle Sköld","doi":"10.1177/01655515231203506","DOIUrl":"https://doi.org/10.1177/01655515231203506","url":null,"abstract":"Research on how archaeological fieldwork manuals, a sub-category of methods handbooks, regulate research documentation is limited. Qualitative content analysis of 25 English-language archaeological field manuals from the early 1900s to 2010s showed that they instruct how to describe the documentation work, prescribe practices and workflows, and function as often pre-coordinated descriptions of work. A manual forms a ‘working space’ that is sometimes adopted as such by following the detailed advice given in some of the texts but likely more often used as a more general point of reference. The fact that many manuals do not provide exact recipes for the fieldwork as a whole means that they function as comprehensive representations and documentation (paradata) of actual fieldwork practices only when read in parallel with field documentation.","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135778484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MNN4Rec: A relation-aware approach based on multi-view news network for news recommendation MNN4Rec:基于多视图新闻网络的关系感知新闻推荐方法
4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-10-18 DOI: 10.1177/01655515231182072
Hao Jiang, Chuanzhen Li, Juanjuan Cai, Jingling Wang
Personalised news recommendation comprises two crucial components: news understanding and user modelling. Previous studies have attempted to model news understanding and user interests using various internal news information and external knowledge graphs (KG). However, they have overlooked the collaborative function of the external KG and the internal information among diverse news and user behaviours, resulting in serious news cold-start problems and poor interpretability of user interests. To address these issues, this article proposes a novel approach called Relation-Aware Approach based on Multi-view News Network for News Recommendation (MNN4Rec). Specifically, MNN4Rec first constructs a Multi-view News Network (MNN), which includes candidate news and user-clicked news, and represents their exclusive multi-view information as heterogeneous nodes. Furthermore, we develop explicit and implicit news relationships and design a special sampling algorithm to search for news co-neighbours. We then use a novel dual-channel graph attention mechanism to obtain the fine-grained news understanding representation. Moreover, we construct explainable user interests by modelling the interaction of user-clicked news through the multi-headed self-attention mechanism in both semantic and relation levels. Finally, we match candidate news understanding with user interests to generate a prediction score for recommendation. Experimental results on Microsoft’s news data set MIND demonstrate that MNN4Rec outperforms existing news-recommendation methods while also mitigating the cold-start problem and enhancing the interpretability of user interests. Our code is available at https://github.com/JiangHaoPG11/MNN4Rec_code .
个性化新闻推荐包括两个关键部分:新闻理解和用户建模。以前的研究试图利用各种内部新闻信息和外部知识图(KG)来建模新闻理解和用户兴趣。然而,他们忽视了外部KG和内部信息在多种新闻和用户行为之间的协同作用,导致新闻冷启动问题严重,用户兴趣的可解释性较差。为了解决这些问题,本文提出了一种新的方法,称为基于多视图新闻网络的新闻推荐关系感知方法(MNN4Rec)。具体而言,MNN4Rec首先构建了一个多视图新闻网络(MNN),其中包括候选新闻和用户点击新闻,并将它们的独占多视图信息表示为异构节点。此外,我们建立了显式和隐式新闻关系,并设计了一种特殊的采样算法来搜索新闻邻居。然后,我们使用一种新的双通道图注意机制来获得细粒度的新闻理解表示。此外,我们通过语义和关系层面的多头自注意机制,通过对用户点击新闻的交互建模,构建了可解释的用户兴趣。最后,我们将候选新闻理解与用户兴趣相匹配,生成预测评分用于推荐。在微软的新闻数据集MIND上的实验结果表明,MNN4Rec优于现有的新闻推荐方法,同时也减轻了冷启动问题,增强了用户兴趣的可解释性。我们的代码可在https://github.com/JiangHaoPG11/MNN4Rec_code上获得。
{"title":"MNN4Rec: A relation-aware approach based on multi-view news network for news recommendation","authors":"Hao Jiang, Chuanzhen Li, Juanjuan Cai, Jingling Wang","doi":"10.1177/01655515231182072","DOIUrl":"https://doi.org/10.1177/01655515231182072","url":null,"abstract":"Personalised news recommendation comprises two crucial components: news understanding and user modelling. Previous studies have attempted to model news understanding and user interests using various internal news information and external knowledge graphs (KG). However, they have overlooked the collaborative function of the external KG and the internal information among diverse news and user behaviours, resulting in serious news cold-start problems and poor interpretability of user interests. To address these issues, this article proposes a novel approach called Relation-Aware Approach based on Multi-view News Network for News Recommendation (MNN4Rec). Specifically, MNN4Rec first constructs a Multi-view News Network (MNN), which includes candidate news and user-clicked news, and represents their exclusive multi-view information as heterogeneous nodes. Furthermore, we develop explicit and implicit news relationships and design a special sampling algorithm to search for news co-neighbours. We then use a novel dual-channel graph attention mechanism to obtain the fine-grained news understanding representation. Moreover, we construct explainable user interests by modelling the interaction of user-clicked news through the multi-headed self-attention mechanism in both semantic and relation levels. Finally, we match candidate news understanding with user interests to generate a prediction score for recommendation. Experimental results on Microsoft’s news data set MIND demonstrate that MNN4Rec outperforms existing news-recommendation methods while also mitigating the cold-start problem and enhancing the interpretability of user interests. Our code is available at https://github.com/JiangHaoPG11/MNN4Rec_code .","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135884147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CHV.br: Exploratory study for the development of a consumer health vocabulary (CHV) supported by a network model for Brazilian Portuguese language CHV。基于网络模型的巴西葡萄牙语消费者健康词汇开发的探索性研究
4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-30 DOI: 10.1177/01655515231196391
Josceli M Tenorio, Fabrício Landi de Moraes, Ivan Torres Pisa
Successful consumer health vocabulary (CHV) models have been engineered and updated by using automatic term extraction techniques from online content. However, the relationship between terms has yet to be mapped. This study aims to describe a CHV model for the Brazilian Portuguese language that is supported by a complex network. The method was split up into three distinct stages: (1) collect and automatically extract terms from structured data sources on the web, such as Unified Medical Language System (UMLS) vocabularies and DBpedia; (2) construct a complex network; and (3) select the terms supported by clustering techniques. A model called CHV.br was developed and supported by a complex network structure which makes connections between the controlled vocabulary and consumer vocabulary and maps semantic relationships as categories, synonyms and related terms. CHV.br contains 146,956 terms, of which 31,439 are UMLS preferred terms and 83,279 are synonyms. The CHV.br is available and powered by Simple Knowledge Organization System and Resource Description Framework standards. The method used in this study showed to be valid for the selection of the candidate terms by connecting the terms from different reliable resources, in addition to expanding the number of terms and their semantic relationships. The content and structure of CHV.br could play a vital role in enhancing the development of consumer-oriented health applications.
成功的消费者健康词汇(CHV)模型是通过使用在线内容的自动术语提取技术来设计和更新的。然而,术语之间的关系还有待绘制。本研究旨在描述一个由复杂网络支持的巴西葡萄牙语CHV模型。该方法分为三个阶段:(1)从web上的结构化数据源(如统一医学语言系统(UMLS)词汇表和DBpedia)中收集并自动提取术语;(2)构建复杂网络;(3)选择聚类技术支持的词。一个叫做CHV的模型。Br是由一个复杂的网络结构开发和支持的,它在受控词汇和消费者词汇之间建立联系,并映射语义关系,如类别、同义词和相关术语。CHV。br包含146,956个术语,其中31,439个是UMLS首选术语,83,279个是同义词。CHV。br是可用的,由简单知识组织系统和资源描述框架标准提供支持。在本研究中使用的方法表明,通过连接来自不同可靠资源的术语来选择候选术语是有效的,此外还扩展了术语的数量及其语义关系。CHV的内容和结构。Br可以在促进面向消费者的健康应用的发展方面发挥至关重要的作用。
{"title":"CHV.br: Exploratory study for the development of a consumer health vocabulary (CHV) supported by a network model for Brazilian Portuguese language","authors":"Josceli M Tenorio, Fabrício Landi de Moraes, Ivan Torres Pisa","doi":"10.1177/01655515231196391","DOIUrl":"https://doi.org/10.1177/01655515231196391","url":null,"abstract":"Successful consumer health vocabulary (CHV) models have been engineered and updated by using automatic term extraction techniques from online content. However, the relationship between terms has yet to be mapped. This study aims to describe a CHV model for the Brazilian Portuguese language that is supported by a complex network. The method was split up into three distinct stages: (1) collect and automatically extract terms from structured data sources on the web, such as Unified Medical Language System (UMLS) vocabularies and DBpedia; (2) construct a complex network; and (3) select the terms supported by clustering techniques. A model called CHV.br was developed and supported by a complex network structure which makes connections between the controlled vocabulary and consumer vocabulary and maps semantic relationships as categories, synonyms and related terms. CHV.br contains 146,956 terms, of which 31,439 are UMLS preferred terms and 83,279 are synonyms. The CHV.br is available and powered by Simple Knowledge Organization System and Resource Description Framework standards. The method used in this study showed to be valid for the selection of the candidate terms by connecting the terms from different reliable resources, in addition to expanding the number of terms and their semantic relationships. The content and structure of CHV.br could play a vital role in enhancing the development of consumer-oriented health applications.","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136336485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic author name disambiguation by differentiable feature selection 基于可微分特征选择的作者姓名自动消歧
4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-19 DOI: 10.1177/01655515231193859
ZhiJian Fang, Yue Zhuo, Jinying Xu, Zhechong Tang, Zijie Jia, HuaXiong Zhang
Author name disambiguation (AND) is the task of resolving the ambiguity problem in bibliographic databases, where distinct real-world authors may share the same name or same author may have distinct names. The aim of AND is to split the name-ambiguous entities (articles) into the corresponding authors. Existing AND algorithms mainly focus on designing different similarity metrics between two ambiguous articles. However, most previous methods empirically select and process the features of entities, then use features to predict the similarity by data-driven models. In this article, we are motivated by natural questions: Which features are most useful for splitting name-ambiguous entities? Can they be automatically determined by an optimisation approach rather than heuristic feature engineering? Therefore, we proposed a novel end-to-end differentiable feature selection algorithm, automatically searching the optimal features for AND task (AAND). AAND optimises the discrete feature selection by differentiable Gumbel-Softmax, leading to the joint learning of feature selection policy and similarity prediction model. The experiments are conducted on a benchmark data set, S2AND, which harmonises eight different AND data sets. The results show that the performance of our proposal is superior to the advanced AND methods and feature selection algorithms. Meanwhile, deep insights into AND features are also given.
作者姓名消歧(AND)是解决书目数据库中的歧义问题的任务,其中不同的现实世界作者可能共享相同的名称,或者相同的作者可能具有不同的名称。AND的目的是将名称不明确的实体(文章)拆分为对应的作者。现有的AND算法主要集中在设计两篇歧义文章之间不同的相似度度量。然而,以往的方法大多是经验地选择和处理实体的特征,然后利用特征通过数据驱动模型来预测相似度。在本文中,我们的动机是一个自然的问题:哪些特性对于拆分名称不明确的实体最有用?它们可以通过优化方法而不是启发式特征工程来自动确定吗?为此,我们提出了一种新的端到端可微特征选择算法,自动搜索与任务的最优特征(AAND)。AAND通过可微Gumbel-Softmax优化离散特征选择,实现特征选择策略和相似度预测模型的联合学习。实验是在一个基准数据集S2AND上进行的,该数据集协调了八个不同的AND数据集。结果表明,该方法的性能优于先进的AND方法和特征选择算法。同时,对AND特征也进行了深入的研究。
{"title":"Automatic author name disambiguation by differentiable feature selection","authors":"ZhiJian Fang, Yue Zhuo, Jinying Xu, Zhechong Tang, Zijie Jia, HuaXiong Zhang","doi":"10.1177/01655515231193859","DOIUrl":"https://doi.org/10.1177/01655515231193859","url":null,"abstract":"Author name disambiguation (AND) is the task of resolving the ambiguity problem in bibliographic databases, where distinct real-world authors may share the same name or same author may have distinct names. The aim of AND is to split the name-ambiguous entities (articles) into the corresponding authors. Existing AND algorithms mainly focus on designing different similarity metrics between two ambiguous articles. However, most previous methods empirically select and process the features of entities, then use features to predict the similarity by data-driven models. In this article, we are motivated by natural questions: Which features are most useful for splitting name-ambiguous entities? Can they be automatically determined by an optimisation approach rather than heuristic feature engineering? Therefore, we proposed a novel end-to-end differentiable feature selection algorithm, automatically searching the optimal features for AND task (AAND). AAND optimises the discrete feature selection by differentiable Gumbel-Softmax, leading to the joint learning of feature selection policy and similarity prediction model. The experiments are conducted on a benchmark data set, S2AND, which harmonises eight different AND data sets. The results show that the performance of our proposal is superior to the advanced AND methods and feature selection algorithms. Meanwhile, deep insights into AND features are also given.","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135107090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discovering research data management trends from job advertisements using a text-mining approach 使用文本挖掘方法从招聘广告中发现研究数据管理趋势
4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-15 DOI: 10.1177/01655515231193845
Naseema Sheriff, R Sevukan
In today’s data-driven culture, research data management (RDM) is essential for the research community. The demand for reusing research datasets is a challenging and diverse process for the scientific community. Despite this, it is essential in RDM to discover trends and themes using text mining, which is scarce. The purpose of this study is to employ text mining to discover insights from job advertisements associated with RDM profiles, which collected 810 advertisements. We found RDM-related patterns using latent Dirichlet allocation (LDA) and identified three key contexts. The first is ‘research services in libraries’, with the topics of research services, research information, research universities, collection processes and library services. The second context is ‘research data’, which includes RDM, business data, university data, research data, health research, science research, social science research, data centres, data services, statistical software, digital scholarship and digital preservation. The third context is ‘workplace environment’, and the topics are leadership, work development and scientific position. Job title normalisation reveals names such as ‘data librarian’, ‘librarian’, ‘director’, ‘data curator’, ‘data manager’, ‘research data librarian’, ‘data specialist’ and ‘data officer’ are frequently employed. Focusing on titles with a single or double occurrence is new and interesting for developing nations. Reputable institutions such as Harvard, Stanford and the Massachusetts Institute of Technology, as well as countries such as the United States, the United Kingdom, Canada and Germany, are the major participants in RDM practises and services. This discovery will assist higher education institutions, RDM stakeholders, which aid in the formulation of curriculum, and job seekers to familiarise themselves with the themes.
在当今数据驱动的文化中,研究数据管理(RDM)对研究社区至关重要。对于科学界来说,重复使用研究数据集的需求是一个具有挑战性和多样化的过程。尽管如此,在RDM中,使用文本挖掘来发现趋势和主题是必要的,而这是稀缺的。本研究的目的是利用文本挖掘来发现与RDM档案相关的招聘广告的见解,该研究收集了810个广告。我们使用潜在狄利克雷分配(LDA)发现了rdm相关的模式,并确定了三个关键上下文。第一个是“图书馆的研究服务”,主题是研究服务、研究信息、研究型大学、馆藏流程和图书馆服务。第二个上下文是“研究数据”,包括RDM、商业数据、大学数据、研究数据、卫生研究、科学研究、社会科学研究、数据中心、数据服务、统计软件、数字奖学金和数字保存。第三个语境是“职场环境”,主题是领导力、工作发展和科学定位。职衔规范化显示,“数据图书管理员”、“图书管理员”、“主管”、“数据馆长”、“数据经理”、“研究数据图书管理员”、“数据专家”和“数据主任”等名称经常被聘用。对于发展中国家来说,专注于出现一次或两次的游戏是一种新颖而有趣的做法。诸如哈佛大学、斯坦福大学和麻省理工学院等著名机构,以及诸如美国、联合王国、加拿大和德国等国家,都是RDM实践和服务的主要参与者。这一发现将有助于高等教育机构、有助于制定课程的RDM利益相关者和求职者熟悉这些主题。
{"title":"Discovering research data management trends from job advertisements using a text-mining approach","authors":"Naseema Sheriff, R Sevukan","doi":"10.1177/01655515231193845","DOIUrl":"https://doi.org/10.1177/01655515231193845","url":null,"abstract":"In today’s data-driven culture, research data management (RDM) is essential for the research community. The demand for reusing research datasets is a challenging and diverse process for the scientific community. Despite this, it is essential in RDM to discover trends and themes using text mining, which is scarce. The purpose of this study is to employ text mining to discover insights from job advertisements associated with RDM profiles, which collected 810 advertisements. We found RDM-related patterns using latent Dirichlet allocation (LDA) and identified three key contexts. The first is ‘research services in libraries’, with the topics of research services, research information, research universities, collection processes and library services. The second context is ‘research data’, which includes RDM, business data, university data, research data, health research, science research, social science research, data centres, data services, statistical software, digital scholarship and digital preservation. The third context is ‘workplace environment’, and the topics are leadership, work development and scientific position. Job title normalisation reveals names such as ‘data librarian’, ‘librarian’, ‘director’, ‘data curator’, ‘data manager’, ‘research data librarian’, ‘data specialist’ and ‘data officer’ are frequently employed. Focusing on titles with a single or double occurrence is new and interesting for developing nations. Reputable institutions such as Harvard, Stanford and the Massachusetts Institute of Technology, as well as countries such as the United States, the United Kingdom, Canada and Germany, are the major participants in RDM practises and services. This discovery will assist higher education institutions, RDM stakeholders, which aid in the formulation of curriculum, and job seekers to familiarise themselves with the themes.","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135396503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A model of planned and unplanned information-seeking behaviour 计划和非计划的信息寻求行为模型
4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-11 DOI: 10.1177/01655515231196390
Hadi Harati, Alireza Isfandyari-Moghaddam
The main purpose of this article is to present a model for information-seeking behaviour with an emphasis on unplanned and planned behaviour of users in using library resources and services. The working method was that, reviewing the literature and previous information behaviour models, such as Wilson, Ellis, Kuhlthau, and Dervin models, this article proposes a novel model of information-seeking behaviour for library users. Our model of information-seeking behaviour was developed by combining the existing models of planned information-seeking behaviour with the focus on the factors affecting unplanned rather than planned behaviour of users in accessing resources or services. Our proposed model for information-seeking behaviour of clients has two main parts. The first part planned behaviour resulting from a problem or a certain information need according to which the user seeks to find information in a planned manner. The second part deals with unplanned behaviour shaped by a hidden or uncertain information need. Finally, both types of behaviour can result in the discovery, extraction, collection and use of information.
本文的主要目的是提出一个信息寻求行为的模型,强调用户在使用图书馆资源和服务时的计划外和计划外行为。本文的工作方法是,通过回顾文献和先前的信息行为模型,如Wilson、Ellis、Kuhlthau和Dervin模型,提出一种新的图书馆用户信息寻求行为模型。我们的信息寻求行为模型是将现有的计划信息寻求行为模型与关注影响用户在访问资源或服务时的非计划行为而非计划行为的因素相结合而开发的。我们提出的客户信息寻求行为模型有两个主要部分。第一部分是由问题或某种信息需求引起的计划行为,用户根据这些行为有计划地寻找信息。第二部分涉及由隐藏的或不确定的信息需求形成的计划外行为。最后,这两种类型的行为都可能导致信息的发现、提取、收集和使用。
{"title":"A model of planned and unplanned information-seeking behaviour","authors":"Hadi Harati, Alireza Isfandyari-Moghaddam","doi":"10.1177/01655515231196390","DOIUrl":"https://doi.org/10.1177/01655515231196390","url":null,"abstract":"The main purpose of this article is to present a model for information-seeking behaviour with an emphasis on unplanned and planned behaviour of users in using library resources and services. The working method was that, reviewing the literature and previous information behaviour models, such as Wilson, Ellis, Kuhlthau, and Dervin models, this article proposes a novel model of information-seeking behaviour for library users. Our model of information-seeking behaviour was developed by combining the existing models of planned information-seeking behaviour with the focus on the factors affecting unplanned rather than planned behaviour of users in accessing resources or services. Our proposed model for information-seeking behaviour of clients has two main parts. The first part planned behaviour resulting from a problem or a certain information need according to which the user seeks to find information in a planned manner. The second part deals with unplanned behaviour shaped by a hidden or uncertain information need. Finally, both types of behaviour can result in the discovery, extraction, collection and use of information.","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135981108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Information architecture on the websites of major American political parties: Qualitative-heuristic assessment and comparative analysis 美国主要政党网站的信息架构:定性启发式评估与比较分析
4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-11 DOI: 10.1177/01655515231196392
Paulina Sajna-Kosobucka, Radoslaw Sajna-Kunowsky
The Internet has become a very important tool of modern political communication. Political parties use it for a variety of purposes, including relationships with its supporters and potential voters. Therefore, the website of any political party is an important channel of communication. Such a website should be user-friendly to various audiences. The purpose of the article is to assess the quality of information architecture (IA) on the websites of the two major American political parties: the Republican Party and the Democratic Party. Triangulation of methods was used in the research: a qualitative-heuristic assessment of the IA on the mentioned websites, expert assessment and a comparative analysis. Within the examined websites, there is a noticeable lack of some important components of the IA. The website of the Republican Party is slightly better in terms of quality than the website of Democratic Party by 14.09%. The results of the conducted research may have an impact not only on the assessment of the technological advancement of parties but also on the image and public perception, and therefore also the effectiveness in reaching voters of a given party. This approach should contribute to the development of websites research in order to improve the quality of user experience and information processes.
互联网已成为现代政治传播的重要工具。政党将其用于各种目的,包括与其支持者和潜在选民的关系。因此,任何政党的网站都是重要的沟通渠道。这样的网站应该是用户友好的各种受众。本文的目的是评估美国两大政党(共和党和民主党)网站上的信息架构(IA)的质量。在研究中使用了三角法:对上述网站的IA进行定性启发式评估,专家评估和比较分析。在接受检查的网站中,明显缺乏保险业监督的一些重要组成部分。共和党网站的质量略好于民主党网站14.09%。所进行的研究结果不仅可能影响对政党技术进步的评估,还可能影响到形象和公众的看法,从而影响到某一政党接触选民的有效性。这种方法应该有助于网站研究的发展,以提高用户体验和信息处理的质量。
{"title":"Information architecture on the websites of major American political parties: Qualitative-heuristic assessment and comparative analysis","authors":"Paulina Sajna-Kosobucka, Radoslaw Sajna-Kunowsky","doi":"10.1177/01655515231196392","DOIUrl":"https://doi.org/10.1177/01655515231196392","url":null,"abstract":"The Internet has become a very important tool of modern political communication. Political parties use it for a variety of purposes, including relationships with its supporters and potential voters. Therefore, the website of any political party is an important channel of communication. Such a website should be user-friendly to various audiences. The purpose of the article is to assess the quality of information architecture (IA) on the websites of the two major American political parties: the Republican Party and the Democratic Party. Triangulation of methods was used in the research: a qualitative-heuristic assessment of the IA on the mentioned websites, expert assessment and a comparative analysis. Within the examined websites, there is a noticeable lack of some important components of the IA. The website of the Republican Party is slightly better in terms of quality than the website of Democratic Party by 14.09%. The results of the conducted research may have an impact not only on the assessment of the technological advancement of parties but also on the image and public perception, and therefore also the effectiveness in reaching voters of a given party. This approach should contribute to the development of websites research in order to improve the quality of user experience and information processes.","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135980884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Information Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1