首页 > 最新文献

Journal of Intelligent Information Systems最新文献

英文 中文
CMC-MMR: multi-modal recommendation model with cross-modal correction CMC-MMR:跨模态校正的多模态推荐模型
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2024-02-20 DOI: 10.1007/s10844-024-00848-x

Abstract

Multi-modal recommendation using multi-modal features (e.g., image and text features) has received significant attention and has been shown to have more effective recommendation. However, there are currently the following problems with multi-modal recommendation: (1) Multi-modal recommendation often handle individual modes’ raw data directly, leading to noise affecting the model’s effectiveness and the failure to explore interconnections between modes; (2) Different users have different preferences. It’s impractical to treat all modalities equally, as this could interfere with the model’s ability to make recommendation. To address the above problems, this paper proposes a Multi-modal recommendation model with cross-modal correction (CMC-MMR). Firstly, in order to reduce the effect of noise in the raw data and to take full advantage of the relationships between modes, we designed a cross-modal correction module to denoise and correct the modes using a cross-modal correction mechanism; Secondly, the similarity between the same modalities of each item is used as a benchmark to build item-item graphs for each modality, and user-item graphs with degree-sensitive pruning strategies are also built to mine higher-order information; Finally, we designed a self-supervised task to adaptively mine user preferences for modality. We conducted comparative experiments with eleven baseline models on four real-world datasets. The experimental results show that CMC-MMR improves 6.202%, 4.975% , 6.054% and 11.368% on average on the four datasets, respectively, demonstrates the effectiveness of CMC-MMR.

摘要 使用多模态特征(如图像和文本特征)的多模态推荐已受到广泛关注,并被证明具有更高的推荐效率。然而,多模态推荐目前存在以下问题:(1)多模态推荐通常直接处理单个模态的原始数据,导致噪声影响模型的有效性,并且无法探索模态之间的内在联系;(2)不同用户有不同的偏好。对所有模式一视同仁是不切实际的,因为这会影响模型的推荐能力。针对上述问题,本文提出了一种具有跨模态修正功能的多模态推荐模型(CMC-MMR)。首先,为了降低原始数据中噪声的影响,并充分利用模态之间的关系,我们设计了一个跨模态校正模块,利用跨模态校正机制对模态进行去噪和校正;其次,以每个条目相同模态之间的相似度为基准,为每种模态建立条目-条目图,同时建立具有程度敏感剪枝策略的用户-条目图,以挖掘高阶信息;最后,我们设计了一个自监督任务,以自适应地挖掘用户对模态的偏好。我们在四个真实数据集上与 11 个基准模型进行了对比实验。实验结果表明,CMC-MMR 在四个数据集上的平均提升率分别为 6.202%、4.975%、6.054% 和 11.368%,证明了 CMC-MMR 的有效性。
{"title":"CMC-MMR: multi-modal recommendation model with cross-modal correction","authors":"","doi":"10.1007/s10844-024-00848-x","DOIUrl":"https://doi.org/10.1007/s10844-024-00848-x","url":null,"abstract":"<h3>Abstract</h3> <p>Multi-modal recommendation using multi-modal features (e.g., image and text features) has received significant attention and has been shown to have more effective recommendation. However, there are currently the following problems with multi-modal recommendation: (1) Multi-modal recommendation often handle individual modes’ raw data directly, leading to noise affecting the model’s effectiveness and the failure to explore interconnections between modes; (2) Different users have different preferences. It’s impractical to treat all modalities equally, as this could interfere with the model’s ability to make recommendation. To address the above problems, this paper proposes a <span>M</span>ulti-<span>m</span>odal <span>r</span>ecommendation model with <span>c</span>ross-<span>m</span>odal <span>c</span>orrection (CMC-MMR). Firstly, in order to reduce the effect of noise in the raw data and to take full advantage of the relationships between modes, we designed a cross-modal correction module to denoise and correct the modes using a cross-modal correction mechanism; Secondly, the similarity between the same modalities of each item is used as a benchmark to build item-item graphs for each modality, and user-item graphs with degree-sensitive pruning strategies are also built to mine higher-order information; Finally, we designed a self-supervised task to adaptively mine user preferences for modality. We conducted comparative experiments with eleven baseline models on four real-world datasets. The experimental results show that CMC-MMR improves 6.202%, 4.975% , 6.054% and 11.368% on average on the four datasets, respectively, demonstrates the effectiveness of CMC-MMR.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139918440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Querying knowledge graphs through positive and negative examples and feedback 通过正反实例和反馈查询知识图谱
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2024-02-15 DOI: 10.1007/s10844-024-00846-z
Akritas Akritidis, Yannis Tzitzikas

The formulation of structured queries over Knowledge Graphs is not an easy task. To alleviate this problem, we propose a novel interactive method for SPARQL query formulation, for enabling users (plain and advanced) to formulate gradually queries by providing examples and various kinds of positive and negative feedback, in a manner that does not pre-suppose knowledge of the query language or the contents of the Knowledge Graph. In comparison to other example-based query approaches, distinctive features of our approach is the support of negative examples, and the positive/negative feedback on the generated constraints. We detail the algorithmic aspect and we present an interactive user interface that implements the approach. The application of the model on real datasets from DBpedia (Movies, Actors) and other datasets (scientific papers), showcases the feasibility and the effectiveness of the approach. A task-based evaluation that included users that are not familiar with SPARQL, provided positive evidence that the interaction is easy-to-grasp and enabled most users to formulate the desired queries.

对知识图谱进行结构化查询并非易事。为了缓解这一问题,我们提出了一种新颖的 SPARQL 查询交互式方法,通过提供示例和各种积极和消极反馈,使用户(普通用户和高级用户)能够以一种不预先假定查询语言或知识图谱内容知识的方式逐步提出查询。与其他基于示例的查询方法相比,我们的方法的显著特点是支持负面示例,并对生成的约束条件提供正面/负面反馈。我们详细介绍了算法方面的内容,并展示了实现该方法的交互式用户界面。该模型在 DBpedia 的真实数据集(电影、演员)和其他数据集(科学论文)上的应用展示了该方法的可行性和有效性。对不熟悉 SPARQL 的用户进行的基于任务的评估提供了积极的证据,证明这种交互方式易于掌握,大多数用户都能提出所需的查询。
{"title":"Querying knowledge graphs through positive and negative examples and feedback","authors":"Akritas Akritidis, Yannis Tzitzikas","doi":"10.1007/s10844-024-00846-z","DOIUrl":"https://doi.org/10.1007/s10844-024-00846-z","url":null,"abstract":"<p>The formulation of structured queries over Knowledge Graphs is not an easy task. To alleviate this problem, we propose a novel interactive method for SPARQL query formulation, for enabling users (plain and advanced) to formulate gradually queries by providing examples and various kinds of positive and negative feedback, in a manner that does not pre-suppose knowledge of the query language or the contents of the Knowledge Graph. In comparison to other example-based query approaches, distinctive features of our approach is the support of negative examples, and the positive/negative feedback on the generated constraints. We detail the algorithmic aspect and we present an interactive user interface that implements the approach. The application of the model on real datasets from DBpedia (Movies, Actors) and other datasets (scientific papers), showcases the feasibility and the effectiveness of the approach. A task-based evaluation that included users that are not familiar with SPARQL, provided positive evidence that the interaction is easy-to-grasp and enabled most users to formulate the desired queries.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139752017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic-enhanced reasoning question answering over temporal knowledge graphs 时态知识图谱上的语义增强推理问题解答
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2024-02-02 DOI: 10.1007/s10844-024-00840-5
Chenyang Du, Xiaoge Li, Zhongyang Li

Question Answering Over Temporal Knowledge Graphs (TKGQA) is an important topic in question answering. TKGQA focuses on accurately understanding questions involving temporal constraints and retrieving accurate answers from knowledge graphs. In previous research, the hierarchical structure of question contexts and the constraints imposed by temporal information on different sentence components have been overlooked. In this paper, we propose a framework called “Semantic-Enhanced Reasoning Question Answering” (SERQA) to tackle this problem. First, we adopt a pretrained language model (LM) to obtain the question relation representation vector. Then, we leverage syntactic information from the constituent tree and dependency tree, in combination with Masked Self-Attention (MSA), to enhance temporal constraint features. Finally, we integrate the temporal constraint features into the question relation representation using an information fusion function for answer prediction. Experimental results demonstrate that SERQA achieves better performance on the CRONQUESTIONS and ImConstrainedQuestions datasets. In comparison with existing temporal KGQA methods, our model exhibits outstanding performance in comprehending temporal constraint questions. The ablation experiments verified the effectiveness of combining the constituent tree and the dependency tree with MSA in question answering.

时态知识图谱问题解答(TKGQA)是问题解答领域的一个重要课题。TKGQA 的重点是准确理解涉及时间限制的问题,并从知识图谱中检索出准确的答案。在以往的研究中,问题上下文的层次结构和时间信息对不同句子成分的约束一直被忽视。本文提出了一个名为 "语义增强推理问题解答"(SERQA)的框架来解决这一问题。首先,我们采用预训练语言模型(LM)来获取问题关系表示向量。然后,我们利用来自成分树和依赖树的句法信息,结合掩码自注意(MSA)来增强时间约束特征。最后,我们利用信息融合函数将时间限制特征整合到问题关系表示中,从而进行答案预测。实验结果表明,SERQA 在 CRONQUESTIONS 和 ImConstrainedQuestions 数据集上取得了更好的性能。与现有的时态 KGQA 方法相比,我们的模型在理解时态约束问题方面表现突出。消融实验验证了将成分树和依赖树与 MSA 结合起来进行问题解答的有效性。
{"title":"Semantic-enhanced reasoning question answering over temporal knowledge graphs","authors":"Chenyang Du, Xiaoge Li, Zhongyang Li","doi":"10.1007/s10844-024-00840-5","DOIUrl":"https://doi.org/10.1007/s10844-024-00840-5","url":null,"abstract":"<p>Question Answering Over Temporal Knowledge Graphs (TKGQA) is an important topic in question answering. TKGQA focuses on accurately understanding questions involving temporal constraints and retrieving accurate answers from knowledge graphs. In previous research, the hierarchical structure of question contexts and the constraints imposed by temporal information on different sentence components have been overlooked. In this paper, we propose a framework called “Semantic-Enhanced Reasoning Question Answering” (SERQA) to tackle this problem. First, we adopt a pretrained language model (LM) to obtain the question relation representation vector. Then, we leverage syntactic information from the constituent tree and dependency tree, in combination with Masked Self-Attention (MSA), to enhance temporal constraint features. Finally, we integrate the temporal constraint features into the question relation representation using an information fusion function for answer prediction. Experimental results demonstrate that SERQA achieves better performance on the CRONQUESTIONS and ImConstrainedQuestions datasets. In comparison with existing temporal KGQA methods, our model exhibits outstanding performance in comprehending temporal constraint questions. The ablation experiments verified the effectiveness of combining the constituent tree and the dependency tree with MSA in question answering.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139669190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KIMedQA: towards building knowledge-enhanced medical QA models KIMedQA:建立知识增强型医疗质量保证模型
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2024-01-25 DOI: 10.1007/s10844-024-00844-1
Aizan Zafar, Sovan Kumar Sahoo, Deeksha Varshney, Amitava Das, Asif Ekbal

Medical question-answering systems require the ability to extract accurate, concise, and comprehensive answers. They will better comprehend the complex text and produce helpful answers if they can reason on the explicit constraints described in the question’s textual context and the implicit, pertinent knowledge of the medical world. Integrating Knowledge Graphs (KG) with Language Models (LMs) is a common approach to incorporating structured information sources. However, effectively combining and reasoning over KG representations and language context remains an open question. To address this, we propose the Knowledge Infused Medical Question Answering system (KIMedQA), which employs two techniques viz. relevant knowledge graph selection and pruning of the large-scale graph to handle Vector Space Inconsistent (VSI) and Excessive Knowledge Information (EKI). The representation of the query and context are then combined with the pruned knowledge network using a pre-trained language model to generate an informed answer. Finally, we demonstrate through in-depth empirical evaluation that our suggested strategy provides cutting-edge outcomes on two benchmark datasets, namely MASH-QA and COVID-QA. We also compared our results to ChatGPT, a robust and very powerful generative model, and discovered that our model outperforms ChatGPT according to the F1 Score and human evaluation metrics such as adequacy.

医学问题解答系统需要能够提取准确、简洁和全面的答案。如果它们能根据问题文本上下文中描述的显式限制条件和医学界的隐式相关知识进行推理,就能更好地理解复杂文本并生成有用的答案。将知识图谱(KG)与语言模型(LMs)相结合是整合结构化信息源的常用方法。然而,如何有效地将知识图谱表示和语言上下文结合起来并进行推理,仍然是一个有待解决的问题。为了解决这个问题,我们提出了知识注入式医学问题解答系统(KIMedQA),该系统采用了两种技术,即相关知识图谱选择和大规模图谱修剪,以处理矢量空间不一致(VSI)和知识信息过多(EKI)问题。然后,利用预先训练好的语言模型,将查询和上下文的表示与剪枝后的知识网络相结合,生成有依据的答案。最后,我们通过深入的实证评估证明,我们建议的策略在两个基准数据集(即 MASH-QA 和 COVID-QA)上提供了最先进的结果。我们还将结果与强大的生成模型 ChatGPT 进行了比较,发现根据 F1 分数和人类评估指标(如充分性),我们的模型优于 ChatGPT。
{"title":"KIMedQA: towards building knowledge-enhanced medical QA models","authors":"Aizan Zafar, Sovan Kumar Sahoo, Deeksha Varshney, Amitava Das, Asif Ekbal","doi":"10.1007/s10844-024-00844-1","DOIUrl":"https://doi.org/10.1007/s10844-024-00844-1","url":null,"abstract":"<p>Medical question-answering systems require the ability to extract accurate, concise, and comprehensive answers. They will better comprehend the complex text and produce helpful answers if they can reason on the explicit constraints described in the question’s textual context and the implicit, pertinent knowledge of the medical world. Integrating Knowledge Graphs (KG) with Language Models (LMs) is a common approach to incorporating structured information sources. However, effectively combining and reasoning over KG representations and language context remains an open question. To address this, we propose the Knowledge Infused Medical Question Answering system <b>(KIMedQA)</b>, which employs two techniques <i>viz.</i> relevant knowledge graph selection and pruning of the large-scale graph to handle Vector Space Inconsistent <i>(VSI)</i> and Excessive Knowledge Information <i>(EKI)</i>. The representation of the query and context are then combined with the pruned knowledge network using a pre-trained language model to generate an informed answer. Finally, we demonstrate through in-depth empirical evaluation that our suggested strategy provides cutting-edge outcomes on two benchmark datasets, namely MASH-QA and COVID-QA. We also compared our results to ChatGPT, a robust and very powerful generative model, and discovered that our model outperforms ChatGPT according to the F1 Score and human evaluation metrics such as adequacy.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139551670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data- & compute-efficient deviance mining via active learning and fast ensembles 通过主动学习和快速集合进行数据和计算效率较高的偏差挖掘
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2024-01-23 DOI: 10.1007/s10844-024-00841-4

Abstract

Detecting deviant traces in business process logs is crucial for modern organizations, given the harmful impact of deviant behaviours (e.g., attacks or faults). However, training a Deviance Prediction Model (DPM) by solely using supervised learning methods is impractical in scenarios where only few examples are labelled. To address this challenge, we propose an Active-Learning-based approach that leverages multiple DPMs and a temporal ensembling method that can train and merge them in a few training epochs. Our method needs expert supervision only for a few unlabelled traces exhibiting high prediction uncertainty. Tests on real data (of either complete or ongoing process instances) confirm the effectiveness of the proposed approach.

摘要 鉴于异常行为(如攻击或故障)的有害影响,检测业务流程日志中的异常痕迹对现代组织至关重要。然而,在只有少数示例被标记的情况下,仅使用监督学习方法来训练偏差预测模型(DPM)是不切实际的。为了应对这一挑战,我们提出了一种基于主动学习的方法,该方法利用多个 DPM 和一种时间集合方法,可以在几个训练历时内训练和合并这些 DPM。我们的方法只需要专家的监督,就能对少数表现出高度预测不确定性的未标记轨迹进行预测。对真实数据(完整或正在进行的流程实例)的测试证实了所提方法的有效性。
{"title":"Data- & compute-efficient deviance mining via active learning and fast ensembles","authors":"","doi":"10.1007/s10844-024-00841-4","DOIUrl":"https://doi.org/10.1007/s10844-024-00841-4","url":null,"abstract":"<h3>Abstract</h3> <p>Detecting deviant traces in business process logs is crucial for modern organizations, given the harmful impact of deviant behaviours (e.g., attacks or faults). However, training a Deviance Prediction Model (DPM) by solely using supervised learning methods is impractical in scenarios where only few examples are labelled. To address this challenge, we propose an Active-Learning-based approach that leverages multiple DPMs and a temporal ensembling method that can train and merge them in a few training epochs. Our method needs expert supervision only for a few unlabelled traces exhibiting high prediction uncertainty. Tests on real data (of either complete or ongoing process instances) confirm the effectiveness of the proposed approach.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139551676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel technique using graph neural networks and relevance scoring to improve the performance of knowledge graph-based question answering systems 利用图神经网络和相关性评分提高基于知识图谱的问题解答系统性能的新技术
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2024-01-22 DOI: 10.1007/s10844-023-00839-4
Sincy V. Thambi, P. C. Reghu Raj

A Knowledge Graph-based Question Answering (KGQA) system attempts to answer a given natural language question using a knowledge graph (KG) rather than from text data. The current KGQA methods attempt to determine whether there is an explicit relationship between the entities in the question and a well-structured relationship between them in the KG. However, such strategies are difficult to build and train, limiting their consistency and versatility. The use of language models such as BERT has aided in the advancement of natural language question answering. In this paper, we present a novel Graph Neural Network(GNN) based approach with relevance scoring for improving KGQA. GNNs use the weight of nodes and edges to influence the information propagation while updating the node features in the network. The suggested method comprises subgraph construction, weighing of nodes and edges, and pruning processes to obtain meaningful answers. BERT-based GNN is used to build subgraph node embeddings. We tested the influence of weighting for both nodes and edges and observed that the system performs better for weighted graphs than unweighted graphs. Additionally, we experimented with several GNN convolutional layers and obtainined improved results by combining GENeralised Graph Convolution (GENConv) with node weights for simple questions. Extensive testing on benchmark datasets confirmed the effectiveness of the proposed model in comparison to state-of-the-art KGQA systems.

基于知识图谱的问题解答(KGQA)系统试图利用知识图谱(KG)而不是文本数据来回答给定的自然语言问题。目前的 KGQA 方法试图确定问题中的实体与知识图谱中结构良好的实体之间是否存在明确的关系。然而,这种策略难以构建和训练,限制了其一致性和通用性。语言模型(如 BERT)的使用促进了自然语言问题解答的发展。在本文中,我们提出了一种基于图神经网络(GNN)的相关性评分新方法,用于改进 KGQA。图神经网络利用节点和边的权重来影响信息传播,同时更新网络中的节点特征。建议的方法包括子图构建、节点和边的权重以及剪枝过程,以获得有意义的答案。基于 BERT 的 GNN 用于构建子图节点嵌入。我们测试了节点和边的权重的影响,并观察到该系统对加权图的性能优于非加权图。此外,我们还试验了多个 GNN 卷积层,并通过将 GENeralised Graph Convolution(GENConv)与简单问题的节点权重相结合,获得了更好的结果。在基准数据集上进行的广泛测试证实,与最先进的 KGQA 系统相比,所提出的模型非常有效。
{"title":"A novel technique using graph neural networks and relevance scoring to improve the performance of knowledge graph-based question answering systems","authors":"Sincy V. Thambi, P. C. Reghu Raj","doi":"10.1007/s10844-023-00839-4","DOIUrl":"https://doi.org/10.1007/s10844-023-00839-4","url":null,"abstract":"<p>A Knowledge Graph-based Question Answering (KGQA) system attempts to answer a given natural language question using a knowledge graph (KG) rather than from text data. The current KGQA methods attempt to determine whether there is an explicit relationship between the entities in the question and a well-structured relationship between them in the KG. However, such strategies are difficult to build and train, limiting their consistency and versatility. The use of language models such as BERT has aided in the advancement of natural language question answering. In this paper, we present a novel Graph Neural Network(GNN) based approach with relevance scoring for improving KGQA. GNNs use the weight of nodes and edges to influence the information propagation while updating the node features in the network. The suggested method comprises subgraph construction, weighing of nodes and edges, and pruning processes to obtain meaningful answers. BERT-based GNN is used to build subgraph node embeddings. We tested the influence of weighting for both nodes and edges and observed that the system performs better for weighted graphs than unweighted graphs. Additionally, we experimented with several GNN convolutional layers and obtainined improved results by combining GENeralised Graph Convolution (GENConv) with node weights for simple questions. Extensive testing on benchmark datasets confirmed the effectiveness of the proposed model in comparison to state-of-the-art KGQA systems.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139551672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sentiment analysis of twitter data to detect and predict political leniency using natural language processing 利用自然语言处理对 twitter 数据进行情感分析,以检测和预测政治宽大政策
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2024-01-19 DOI: 10.1007/s10844-024-00842-3

Abstract

This paper analyses Twitter data to detect the political lean of a profile by extracting and classifying sentiments expressed through tweets. The work utilizes natural language processing, augmented with sentiment analysis algorithms and machine learning techniques, to classify specific keywords. The proposed methodology initially performs data pre-processing, followed by multi-aspect sentiment analysis for computing the sentiment score of the extracted keywords, for precisely classifying users into various clusters based on similarity score with respect to a sample user in each cluster. The proposed technique also predicts the sentiment of a profile towards unknown keywords and gauges the bias of an unidentified user towards political events or social issues. The proposed technique was tested on Twitter dataset with 1.72 million tweets taken from over 10,000 profiles and was able to successfully identify the political leniency of the user profiles with 99% confidence level, and also on a synthetic dataset with 2500 tweets, where the predicted accuracy and F1 score were 0.99 and 0.985 respectively, and 0.97 and 0.975 when neutral users were also considered for classification. The paper could also identify the impact of political decisions on various clusters, by analyzing the shift in the number of users belonging to the different clusters.

摘要 本文分析了 Twitter 数据,通过提取和分类推文中表达的情感来检测个人资料的政治倾向。这项工作利用自然语言处理,辅以情感分析算法和机器学习技术,对特定关键词进行分类。所提出的方法首先进行数据预处理,然后进行多方面的情感分析,计算所提取关键词的情感得分,根据与每个群组中样本用户的相似度得分,将用户精确地分类到不同的群组中。所提出的技术还能预测个人资料对未知关键词的情感,并衡量未识别用户对政治事件或社会问题的偏好。所提出的技术在 Twitter 数据集上进行了测试,该数据集包含来自 10,000 多个用户配置文件的 172 万条推文,能够以 99% 的置信度成功识别出用户配置文件的政治宽松度,同时还在一个包含 2500 条推文的合成数据集上进行了测试,预测准确率和 F1 分数分别为 0.99 和 0.985,当中立用户也被考虑进行分类时,预测准确率和 F1 分数分别为 0.97 和 0.975。论文还通过分析属于不同聚类的用户数量的变化,确定了政治决策对不同聚类的影响。
{"title":"Sentiment analysis of twitter data to detect and predict political leniency using natural language processing","authors":"","doi":"10.1007/s10844-024-00842-3","DOIUrl":"https://doi.org/10.1007/s10844-024-00842-3","url":null,"abstract":"<h3>Abstract</h3> <p>This paper analyses Twitter data to detect the political lean of a profile by extracting and classifying sentiments expressed through tweets. The work utilizes natural language processing, augmented with sentiment analysis algorithms and machine learning techniques, to classify specific keywords. The proposed methodology initially performs data pre-processing, followed by multi-aspect sentiment analysis for computing the sentiment score of the extracted keywords, for precisely classifying users into various clusters based on similarity score with respect to a sample user in each cluster. The proposed technique also predicts the sentiment of a profile towards unknown keywords and gauges the bias of an unidentified user towards political events or social issues. The proposed technique was tested on Twitter dataset with 1.72 million tweets taken from over 10,000 profiles and was able to successfully identify the political leniency of the user profiles with 99% confidence level, and also on a synthetic dataset with 2500 tweets, where the predicted accuracy and F1 score were 0.99 and 0.985 respectively, and 0.97 and 0.975 when neutral users were also considered for classification. The paper could also identify the impact of political decisions on various clusters, by analyzing the shift in the number of users belonging to the different clusters.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139509067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A qualitative analysis of knowledge graphs in recommendation scenarios through semantics-aware autoencoders 通过语义感知自动编码器对推荐场景中的知识图谱进行定性分析
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2024-01-19 DOI: 10.1007/s10844-023-00830-z
Vito Bellini, Eugenio Di Sciascio, Francesco Maria Donini, Claudio Pomo, Azzurra Ragone, Angelo Schiavone

Knowledge Graphs (KGs) have already proven their strength as a source of high-quality information for different tasks such as data integration, search, text summarization, and personalization. Another prominent research field that has been benefiting from the adoption of KGs is that of Recommender Systems (RSs). Feeding a RS with data coming from a KG improves recommendation accuracy, diversity, and novelty, and paves the way to the creation of interpretable models that can be used for explanations. This possibility of combining a KG with a RS raises the question whether such an addition can be performed in a plug-and-play fashion – also with respect to the recommendation domain – or whether each combination needs a careful evaluation. To investigate such a question, we consider all possible combinations of (i) three recommendation tasks (books, music, movies); (ii) three recommendation models fed with data from a KG (and in particular, a semantics-aware deep learning model, that we discuss in detail), compared with three baseline models without KG addition; (iii) two main encyclopedic KGs freely available on the Web: DBpedia and Wikidata. Supported by an extensive experimental evaluation, we show the final results in terms of accuracy and diversity of the various combinations, highlighting that the injection of knowledge does not always pay off. Moreover, we show how the choice of the KG, and the form of data in it, affect the results, depending on the recommendation domain and the learning model.

知识图谱(KG)已经证明了其作为高质量信息源在数据整合、搜索、文本摘要和个性化等不同任务中的优势。另一个因采用知识图谱而受益的著名研究领域是推荐系统(RS)。向 RS 输入来自 KG 的数据可以提高推荐的准确性、多样性和新颖性,并为创建可用于解释的可解释模型铺平道路。将 KG 与 RS 结合起来的这种可能性提出了一个问题,即这种添加是否可以即插即用的方式进行--也适用于推荐领域--还是每种组合都需要仔细评估。为了研究这个问题,我们考虑了以下所有可能的组合:(i) 三项推荐任务(书籍、音乐、电影);(ii) 使用来自 KG 的数据(特别是我们将详细讨论的语义感知深度学习模型)的三种推荐模型,与不添加 KG 的三种基线模型进行比较;(iii) 网络上免费提供的两种主要百科全书式 KG:DBpedia 和 Wikidata。在大量实验评估的支持下,我们展示了各种组合在准确性和多样性方面的最终结果,突出说明了知识的注入并不总能带来回报。此外,我们还展示了根据推荐领域和学习模型,KG 的选择和其中的数据形式对结果的影响。
{"title":"A qualitative analysis of knowledge graphs in recommendation scenarios through semantics-aware autoencoders","authors":"Vito Bellini, Eugenio Di Sciascio, Francesco Maria Donini, Claudio Pomo, Azzurra Ragone, Angelo Schiavone","doi":"10.1007/s10844-023-00830-z","DOIUrl":"https://doi.org/10.1007/s10844-023-00830-z","url":null,"abstract":"<p>Knowledge Graphs (KGs) have already proven their strength as a source of high-quality information for different tasks such as data integration, search, text summarization, and personalization. Another prominent research field that has been benefiting from the adoption of KGs is that of Recommender Systems (RSs). Feeding a RS with data coming from a KG improves recommendation accuracy, diversity, and novelty, and paves the way to the creation of interpretable models that can be used for explanations. This possibility of combining a KG with a RS raises the question whether such an addition can be performed in a plug-and-play fashion – also with respect to the recommendation domain – or whether each combination needs a careful evaluation. To investigate such a question, we consider all possible combinations of <i>(i)</i> three recommendation tasks (books, music, movies); <i>(ii)</i> three recommendation models fed with data from a KG (and in particular, a semantics-aware deep learning model, that we discuss in detail), compared with three baseline models without KG addition; <i>(iii)</i> two main encyclopedic KGs freely available on the Web: DBpedia and Wikidata. Supported by an extensive experimental evaluation, we show the final results in terms of accuracy and diversity of the various combinations, highlighting that the injection of knowledge does not always pay off. Moreover, we show how the choice of the KG, and the form of data in it, affect the results, depending on the recommendation domain and the learning model.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139509261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing the fairness of offensive memes detection models by mitigating unintended political bias 通过减少意外的政治偏见,提高攻击性备忘录检测模型的公平性
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2024-01-06 DOI: 10.1007/s10844-023-00834-9

Abstract

This paper tackles the critical challenge of detecting and mitigating unintended political bias in offensive meme detection. Political memes are a powerful tool that can be used to influence public opinion and disrupt voters’ mindsets. However, current visual-linguistic models for offensive meme detection exhibit unintended bias and struggle to accurately classify non-offensive and offensive memes. This can harm the fairness of the democratic process either by targeting minority groups or promoting harmful political ideologies. With Hindi being the fifth most spoken language globally and having a significant number of native speakers, it is essential to detect and remove Hindi-based offensive memes to foster a fair and equitable democratic process. To address these concerns, we propose three debiasing techniques to mitigate the overrepresentation of majority group perspectives while addressing the suppression of minority opinions in political discourse. To support our approach, we curate a comprehensive dataset called Pol_Off_Meme, designed especially for the Hindi language. Empirical analysis of this dataset demonstrates the efficacy of our proposed debiasing techniques in reducing political bias in internet memes, promoting a fair and equitable democratic environment. Our debiased model, named (DRTIM^{Adv}_{Att}) , exhibited superior performance compared to the CLIP-based baseline model. It achieved a significant improvement of +9.72% in the F1-score while reducing the False Positive Rate Difference (FPRD) by -16% and the False Negative Rate Difference (FNRD) by -14.01%. Our efforts strive to cultivate a more informed and inclusive political discourse, ensuring that all opinions, irrespective of their majority or minority status, receive adequate attention and representation.

摘要 本文探讨了在攻击性备忘录检测中检测和减轻意外政治偏见这一关键挑战。政治备忘录是一种强大的工具,可以用来影响公众舆论和扰乱选民的心态。然而,目前用于检测冒犯性备忘录的视觉语言学模型表现出了意外的偏见,难以准确地对非冒犯性备忘录和冒犯性备忘录进行分类。这可能会损害民主进程的公平性,要么针对少数群体,要么宣扬有害的政治意识形态。印地语是全球使用人数最多的第五大语言,母语使用者人数众多,因此必须检测和删除基于印地语的攻击性备忘录,以促进公平公正的民主进程。为了解决这些问题,我们提出了三种去污技术,以减轻多数群体观点的过度代表性,同时解决政治话语中对少数群体观点的压制问题。为了支持我们的方法,我们专门为印地语设计了一个名为 Pol_Off_Meme 的综合数据集。对该数据集的实证分析表明,我们提出的去中心化技术能有效减少网络备忘录中的政治偏见,促进公平公正的民主环境。与基于CLIP的基线模型相比,我们的去除法模型(名为(DRTIM^{Adv}_{Att}))表现出了更优越的性能。它的 F1 分数大幅提高了 9.72%,同时假阳性率差异(FPRD)降低了 -16%,假阴性率差异(FNRD)降低了 -14.01%。我们的努力旨在培养一种更加知情和包容的政治话语,确保所有意见,无论其处于多数还是少数地位,都能得到充分的关注和代表。
{"title":"Enhancing the fairness of offensive memes detection models by mitigating unintended political bias","authors":"","doi":"10.1007/s10844-023-00834-9","DOIUrl":"https://doi.org/10.1007/s10844-023-00834-9","url":null,"abstract":"<h3>Abstract</h3> <p>This paper tackles the critical challenge of detecting and mitigating unintended political bias in offensive meme detection. Political memes are a powerful tool that can be used to influence public opinion and disrupt voters’ mindsets. However, current visual-linguistic models for offensive meme detection exhibit unintended bias and struggle to accurately classify non-offensive and offensive memes. This can harm the fairness of the democratic process either by targeting minority groups or promoting harmful political ideologies. With Hindi being the fifth most spoken language globally and having a significant number of native speakers, it is essential to detect and remove Hindi-based offensive memes to foster a fair and equitable democratic process. To address these concerns, we propose three debiasing techniques to mitigate the overrepresentation of majority group perspectives while addressing the suppression of minority opinions in political discourse. To support our approach, we curate a comprehensive dataset called Pol_Off_Meme, designed especially for the Hindi language. Empirical analysis of this dataset demonstrates the efficacy of our proposed debiasing techniques in reducing political bias in internet memes, promoting a fair and equitable democratic environment. Our debiased model, named <span> <span>(DRTIM^{Adv}_{Att})</span> </span>, exhibited superior performance compared to the CLIP-based baseline model. It achieved a significant improvement of +9.72% in the F1-score while reducing the False Positive Rate Difference (FPRD) by -16% and the False Negative Rate Difference (FNRD) by -14.01%. Our efforts strive to cultivate a more informed and inclusive political discourse, ensuring that all opinions, irrespective of their majority or minority status, receive adequate attention and representation.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139375936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Movie tag prediction: An extreme multi-label multi-modal transformer-based solution with explanation 电影标签预测:基于转换器的极端多标签多模态解决方案及说明
IF 3.4 3区 计算机科学 Q2 Computer Science Pub Date : 2024-01-06 DOI: 10.1007/s10844-023-00836-7
Massimo Guarascio, Marco Minici, Francesco Sergio Pisani, Erika De Francesco, Pasquale Lambardi

Providing rich and accurate metadata for indexing media content is a crucial problem for all the companies offering streaming entertainment services. These metadata are commonly employed to enhance search engine results and feed recommendation algorithms to improve the matching with user interests. However, the problem of labeling multimedia content with informative tags is challenging as the labeling procedure, manually performed by domain experts, is time-consuming and prone to error. Recently, the adoption of AI-based methods has been demonstrated to be an effective approach for automating this complex process. However, developing an effective solution requires coping with different challenging issues, such as data noise and the scarcity of labeled examples during the training phase. In this work, we address these challenges by introducing a Transformer-based framework for multi-modal multi-label classification enriched with model prediction explanation capabilities. These explanations can help the domain expert to understand the system’s predictions. Experimentation conducted on two real test cases demonstrates its effectiveness.

对于所有提供流媒体娱乐服务的公司来说,为媒体内容索引提供丰富而准确的元数据是一个至关重要的问题。这些元数据通常用于增强搜索引擎结果,并为推荐算法提供信息,以提高与用户兴趣的匹配度。然而,为多媒体内容标注信息标签是一个具有挑战性的问题,因为由领域专家手动执行的标注程序既耗时又容易出错。最近,基于人工智能的方法被证明是实现这一复杂过程自动化的有效方法。然而,开发有效的解决方案需要应对各种挑战性问题,如数据噪声和训练阶段标注示例的稀缺性。在这项工作中,我们引入了一个基于 Transformer 的多模态多标签分类框架,并丰富了模型预测解释功能,以应对这些挑战。这些解释可以帮助领域专家理解系统的预测。在两个实际测试案例中进行的实验证明了它的有效性。
{"title":"Movie tag prediction: An extreme multi-label multi-modal transformer-based solution with explanation","authors":"Massimo Guarascio, Marco Minici, Francesco Sergio Pisani, Erika De Francesco, Pasquale Lambardi","doi":"10.1007/s10844-023-00836-7","DOIUrl":"https://doi.org/10.1007/s10844-023-00836-7","url":null,"abstract":"<p>Providing rich and accurate metadata for indexing media content is a crucial problem for all the companies offering streaming entertainment services. These metadata are commonly employed to enhance search engine results and feed recommendation algorithms to improve the matching with user interests. However, the problem of labeling multimedia content with informative tags is challenging as the labeling procedure, manually performed by domain experts, is time-consuming and prone to error. Recently, the adoption of AI-based methods has been demonstrated to be an effective approach for automating this complex process. However, developing an effective solution requires coping with different challenging issues, such as data noise and the scarcity of labeled examples during the training phase. In this work, we address these challenges by introducing a Transformer-based framework for multi-modal multi-label classification enriched with model prediction explanation capabilities. These explanations can help the domain expert to understand the system’s predictions. Experimentation conducted on two real test cases demonstrates its effectiveness.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139375935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Intelligent Information Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1