Pub Date : 2024-03-16DOI: 10.1007/s10844-024-00850-3
Hossein Hajibabaei, Vahid Seydi, Abbas Koochari
Community detection in complex networks is an important task for discovering hidden information in network analysis. Neighborhood density between nodes is one of the fundamental indicators of community presence in the network. A community with a high edge density will have correlations between nodes that extend beyond their immediate neighbors, denoted by motifs. Motifs are repetitive patterns of edges observed with high frequency in the network. We proposed the PCDMS method (Probabilistic Community Detection with Motif Structure) that detects communities by estimating the triangular motif in the network. This study employs structural density between nodes, a key concept in graph analysis. The proposed model has the advantage of using a probabilistic generative model that calculates the latent parameters of the probabilistic model and determines the community based on the likelihood of triangular motifs. The relationship between observing two pairs of nodes in multiple communities leads to an increasing likelihood estimation of the existence of a motif structure between them. The output of the proposed model is the intensity of each node in the communities. The efficiency and validity of the proposed method are evaluated through experimental work on both synthetic and real-world networks; the findings will show that the community identified by the proposed method is more accurate and dense than other algorithms with modularity, NMI, and F1score evaluation metrics.
{"title":"A motif-based probabilistic approach for community detection in complex networks","authors":"Hossein Hajibabaei, Vahid Seydi, Abbas Koochari","doi":"10.1007/s10844-024-00850-3","DOIUrl":"https://doi.org/10.1007/s10844-024-00850-3","url":null,"abstract":"<p>Community detection in complex networks is an important task for discovering hidden information in network analysis. Neighborhood density between nodes is one of the fundamental indicators of community presence in the network. A community with a high edge density will have correlations between nodes that extend beyond their immediate neighbors, denoted by motifs. Motifs are repetitive patterns of edges observed with high frequency in the network. We proposed the PCDMS method (Probabilistic Community Detection with Motif Structure) that detects communities by estimating the triangular motif in the network. This study employs structural density between nodes, a key concept in graph analysis. The proposed model has the advantage of using a probabilistic generative model that calculates the latent parameters of the probabilistic model and determines the community based on the likelihood of triangular motifs. The relationship between observing two pairs of nodes in multiple communities leads to an increasing likelihood estimation of the existence of a motif structure between them. The output of the proposed model is the intensity of each node in the communities. The efficiency and validity of the proposed method are evaluated through experimental work on both synthetic and real-world networks; the findings will show that the community identified by the proposed method is more accurate and dense than other algorithms with modularity, NMI, and F1score evaluation metrics.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"16 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140149819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-15DOI: 10.1007/s10844-024-00852-1
Serhat Hakki Akdag, Nihan Kesim Cicekli
In this paper, we present a methodology for the early detection of fake news on emerging topics through the innovative application of weak supervision. Traditional techniques for fake news detection often rely on fact-checkers or supervised learning with labeled data, which is not readily available for emerging topics. To address this, we introduce the Weakly Supervised Text Classification framework (WeSTeC), an end-to-end solution designed to programmatically label large-scale text datasets within specific domains and train supervised text classifiers using the assigned labels. The proposed framework automatically generates labeling functions through multiple weak labeling strategies and eliminates underperforming ones. Labels assigned through the generated labeling functions are then used to fine-tune a pre-trained RoBERTa classifier for fake news detection. By using a weakly labeled dataset, which contains fake news related to the emerging topic, the trained fake news detection model becomes specialized for the topic under consideration. We explore both semi-supervision and domain adaptation setups, utilizing small amounts of labeled data and labeled data from other domains, respectively. The fake news classification model generated by the proposed framework excels when compared with all baselines in both setups. In addition, when compared to its fully supervised counterpart, our fake news detection model trained through weak labels achieves accuracy within 1%, emphasizing the robustness of the proposed framework’s weak labeling capabilities.
在本文中,我们提出了一种通过创新应用弱监督来早期检测新兴话题假新闻的方法。传统的假新闻检测技术通常依赖于事实核查人员或有标注数据的监督学习,而对于新兴话题来说,这些数据并不容易获得。为了解决这个问题,我们推出了弱监督文本分类框架(WeSTeC),这是一个端到端的解决方案,旨在以编程方式为特定领域内的大规模文本数据集贴标签,并使用分配的标签训练监督文本分类器。所提出的框架通过多种弱标签策略自动生成标签函数,并消除表现不佳的标签。然后,通过生成的标签函数分配的标签被用于微调预训练的 RoBERTa 分类器,以检测假新闻。通过使用弱标签数据集(其中包含与新兴话题相关的假新闻),经过训练的假新闻检测模型变得专门针对所考虑的话题。我们探索了半监督和领域适应设置,分别利用了少量标记数据和来自其他领域的标记数据。在这两种设置中,与所有基线相比,拟议框架生成的假新闻分类模型都非常出色。此外,与完全监督的假新闻检测模型相比,我们通过弱标签训练的假新闻检测模型的准确率在 1%以内,强调了所提出框架的弱标签功能的鲁棒性。
{"title":"Early detection of fake news on emerging topics through weak supervision","authors":"Serhat Hakki Akdag, Nihan Kesim Cicekli","doi":"10.1007/s10844-024-00852-1","DOIUrl":"https://doi.org/10.1007/s10844-024-00852-1","url":null,"abstract":"<p>In this paper, we present a methodology for the early detection of fake news on emerging topics through the innovative application of weak supervision. Traditional techniques for fake news detection often rely on fact-checkers or supervised learning with labeled data, which is not readily available for emerging topics. To address this, we introduce the Weakly Supervised Text Classification framework (WeSTeC), an end-to-end solution designed to programmatically label large-scale text datasets within specific domains and train supervised text classifiers using the assigned labels. The proposed framework automatically generates labeling functions through multiple weak labeling strategies and eliminates underperforming ones. Labels assigned through the generated labeling functions are then used to fine-tune a pre-trained RoBERTa classifier for fake news detection. By using a weakly labeled dataset, which contains fake news related to the emerging topic, the trained fake news detection model becomes specialized for the topic under consideration. We explore both semi-supervision and domain adaptation setups, utilizing small amounts of labeled data and labeled data from other domains, respectively. The fake news classification model generated by the proposed framework excels when compared with all baselines in both setups. In addition, when compared to its fully supervised counterpart, our fake news detection model trained through weak labels achieves accuracy within 1%, emphasizing the robustness of the proposed framework’s weak labeling capabilities.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"186 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140149966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recognizing crucial seed spreaders of complex networks is an open issue that studies the dynamic spreading process and analyzes the performance of networks. However, most of the findings design the hierarchical model based on nodes’ degree such as Kshell decomposition for obtaining global information, and identifying effects brought by the weight value of each layer is coarse. In addition, local structural information fails to be effectively captured when neighborhood nodes are sometimes unconnected in the hierarchical structure. To solve these issues, in this paper, we design a novel hierarchical structure based on the shortest path distance by using the interpretative structure model and determine influence weights of each layer. Furthermore, we also design the local neighborhood overlap coefficient and the local index based on the overlap (LIO) by considering two conditions of connected and unconnected neighborhood nodes in the hierarchical structure. For reaching a comprehensive recognition and finding crucial seed spreaders precisely, we introduce influence weights vector, local evaluation index matrix after normalization and the weight vector of local indexes into a new hybrid recognition framework. The proposed method adopts a series of indicators, including the monotonicity relation, Susceptible-Infected-Susceptible model, complementary cumulative distribution function, Kendall’s coefficient, spreading scale ratio and average shortest path length, to execute corresponding experiments and evaluate the diffusion ability in different datasets. Results demonstrate that, our method outperforms involved algorithms in the recognition effects and spreading capability.
{"title":"A hybrid recognition framework of crucial seed spreaders in complex networks with neighborhood overlap","authors":"Tianchi Tong, Min Wang, Wenying Yuan, Qian Dong, Jinsheng Sun, Yuan Jiang","doi":"10.1007/s10844-024-00849-w","DOIUrl":"https://doi.org/10.1007/s10844-024-00849-w","url":null,"abstract":"<p>Recognizing crucial seed spreaders of complex networks is an open issue that studies the dynamic spreading process and analyzes the performance of networks. However, most of the findings design the hierarchical model based on nodes’ degree such as Kshell decomposition for obtaining global information, and identifying effects brought by the weight value of each layer is coarse. In addition, local structural information fails to be effectively captured when neighborhood nodes are sometimes unconnected in the hierarchical structure. To solve these issues, in this paper, we design a novel hierarchical structure based on the shortest path distance by using the interpretative structure model and determine influence weights of each layer. Furthermore, we also design the local neighborhood overlap coefficient and the local index based on the overlap (LIO) by considering two conditions of connected and unconnected neighborhood nodes in the hierarchical structure. For reaching a comprehensive recognition and finding crucial seed spreaders precisely, we introduce influence weights vector, local evaluation index matrix after normalization and the weight vector of local indexes into a new hybrid recognition framework. The proposed method adopts a series of indicators, including the monotonicity relation, Susceptible-Infected-Susceptible model, complementary cumulative distribution function, Kendall’s coefficient, spreading scale ratio and average shortest path length, to execute corresponding experiments and evaluate the diffusion ability in different datasets. Results demonstrate that, our method outperforms involved algorithms in the recognition effects and spreading capability.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"9 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140149906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-02DOI: 10.1007/s10844-024-00851-2
Kenniy Olorunnimbe, Herna Viktor
The accuracy of price forecasts is important for financial market trading strategies and portfolio management. Compared to traditional models such as ARIMA and other state-of-the-art deep learning techniques, temporal Transformers with similarity embedding perform better for multi-horizon forecasts in financial time series, as they account for the conditional heteroscedasticity inherent in financial data. Despite this, the methods employed in generating these forecasts must be optimized to achieve the highest possible level of precision. One approach that has been shown to improve the accuracy of machine learning models is ensemble techniques. To this end, we present an ensemble approach that efficiently utilizes the available data over an extended timeframe. Our ensemble combines multiple temporal Transformer models learned within sliding windows, thereby making optimal use of the data. As combination methods, along with an averaging approach, we also introduced a stacking meta-learner that leverages a quantile estimator to determine the optimal weights for combining the base models of smaller windows. By decomposing the constituent time series of an extended timeframe, we optimize the utilization of the series for financial deep learning. This simplifies the training process of a temporal Transformer model over an extended time series while achieving better performance, particularly when accounting for the non-constant variance of financial time series. Our experiments, conducted across volatile and non-volatile extrapolation periods, using 20 companies from the Dow Jones Industrial Average show more than 40% and 60% improvement in predictive performance compared to the baseline temporal Transformer.
{"title":"Ensemble of temporal Transformers for financial time series","authors":"Kenniy Olorunnimbe, Herna Viktor","doi":"10.1007/s10844-024-00851-2","DOIUrl":"https://doi.org/10.1007/s10844-024-00851-2","url":null,"abstract":"<p>The accuracy of price forecasts is important for financial market trading strategies and portfolio management. Compared to traditional models such as ARIMA and other state-of-the-art deep learning techniques, temporal Transformers with similarity embedding perform better for multi-horizon forecasts in financial time series, as they account for the conditional heteroscedasticity inherent in financial data. Despite this, the methods employed in generating these forecasts must be optimized to achieve the highest possible level of precision. One approach that has been shown to improve the accuracy of machine learning models is ensemble techniques. To this end, we present an ensemble approach that efficiently utilizes the available data over an extended timeframe. Our ensemble combines multiple temporal Transformer models learned within sliding windows, thereby making optimal use of the data. As combination methods, along with an averaging approach, we also introduced a stacking meta-learner that leverages a quantile estimator to determine the optimal weights for combining the base models of smaller windows. By decomposing the constituent time series of an extended timeframe, we optimize the utilization of the series for financial deep learning. This simplifies the training process of a temporal Transformer model over an extended time series while achieving better performance, particularly when accounting for the non-constant variance of financial time series. Our experiments, conducted across volatile and non-volatile extrapolation periods, using 20 companies from the Dow Jones Industrial Average show more than 40% and 60% improvement in predictive performance compared to the baseline temporal Transformer.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"1 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140017636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-29DOI: 10.1007/s10844-024-00843-2
Divya Kumari, Asif Ekbal
Producing a high-quality review translation is a multifaceted process. It goes beyond successful semantic transfer and requires conveying the original message’s tone and style in a way that resonates with the target audience, whether they are human readers or Natural Language Processing (NLP) applications. Capturing these subtle nuances of the review text demands a deeper understanding and better encoding of the source message. In order to achieve this goal, we explore the use of self-supervised masked language modeling (MLM) and a variant called polarity masked language modeling (p-MLM) as auxiliary tasks in a multi-learning setup. MLM is widely recognized for its ability to capture rich linguistic representations of the input and has been shown to achieve state-of-the-art accuracy in various language understanding tasks. Motivated by its effectiveness, in this paper we adopt joint learning, combining the neural machine translation (NMT) task with source polarity-masked language modeling within a shared embedding space to induce a deeper understanding of the emotional nuances of the text. We analyze the results and observe that our multi-task model indeed exhibits a better understanding of linguistic concepts like sentiment and emotion. Intriguingly, this is achieved even without explicit training on sentiment-annotated or domain-specific sentiment corpora. Our multi-task NMT model consistently improves the translation quality of affect sentences from diverse domains in three language pairs.
{"title":"Enhancing sentiment and emotion translation of review text through MLM knowledge integration in NMT","authors":"Divya Kumari, Asif Ekbal","doi":"10.1007/s10844-024-00843-2","DOIUrl":"https://doi.org/10.1007/s10844-024-00843-2","url":null,"abstract":"<p>Producing a high-quality review translation is a multifaceted process. It goes beyond successful semantic transfer and requires conveying the original message’s tone and style in a way that resonates with the target audience, whether they are human readers or Natural Language Processing (NLP) applications. Capturing these subtle nuances of the review text demands a deeper understanding and better encoding of the source message. In order to achieve this goal, we explore the use of self-supervised masked language modeling (MLM) and a variant called polarity masked language modeling (p-MLM) as auxiliary tasks in a multi-learning setup. MLM is widely recognized for its ability to capture rich linguistic representations of the input and has been shown to achieve state-of-the-art accuracy in various language understanding tasks. Motivated by its effectiveness, in this paper we adopt joint learning, combining the neural machine translation (NMT) task with source polarity-masked language modeling within a shared embedding space to induce a deeper understanding of the emotional nuances of the text. We analyze the results and observe that our multi-task model indeed exhibits a better understanding of linguistic concepts like sentiment and emotion. Intriguingly, this is achieved even without explicit training on sentiment-annotated or domain-specific sentiment corpora. Our multi-task NMT model consistently improves the translation quality of affect sentences from diverse domains in three language pairs.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"135 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140010850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-20DOI: 10.1007/s10844-024-00848-x
Abstract
Multi-modal recommendation using multi-modal features (e.g., image and text features) has received significant attention and has been shown to have more effective recommendation. However, there are currently the following problems with multi-modal recommendation: (1) Multi-modal recommendation often handle individual modes’ raw data directly, leading to noise affecting the model’s effectiveness and the failure to explore interconnections between modes; (2) Different users have different preferences. It’s impractical to treat all modalities equally, as this could interfere with the model’s ability to make recommendation. To address the above problems, this paper proposes a Multi-modal recommendation model with cross-modal correction (CMC-MMR). Firstly, in order to reduce the effect of noise in the raw data and to take full advantage of the relationships between modes, we designed a cross-modal correction module to denoise and correct the modes using a cross-modal correction mechanism; Secondly, the similarity between the same modalities of each item is used as a benchmark to build item-item graphs for each modality, and user-item graphs with degree-sensitive pruning strategies are also built to mine higher-order information; Finally, we designed a self-supervised task to adaptively mine user preferences for modality. We conducted comparative experiments with eleven baseline models on four real-world datasets. The experimental results show that CMC-MMR improves 6.202%, 4.975% , 6.054% and 11.368% on average on the four datasets, respectively, demonstrates the effectiveness of CMC-MMR.
{"title":"CMC-MMR: multi-modal recommendation model with cross-modal correction","authors":"","doi":"10.1007/s10844-024-00848-x","DOIUrl":"https://doi.org/10.1007/s10844-024-00848-x","url":null,"abstract":"<h3>Abstract</h3> <p>Multi-modal recommendation using multi-modal features (e.g., image and text features) has received significant attention and has been shown to have more effective recommendation. However, there are currently the following problems with multi-modal recommendation: (1) Multi-modal recommendation often handle individual modes’ raw data directly, leading to noise affecting the model’s effectiveness and the failure to explore interconnections between modes; (2) Different users have different preferences. It’s impractical to treat all modalities equally, as this could interfere with the model’s ability to make recommendation. To address the above problems, this paper proposes a <span>M</span>ulti-<span>m</span>odal <span>r</span>ecommendation model with <span>c</span>ross-<span>m</span>odal <span>c</span>orrection (CMC-MMR). Firstly, in order to reduce the effect of noise in the raw data and to take full advantage of the relationships between modes, we designed a cross-modal correction module to denoise and correct the modes using a cross-modal correction mechanism; Secondly, the similarity between the same modalities of each item is used as a benchmark to build item-item graphs for each modality, and user-item graphs with degree-sensitive pruning strategies are also built to mine higher-order information; Finally, we designed a self-supervised task to adaptively mine user preferences for modality. We conducted comparative experiments with eleven baseline models on four real-world datasets. The experimental results show that CMC-MMR improves 6.202%, 4.975% , 6.054% and 11.368% on average on the four datasets, respectively, demonstrates the effectiveness of CMC-MMR.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"4 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139918440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.1007/s10844-024-00846-z
Akritas Akritidis, Yannis Tzitzikas
The formulation of structured queries over Knowledge Graphs is not an easy task. To alleviate this problem, we propose a novel interactive method for SPARQL query formulation, for enabling users (plain and advanced) to formulate gradually queries by providing examples and various kinds of positive and negative feedback, in a manner that does not pre-suppose knowledge of the query language or the contents of the Knowledge Graph. In comparison to other example-based query approaches, distinctive features of our approach is the support of negative examples, and the positive/negative feedback on the generated constraints. We detail the algorithmic aspect and we present an interactive user interface that implements the approach. The application of the model on real datasets from DBpedia (Movies, Actors) and other datasets (scientific papers), showcases the feasibility and the effectiveness of the approach. A task-based evaluation that included users that are not familiar with SPARQL, provided positive evidence that the interaction is easy-to-grasp and enabled most users to formulate the desired queries.
{"title":"Querying knowledge graphs through positive and negative examples and feedback","authors":"Akritas Akritidis, Yannis Tzitzikas","doi":"10.1007/s10844-024-00846-z","DOIUrl":"https://doi.org/10.1007/s10844-024-00846-z","url":null,"abstract":"<p>The formulation of structured queries over Knowledge Graphs is not an easy task. To alleviate this problem, we propose a novel interactive method for SPARQL query formulation, for enabling users (plain and advanced) to formulate gradually queries by providing examples and various kinds of positive and negative feedback, in a manner that does not pre-suppose knowledge of the query language or the contents of the Knowledge Graph. In comparison to other example-based query approaches, distinctive features of our approach is the support of negative examples, and the positive/negative feedback on the generated constraints. We detail the algorithmic aspect and we present an interactive user interface that implements the approach. The application of the model on real datasets from DBpedia (Movies, Actors) and other datasets (scientific papers), showcases the feasibility and the effectiveness of the approach. A task-based evaluation that included users that are not familiar with SPARQL, provided positive evidence that the interaction is easy-to-grasp and enabled most users to formulate the desired queries.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"74 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139752017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-02DOI: 10.1007/s10844-024-00840-5
Chenyang Du, Xiaoge Li, Zhongyang Li
Question Answering Over Temporal Knowledge Graphs (TKGQA) is an important topic in question answering. TKGQA focuses on accurately understanding questions involving temporal constraints and retrieving accurate answers from knowledge graphs. In previous research, the hierarchical structure of question contexts and the constraints imposed by temporal information on different sentence components have been overlooked. In this paper, we propose a framework called “Semantic-Enhanced Reasoning Question Answering” (SERQA) to tackle this problem. First, we adopt a pretrained language model (LM) to obtain the question relation representation vector. Then, we leverage syntactic information from the constituent tree and dependency tree, in combination with Masked Self-Attention (MSA), to enhance temporal constraint features. Finally, we integrate the temporal constraint features into the question relation representation using an information fusion function for answer prediction. Experimental results demonstrate that SERQA achieves better performance on the CRONQUESTIONS and ImConstrainedQuestions datasets. In comparison with existing temporal KGQA methods, our model exhibits outstanding performance in comprehending temporal constraint questions. The ablation experiments verified the effectiveness of combining the constituent tree and the dependency tree with MSA in question answering.
{"title":"Semantic-enhanced reasoning question answering over temporal knowledge graphs","authors":"Chenyang Du, Xiaoge Li, Zhongyang Li","doi":"10.1007/s10844-024-00840-5","DOIUrl":"https://doi.org/10.1007/s10844-024-00840-5","url":null,"abstract":"<p>Question Answering Over Temporal Knowledge Graphs (TKGQA) is an important topic in question answering. TKGQA focuses on accurately understanding questions involving temporal constraints and retrieving accurate answers from knowledge graphs. In previous research, the hierarchical structure of question contexts and the constraints imposed by temporal information on different sentence components have been overlooked. In this paper, we propose a framework called “Semantic-Enhanced Reasoning Question Answering” (SERQA) to tackle this problem. First, we adopt a pretrained language model (LM) to obtain the question relation representation vector. Then, we leverage syntactic information from the constituent tree and dependency tree, in combination with Masked Self-Attention (MSA), to enhance temporal constraint features. Finally, we integrate the temporal constraint features into the question relation representation using an information fusion function for answer prediction. Experimental results demonstrate that SERQA achieves better performance on the CRONQUESTIONS and ImConstrainedQuestions datasets. In comparison with existing temporal KGQA methods, our model exhibits outstanding performance in comprehending temporal constraint questions. The ablation experiments verified the effectiveness of combining the constituent tree and the dependency tree with MSA in question answering.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"40 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139669190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical question-answering systems require the ability to extract accurate, concise, and comprehensive answers. They will better comprehend the complex text and produce helpful answers if they can reason on the explicit constraints described in the question’s textual context and the implicit, pertinent knowledge of the medical world. Integrating Knowledge Graphs (KG) with Language Models (LMs) is a common approach to incorporating structured information sources. However, effectively combining and reasoning over KG representations and language context remains an open question. To address this, we propose the Knowledge Infused Medical Question Answering system (KIMedQA), which employs two techniques viz. relevant knowledge graph selection and pruning of the large-scale graph to handle Vector Space Inconsistent (VSI) and Excessive Knowledge Information (EKI). The representation of the query and context are then combined with the pruned knowledge network using a pre-trained language model to generate an informed answer. Finally, we demonstrate through in-depth empirical evaluation that our suggested strategy provides cutting-edge outcomes on two benchmark datasets, namely MASH-QA and COVID-QA. We also compared our results to ChatGPT, a robust and very powerful generative model, and discovered that our model outperforms ChatGPT according to the F1 Score and human evaluation metrics such as adequacy.
医学问题解答系统需要能够提取准确、简洁和全面的答案。如果它们能根据问题文本上下文中描述的显式限制条件和医学界的隐式相关知识进行推理,就能更好地理解复杂文本并生成有用的答案。将知识图谱(KG)与语言模型(LMs)相结合是整合结构化信息源的常用方法。然而,如何有效地将知识图谱表示和语言上下文结合起来并进行推理,仍然是一个有待解决的问题。为了解决这个问题,我们提出了知识注入式医学问题解答系统(KIMedQA),该系统采用了两种技术,即相关知识图谱选择和大规模图谱修剪,以处理矢量空间不一致(VSI)和知识信息过多(EKI)问题。然后,利用预先训练好的语言模型,将查询和上下文的表示与剪枝后的知识网络相结合,生成有依据的答案。最后,我们通过深入的实证评估证明,我们建议的策略在两个基准数据集(即 MASH-QA 和 COVID-QA)上提供了最先进的结果。我们还将结果与强大的生成模型 ChatGPT 进行了比较,发现根据 F1 分数和人类评估指标(如充分性),我们的模型优于 ChatGPT。
{"title":"KIMedQA: towards building knowledge-enhanced medical QA models","authors":"Aizan Zafar, Sovan Kumar Sahoo, Deeksha Varshney, Amitava Das, Asif Ekbal","doi":"10.1007/s10844-024-00844-1","DOIUrl":"https://doi.org/10.1007/s10844-024-00844-1","url":null,"abstract":"<p>Medical question-answering systems require the ability to extract accurate, concise, and comprehensive answers. They will better comprehend the complex text and produce helpful answers if they can reason on the explicit constraints described in the question’s textual context and the implicit, pertinent knowledge of the medical world. Integrating Knowledge Graphs (KG) with Language Models (LMs) is a common approach to incorporating structured information sources. However, effectively combining and reasoning over KG representations and language context remains an open question. To address this, we propose the Knowledge Infused Medical Question Answering system <b>(KIMedQA)</b>, which employs two techniques <i>viz.</i> relevant knowledge graph selection and pruning of the large-scale graph to handle Vector Space Inconsistent <i>(VSI)</i> and Excessive Knowledge Information <i>(EKI)</i>. The representation of the query and context are then combined with the pruned knowledge network using a pre-trained language model to generate an informed answer. Finally, we demonstrate through in-depth empirical evaluation that our suggested strategy provides cutting-edge outcomes on two benchmark datasets, namely MASH-QA and COVID-QA. We also compared our results to ChatGPT, a robust and very powerful generative model, and discovered that our model outperforms ChatGPT according to the F1 Score and human evaluation metrics such as adequacy.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"67 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139551670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-23DOI: 10.1007/s10844-024-00841-4
Abstract
Detecting deviant traces in business process logs is crucial for modern organizations, given the harmful impact of deviant behaviours (e.g., attacks or faults). However, training a Deviance Prediction Model (DPM) by solely using supervised learning methods is impractical in scenarios where only few examples are labelled. To address this challenge, we propose an Active-Learning-based approach that leverages multiple DPMs and a temporal ensembling method that can train and merge them in a few training epochs. Our method needs expert supervision only for a few unlabelled traces exhibiting high prediction uncertainty. Tests on real data (of either complete or ongoing process instances) confirm the effectiveness of the proposed approach.
{"title":"Data- & compute-efficient deviance mining via active learning and fast ensembles","authors":"","doi":"10.1007/s10844-024-00841-4","DOIUrl":"https://doi.org/10.1007/s10844-024-00841-4","url":null,"abstract":"<h3>Abstract</h3> <p>Detecting deviant traces in business process logs is crucial for modern organizations, given the harmful impact of deviant behaviours (e.g., attacks or faults). However, training a Deviance Prediction Model (DPM) by solely using supervised learning methods is impractical in scenarios where only few examples are labelled. To address this challenge, we propose an Active-Learning-based approach that leverages multiple DPMs and a temporal ensembling method that can train and merge them in a few training epochs. Our method needs expert supervision only for a few unlabelled traces exhibiting high prediction uncertainty. Tests on real data (of either complete or ongoing process instances) confirm the effectiveness of the proposed approach.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"10 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139551676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}