Knowledge and Information Systems最新文献_第3页

EIGP: document-level event argument extraction with information enhancement generated based on prompts EIGP：文档级事件论据提取，根据提示生成信息增强功能

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems

Pub Date : 2024-08-23 DOI: 10.1007/s10115-024-02213-4

Kai Liu, Hui Zhao, Zicong Wang, Qianxi Hou

The event argument extraction (EAE) task primarily aims to identify event arguments and their specific roles within a given event. Existing generation-based event argument extraction models, including the recent ones focused on document-level event argument extraction, emphasize the construction of prompt templates and entity representations. However, they overlook the inadequate comprehension of model in document context structure information and the impact of arguments spanning a wide range on event argument extraction. Consequently, this results in reduced model detection accuracy. In this paper, we propose a prompt-based generation event argument extraction model with the ability of document structure information enhancement for document-level event argument extraction task based on prompt generation. Specifically, we use sentence abstract meaning representation (AMR) to represent the contextual structural information of the document, and then remove the redundant parts of the structural information through constraints to obtain the constraint graph with the document information. Finally, we use the encoder to convert the graph into the corresponding dense vector. We inject these vectors with contextual structural information into the prompt-based generation EAE model in a prefixed manner. When contextual information and prompt templates interact at the attention layer of the model, the generated structural information improves the generation by affecting attention. We conducted experiments on RAMS and WIKIEVENTS datasets, and the results show that our model achieves excellent results compared with the current advanced generative EAE model.

事件论据提取（EAE）任务的主要目的是识别事件论据及其在给定事件中的具体作用。现有的基于生成的事件论据抽取模型，包括最近专注于文档级事件论据抽取的模型，都强调构建提示模板和实体表征。然而，它们忽视了文档上下文结构信息中对模型的理解不足，以及跨度较大的参数对事件参数提取的影响。因此，这导致了模型检测准确率的降低。本文针对基于提示生成的文档级事件论据提取任务，提出了一种具有文档结构信息增强能力的基于提示生成的事件论据提取模型。具体来说，我们使用句子抽象意义表示法（AMR）来表示文档的上下文结构信息，然后通过约束去除结构信息中的冗余部分，得到带有文档信息的约束图。最后，我们使用编码器将图转换成相应的密集向量。我们将这些带有上下文结构信息的向量以前缀的方式注入到基于提示的生成 EAE 模型中。当上下文信息和提示模板在模型的注意力层相互作用时，生成的结构信息会通过影响注意力来改善生成效果。我们在 RAMS 和 WIKIEVENTS 数据集上进行了实验，结果表明，与目前先进的生成式 EAE 模型相比，我们的模型取得了优异的成绩。

{"title":"EIGP: document-level event argument extraction with information enhancement generated based on prompts","authors":"Kai Liu, Hui Zhao, Zicong Wang, Qianxi Hou","doi":"10.1007/s10115-024-02213-4","DOIUrl":"https://doi.org/10.1007/s10115-024-02213-4","url":null,"abstract":"The event argument extraction (EAE) task primarily aims to identify event arguments and their specific roles within a given event. Existing generation-based event argument extraction models, including the recent ones focused on document-level event argument extraction, emphasize the construction of prompt templates and entity representations. However, they overlook the inadequate comprehension of model in document context structure information and the impact of arguments spanning a wide range on event argument extraction. Consequently, this results in reduced model detection accuracy. In this paper, we propose a prompt-based generation event argument extraction model with the ability of document structure information enhancement for document-level event argument extraction task based on prompt generation. Specifically, we use sentence abstract meaning representation (AMR) to represent the contextual structural information of the document, and then remove the redundant parts of the structural information through constraints to obtain the constraint graph with the document information. Finally, we use the encoder to convert the graph into the corresponding dense vector. We inject these vectors with contextual structural information into the prompt-based generation EAE model in a prefixed manner. When contextual information and prompt templates interact at the attention layer of the model, the generated structural information improves the generation by affecting attention. We conducted experiments on RAMS and WIKIEVENTS datasets, and the results show that our model achieves excellent results compared with the current advanced generative EAE model.\u0000","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"8 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computing marginal and conditional divergences between decomposable models with applications in quantum computing and earth observation 计算可分解模型之间的边际分歧和条件分歧，并将其应用于量子计算和地球观测

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems

Pub Date : 2024-08-22 DOI: 10.1007/s10115-024-02191-7

Loong Kuan Lee, Geoffrey I. Webb, Daniel F. Schmidt, Nico Piatkowski

The ability to compute the exact divergence between two high-dimensional distributions is useful in many applications, but doing so naively is intractable. Computing the (alpha beta )-divergence—a family of divergences that includes the Kullback–Leibler divergence and Hellinger distance—between the joint distribution of two decomposable models, i.e., chordal Markov networks, can be done in time exponential in the treewidth of these models. Extending this result, we propose an approach to compute the exact (alpha beta )-divergence between any marginal or conditional distribution of two decomposable models. In order to do so tractably, we provide a decomposition over the marginal and conditional distributions of decomposable models. We then show how our method can be used to analyze distributional changes by first applying it to the benchmark image dataset QMNIST and a dataset containing observations from various areas at the Roosevelt Nation Forest and their cover type. Finally, based on our framework, we propose a novel way to quantify the error in contemporary superconducting quantum computers.

计算两个高维分布之间的精确发散的能力在很多应用中都很有用，但简单地计算却很难。计算两个可分解模型（即弦马尔可夫网络）的联合分布之间的（α beta ）发散--包括库尔巴克-莱布勒发散和海灵格距离在内的发散族--可以在这些模型的树宽指数级的时间内完成。在这一结果的基础上，我们提出了一种计算两个可分解模型的任何边际或条件分布之间的精确（α beta ）-发散的方法。为了方便地计算，我们提供了对可分解模型的边际分布和条件分布的分解。然后，我们首先将该方法应用于基准图像数据集 QMNIST 和包含罗斯福国家森林不同区域观测数据及其覆盖类型的数据集，从而展示了如何利用该方法分析分布变化。最后，基于我们的框架，我们提出了一种量化当代超导量子计算机误差的新方法。

{"title":"Computing marginal and conditional divergences between decomposable models with applications in quantum computing and earth observation","authors":"Loong Kuan Lee, Geoffrey I. Webb, Daniel F. Schmidt, Nico Piatkowski","doi":"10.1007/s10115-024-02191-7","DOIUrl":"https://doi.org/10.1007/s10115-024-02191-7","url":null,"abstract":"The ability to compute the exact divergence between two high-dimensional distributions is useful in many applications, but doing so naively is intractable. Computing the (alpha beta )-divergence—a family of divergences that includes the Kullback–Leibler divergence and Hellinger distance—between the joint distribution of two decomposable models, i.e., chordal Markov networks, can be done in time exponential in the treewidth of these models. Extending this result, we propose an approach to compute the exact (alpha beta )-divergence between any marginal or conditional distribution of two decomposable models. In order to do so tractably, we provide a decomposition over the marginal and conditional distributions of decomposable models. We then show how our method can be used to analyze distributional changes by first applying it to the benchmark image dataset QMNIST and a dataset containing observations from various areas at the Roosevelt Nation Forest and their cover type. Finally, based on our framework, we propose a novel way to quantify the error in contemporary superconducting quantum computers.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"97 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GoSum: extractive summarization of long documents by reinforcement learning and graph-organized discourse state GoSum：通过强化学习和图组织话语状态提取长文档摘要

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems

Pub Date : 2024-08-22 DOI: 10.1007/s10115-024-02195-3

Junyi Bian, Xiaodi Huang, Hong Zhou, Tianyang Huang, Shanfeng Zhu

Summarizing extensive documents involves selecting sentences, with the organizational structure of document sections playing a pivotal role. However, effectively utilizing discourse information for summary generation poses a significant challenge, especially given the inconsistency between training and evaluation in extractive summarization. In this paper, we introduce GoSum, a novel extractive summarizer that integrates a graph-based model with reinforcement learning techniques to summarize long documents. Specifically, GoSum utilizes a graph neural network to encode sentence states, constructing a heterogeneous graph that represents each document at various discourse levels. The edges of this graph capture hierarchical relationships between different document sections. Furthermore, GoSum incorporates offline reinforcement learning, enabling the model to receive ROUGE score feedback on diverse training samples, thereby enhancing the quality of summary generation. On the two scientific article datasets PubMed and arXiv, GoSum achieved the highest performance among extractive models. Particularly on the PubMed dataset, GoSum outperformed other models with ROUGE-1 and ROUGE-L scores surpassing by 0.45 and 0.26, respectively.

对内容广泛的文档进行摘要需要选择句子，文档章节的组织结构起着关键作用。然而，有效利用话语信息生成摘要是一项巨大的挑战，尤其是考虑到抽取式摘要的训练和评估之间的不一致性。在本文中，我们介绍了 GoSum，这是一种新颖的提取式摘要器，它将基于图的模型与强化学习技术相结合，用于摘要长文档。具体来说，GoSum 利用图神经网络对句子状态进行编码，构建了一个异构图，在不同的话语层次上表示每篇文档。该图的边捕捉不同文档部分之间的层次关系。此外，GoSum 还采用了离线强化学习技术，使模型能够接收不同训练样本的 ROUGE 分数反馈，从而提高摘要生成的质量。在 PubMed 和 arXiv 这两个科学文章数据集上，GoSum 取得了提取模型中最高的性能。特别是在 PubMed 数据集上，GoSum 的表现优于其他模型，ROUGE-1 和 ROUGE-L 分数分别超过了 0.45 和 0.26。

{"title":"GoSum: extractive summarization of long documents by reinforcement learning and graph-organized discourse state","authors":"Junyi Bian, Xiaodi Huang, Hong Zhou, Tianyang Huang, Shanfeng Zhu","doi":"10.1007/s10115-024-02195-3","DOIUrl":"https://doi.org/10.1007/s10115-024-02195-3","url":null,"abstract":"Summarizing extensive documents involves selecting sentences, with the organizational structure of document sections playing a pivotal role. However, effectively utilizing discourse information for summary generation poses a significant challenge, especially given the inconsistency between training and evaluation in extractive summarization. In this paper, we introduce GoSum, a novel extractive summarizer that integrates a graph-based model with reinforcement learning techniques to summarize long documents. Specifically, GoSum utilizes a graph neural network to encode sentence states, constructing a heterogeneous graph that represents each document at various discourse levels. The edges of this graph capture hierarchical relationships between different document sections. Furthermore, GoSum incorporates offline reinforcement learning, enabling the model to receive ROUGE score feedback on diverse training samples, thereby enhancing the quality of summary generation. On the two scientific article datasets PubMed and arXiv, GoSum achieved the highest performance among extractive models. Particularly on the PubMed dataset, GoSum outperformed other models with ROUGE-1 and ROUGE-L scores surpassing by 0.45 and 0.26, respectively.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"10 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Probabilistic temporal semantic graph: a holistic framework for event detection in twitter 概率时间语义图：用于检测 twitter 中事件的整体框架

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems

Pub Date : 2024-08-22 DOI: 10.1007/s10115-024-02208-1

Hadis Bashiri, Hassan Naderi

Event detection on social media platforms, especially Twitter, poses significant challenges due to the dynamic nature and high volume of data. The rapid flow of tweets and the varied ways users express thoughts complicate the identification of relevant events. Accurately identifying and interpreting events from this noisy and fast-paced environment is crucial for various applications, including crisis management and market analysis. This paper presents a novel unsupervised framework for event detection on social media, designed to enhance the accuracy and efficiency of identifying significant events from Twitter data. The framework incorporates several innovative techniques, including dynamic bandwidth adjustment based on local data density, Mahalanobis distance integration, adaptive kernel density estimation, and an improved Louvain-MOMR method for community detection. Additionally, a new scoring system is implemented to accurately extract trending words that evoke strong emotions, improving the identification of event-related keywords. The proposed framework demonstrates robust performance across three diverse datasets: FACup, Super Tuesday, and US Election, showcasing its effectiveness in capturing temporal and semantic patterns within tweets.

由于数据的动态性和海量性，社交媒体平台（尤其是 Twitter）上的事件检测面临着巨大挑战。推文的快速流动和用户表达思想的各种方式使相关事件的识别变得更加复杂。从这种嘈杂、快节奏的环境中准确识别和解读事件，对于危机管理和市场分析等各种应用至关重要。本文介绍了一种用于社交媒体事件检测的新型无监督框架，旨在提高从 Twitter 数据中识别重大事件的准确性和效率。该框架采用了多项创新技术，包括基于本地数据密度的动态带宽调整、Mahalanobis 距离整合、自适应核密度估计以及用于社区检测的改进型 Louvain-MOMR 方法。此外，还采用了一种新的评分系统，以准确提取能唤起强烈情绪的趋势词，从而改进对事件相关关键词的识别。所提出的框架在三个不同的数据集上都表现出了强大的性能：FACup、"超级星期二 "和美国大选，展示了其在捕捉推文中的时间和语义模式方面的有效性。

{"title":"Probabilistic temporal semantic graph: a holistic framework for event detection in twitter","authors":"Hadis Bashiri, Hassan Naderi","doi":"10.1007/s10115-024-02208-1","DOIUrl":"https://doi.org/10.1007/s10115-024-02208-1","url":null,"abstract":"Event detection on social media platforms, especially Twitter, poses significant challenges due to the dynamic nature and high volume of data. The rapid flow of tweets and the varied ways users express thoughts complicate the identification of relevant events. Accurately identifying and interpreting events from this noisy and fast-paced environment is crucial for various applications, including crisis management and market analysis. This paper presents a novel unsupervised framework for event detection on social media, designed to enhance the accuracy and efficiency of identifying significant events from Twitter data. The framework incorporates several innovative techniques, including dynamic bandwidth adjustment based on local data density, Mahalanobis distance integration, adaptive kernel density estimation, and an improved Louvain-MOMR method for community detection. Additionally, a new scoring system is implemented to accurately extract trending words that evoke strong emotions, improving the identification of event-related keywords. The proposed framework demonstrates robust performance across three diverse datasets: FACup, Super Tuesday, and US Election, showcasing its effectiveness in capturing temporal and semantic patterns within tweets.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"93 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

REDAffectiveLM: leveraging affect enriched embedding and transformer-based neural language model for readers’ emotion detection REDAffectiveLM：利用情感丰富嵌入和基于转换器的神经语言模型进行读者情感检测

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems

Pub Date : 2024-08-19 DOI: 10.1007/s10115-024-02194-4

Anoop Kadan, P. Deepak, Manjary P. Gangan, Sam Savitha Abraham, V. L. Lajish

Technological advancements in web platforms allow people to express and share emotions toward textual write-ups written and shared by others. This brings about different interesting domains for analysis, emotion expressed by the writer and emotion elicited from the readers. In this paper, we propose a novel approach for readers’ emotion detection from short-text documents using a deep learning model called REDAffectiveLM. Within state-of-the-art NLP tasks, it is well understood that utilizing context-specific representations from transformer-based pre-trained language models helps achieve improved performance. Within this affective computing task, we explore how incorporating affective information can further enhance performance. Toward this, we leverage context-specific and affect enriched representations by using a transformer-based pre-trained language model in tandem with affect enriched Bi-LSTM+Attention. For empirical evaluation, we procure a new dataset REN-20k, besides using RENh-4k and SemEval-2007. We evaluate the performance of our REDAffectiveLM rigorously across these datasets, against a vast set of state-of-the-art baselines, where our model consistently outperforms baselines and obtains statistically significant results. Our results establish that utilizing affect enriched representation along with context-specific representation within a neural architecture can considerably enhance readers’ emotion detection. Since the impact of affect enrichment specifically in readers’ emotion detection isn’t well explored, we conduct a detailed analysis over affect enriched Bi-LSTM+Attention using qualitative and quantitative model behavior evaluation techniques. We observe that compared to conventional semantic embedding, affect enriched embedding increases the ability of the network to effectively identify and assign weightage to the key terms responsible for readers’ emotion detection to improve prediction.

网络平台的技术进步使人们能够表达和分享对他人撰写和分享的文本文章的情感。这带来了不同的有趣分析领域：作者表达的情感和读者激发的情感。在本文中，我们提出了一种新方法，利用名为 REDAffectiveLM 的深度学习模型，从短文文档中检测读者的情感。众所周知，在最先进的 NLP 任务中，利用基于转换器的预训练语言模型的特定语境表示有助于提高性能。在这一情感计算任务中，我们探索了如何结合情感信息来进一步提高性能。为此，我们将基于转换器的预训练语言模型与情感丰富的 Bi-LSTM+Attention 模型结合使用，从而利用特定语境和情感丰富的表征。为了进行实证评估，除了使用 RENh-4k 和 SemEval-2007 之外，我们还获得了一个新的数据集 REN-20k。我们在这些数据集上对 REDAffectiveLM 的性能进行了严格评估，并与大量最先进的基线模型进行了对比，结果显示我们的模型始终优于基线模型，并获得了具有统计意义的结果。我们的研究结果表明，在神经架构中利用情感丰富表示法和特定语境表示法可以大大提高读者的情感检测能力。由于情感丰富对读者情感检测的具体影响还没有得到很好的探讨，我们使用定性和定量模型行为评估技术对情感丰富的 Bi-LSTM+Attention 进行了详细分析。我们发现，与传统的语义嵌入相比，情感丰富嵌入提高了网络有效识别读者情感检测关键术语并为其分配权重的能力，从而改善了预测效果。

{"title":"REDAffectiveLM: leveraging affect enriched embedding and transformer-based neural language model for readers’ emotion detection","authors":"Anoop Kadan, P. Deepak, Manjary P. Gangan, Sam Savitha Abraham, V. L. Lajish","doi":"10.1007/s10115-024-02194-4","DOIUrl":"https://doi.org/10.1007/s10115-024-02194-4","url":null,"abstract":"Technological advancements in web platforms allow people to express and share emotions toward textual write-ups written and shared by others. This brings about different interesting domains for analysis, emotion expressed by the writer and emotion elicited from the readers. In this paper, we propose a novel approach for readers’ emotion detection from short-text documents using a deep learning model called REDAffectiveLM. Within state-of-the-art NLP tasks, it is well understood that utilizing context-specific representations from transformer-based pre-trained language models helps achieve improved performance. Within this affective computing task, we explore how incorporating affective information can further enhance performance. Toward this, we leverage context-specific and affect enriched representations by using a transformer-based pre-trained language model in tandem with affect enriched Bi-LSTM+Attention. For empirical evaluation, we procure a new dataset REN-20k, besides using RENh-4k and SemEval-2007. We evaluate the performance of our REDAffectiveLM rigorously across these datasets, against a vast set of state-of-the-art baselines, where our model consistently outperforms baselines and obtains statistically significant results. Our results establish that utilizing affect enriched representation along with context-specific representation within a neural architecture can considerably enhance readers’ emotion detection. Since the impact of affect enrichment specifically in readers’ emotion detection isn’t well explored, we conduct a detailed analysis over affect enriched Bi-LSTM+Attention using qualitative and quantitative model behavior evaluation techniques. We observe that compared to conventional semantic embedding, affect enriched embedding increases the ability of the network to effectively identify and assign weightage to the key terms responsible for readers’ emotion detection to improve prediction.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"29 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Aspect-based sentiment analysis: approaches, applications, challenges and trends 基于方面的情感分析：方法、应用、挑战和趋势

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems

Pub Date : 2024-08-14 DOI: 10.1007/s10115-024-02200-9

Deena Nath, Sanjay K. Dwivedi

Sentiment analysis (SA) is a technique that employs natural language processing to determine the function of mining methodically, extract, analyse and comprehend people’s thoughts, feelings, personal opinions and perceptions as well as their reactions and attitude regarding various subjects such as topics, commodities and various other products and services. However, it only reveals the overall sentiment. Unlike SA, the aspect-based sentiment analysis (ABSA) study categorizes a text into distinct components and determines the appropriate sentiment, which is more reliable in its predictions. Hence, ABSA is essential to study and break down texts into various service elements. It then assigns the appropriate sentiment polarity (positive, negative or neutral) for every aspect. In this paper, the main task is to critically review the research outcomes to look at the various techniques, methods and features used for ABSA. After giving brief introduction of SA in order to establish a clear relationship between SA and ABSA, we focussed on approaches, applications, challenges and trends in ABSA research.

情感分析（Sentiment Analysis，SA）是一种利用自然语言处理技术来确定挖掘功能的技术，它有条不紊地提取、分析和理解人们的思想、情感、个人观点和看法，以及他们对各种主题（如话题、商品和其他各种产品和服务）的反应和态度。然而，它只能揭示整体情感。与情感分析不同，基于方面的情感分析（ABSA）研究将文本分为不同的组成部分，并确定相应的情感，其预测结果更为可靠。因此，ABSA 对于研究和将文本分解为各种服务元素至关重要。然后，它为每个方面分配适当的情感极性（正面、负面或中性）。本文的主要任务是批判性地回顾研究成果，研究 ABSA 所使用的各种技术、方法和特征。在简要介绍 SA 以明确 SA 与 ABSA 之间的关系之后，我们重点讨论了 ABSA 研究的方法、应用、挑战和趋势。

引用次数: 0

Complementary incomplete weighted concept factorization methods for multi-view clustering 用于多视角聚类的互补不完全加权概念因式分解方法

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems

Pub Date : 2024-08-14 DOI: 10.1007/s10115-024-02197-1

Ghufran Ahmad Khan, Jalaluddin Khan, Taushif Anwar, Zaid Al-Huda, Bassoma Diallo, Naved Ahmad

The main aim of traditional multi-view clustering is to categorize data into separate clusters under the assumption that all views are fully available. However, practical scenarios often arise where not all aspects of the data are accessible, which hampers the efficacy of conventional multi-view clustering techniques. Recent advancements have made significant progress in addressing the incompleteness in multi-view data clustering. Still, current incomplete multi-view clustering methods overlooked a number of important factors, such as providing a consensus representation across the kernel space, dealing with over-fitting issue from different views, and looking at how these multiple views relate to each other at the same time. To deal these challenges, we introduced an innovative multi-view clustering algorithm to manage incomplete data from multiple perspectives. Additionally, we have introduced a novel objective function incorporating a weighted concept factorization technique to tackle the absence of data instances within each incomplete viewpoint. We used a co-regularization constraint to learn a common shared structure from different points of view and a smooth regularization term to prevent view over-fitting. It is noteworthy that the proposed objective function is inherently non-convex, presenting optimization challenges. To obtain the optimal solution, we have implemented an iterative optimization approach to converge the local minima for our method. To underscore the effectiveness and validation of our approach, we conducted experiments using real-world datasets against state-of-the-art methods for comparative evaluation.

传统多视图聚类的主要目的是在假设所有视图都完全可用的情况下，将数据归类到不同的聚类中。然而，在实际应用中经常会出现并非所有方面的数据都可访问的情况，这就阻碍了传统多视图聚类技术的功效。最近的进步在解决多视图数据聚类的不完整性方面取得了重大进展。尽管如此，目前不完整的多视图聚类方法仍然忽略了一些重要因素，例如在整个内核空间提供一致的表示方法、处理来自不同视图的过拟合问题，以及同时研究这些多视图之间的关系。为了应对这些挑战，我们引入了一种创新的多视角聚类算法来管理来自多个视角的不完整数据。此外，我们还引入了一种新的目标函数，其中包含一种加权概念因式分解技术，以解决每个不完整视角中缺乏数据实例的问题。我们使用共同正则化约束从不同视角学习共同的共享结构，并使用平滑正则化项防止视角过度拟合。值得注意的是，所提出的目标函数本身是非凸的，这给优化带来了挑战。为了获得最优解，我们采用了迭代优化方法来收敛我们方法的局部最小值。为了强调我们方法的有效性和验证性，我们使用真实世界的数据集与最先进的方法进行了实验，以进行比较评估。

{"title":"Complementary incomplete weighted concept factorization methods for multi-view clustering","authors":"Ghufran Ahmad Khan, Jalaluddin Khan, Taushif Anwar, Zaid Al-Huda, Bassoma Diallo, Naved Ahmad","doi":"10.1007/s10115-024-02197-1","DOIUrl":"https://doi.org/10.1007/s10115-024-02197-1","url":null,"abstract":"The main aim of traditional multi-view clustering is to categorize data into separate clusters under the assumption that all views are fully available. However, practical scenarios often arise where not all aspects of the data are accessible, which hampers the efficacy of conventional multi-view clustering techniques. Recent advancements have made significant progress in addressing the incompleteness in multi-view data clustering. Still, current incomplete multi-view clustering methods overlooked a number of important factors, such as providing a consensus representation across the kernel space, dealing with over-fitting issue from different views, and looking at how these multiple views relate to each other at the same time. To deal these challenges, we introduced an innovative multi-view clustering algorithm to manage incomplete data from multiple perspectives. Additionally, we have introduced a novel objective function incorporating a weighted concept factorization technique to tackle the absence of data instances within each incomplete viewpoint. We used a co-regularization constraint to learn a common shared structure from different points of view and a smooth regularization term to prevent view over-fitting. It is noteworthy that the proposed objective function is inherently non-convex, presenting optimization challenges. To obtain the optimal solution, we have implemented an iterative optimization approach to converge the local minima for our method. To underscore the effectiveness and validation of our approach, we conducted experiments using real-world datasets against state-of-the-art methods for comparative evaluation.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"57 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hyperparameter elegance: fine-tuning text analysis with enhanced genetic algorithm hyperparameter landscape 超参数优雅：利用增强型遗传算法超参数景观微调文本分析

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems

Pub Date : 2024-08-13 DOI: 10.1007/s10115-024-02202-7

Gyananjaya Tripathy, Aakanksha Sharaff

Due to the significant participation of the users, it is highly challenging to handle enormous datasets using machine learning algorithms. Deep learning methods are therefore designed with efficient hyperparameter sets to enhance the processing of the vast corpus. Different hyperparameter tuning models have been used previously in various studies. Still, tuning the deep learning models with the greatest possible number of hyperparameters has not yet been possible. This study developed a modified optimization methodology for effective hyperparameter identification, addressing the shortcomings of the previous studies. To get the optimum outcome, an enhanced genetic algorithm is used with modified crossover and mutation. The method has the ability to tune several hyperparameters simultaneously. The benchmark datasets for online reviews show outstanding results from the proposed methodology. The outcome demonstrates that the presented enhanced genetic algorithm-based hyperparameter tuning model performs better than other standard approaches with 88.73% classification accuracy, 87.31% sensitivity, 90.15% specificity, and 88.58% F-score value for the IMDB dataset and 92.17% classification accuracy, 91.89% sensitivity, 92.47% specificity, and 92.50% F-score value for the Yelp dataset while requiring less processing effort. To further enhance the performance, attention mechanism is applied to the designed model, achieving 89.62% accuracy, 88.59% sensitivity, 91.89% specificity, and 89.35% F-score with the IMDB dataset and 93.29% accuracy, 92.04% sensitivity, 93.22% specificity, and 92.98% F-score with the Yelp dataset.

由于用户的大量参与，使用机器学习算法处理庞大的数据集极具挑战性。因此，深度学习方法设计了高效的超参数集，以提高对庞大语料库的处理能力。以前的各种研究中使用过不同的超参数调整模型。不过，用尽可能多的超参数来调整深度学习模型还没有实现。本研究针对以往研究的不足，开发了一种改进的优化方法，用于有效识别超参数。为获得最佳结果，使用了改进的遗传算法，并对交叉和变异进行了修改。该方法能够同时调整多个超参数。在线评论的基准数据集显示，所提出的方法取得了出色的结果。结果表明，所提出的基于增强遗传算法的超参数调整模型比其他标准方法表现更好，在 IMDB 数据集上的分类准确率为 88.73%，灵敏度为 87.31%，特异性为 90.15%，F-score 值为 88.58%；在 Yelp 数据集上的分类准确率为 92.17%，灵敏度为 91.89%，特异性为 92.47%，F-score 值为 92.50%，同时所需的处理工作量更少。为了进一步提高性能，在设计的模型中应用了注意力机制，在 IMDB 数据集上实现了 89.62% 的准确率、88.59% 的灵敏度、91.89% 的特异性和 89.35% 的 F-score，在 Yelp 数据集上实现了 93.29% 的准确率、92.04% 的灵敏度、93.22% 的特异性和 92.98% 的 F-score。

{"title":"Hyperparameter elegance: fine-tuning text analysis with enhanced genetic algorithm hyperparameter landscape","authors":"Gyananjaya Tripathy, Aakanksha Sharaff","doi":"10.1007/s10115-024-02202-7","DOIUrl":"https://doi.org/10.1007/s10115-024-02202-7","url":null,"abstract":"Due to the significant participation of the users, it is highly challenging to handle enormous datasets using machine learning algorithms. Deep learning methods are therefore designed with efficient hyperparameter sets to enhance the processing of the vast corpus. Different hyperparameter tuning models have been used previously in various studies. Still, tuning the deep learning models with the greatest possible number of hyperparameters has not yet been possible. This study developed a modified optimization methodology for effective hyperparameter identification, addressing the shortcomings of the previous studies. To get the optimum outcome, an enhanced genetic algorithm is used with modified crossover and mutation. The method has the ability to tune several hyperparameters simultaneously. The benchmark datasets for online reviews show outstanding results from the proposed methodology. The outcome demonstrates that the presented enhanced genetic algorithm-based hyperparameter tuning model performs better than other standard approaches with 88.73% classification accuracy, 87.31% sensitivity, 90.15% specificity, and 88.58% F-score value for the IMDB dataset and 92.17% classification accuracy, 91.89% sensitivity, 92.47% specificity, and 92.50% F-score value for the Yelp dataset while requiring less processing effort. To further enhance the performance, attention mechanism is applied to the designed model, achieving 89.62% accuracy, 88.59% sensitivity, 91.89% specificity, and 89.35% F-score with the IMDB dataset and 93.29% accuracy, 92.04% sensitivity, 93.22% specificity, and 92.98% F-score with the Yelp dataset.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"18 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive moving average Q-learning 自适应移动平均 Q 学习

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems

Pub Date : 2024-08-12 DOI: 10.1007/s10115-024-02190-8

Tao Tan, Hong Xie, Yunni Xia, Xiaoyu Shi, Mingsheng Shang

A variety of algorithms have been proposed to address the long-standing overestimation bias problem of Q-learning. Reducing this overestimation bias may lead to an underestimation bias, such as double Q-learning. However, it is still unclear how to make a good balance between overestimation and underestimation. We present a simple yet effective algorithm to fill in this gap and call Moving Average Q-learning. Specifically, we maintain two dependent Q-estimators. The first one is used to estimate the maximum expected Q-value. The second one is used to select the optimal action. In particular, the second estimator is the moving average of historical Q-values generated by the first estimator. The second estimator has only one hyperparameter, namely the moving average parameter. This parameter controls the dependence between the second estimator and the first estimator, ranging from independent to identical. Based on Moving Average Q-learning, we design an adaptive strategy to select the moving average parameter, resulting in AdaMA (Adaptive Moving Average) Q-learning. This adaptive strategy is a simple function, where the moving average parameter increases monotonically with the number of state–action pairs visited. Moreover, we extend AdaMA Q-learning to AdaMA DQN in high-dimensional environments. Extensive experiment results reveal why Moving Average Q-learning and AdaMA Q-learning can mitigate the overestimation bias, and also show that AdaMA Q-learning and AdaMA DQN outperform SOTA baselines drastically. In particular, when compared with the overestimated value of 1.66 in Q-learning, AdaMA Q-learning underestimates by 0.196, resulting in an improvement of 88.19%.

为了解决 Q-learning 长期存在的高估偏差问题，人们提出了多种算法。减少高估偏差可能会导致低估偏差，如双 Q 学习。然而，如何在高估和低估之间取得良好的平衡仍不清楚。我们提出了一种简单而有效的算法来填补这一空白，并称之为移动平均 Q-learning。具体来说，我们保留了两个相互依赖的 Q 值估计器。第一个用于估计最大预期 Q 值。第二个用于选择最优行动。具体来说，第二个估计器是第一个估计器生成的历史 Q 值的移动平均值。第二个估计器只有一个超参数，即移动平均参数。该参数控制着第二个估计器和第一个估计器之间的依赖关系，范围从独立到相同。在移动平均 Q-learning 的基础上，我们设计了一种自适应策略来选择移动平均参数，这就是 AdaMA（自适应移动平均）Q-learning。这种自适应策略是一个简单的函数，其中移动平均参数随访问的状态-动作对数量的增加而单调增加。此外，我们还将 AdaMA Q-learning 扩展到了高维环境下的 AdaMA DQN。广泛的实验结果揭示了移动平均 Q-learning 和 AdaMA Q-learning 能够减轻高估偏差的原因，同时也表明 AdaMA Q-learning 和 AdaMA DQN 的性能大大优于 SOTA 基线。其中，与 Q-learning 的高估值 1.66 相比，AdaMA Q-learning 的低估值为 0.196，提高了 88.19%。

{"title":"Adaptive moving average Q-learning","authors":"Tao Tan, Hong Xie, Yunni Xia, Xiaoyu Shi, Mingsheng Shang","doi":"10.1007/s10115-024-02190-8","DOIUrl":"https://doi.org/10.1007/s10115-024-02190-8","url":null,"abstract":"A variety of algorithms have been proposed to address the long-standing overestimation bias problem of Q-learning. Reducing this overestimation bias may lead to an underestimation bias, such as double Q-learning. However, it is still unclear how to make a good balance between overestimation and underestimation. We present a simple yet effective algorithm to fill in this gap and call Moving Average Q-learning. Specifically, we maintain two dependent Q-estimators. The first one is used to estimate the maximum expected Q-value. The second one is used to select the optimal action. In particular, the second estimator is the moving average of historical Q-values generated by the first estimator. The second estimator has only one hyperparameter, namely the moving average parameter. This parameter controls the dependence between the second estimator and the first estimator, ranging from independent to identical. Based on Moving Average Q-learning, we design an adaptive strategy to select the moving average parameter, resulting in AdaMA (Adaptive Moving Average) Q-learning. This adaptive strategy is a simple function, where the moving average parameter increases monotonically with the number of state–action pairs visited. Moreover, we extend AdaMA Q-learning to AdaMA DQN in high-dimensional environments. Extensive experiment results reveal why Moving Average Q-learning and AdaMA Q-learning can mitigate the overestimation bias, and also show that AdaMA Q-learning and AdaMA DQN outperform SOTA baselines drastically. In particular, when compared with the overestimated value of 1.66 in Q-learning, AdaMA Q-learning underestimates by 0.196, resulting in an improvement of 88.19%.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"372 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Community detection in social networks using machine learning: a systematic mapping study 利用机器学习检测社交网络中的社群：一项系统制图研究

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems

Pub Date : 2024-08-12 DOI: 10.1007/s10115-024-02201-8

Mahsa Nooribakhsh, Marta Fernández-Diego, Fernando González-Ladrón-De-Guevara, Mahdi Mollamotalebi

One of the important issues in social networks is the social communities which are formed by interactions between its members. Three types of community including overlapping, non-overlapping, and hidden are detected by different approaches. Regarding the importance of community detection in social networks, this paper provides a systematic mapping of machine learning-based community detection approaches. The study aimed to show the type of communities in social networks along with the algorithms of machine learning that have been used for community detection. After carrying out the steps of mapping and removing useless references, 246 papers were selected to answer the questions of this research. The results of the research indicated that unsupervised machine learning-based algorithms with 41.46% (such as k means) are the most used categories to detect communities in social networks due to their low processing overheads. On the other hand, there has been a significant increase in the use of deep learning since 2020 which has sufficient performance for community detection in large-volume data. With regard to the ability of NMI to measure the correlation or similarity between communities, with 53.25%, it is the most frequently used metric to evaluate the performance of community identifications. Furthermore, considering availability, low in size, and lack of multiple edge and loops, dataset Zachary’s Karate Club with 26.42% is the most used dataset for community detection research in social networks.

社交网络中的一个重要问题是由其成员之间的互动所形成的社交社区。不同的方法可以检测出三种类型的社群，包括重叠社群、非重叠社群和隐藏社群。鉴于社群检测在社交网络中的重要性，本文系统地介绍了基于机器学习的社群检测方法。研究旨在展示社交网络中的社群类型以及用于社群检测的机器学习算法。在进行了映射和删除无用参考文献等步骤后，选出了 246 篇论文来回答本研究的问题。研究结果表明，基于无监督机器学习的算法（如 k 平均值）因其较低的处理开销，以 41.46% 的比例成为检测社交网络中社区的最常用类别。另一方面，自 2020 年以来，深度学习的使用显著增加，其性能足以在海量数据中进行社群检测。关于 NMI 衡量社区间相关性或相似性的能力，它以 53.25% 的比例成为评估社区识别性能的最常用指标。此外，考虑到数据集 Zachary's Karate Club 的可用性、规模较小、缺乏多边缘和循环等因素，该数据集以 26.42% 的比例成为社交网络社区检测研究中最常用的数据集。

{"title":"Community detection in social networks using machine learning: a systematic mapping study","authors":"Mahsa Nooribakhsh, Marta Fernández-Diego, Fernando González-Ladrón-De-Guevara, Mahdi Mollamotalebi","doi":"10.1007/s10115-024-02201-8","DOIUrl":"https://doi.org/10.1007/s10115-024-02201-8","url":null,"abstract":"One of the important issues in social networks is the social communities which are formed by interactions between its members. Three types of community including overlapping, non-overlapping, and hidden are detected by different approaches. Regarding the importance of community detection in social networks, this paper provides a systematic mapping of machine learning-based community detection approaches. The study aimed to show the type of communities in social networks along with the algorithms of machine learning that have been used for community detection. After carrying out the steps of mapping and removing useless references, 246 papers were selected to answer the questions of this research. The results of the research indicated that unsupervised machine learning-based algorithms with 41.46% (such as k means) are the most used categories to detect communities in social networks due to their low processing overheads. On the other hand, there has been a significant increase in the use of deep learning since 2020 which has sufficient performance for community detection in large-volume data. With regard to the ability of NMI to measure the correlation or similarity between communities, with 53.25%, it is the most frequently used metric to evaluate the performance of community identifications. Furthermore, considering availability, low in size, and lack of multiple edge and loops, dataset Zachary’s Karate Club with 26.42% is the most used dataset for community detection research in social networks.","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"53 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0