首页 > 最新文献

Applied Intelligence最新文献

英文 中文
TableGPT: a novel table understanding method based on table recognition and large language model collaborative enhancement TableGPT:一种基于表识别和大语言模型协同增强的表理解新方法
IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-14 DOI: 10.1007/s10489-024-05937-6
Yi Ren, Chenglong Yu, Weibin Li, Wei Li, Zixuan Zhu, TianYi Zhang, ChenHao Qin, WenBo Ji, Jianjun Zhang

In today's information age, table images play a crucial role in storing structured information, making table image recognition technology an essential component in many fields. However, accurately recognizing the structure and text content of various complex table images has remained a challenge. Recently, large language models (LLMs) have demonstrated exceptional capabilities in various natural language processing tasks. Therefore, applying LLMs to the correction tasks of structure and text content after table image recognition presents a novel solution. This paper introduces a new method, TableGPT, which combines table recognition with LLMs and develops a specialized multimodal agent to enhance the effectiveness of table image recognition. Our approach is divided into four stages. In the first stage, TableGPT_agent initially evaluates whether the input is a table image and, upon confirmation, uses algorithms such as the transformer for preliminary recognition. In the second stage, the agent converts the recognition results into HTML format and autonomously assesses whether corrections are needed. If corrections are needed, the data are input into a trained LLM to achieve more accurate table recognition and optimization. In the third stage, the agent evaluates user satisfaction through feedback and applies superresolution algorithms to low-quality images, as this is often the main reason for user dissatisfaction. Finally, the agent inputs both the enhanced and original images into the trained model, integrating the information to obtain the optimal table text representation. Our research shows that trained LLMs can effectively interpret table images, improving the Tree Edit Distance Similarity (TEDS) score by an average of 4% even when based on the best current table recognition methods, across both public and private datasets. They also demonstrate better performance in correcting structural and textual errors. We also explore the impact of image superresolution technology on low-quality table images. Combined with the LLMs, our TEDS score significantly increased by 54%, greatly enhancing the recognition performance. Finally, by leveraging agent technology, our multimodal model improved table recognition performance, with the TEDS score of TableGPT_agent surpassing that of GPT-4 by 34%.

在当今信息时代,表格图像在存储结构化信息方面发挥着至关重要的作用,使得表格图像识别技术成为许多领域必不可少的组成部分。然而,准确识别各种复杂表格图像的结构和文本内容仍然是一个挑战。近年来,大型语言模型(llm)在各种自然语言处理任务中表现出了卓越的能力。因此,将llm应用于表图像识别后的结构和文本内容校正任务是一种新颖的解决方案。本文介绍了一种将表识别与llm相结合的新方法TableGPT,并开发了一种专门的多模态代理来提高表图像识别的有效性。我们的方法分为四个阶段。在第一阶段,TableGPT_agent首先评估输入是否是表图像,并在确认后使用transformer等算法进行初步识别。在第二阶段,代理将识别结果转换为HTML格式,并自主评估是否需要更正。如果需要更正,则将数据输入训练有素的LLM,以实现更准确的表识别和优化。在第三阶段,agent通过反馈评估用户满意度,并对低质量图像应用超分辨率算法,因为这通常是用户不满意的主要原因。最后,智能体将增强后的图像和原始图像同时输入到训练好的模型中,整合信息以获得最优的表文本表示。我们的研究表明,经过训练的llm可以有效地解释表图像,即使基于当前最好的表识别方法,在公共和私人数据集上,也可以将树编辑距离相似性(TEDS)得分平均提高4%。它们在纠正结构和文本错误方面也表现出更好的性能。我们还探讨了图像超分辨率技术对低质量表图像的影响。结合llm,我们的TEDS得分显著提高了54%,大大提高了识别性能。最后,通过利用智能体技术,我们的多模态模型提高了表识别性能,TableGPT_agent的TEDS得分比GPT-4高出34%。
{"title":"TableGPT: a novel table understanding method based on table recognition and large language model collaborative enhancement","authors":"Yi Ren,&nbsp;Chenglong Yu,&nbsp;Weibin Li,&nbsp;Wei Li,&nbsp;Zixuan Zhu,&nbsp;TianYi Zhang,&nbsp;ChenHao Qin,&nbsp;WenBo Ji,&nbsp;Jianjun Zhang","doi":"10.1007/s10489-024-05937-6","DOIUrl":"10.1007/s10489-024-05937-6","url":null,"abstract":"<div><p>In today's information age, table images play a crucial role in storing structured information, making table image recognition technology an essential component in many fields. However, accurately recognizing the structure and text content of various complex table images has remained a challenge. Recently, large language models (LLMs) have demonstrated exceptional capabilities in various natural language processing tasks. Therefore, applying LLMs to the correction tasks of structure and text content after table image recognition presents a novel solution. This paper introduces a new method, TableGPT, which combines table recognition with LLMs and develops a specialized multimodal agent to enhance the effectiveness of table image recognition. Our approach is divided into four stages. In the first stage, TableGPT_agent initially evaluates whether the input is a table image and, upon confirmation, uses algorithms such as the transformer for preliminary recognition. In the second stage, the agent converts the recognition results into HTML format and autonomously assesses whether corrections are needed. If corrections are needed, the data are input into a trained LLM to achieve more accurate table recognition and optimization. In the third stage, the agent evaluates user satisfaction through feedback and applies superresolution algorithms to low-quality images, as this is often the main reason for user dissatisfaction. Finally, the agent inputs both the enhanced and original images into the trained model, integrating the information to obtain the optimal table text representation. Our research shows that trained LLMs can effectively interpret table images, improving the Tree Edit Distance Similarity (TEDS) score by an average of 4% even when based on the best current table recognition methods, across both public and private datasets. They also demonstrate better performance in correcting structural and textual errors. We also explore the impact of image superresolution technology on low-quality table images. Combined with the LLMs, our TEDS score significantly increased by 54%, greatly enhancing the recognition performance. Finally, by leveraging agent technology, our multimodal model improved table recognition performance, with the TEDS score of TableGPT_agent surpassing that of GPT-4 by 34%.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 4","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-supervised motion forecasting with local information interaction in autonomous driving 基于局部信息交互的自动驾驶自监督运动预测
IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-14 DOI: 10.1007/s10489-024-06030-8
Xinyu Lei, Longjun Liu, Haoteng Li, Haonan Zhang

Motion forecasting presents significant challenges critical for ensuring the safety of autonomous driving systems. The accuracy of these forecasts relies heavily on factors such as map topology and the behaviors of vehicles and pedestrians. However, within vast datasets, certain features with unique properties, capable of enhancing representation generalization often remain hidden and overlooked. While self-supervised learning (SSL) has shown promise in uncovering such hidden features through pretext tasks, its application to motion forecasting remains underexplored. In this paper, we propose a novel self-supervised motion forecasting method that exploits the interaction of map topology and actors’ maneuvers within localized focal points to generate more informative and generalizable representations for forecasting task. Since intersections, characterized by intricate structures and frequent motion state changes among actors, serve as pivotal locations where the topology of the intersection map profoundly influences actors’ intentions to change course, we leverage this interplay by calculating map structure-based actors’ attributes, and actors’ maneuver-based map attributes. These attributes yield significant advantages for motion forecasting tasks. Experimentally, our proposed method outperforms the baseline on both the challenging large-scale Argoverse benchmark (Chang et al. 2019) and local test, which demonstrates the effectiveness of the fusion of cross-domain information in a local neighborhood.

运动预测是确保自动驾驶系统安全的重要挑战。这些预测的准确性在很大程度上依赖于地图拓扑、车辆和行人的行为等因素。然而,在庞大的数据集中,某些具有独特属性、能够增强表示泛化的特征往往被隐藏和忽视。虽然自监督学习(SSL)在通过借口任务发现这些隐藏特征方面表现出了希望,但其在运动预测中的应用仍未得到充分探索。在本文中,我们提出了一种新的自监督运动预测方法,该方法利用地图拓扑和参与者在局部焦点内的动作的相互作用,为预测任务生成更多信息和可泛化的表示。由于交叉口以复杂的结构和行动者之间频繁的运动状态变化为特征,是交叉口地图拓扑结构深刻影响行动者改变路线意图的关键位置,我们通过计算基于地图结构的行动者属性和基于行动者机动的地图属性来利用这种相互作用。这些属性为运动预测任务带来了显著的优势。在实验中,我们提出的方法在具有挑战性的大规模Argoverse基准测试(Chang et al. 2019)和局部测试中都优于基线,证明了局部邻域跨域信息融合的有效性。
{"title":"Self-supervised motion forecasting with local information interaction in autonomous driving","authors":"Xinyu Lei,&nbsp;Longjun Liu,&nbsp;Haoteng Li,&nbsp;Haonan Zhang","doi":"10.1007/s10489-024-06030-8","DOIUrl":"10.1007/s10489-024-06030-8","url":null,"abstract":"<p>Motion forecasting presents significant challenges critical for ensuring the safety of autonomous driving systems. The accuracy of these forecasts relies heavily on factors such as map topology and the behaviors of vehicles and pedestrians. However, within vast datasets, certain features with unique properties, capable of enhancing representation generalization often remain hidden and overlooked. While self-supervised learning (SSL) has shown promise in uncovering such hidden features through pretext tasks, its application to motion forecasting remains underexplored. In this paper, we propose a novel self-supervised motion forecasting method that exploits the interaction of map topology and actors’ maneuvers within localized focal points to generate more informative and generalizable representations for forecasting task. Since intersections, characterized by intricate structures and frequent motion state changes among actors, serve as pivotal locations where the topology of the intersection map profoundly influences actors’ intentions to change course, we leverage this interplay by calculating map structure-based actors’ attributes, and actors’ maneuver-based map attributes. These attributes yield significant advantages for motion forecasting tasks. Experimentally, our proposed method outperforms the baseline on both the challenging large-scale Argoverse benchmark (Chang et al. 2019) and local test, which demonstrates the effectiveness of the fusion of cross-domain information in a local neighborhood.</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Personalized federated knowledge graph embedding with client-wise relation graph 基于客户端关系图的个性化联邦知识图嵌入
IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-14 DOI: 10.1007/s10489-024-06211-5
Xiaoxiong Zhang, Zhiwei Zeng, Xin Zhou, Dusit Niyato, ZhiQi Shen

Federated Knowledge Graph Embedding (FKGE) has recently garnered considerable interest due to its capacity to extract expressive representations from distributed knowledge graphs, while concurrently safeguarding the privacy of individual clients. Existing FKGE methods typically harness the arithmetic mean of entity embeddings from all clients as the global supplementary knowledge, and learn a replica of global consensus entities embeddings for each client. However, these methods usually neglect the inherent semantic disparities among distinct clients. This oversight not only results in the globally shared complementary knowledge being inundated with too much noise when tailored to a specific client, but also instigates a discrepancy between local and global optimization objectives. Consequently, the quality of the learned embeddings is compromised. To address this, we propose Personalized Federated knowledge graph Embedding with client-wise relation Graph (PFedEG), a novel approach that employs a client-wise relation graph to learn personalized embeddings by discerning the semantic relevance of embeddings from other clients. Specifically, PFedEG learns personalized supplementary knowledge for each client by amalgamating entity embedding from its neighboring clients based on their “affinity” on the client-wise relation graph. Each client then conducts personalized embedding learning based on its local triples and personalized supplementary knowledge. We conduct extensive experiments on four benchmark datasets to evaluate our method against state-of-the-art models and results demonstrate the superiority of our method.

联邦知识图嵌入(FKGE)最近获得了相当大的兴趣,因为它能够从分布式知识图中提取富有表现力的表示,同时保护个人客户端的隐私。现有的FKGE方法通常利用来自所有客户端的实体嵌入的算术平均值作为全局补充知识,并为每个客户端学习全局共识实体嵌入的副本。然而,这些方法通常忽略了不同客户端之间固有的语义差异。这种疏忽不仅会导致全球共享的互补知识在针对特定客户进行定制时被过多的噪音所淹没,而且还会导致局部和全局优化目标之间的差异。因此,学习到的嵌入的质量受到损害。为了解决这个问题,我们提出了使用客户端智能关系图嵌入个性化联邦知识图(PFedEG),这是一种新颖的方法,它使用客户端智能关系图通过识别来自其他客户端的嵌入的语义相关性来学习个性化嵌入。具体来说,PFedEG通过合并相邻客户端的实体嵌入,根据它们在客户端关系图上的“亲和力”,为每个客户端学习个性化的补充知识。然后,每个客户端根据其局部三元组和个性化补充知识进行个性化嵌入学习。我们在四个基准数据集上进行了广泛的实验,以根据最先进的模型评估我们的方法,结果证明了我们方法的优越性。
{"title":"Personalized federated knowledge graph embedding with client-wise relation graph","authors":"Xiaoxiong Zhang,&nbsp;Zhiwei Zeng,&nbsp;Xin Zhou,&nbsp;Dusit Niyato,&nbsp;ZhiQi Shen","doi":"10.1007/s10489-024-06211-5","DOIUrl":"10.1007/s10489-024-06211-5","url":null,"abstract":"<div><p>Federated Knowledge Graph Embedding (FKGE) has recently garnered considerable interest due to its capacity to extract expressive representations from distributed knowledge graphs, while concurrently safeguarding the privacy of individual clients. Existing FKGE methods typically harness the arithmetic mean of entity embeddings from all clients as the global supplementary knowledge, and learn a replica of global consensus entities embeddings for each client. However, these methods usually neglect the inherent semantic disparities among distinct clients. This oversight not only results in the globally shared complementary knowledge being inundated with too much noise when tailored to a specific client, but also instigates a discrepancy between local and global optimization objectives. Consequently, the quality of the learned embeddings is compromised. To address this, we propose <b>P</b>ersonalized <b>Fed</b>erated knowledge graph <b>E</b>mbedding with client-wise relation <b>G</b>raph (<b>PFedEG</b>), a novel approach that employs a client-wise relation graph to learn personalized embeddings by discerning the semantic relevance of embeddings from other clients. Specifically, PFedEG learns personalized supplementary knowledge for each client by amalgamating entity embedding from its neighboring clients based on their “affinity” on the client-wise relation graph. Each client then conducts personalized embedding learning based on its local triples and personalized supplementary knowledge. We conduct extensive experiments on four benchmark datasets to evaluate our method against state-of-the-art models and results demonstrate the superiority of our method.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FineDiffusion: scaling up diffusion models for fine-grained image generation with 10,000 classes FineDiffusion:扩展扩散模型,用于10000个类的细粒度图像生成
IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-14 DOI: 10.1007/s10489-024-06215-1
Ziying Pan, Kun Wang, Gang Li, Feihong He, Yongxuan Lai

The class-conditional image generation based on diffusion models is renowned for generating high-quality and diverse images. However, most prior efforts focus on generating images for general categories, e.g., 1000 classes in ImageNet-1k. A more challenging task, large-scale fine-grained image generation, remains the boundary to explore. In this work, we present a parameter-efficient strategy, called FineDiffusion, to fine-tune large pre-trained diffusion models scaling to large-scale fine-grained image generation with 10,000 categories. FineDiffusion significantly accelerates training and reduces storage overhead by only fine-tuning tiered class embedder, bias terms, and normalization layers’ parameters. To further improve the image generation quality of fine-grained categories, we propose a novel sampling method for fine-grained image generation, which utilizes superclass-conditioned guidance, specifically tailored for fine-grained categories, to replace the conventional classifier-free guidance sampling. Compared to full fine-tuning, FineDiffusion achieves a remarkable 1.56(times ) training speed-up and requires storing merely 1.77% of the total model parameters, while achieving state-of-the-art FID of 9.776 on image generation of 10,000 classes. Extensive qualitative and quantitative experiments demonstrate the superiority of our method compared to other parameter-efficient fine-tuning methods. The code and more generated results are available at our project website: https://finediffusion.github.io/.

基于扩散模型的类条件图像生成以生成高质量和多样化的图像而闻名。然而,大多数先前的努力都集中在为一般类别生成图像,例如ImageNet-1k中的1000个类。一个更具挑战性的任务,大规模细粒度图像生成,仍然是有待探索的边界。在这项工作中,我们提出了一种参数高效的策略,称为FineDiffusion,用于微调大型预训练扩散模型,使其扩展到具有10,000个类别的大规模细粒度图像生成。FineDiffusion通过微调分层类嵌入器、偏置项和归一化层参数,显著加快了训练速度,减少了存储开销。为了进一步提高细粒度分类的图像生成质量,我们提出了一种新的细粒度图像生成采样方法,该方法利用超类条件引导,专门为细粒度分类量身定制,以取代传统的无分类器引导采样。与完全微调相比,FineDiffusion实现了显著的1.56 (times )训练加速,只需要存储1.77% of the total model parameters, while achieving state-of-the-art FID of 9.776 on image generation of 10,000 classes. Extensive qualitative and quantitative experiments demonstrate the superiority of our method compared to other parameter-efficient fine-tuning methods. The code and more generated results are available at our project website: https://finediffusion.github.io/.
{"title":"FineDiffusion: scaling up diffusion models for fine-grained image generation with 10,000 classes","authors":"Ziying Pan,&nbsp;Kun Wang,&nbsp;Gang Li,&nbsp;Feihong He,&nbsp;Yongxuan Lai","doi":"10.1007/s10489-024-06215-1","DOIUrl":"10.1007/s10489-024-06215-1","url":null,"abstract":"<div><p>The class-conditional image generation based on diffusion models is renowned for generating high-quality and diverse images. However, most prior efforts focus on generating images for general categories, e.g., 1000 classes in ImageNet-1k. A more challenging task, large-scale fine-grained image generation, remains the boundary to explore. In this work, we present a parameter-efficient strategy, called <i>FineDiffusion</i>, to fine-tune large pre-trained diffusion models scaling to large-scale fine-grained image generation with 10,000 categories. FineDiffusion significantly accelerates training and reduces storage overhead by only fine-tuning tiered class embedder, bias terms, and normalization layers’ parameters. To further improve the image generation quality of fine-grained categories, we propose a novel sampling method for fine-grained image generation, which utilizes superclass-conditioned guidance, specifically tailored for fine-grained categories, to replace the conventional classifier-free guidance sampling. Compared to full fine-tuning, FineDiffusion achieves a remarkable 1.56<span>(times )</span> training speed-up and requires storing merely 1.77% of the total model parameters, while achieving state-of-the-art FID of 9.776 on image generation of 10,000 classes. Extensive qualitative and quantitative experiments demonstrate the superiority of our method compared to other parameter-efficient fine-tuning methods. The code and more generated results are available at our project website: https://finediffusion.github.io/.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 4","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-way reductions of conflict analysis based on relation matrices and integration measures 基于关系矩阵和集成测度的三向冲突缩减分析
IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-14 DOI: 10.1007/s10489-024-06020-w
Jiang Chen, Xianyong Zhang

Conflicts serve as an important focus of uncertainty analysis, and their reductions facilitate the issue identification and conflict solving to become valuable but rare. At present, conflict analysis reductions mainly embrace relation matrices, and they never concern uncertainty measures with highly concentrated information. In this paper, three-way reductions of conflict analysis are transferred from relation matrices to integration measures, and corresponding heuristic reduction algorithms are constructed for information systems. At first, three-way membership degrees and three-way similarity degrees are proposed for conflict analysis, and their measurement boundedness, issue monotonicity, calculation algorithm, and transformation interrelationship are researched. Then, alliance, conflict, and neutrality reductions are proposed based on similarity degrees to acquire heuristic reduction algorithms, and they can be equivalently characterized by both membership degrees and relation matrices. Finally by table examples and data experiments, similarity degrees and relevant measurement properties are validated, and two groups of three-way reduction algorithms related to relation matrices and similarity degrees are comparatively analyzed; as a result, three-way reduction algorithms based on similarity degrees become novel and effective for conflict analysis. This study provides an in-depth insight into three-way reductions of conflict analysis from algebraic measurement.

冲突是不确定性分析的一个重要焦点,对冲突的还原有助于问题的识别和冲突的解决,具有重要的价值,但并不多见。目前,冲突分析的还原主要包括关系矩阵,从未涉及信息高度集中的不确定性度量。本文将冲突分析的三向还原从关系矩阵转移到整合度量,并针对信息系统构建了相应的启发式还原算法。首先,提出了用于冲突分析的三向成员度和三向相似度,并研究了它们的度量界限、问题单调性、计算算法和变换相互关系。然后,提出了基于相似度的联盟、冲突和中立性还原,从而获得启发式还原算法,它们可以等效地用成员度和关系矩阵来表征。最后通过表格示例和数据实验,验证了相似度和相关测量属性,并比较分析了与关系矩阵和相似度相关的两组三向还原算法,从而使基于相似度的三向还原算法成为冲突分析中新颖而有效的算法。本研究从代数测量的角度深入探讨了冲突分析的三向还原问题。
{"title":"Three-way reductions of conflict analysis based on relation matrices and integration measures","authors":"Jiang Chen,&nbsp;Xianyong Zhang","doi":"10.1007/s10489-024-06020-w","DOIUrl":"10.1007/s10489-024-06020-w","url":null,"abstract":"<div><p>Conflicts serve as an important focus of uncertainty analysis, and their reductions facilitate the issue identification and conflict solving to become valuable but rare. At present, conflict analysis reductions mainly embrace relation matrices, and they never concern uncertainty measures with highly concentrated information. In this paper, three-way reductions of conflict analysis are transferred from relation matrices to integration measures, and corresponding heuristic reduction algorithms are constructed for information systems. At first, three-way membership degrees and three-way similarity degrees are proposed for conflict analysis, and their measurement boundedness, issue monotonicity, calculation algorithm, and transformation interrelationship are researched. Then, alliance, conflict, and neutrality reductions are proposed based on similarity degrees to acquire heuristic reduction algorithms, and they can be equivalently characterized by both membership degrees and relation matrices. Finally by table examples and data experiments, similarity degrees and relevant measurement properties are validated, and two groups of three-way reduction algorithms related to relation matrices and similarity degrees are comparatively analyzed; as a result, three-way reduction algorithms based on similarity degrees become novel and effective for conflict analysis. This study provides an in-depth insight into three-way reductions of conflict analysis from algebraic measurement.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A meta-heuristic approach to estimate and explain classifier uncertainty 一种估计和解释分类器不确定性的元启发式方法
IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-14 DOI: 10.1007/s10489-024-06127-0
Andrew Houston, Georgina Cosma

Trust is a crucial factor affecting the adoption of machine learning (ML) models. Qualitative studies have revealed that end-users, particularly in the medical domain, need models that can express their uncertainty in decision-making allowing users to know when to ignore the model’s recommendations. However, existing approaches for quantifying decision-making uncertainty are not model-agnostic, or they rely on complex mathematical derivations that are not easily understood by laypersons or end-users, making them less useful for explaining the model’s decision-making process. This work proposes a set of class-independent meta-heuristics that can characterise the complexity of an instance in terms of factors that are mutually relevant to both human and ML decision-making. The measures are integrated into a meta-learning framework that estimates the risk of misclassification. The proposed framework outperformed predicted probabilities and entropy-based methods of identifying instances at risk of being misclassified. Furthermore, the proposed approach resulted in uncertainty estimates that proves more independent of model accuracy and calibration than existing approaches. The proposed measures and framework demonstrate promise for improving model development for more complex instances and provides a new means of model abstention and explanation.

信任是影响机器学习(ML)模型采用的一个关键因素。定性研究表明,终端用户,尤其是医疗领域的终端用户,需要能够表达其决策不确定性的模型,以便用户知道何时忽略模型的建议。然而,现有的量化决策不确定性的方法与模型无关,或者依赖于外行或最终用户不易理解的复杂数学推导,因此在解释模型的决策过程方面作用不大。这项工作提出了一套独立于类的元启发式方法,可以根据与人类和人工智能决策相互相关的因素来描述实例的复杂性。这些测量方法被集成到一个元学习框架中,用于估算错误分类的风险。在识别有被误分类风险的实例方面,所提出的框架优于预测概率和基于熵的方法。此外,与现有方法相比,拟议方法得出的不确定性估计值证明更独立于模型的准确性和校准。所提出的测量方法和框架有望改善针对更复杂实例的模型开发,并提供了一种放弃和解释模型的新方法。
{"title":"A meta-heuristic approach to estimate and explain classifier uncertainty","authors":"Andrew Houston,&nbsp;Georgina Cosma","doi":"10.1007/s10489-024-06127-0","DOIUrl":"10.1007/s10489-024-06127-0","url":null,"abstract":"<div><p>Trust is a crucial factor affecting the adoption of machine learning (ML) models. Qualitative studies have revealed that end-users, particularly in the medical domain, need models that can express their uncertainty in decision-making allowing users to know when to ignore the model’s recommendations. However, existing approaches for quantifying decision-making uncertainty are not model-agnostic, or they rely on complex mathematical derivations that are not easily understood by laypersons or end-users, making them less useful for explaining the model’s decision-making process. This work proposes a set of class-independent meta-heuristics that can characterise the complexity of an instance in terms of factors that are mutually relevant to both human and ML decision-making. The measures are integrated into a meta-learning framework that estimates the risk of misclassification. The proposed framework outperformed predicted probabilities and entropy-based methods of identifying instances at risk of being misclassified. Furthermore, the proposed approach resulted in uncertainty estimates that proves more independent of model accuracy and calibration than existing approaches. The proposed measures and framework demonstrate promise for improving model development for more complex instances and provides a new means of model abstention and explanation.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-06127-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced interpretation of novel datasets by summarizing clustering results using deep-learning based linguistic models 通过使用基于深度学习的语言模型总结聚类结果,增强对新数据集的解释
IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-14 DOI: 10.1007/s10489-025-06250-6
Natarajan K, Srikar Verma, Dheeraj Kumar

In today’s technology-driven era, the proliferation of data is inevitable across various domains. Within engineering, sciences, and business domains, particularly in the context of big data, it can extract actionable insights that can revolutionize the field. Amid data management and analysis, patterns or groups of interconnected data points, commonly referred to as clusters, frequently emerge. These clusters represent distinct subsets containing closely related data points, showcasing unique characteristics compared to other clusters within the same dataset. Spanning across disciplines such as physics, biology, business, and sales, clustering is important in understanding these novel datasets’ essential characteristics, developing complex statistical models, and testing various hypotheses. However, interpreting the characteristics and physical implications of generated clusters by different clustering algorithms is challenging for researchers unfamiliar with these algorithms’ inner workings. This research addresses the intricacies of comprehending data clustering, cluster attributes, and evaluation metrics, especially for individuals lacking proficiency in clustering or related disciplines like statistics. The primary objective of this study is to simplify cluster analysis by furnishing users or analysts from diverse domains with succinct linguistic synopses of clustering results, circumventing the necessity for intricate numerical or mathematical terms. Deep learning techniques based on large language models, such as encoder-decoders (for example, the T5 model) and generative pre-trained transformers (GPTs), are employed to achieve this. This study aims to construct a summarization model capable of ingesting data clusters, producing a condensed overview of the contained insights in a simplified, easily understandable linguistic format. The evaluation process revealed a clear preference among evaluators for the summaries generated by GPT, with T5 summaries following closely behind. GPT and T5 summaries were good at fluency, demonstrating their ability to capture the original content in a human-like manner. In contrast, while providing a structured framework for summarization, the linguistic protoform-based approach is needed to match the quality and coherence of the GPT and T5 summaries.

在当今技术驱动的时代,数据在各个领域的扩散是不可避免的。在工程、科学和商业领域,特别是在大数据的背景下,它可以提取可操作的见解,从而彻底改变该领域。在数据管理和分析中,经常出现模式或相互连接的数据点组,通常称为集群。这些聚类表示包含密切相关数据点的不同子集,与同一数据集中的其他聚类相比,显示出独特的特征。聚类横跨物理学、生物学、商业和销售等学科,对于理解这些新数据集的基本特征、开发复杂的统计模型和测试各种假设非常重要。然而,对于不熟悉这些算法内部工作原理的研究人员来说,解释不同聚类算法生成的聚类的特征和物理含义是具有挑战性的。本研究解决了理解数据聚类、聚类属性和评估指标的复杂性,特别是对于缺乏聚类或相关学科(如统计学)熟练程度的个人。本研究的主要目的是通过为不同领域的用户或分析人员提供聚类结果的简洁语言概要来简化聚类分析,从而避免了复杂的数值或数学术语的必要性。基于大型语言模型的深度学习技术,如编码器-解码器(例如,T5模型)和生成式预训练转换器(gpt),被用来实现这一点。本研究旨在构建一个能够摄取数据簇的摘要模型,以简化、易于理解的语言格式生成包含见解的浓缩概述。评估过程显示,评估者对GPT生成的摘要有明显的偏好,T5摘要紧随其后。GPT和T5总结的流畅性较好,显示出他们以人性化的方式捕捉原始内容的能力。相比之下,在为摘要提供结构化框架的同时,需要基于语言原型的方法来匹配GPT和T5摘要的质量和一致性。
{"title":"Enhanced interpretation of novel datasets by summarizing clustering results using deep-learning based linguistic models","authors":"Natarajan K,&nbsp;Srikar Verma,&nbsp;Dheeraj Kumar","doi":"10.1007/s10489-025-06250-6","DOIUrl":"10.1007/s10489-025-06250-6","url":null,"abstract":"<div><p>In today’s technology-driven era, the proliferation of data is inevitable across various domains. Within engineering, sciences, and business domains, particularly in the context of big data, it can extract actionable insights that can revolutionize the field. Amid data management and analysis, patterns or groups of interconnected data points, commonly referred to as clusters, frequently emerge. These clusters represent distinct subsets containing closely related data points, showcasing unique characteristics compared to other clusters within the same dataset. Spanning across disciplines such as physics, biology, business, and sales, clustering is important in understanding these novel datasets’ essential characteristics, developing complex statistical models, and testing various hypotheses. However, interpreting the characteristics and physical implications of generated clusters by different clustering algorithms is challenging for researchers unfamiliar with these algorithms’ inner workings. This research addresses the intricacies of comprehending data clustering, cluster attributes, and evaluation metrics, especially for individuals lacking proficiency in clustering or related disciplines like statistics. The primary objective of this study is to simplify cluster analysis by furnishing users or analysts from diverse domains with succinct linguistic synopses of clustering results, circumventing the necessity for intricate numerical or mathematical terms. Deep learning techniques based on large language models, such as encoder-decoders (for example, the T5 model) and generative pre-trained transformers (GPTs), are employed to achieve this. This study aims to construct a summarization model capable of ingesting data clusters, producing a condensed overview of the contained insights in a simplified, easily understandable linguistic format. The evaluation process revealed a clear preference among evaluators for the summaries generated by GPT, with T5 summaries following closely behind. GPT and T5 summaries were good at fluency, demonstrating their ability to capture the original content in a human-like manner. In contrast, while providing a structured framework for summarization, the linguistic protoform-based approach is needed to match the quality and coherence of the GPT and T5 summaries.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on small-scale face detection methods in dense scenes 密集场景下的小尺度人脸检测方法研究
IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-14 DOI: 10.1007/s10489-025-06231-9
Yuan Cao, Bei Zhang, Changqing Wang, Meng Wang

Face detection serves as the core foundation for applications such as face analysis, recognition and reconstruction. In dense scenarios, the target scale difference is significant, and the instance pixels are too small as well as the mutual occlusion is serious leading to inconspicuous feature representation. However, existing detection methods rely on convolutional and pooling layers for feature extraction, with insufficient deep feature extraction and limited inference capability, leading to inaccurate recognition and high leakage rate. Therefore, we propose a small-scale face detection model YOLO-SXS based on the extended Transformer structure, which makes full use of contextual information and feature fusion networks to significantly improve the detection performance for small-scale and occluded faces. Specifically, the fusion of Swin Transformer and Convolutional Neural Networks (CNN) for feature extraction enhances the network’s ability to perceive global features; the Space to Depth (SPD-Conv) mapping is used to improve the network’s feature extraction in low-resolution and small-target detection tasks; furthermore, by adding fine-grained features, YOLO-SXS can significantly improve its performance for small-scale and occluded face detection capability; in addition, by adding a fine-grained feature fusion layer, feature information is retained to the maximum extent, which effectively reduces the loss of target information. The performance evaluation was performed on WIDER FACE, SCUT-HEAD and FDDB datasets, and the experimental results show that our proposed method significantly improves the performance of recognizing small-sized faces and achieves high detection rate and low error rate.

人脸检测是人脸分析、识别和重建等应用的核心基础。在密集场景下,目标尺度差异较大,实例像素过小,相互遮挡严重,导致特征表示不明显。然而,现有的检测方法依赖于卷积层和池化层进行特征提取,深度特征提取不足,推理能力有限,导致识别不准确,泄漏率高。因此,我们提出了一种基于扩展Transformer结构的小尺度人脸检测模型YOLO-SXS,该模型充分利用上下文信息和特征融合网络,显著提高了小尺度和遮挡人脸的检测性能。具体来说,Swin Transformer与卷积神经网络(CNN)的融合特征提取增强了网络感知全局特征的能力;利用空间到深度(SPD-Conv)映射改进网络在低分辨率和小目标检测任务中的特征提取;此外,通过添加细粒度特征,YOLO-SXS可以显著提高小尺度和遮挡人脸检测能力;此外,通过添加细粒度的特征融合层,最大程度地保留了特征信息,有效减少了目标信息的丢失。在WIDER FACE、SCUT-HEAD和FDDB数据集上进行了性能评估,实验结果表明,该方法显著提高了小尺寸人脸的识别性能,实现了高检测率和低错误率。
{"title":"Research on small-scale face detection methods in dense scenes","authors":"Yuan Cao,&nbsp;Bei Zhang,&nbsp;Changqing Wang,&nbsp;Meng Wang","doi":"10.1007/s10489-025-06231-9","DOIUrl":"10.1007/s10489-025-06231-9","url":null,"abstract":"<div><p>Face detection serves as the core foundation for applications such as face analysis, recognition and reconstruction. In dense scenarios, the target scale difference is significant, and the instance pixels are too small as well as the mutual occlusion is serious leading to inconspicuous feature representation. However, existing detection methods rely on convolutional and pooling layers for feature extraction, with insufficient deep feature extraction and limited inference capability, leading to inaccurate recognition and high leakage rate. Therefore, we propose a small-scale face detection model YOLO-SXS based on the extended Transformer structure, which makes full use of contextual information and feature fusion networks to significantly improve the detection performance for small-scale and occluded faces. Specifically, the fusion of Swin Transformer and Convolutional Neural Networks (CNN) for feature extraction enhances the network’s ability to perceive global features; the Space to Depth (SPD-Conv) mapping is used to improve the network’s feature extraction in low-resolution and small-target detection tasks; furthermore, by adding fine-grained features, YOLO-SXS can significantly improve its performance for small-scale and occluded face detection capability; in addition, by adding a fine-grained feature fusion layer, feature information is retained to the maximum extent, which effectively reduces the loss of target information. The performance evaluation was performed on WIDER FACE, SCUT-HEAD and FDDB datasets, and the experimental results show that our proposed method significantly improves the performance of recognizing small-sized faces and achieves high detection rate and low error rate.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 5","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intelligent gear shifting strategy of mining truck based on deep learning and real-time vehicle condition 基于深度学习和实时车况的矿用卡车智能换挡策略
IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-13 DOI: 10.1007/s10489-024-06142-1
Qinghua Su, Xiaoyu Xu, Liyong Wang, Dingge Zhang, Min Xie, Pengbo Zhang

The driving conditions in mining areas are complex, and developing a suitable automatic shifting strategy for mining trucks is crucial. However, the development of automatic shifting strategies faces challenges, as it relies on experience and historical experimental data, which are the highest commercial secrets of manufacturers. In recent years, some shifting strategies based on artificial intelligence technologies have been implemented. However, many people shift gears based on the current state of the vehicle, ignoring the influence of historical data. There is a potential risk of mis-shift when unexpected sensor data is received, and continuously shifting gears in a short period of time can increase the likelihood of transmission damage, affecting the driving experience. To this end, this study proposes a novel gear shifting prediction method based on a multi-parameter Bi-directional Long Short-Term Memory(Bi-LSTM) network operating over continuous time periods. Real-time vehicle state data is collected via the CAN bus and 9 parameters that are positively correlated with gear shifting are selected through R/S analysis. By inputting values of those 9 parameters within continuous time periods into the machine learning model, gear shifting prediction is conducted. The experimental results show that our model predicts gear shifting with 96.85% accuracy while its average time cost is around 3.86 ms, meeting the real-time processing requirement. The model balances prediction accuracy and time consumption, and it overcomes the impact of transient abnormal sensor data. Hence, it has the potential for wide application in predictive models based on data with temporal characteristics.

矿区行驶条件复杂,制定合适的矿用卡车自动换挡策略至关重要。然而,自动换挡策略的开发面临挑战,因为它依赖于经验和历史实验数据,而这些数据是制造商的最高商业机密。近年来,一些基于人工智能技术的转移策略得到了实施。然而,很多人根据车辆的当前状态来换挡,忽略了历史数据的影响。当接收到意外的传感器数据时,存在误挡的潜在风险,短时间内连续换挡会增加变速箱损坏的可能性,影响驾驶体验。为此,本研究提出了一种基于连续时间段的多参数双向长短期记忆(Bi-LSTM)网络的换挡预测方法。通过CAN总线实时采集车辆状态数据,通过R/S分析选择与换挡正相关的9个参数。将这9个参数在连续时间段内的值输入到机器学习模型中,进行换挡预测。实验结果表明,该模型预测换挡精度为96.85%,平均时间成本约为3.86 ms,满足实时处理要求。该模型平衡了预测精度和时间消耗,克服了瞬态异常传感器数据的影响。因此,它在基于时间特征数据的预测模型中具有广泛的应用潜力。
{"title":"Intelligent gear shifting strategy of mining truck based on deep learning and real-time vehicle condition","authors":"Qinghua Su,&nbsp;Xiaoyu Xu,&nbsp;Liyong Wang,&nbsp;Dingge Zhang,&nbsp;Min Xie,&nbsp;Pengbo Zhang","doi":"10.1007/s10489-024-06142-1","DOIUrl":"10.1007/s10489-024-06142-1","url":null,"abstract":"<div><p>The driving conditions in mining areas are complex, and developing a suitable automatic shifting strategy for mining trucks is crucial. However, the development of automatic shifting strategies faces challenges, as it relies on experience and historical experimental data, which are the highest commercial secrets of manufacturers. In recent years, some shifting strategies based on artificial intelligence technologies have been implemented. However, many people shift gears based on the current state of the vehicle, ignoring the influence of historical data. There is a potential risk of mis-shift when unexpected sensor data is received, and continuously shifting gears in a short period of time can increase the likelihood of transmission damage, affecting the driving experience. To this end, this study proposes a novel gear shifting prediction method based on a multi-parameter Bi-directional Long Short-Term Memory(Bi-LSTM) network operating over continuous time periods. Real-time vehicle state data is collected via the CAN bus and 9 parameters that are positively correlated with gear shifting are selected through R/S analysis. By inputting values of those 9 parameters within continuous time periods into the machine learning model, gear shifting prediction is conducted. The experimental results show that our model predicts gear shifting with 96.85% accuracy while its average time cost is around 3.86 ms, meeting the real-time processing requirement. The model balances prediction accuracy and time consumption, and it overcomes the impact of transient abnormal sensor data. Hence, it has the potential for wide application in predictive models based on data with temporal characteristics.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 4","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142963001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MPBE: Multi-perspective boundary enhancement network for aspect sentiment triplet extraction 面向方面情感三元组提取的多视角边界增强网络
IF 3.4 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-13 DOI: 10.1007/s10489-024-06144-z
Kun Yang, Liansong Zong, Mingwei Tang, Yanxi Zheng, Yujun Chen, Mingfeng Zhao, Zhongyuan Jiang

Aspect Sentiment Triple Extraction (ASTE) is an emerging task in sentiment analysis that aims to extract triplets consisting of aspect terms, opinion terms, and sentiment polarity from review texts. Previous span-based methods often struggle with accurately identifying the boundaries of aspect and opinion terms, especially when multiple word spans appear in a sentence. This limitation arises from their reliance on a single, simplistic approach to constructing contextual features. To address these challenges, we propose Multi-Perspective Boundary Enhancement Network (MPBE). The network captures rich contextual features by adopting a dual-encoder mechanism and constructs multiple channels to further enhance these features. Specifically, we introduce enhanced semantic and syntactic information in two channels, while the third channel transforms the features using discrete fourier transform. In addition, we design a dual-graph cross fusion module to fuse features from different channels for more efficient information interaction and integration. Finally, by statistically analyzing the length distribution of aspect and opinion terms, a candidate length-based decoding strategy is proposed to achieve more accurate decoding. In experiments, the proposed MPBE model achieved excellent results on four benchmark datasets (14Lap, 14Res, 15Res, 16Res), with F1 scores of 62.32%, 73.78%, 65.32%, and 73.36%, respectively, demonstrating the superiority of the method.

面向情感三元提取(ASTE)是情感分析领域的一项新兴任务,旨在从评论文本中提取由面向术语、观点术语和情感极性组成的三元组。以前基于广度的方法常常难以准确识别方面和观点术语的边界,特别是当一个句子中出现多个词广度时。这种限制源于它们依赖于单一的、简单的方法来构建上下文特征。为了解决这些挑战,我们提出了多视角边界增强网络(MPBE)。该网络通过采用双编码器机制捕获丰富的上下文特征,并构建多通道进一步增强这些特征。具体来说,我们在两个通道中引入增强的语义和句法信息,而第三个通道使用离散傅里叶变换对特征进行变换。此外,我们设计了双图交叉融合模块,融合不同渠道的特征,实现更高效的信息交互和集成。最后,通过统计分析方面词和意见词的长度分布,提出了基于候选长度的译码策略,实现了更精确的译码。在实验中,提出的MPBE模型在14Lap、14Res、15Res、16Res四个基准数据集上取得了优异的结果,F1得分分别为62.32%、73.78%、65.32%和73.36%,证明了该方法的优越性。
{"title":"MPBE: Multi-perspective boundary enhancement network for aspect sentiment triplet extraction","authors":"Kun Yang,&nbsp;Liansong Zong,&nbsp;Mingwei Tang,&nbsp;Yanxi Zheng,&nbsp;Yujun Chen,&nbsp;Mingfeng Zhao,&nbsp;Zhongyuan Jiang","doi":"10.1007/s10489-024-06144-z","DOIUrl":"10.1007/s10489-024-06144-z","url":null,"abstract":"<div><p>Aspect Sentiment Triple Extraction (ASTE) is an emerging task in sentiment analysis that aims to extract triplets consisting of aspect terms, opinion terms, and sentiment polarity from review texts. Previous span-based methods often struggle with accurately identifying the boundaries of aspect and opinion terms, especially when multiple word spans appear in a sentence. This limitation arises from their reliance on a single, simplistic approach to constructing contextual features. To address these challenges, we propose Multi-Perspective Boundary Enhancement Network (MPBE). The network captures rich contextual features by adopting a dual-encoder mechanism and constructs multiple channels to further enhance these features. Specifically, we introduce enhanced semantic and syntactic information in two channels, while the third channel transforms the features using discrete fourier transform. In addition, we design a dual-graph cross fusion module to fuse features from different channels for more efficient information interaction and integration. Finally, by statistically analyzing the length distribution of aspect and opinion terms, a candidate length-based decoding strategy is proposed to achieve more accurate decoding. In experiments, the proposed MPBE model achieved excellent results on four benchmark datasets (14Lap, 14Res, 15Res, 16Res), with F1 scores of 62.32%, 73.78%, 65.32%, and 73.36%, respectively, demonstrating the superiority of the method.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 4","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142963002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1