Pub Date : 2024-11-04DOI: 10.1016/j.knosys.2024.112682
Vincenzo Moscato, Marco Postiglione, Giancarlo Sperlì, Andrea Vignali
Training Named Entity Recognition (NER) models typically necessitates the use of extensively annotated datasets. This requirement presents a significant challenge due to the labor-intensive and costly nature of manual annotation, especially in specialized domains such as medicine and finance. To address data scarcity, two strategies have emerged as effective: (1) Active Learning (AL), which autonomously identifies samples that would most enhance model performance if annotated, and (2) data augmentation, which automatically generates new samples. However, while AL reduces human effort, it does not eliminate it entirely, and data augmentation often leads to incomplete and noisy annotations, presenting new hurdles in NER model training. In this study, we integrate AL principles into a data augmentation framework, named Active Learning-based Data Augmentation for NER (ALDANER), to prioritize the selection of informative samples from an augmented pool and mitigate the impact of noisy annotations. Our experiments across various benchmark datasets and few-shot scenarios demonstrate that our approach surpasses several data augmentation baselines, offering insights into promising avenues for future research.
训练命名实体识别(NER)模型通常需要使用大量注释数据集。由于人工标注劳动密集且成本高昂,尤其是在医学和金融等专业领域,这一要求带来了巨大的挑战。为了解决数据稀缺的问题,有两种有效的策略:(1) 主动学习(Active Learning,AL),它能自动识别如果注释后最能提高模型性能的样本;(2) 数据增强(data augmentation,自动生成新样本)。然而,虽然主动学习可以减少人工操作,但并不能完全消除人工操作,而且数据扩增往往会导致注释不完整和有噪声,给 NER 模型训练带来新的障碍。在本研究中,我们将 AL 原则整合到数据扩增框架中,命名为基于主动学习的 NER 数据扩增(ALDANER),以便优先从扩增池中选择信息样本,并减轻噪声注释的影响。我们在各种基准数据集和少数几个场景中进行的实验表明,我们的方法超越了几种数据扩增基线,为未来的研究提供了有前途的途径。
{"title":"ALDANER: Active Learning based Data Augmentation for Named Entity Recognition","authors":"Vincenzo Moscato, Marco Postiglione, Giancarlo Sperlì, Andrea Vignali","doi":"10.1016/j.knosys.2024.112682","DOIUrl":"10.1016/j.knosys.2024.112682","url":null,"abstract":"<div><div>Training Named Entity Recognition (NER) models typically necessitates the use of extensively annotated datasets. This requirement presents a significant challenge due to the labor-intensive and costly nature of manual annotation, especially in specialized domains such as medicine and finance. To address data scarcity, two strategies have emerged as effective: (1) Active Learning (AL), which autonomously identifies samples that would most enhance model performance if annotated, and (2) data augmentation, which automatically generates new samples. However, while AL reduces human effort, it does not eliminate it entirely, and data augmentation often leads to incomplete and noisy annotations, presenting new hurdles in NER model training. In this study, we integrate AL principles into a data augmentation framework, named Active Learning-based Data Augmentation for NER (ALDANER), to prioritize the selection of informative samples from an augmented pool and mitigate the impact of noisy annotations. Our experiments across various benchmark datasets and few-shot scenarios demonstrate that our approach surpasses several data augmentation baselines, offering insights into promising avenues for future research.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112682"},"PeriodicalIF":7.2,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-04DOI: 10.1016/j.knosys.2024.112686
Zaifeng Hua, Yifei Chen
As the field of Nested Named Entity Recognition (NNER) advances, it is marked by a growing complexity due to the increasing number of multi-label entity instances. How to more effectively identify multi-label entities and explore the correlation between labels is the focus of our work. Unlike previous models that are modeled in single-label multi-classification problems, we propose a novel multi-label local metric NER model to rethink Nested Entity Recognition from a multi-label perspective. Simultaneously, to address the significant sample imbalance problem commonly encountered in multi-label scenarios, we introduce a parts-of-speech-based strategy that significantly improves the model’s performance on imbalanced datasets. Experiments on nested, multi-label, and flat datasets verify the generalization and superiority of our model, with results surpassing the existing state-of-the-art (SOTA) on several multi-label and flat benchmarks. After a series of experimental analyses, we highlight the persistent challenges in the multi-label NER. We are hopeful that the insights derived from our work will not only provide new perspectives on the nested NER landscape but also contribute to the ongoing momentum necessary for advancing research in the field of multi-label NER.
随着嵌套命名实体识别(NNER)领域的发展,由于多标签实体实例的数量不断增加,其复杂性也随之增加。如何更有效地识别多标签实体并探索标签之间的相关性是我们工作的重点。与以往以单标签多分类问题为模型的模型不同,我们提出了一种新颖的多标签局部度量 NER 模型,从多标签的角度重新思考嵌套实体识别。同时,为了解决多标签场景中常见的严重样本不平衡问题,我们引入了基于语音部分的策略,显著提高了模型在不平衡数据集上的性能。在嵌套、多标签和平面数据集上的实验验证了我们模型的通用性和优越性,在多个多标签和平面基准上的结果超过了现有的最先进模型(SOTA)。在一系列实验分析之后,我们强调了多标签 NER 中持续存在的挑战。我们希望,从我们的工作中得出的见解不仅能为嵌套 NER 领域提供新的视角,还能为推动多标签 NER 领域的研究提供必要的持续动力。
{"title":"Local Metric NER: A new paradigm for named entity recognition from a multi-label perspective","authors":"Zaifeng Hua, Yifei Chen","doi":"10.1016/j.knosys.2024.112686","DOIUrl":"10.1016/j.knosys.2024.112686","url":null,"abstract":"<div><div>As the field of Nested Named Entity Recognition (NNER) advances, it is marked by a growing complexity due to the increasing number of multi-label entity instances. How to more effectively identify multi-label entities and explore the correlation between labels is the focus of our work. Unlike previous models that are modeled in single-label multi-classification problems, we propose a novel multi-label local metric NER model to rethink Nested Entity Recognition from a multi-label perspective. Simultaneously, to address the significant sample imbalance problem commonly encountered in multi-label scenarios, we introduce a parts-of-speech-based strategy that significantly improves the model’s performance on imbalanced datasets. Experiments on nested, multi-label, and flat datasets verify the generalization and superiority of our model, with results surpassing the existing state-of-the-art (SOTA) on several multi-label and flat benchmarks. After a series of experimental analyses, we highlight the persistent challenges in the multi-label NER. We are hopeful that the insights derived from our work will not only provide new perspectives on the nested NER landscape but also contribute to the ongoing momentum necessary for advancing research in the field of multi-label NER.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112686"},"PeriodicalIF":7.2,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-04DOI: 10.1016/j.knosys.2024.112692
Shichao Wu , Yongru Wang , Yushan Jiang , Qianyi Zhang , Jingtai Liu
Sound event localization and detection (SELD) refers to classifying sound categories and locating their locations with acoustic models on the same multichannel audio. Recently, SELD has been rapidly evolving by leveraging advanced approaches from other research areas, and the benchmark SELD datasets have become increasingly realistic with simultaneously captured videos provided. Vibration produces sound, we usually associate visual objects with their sound, i.e., we hear footsteps from a walking person, and hear a jangle from one running bell. It comes naturally to think about using multimodal information (image–audio–text vs audio merely), to strengthen sound event detection (SED) accuracies and decrease sound source localization (SSL) errors. In this paper, we propose one contrastive representation-based multimodal acoustic model (CRATI) for SELD, which is designed to learn contrastive audio representations from audio, text, and image in an end-to-end manner. Experiments on the real dataset of STARSS23 and the synthesized dataset of TAU-NIGENS Spatial Sound Events 2021 both show that our CRATI model can learn more effective audio features with additional constraints to minimize the difference among audio and text (SED and SSL annotations in this work). Image input is not conducive to improving SELD performance, as only minor visual changes can be observed from consecutive frames. Compared to the baseline system, our model increases the SED F-score by 11% and decreases the SSL error by 31.02 on the STARSS23 dataset, respectively.
{"title":"CRATI: Contrastive representation-based multimodal sound event localization and detection","authors":"Shichao Wu , Yongru Wang , Yushan Jiang , Qianyi Zhang , Jingtai Liu","doi":"10.1016/j.knosys.2024.112692","DOIUrl":"10.1016/j.knosys.2024.112692","url":null,"abstract":"<div><div>Sound event localization and detection (SELD) refers to classifying sound categories and locating their locations with acoustic models on the same multichannel audio. Recently, SELD has been rapidly evolving by leveraging advanced approaches from other research areas, and the benchmark SELD datasets have become increasingly realistic with simultaneously captured videos provided. Vibration produces sound, we usually associate visual objects with their sound, i.e., we hear footsteps from a walking person, and hear a jangle from one running bell. It comes naturally to think about using multimodal information (image–audio–text vs audio merely), to strengthen sound event detection (SED) accuracies and decrease sound source localization (SSL) errors. In this paper, we propose one contrastive representation-based multimodal acoustic model (CRATI) for SELD, which is designed to learn contrastive audio representations from audio, text, and image in an end-to-end manner. Experiments on the real dataset of STARSS23 and the synthesized dataset of TAU-NIGENS Spatial Sound Events 2021 both show that our CRATI model can learn more effective audio features with additional constraints to minimize the difference among audio and text (SED and SSL annotations in this work). Image input is not conducive to improving SELD performance, as only minor visual changes can be observed from consecutive frames. Compared to the baseline system, our model increases the SED F-score by 11% and decreases the SSL error by 31.02<span><math><mo>°</mo></math></span> on the STARSS23 dataset, respectively.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112692"},"PeriodicalIF":7.2,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-03DOI: 10.1016/j.knosys.2024.112696
Bo Meng , Amin Rezaeipanah
The orchestration of Service Function Chains (SFCs) in Mobile Edge Computing (MEC) becomes crucial for ensuring efficient service provision, especially under dynamic and uncertain demand. Meanwhile, the parallelization of Virtual Network Functions (VNFs) within an SFC can further optimize resource usage and reduce the risk of deadline violations. However, most existing works formulate the SFC orchestration problem in MEC with deterministic demands and costly runtime resource reprovisioning to handle dynamic demands. This paper introduces a Robust Deadline-aware network function Parallelization framework under Demand Uncertainty (RDPDU) designed to address the challenges posed by unpredictable fluctuations in user demand and resource availability within MEC networks. RDPDU to consider end-to-end latency for SFC assembly by modeling load-dependent processing latency and load-independent propagation latency. Also, RDPDU formulates the problem assuming uncertain demand by Quadratic Integer Programming (QIP) to be resistant to dynamic service demand fluctuations. By discovering dependencies between VNFs, the RDPDU effectively assembles multiple sub-SFCs instead of the original SFC. Finally, our framework uses Deep Reinforcement Learning (DRL) to assemble sub-SFCs with guaranteed latency and deadline. By integrating DRL into the SFC orchestration problem, the framework adapts to changing network conditions and demand patterns, improving the overall system's flexibility and robustness. Experimental evaluations show that the proposed framework can effectively deal with demand fluctuations, latency, deadline, and scalability and improve performance against recent algorithms.
{"title":"Robust deadline-aware network function parallelization framework under demand uncertainty","authors":"Bo Meng , Amin Rezaeipanah","doi":"10.1016/j.knosys.2024.112696","DOIUrl":"10.1016/j.knosys.2024.112696","url":null,"abstract":"<div><div>The orchestration of Service Function Chains (SFCs) in Mobile Edge Computing (MEC) becomes crucial for ensuring efficient service provision, especially under dynamic and uncertain demand. Meanwhile, the parallelization of Virtual Network Functions (VNFs) within an SFC can further optimize resource usage and reduce the risk of deadline violations. However, most existing works formulate the SFC orchestration problem in MEC with deterministic demands and costly runtime resource reprovisioning to handle dynamic demands. This paper introduces a Robust Deadline-aware network function Parallelization framework under Demand Uncertainty (RDPDU) designed to address the challenges posed by unpredictable fluctuations in user demand and resource availability within MEC networks. RDPDU to consider end-to-end latency for SFC assembly by modeling load-dependent processing latency and load-independent propagation latency. Also, RDPDU formulates the problem assuming uncertain demand by Quadratic Integer Programming (QIP) to be resistant to dynamic service demand fluctuations. By discovering dependencies between VNFs, the RDPDU effectively assembles multiple sub-SFCs instead of the original SFC. Finally, our framework uses Deep Reinforcement Learning (DRL) to assemble sub-SFCs with guaranteed latency and deadline. By integrating DRL into the SFC orchestration problem, the framework adapts to changing network conditions and demand patterns, improving the overall system's flexibility and robustness. Experimental evaluations show that the proposed framework can effectively deal with demand fluctuations, latency, deadline, and scalability and improve performance against recent algorithms.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112696"},"PeriodicalIF":7.2,"publicationDate":"2024-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-02DOI: 10.1016/j.knosys.2024.112680
Bin Chen , Xiaojin Ren , Shunshun Bai , Ziyuan Chen , Qinghai Zheng , Jihua Zhu
Multi-view Representation Learning (MRL) has recently attracted widespread attention because it can integrate information from diverse data sources to achieve better performance. However, existing MRL methods still have two issues: (1) They typically perform various consistency objectives within the feature space, which might discard complementary information contained in each view. (2) Some methods only focus on handling inter-view relationships while ignoring inter-sample relationships that are also valuable for downstream tasks. To address these issues, we propose a novel Multi-view representation learning method with Dual-label Collaborative Guidance (MDCG). Specifically, we fully excavate and utilize valuable semantic and graph information hidden in multi-view data to collaboratively guide the learning process of MRL. By learning consistent semantic labels from distinct views, our method enhances intrinsic connections across views while preserving view-specific information, which contributes to learning the consistent and complementary unified representation. Moreover, we integrate similarity matrices of multiple views to construct graph labels that indicate inter-sample relationships. With the idea of self-supervised contrastive learning, graph structure information implied in graph labels is effectively captured by the unified representation, thus enhancing its discriminability. Extensive experiments on diverse real-world datasets demonstrate the effectiveness and superiority of MDCG compared with nine state-of-the-art methods. Our code will be available at https://github.com/Bin1Chen/MDCG.
{"title":"Multi-view representation learning with dual-label collaborative guidance","authors":"Bin Chen , Xiaojin Ren , Shunshun Bai , Ziyuan Chen , Qinghai Zheng , Jihua Zhu","doi":"10.1016/j.knosys.2024.112680","DOIUrl":"10.1016/j.knosys.2024.112680","url":null,"abstract":"<div><div>Multi-view Representation Learning (MRL) has recently attracted widespread attention because it can integrate information from diverse data sources to achieve better performance. However, existing MRL methods still have two issues: (1) They typically perform various consistency objectives within the feature space, which might discard complementary information contained in each view. (2) Some methods only focus on handling inter-view relationships while ignoring inter-sample relationships that are also valuable for downstream tasks. To address these issues, we propose a novel Multi-view representation learning method with Dual-label Collaborative Guidance (MDCG). Specifically, we fully excavate and utilize valuable semantic and graph information hidden in multi-view data to collaboratively guide the learning process of MRL. By learning consistent semantic labels from distinct views, our method enhances intrinsic connections across views while preserving view-specific information, which contributes to learning the consistent and complementary unified representation. Moreover, we integrate similarity matrices of multiple views to construct graph labels that indicate inter-sample relationships. With the idea of self-supervised contrastive learning, graph structure information implied in graph labels is effectively captured by the unified representation, thus enhancing its discriminability. Extensive experiments on diverse real-world datasets demonstrate the effectiveness and superiority of MDCG compared with nine state-of-the-art methods. Our code will be available at <span><span>https://github.com/Bin1Chen/MDCG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112680"},"PeriodicalIF":7.2,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-02DOI: 10.1016/j.knosys.2024.112681
Chang Wu , Gang He , Wanlin Zhao , Xinquan Lai , Yunsong Li
Despite progress in learning-based stereo dehazing, few studies have focused on stereo video dehazing (SVD). Existing methods may fall short in the SVD task by not fully leveraging multi-domain information. To address this gap, we propose a parallax-motion collaboration network (PMCN) that integrates parallax and motion information for efficient stereo video fog removal. We delicately design a parallax-motion collaboration block (PMCB) as the critical component of PMCN. Firstly, to capture binocular parallax correspondences more efficiently, we introduce a window-based parallax attention mechanism (W-PAM) in the parallax interaction module (PIM) of PMCB. By horizontally splitting the whole frame into multiple windows and extracting parallax relationships within each window, memory usage and runtime can be reduced. Meanwhile, we further conduct horizontal feature modulation to handle cross-window disparity variations. Secondly, a motion alignment module (MAM) based on deformable convolution explores the temporal correlation in the feature space for an independent view. Finally, we propose a fog-adaptive refinement module (FARM) to refine the features after interaction and alignment. FARM incorporates fog prior information and guides the network in dynamically generating processing kernels for dehazing to adapt to different fog scenarios. Quantitative and qualitative results demonstrate that the proposed PMCN outperforms state-of-the-art methods on both synthetic and real-world datasets. In addition, our PMCN also benefits the accuracy improvement for high-level vision tasks in fog scenes, e.g., object detection and stereo matching.
{"title":"PMCN: Parallax-motion collaboration network for stereo video dehazing","authors":"Chang Wu , Gang He , Wanlin Zhao , Xinquan Lai , Yunsong Li","doi":"10.1016/j.knosys.2024.112681","DOIUrl":"10.1016/j.knosys.2024.112681","url":null,"abstract":"<div><div>Despite progress in learning-based stereo dehazing, few studies have focused on stereo video dehazing (SVD). Existing methods may fall short in the SVD task by not fully leveraging multi-domain information. To address this gap, we propose a parallax-motion collaboration network (PMCN) that integrates parallax and motion information for efficient stereo video fog removal. We delicately design a parallax-motion collaboration block (PMCB) as the critical component of PMCN. Firstly, to capture binocular parallax correspondences more efficiently, we introduce a window-based parallax attention mechanism (W-PAM) in the parallax interaction module (PIM) of PMCB. By horizontally splitting the whole frame into multiple windows and extracting parallax relationships within each window, memory usage and runtime can be reduced. Meanwhile, we further conduct horizontal feature modulation to handle cross-window disparity variations. Secondly, a motion alignment module (MAM) based on deformable convolution explores the temporal correlation in the feature space for an independent view. Finally, we propose a fog-adaptive refinement module (FARM) to refine the features after interaction and alignment. FARM incorporates fog prior information and guides the network in dynamically generating processing kernels for dehazing to adapt to different fog scenarios. Quantitative and qualitative results demonstrate that the proposed PMCN outperforms state-of-the-art methods on both synthetic and real-world datasets. In addition, our PMCN also benefits the accuracy improvement for high-level vision tasks in fog scenes, e.g., object detection and stereo matching.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112681"},"PeriodicalIF":7.2,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-30DOI: 10.1016/j.knosys.2024.112666
Nicolás R. Uribe, Alberto Herrán, J. Manuel Colmenar
The bi-objective Double Floor Corridor Allocation Problem is an operational research problem with the goal of finding the best arrangement of facilities in a layout with two corridors located in two floors, in order to minimize the material handling costs and the corridor length. In this paper, we present a novel approach based on a combination of Path Relinking strategies. To this aim, we propose two greedy algorithms to produce an initial set of non-dominated solutions. In a first stage, we apply an Interior Path Relinking with the aim of improving this set and, in the second stage, apply an Exterior Path Relinking to reach solutions that are unreachable in the first stage. Our extensive experimental analysis shows that our method, after automatic parameter optimization, completely dominates the previous benchmarks, spending shorter computation times. In addition, we provide detailed results for the new instances, including standard metrics for multi-objective problems.
{"title":"Path relinking strategies for the bi-objective double floor corridor allocation problem","authors":"Nicolás R. Uribe, Alberto Herrán, J. Manuel Colmenar","doi":"10.1016/j.knosys.2024.112666","DOIUrl":"10.1016/j.knosys.2024.112666","url":null,"abstract":"<div><div>The bi-objective Double Floor Corridor Allocation Problem is an operational research problem with the goal of finding the best arrangement of facilities in a layout with two corridors located in two floors, in order to minimize the material handling costs and the corridor length. In this paper, we present a novel approach based on a combination of Path Relinking strategies. To this aim, we propose two greedy algorithms to produce an initial set of non-dominated solutions. In a first stage, we apply an Interior Path Relinking with the aim of improving this set and, in the second stage, apply an Exterior Path Relinking to reach solutions that are unreachable in the first stage. Our extensive experimental analysis shows that our method, after automatic parameter optimization, completely dominates the previous benchmarks, spending shorter computation times. In addition, we provide detailed results for the new instances, including standard metrics for multi-objective problems.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112666"},"PeriodicalIF":7.2,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142587440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.knosys.2024.112604
Jieyang Su, Yuzhong Chen, Xiuqiang Lin, Jiayuan Zhong, Chen Dong
The recommendation system aims to recommend items to users by capturing their personalized interests. Traditional recommendation systems typically focus on modeling target behaviors between users and items. However, in practical application scenarios, various types of behaviors (e.g., click, favorite, purchase, etc.) occur between users and items. Despite recent efforts in modeling various behavior types, multi-behavior recommendation still faces two significant challenges. The first challenge is how to comprehensively capture the complex relationships between various types of behaviors, including their interest differences and interest commonalities. The second challenge is how to solve the sparsity of target behaviors while ensuring the authenticity of information from various types of behaviors. To address these issues, a multi-behavior recommendation framework based on Multi-View Multi-Behavior Interest Learning Network and Contrastive Learning (MMNCL) is proposed. This framework includes a multi-view multi-behavior interest learning module that consists of two submodules: the behavior difference aware submodule, which captures intra-behavior interests for each behavior type and the correlations between various types of behaviors, and the behavior commonality aware submodule, which captures the information of interest commonalities between various types of behaviors. Additionally, a multi-view contrastive learning module is proposed to conduct node self-discrimination, ensuring the authenticity of information integration among various types of behaviors, and facilitating an effective fusion of interest differences and interest commonalities. Experimental results on three real-world benchmark datasets demonstrate the effectiveness of MMNCL and its advantages over other state-of-the-art recommendation models. Our code is available at https://github.com/sujieyang/MMNCL.
{"title":"Multi-view multi-behavior interest learning network and contrastive learning for multi-behavior recommendation","authors":"Jieyang Su, Yuzhong Chen, Xiuqiang Lin, Jiayuan Zhong, Chen Dong","doi":"10.1016/j.knosys.2024.112604","DOIUrl":"10.1016/j.knosys.2024.112604","url":null,"abstract":"<div><div>The recommendation system aims to recommend items to users by capturing their personalized interests. Traditional recommendation systems typically focus on modeling target behaviors between users and items. However, in practical application scenarios, various types of behaviors (e.g., click, favorite, purchase, etc.) occur between users and items. Despite recent efforts in modeling various behavior types, multi-behavior recommendation still faces two significant challenges. The first challenge is how to comprehensively capture the complex relationships between various types of behaviors, including their interest differences and interest commonalities. The second challenge is how to solve the sparsity of target behaviors while ensuring the authenticity of information from various types of behaviors. To address these issues, a multi-behavior recommendation framework based on Multi-View Multi-Behavior Interest Learning Network and Contrastive Learning (MMNCL) is proposed. This framework includes a multi-view multi-behavior interest learning module that consists of two submodules: the behavior difference aware submodule, which captures intra-behavior interests for each behavior type and the correlations between various types of behaviors, and the behavior commonality aware submodule, which captures the information of interest commonalities between various types of behaviors. Additionally, a multi-view contrastive learning module is proposed to conduct node self-discrimination, ensuring the authenticity of information integration among various types of behaviors, and facilitating an effective fusion of interest differences and interest commonalities. Experimental results on three real-world benchmark datasets demonstrate the effectiveness of MMNCL and its advantages over other state-of-the-art recommendation models. Our code is available at <span><span>https://github.com/sujieyang/MMNCL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112604"},"PeriodicalIF":7.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142553654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.knosys.2024.112658
Liu Cheng , Haochen Qi , Rongcai Ma , Xiangwei Kong , Yongchao Zhang , Yunpeng Zhu
Traditional supervised learning-based fault-diagnosis models often encounter performance degradation when data distribution shifts occur. Although unsupervised transfer learning can address such issues, most existing methods face challenges arising from partial cross-domain diagnostic scenarios with limited training data. Therefore, this study introduces a unified few-shot partial-transfer learning framework, specifically designed to address the limitations of data scarcity and partial cross-domain diagnosis applicability. Our framework innovatively takes ridge regression-based feature reconstruction as a nexus to integrate episodic learning with an episodic pretext task and weighted feature alignment, thereby enhancing model adaptability across varying working conditions with minimal data. Specifically, the episodic pretext task enables the learned features with generalization abilities in a self-supervised manner to mitigate meta-overfitting. Weighted feature alignment is performed at the reconstructed feature level, allowing partial transfer with a significantly increased number of features, while further reducing overfitting. Experiments conducted on two distinct datasets revealed that the proposed method outperforms existing state-of-the-art approaches, demonstrating superior transfer performance and robustness under the conditions of limited fault samples.
{"title":"FS-PTL: A unified few-shot partial transfer learning framework for partial cross-domain fault diagnosis under limited data scenarios","authors":"Liu Cheng , Haochen Qi , Rongcai Ma , Xiangwei Kong , Yongchao Zhang , Yunpeng Zhu","doi":"10.1016/j.knosys.2024.112658","DOIUrl":"10.1016/j.knosys.2024.112658","url":null,"abstract":"<div><div>Traditional supervised learning-based fault-diagnosis models often encounter performance degradation when data distribution shifts occur. Although unsupervised transfer learning can address such issues, most existing methods face challenges arising from partial cross-domain diagnostic scenarios with limited training data. Therefore, this study introduces a unified few-shot partial-transfer learning framework, specifically designed to address the limitations of data scarcity and partial cross-domain diagnosis applicability. Our framework innovatively takes ridge regression-based feature reconstruction as a nexus to integrate episodic learning with an episodic pretext task and weighted feature alignment, thereby enhancing model adaptability across varying working conditions with minimal data. Specifically, the episodic pretext task enables the learned features with generalization abilities in a self-supervised manner to mitigate meta-overfitting. Weighted feature alignment is performed at the reconstructed feature level, allowing partial transfer with a significantly increased number of features, while further reducing overfitting. Experiments conducted on two distinct datasets revealed that the proposed method outperforms existing state-of-the-art approaches, demonstrating superior transfer performance and robustness under the conditions of limited fault samples.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112658"},"PeriodicalIF":7.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-28DOI: 10.1016/j.knosys.2024.112694
V. Shandhoosh , Naveen Venkatesh S , Ganjikunta Chakrapani , V. Sugumaran , Sangharatna M. Ramteke , Max Marian
Timely fault detection is crucial for preventing issues like worn clutch plates and excessive friction material degradation, enhancing fuel efficiency, and prolonging clutch lifespan. This study focuses on early fault diagnosis in dry friction clutch systems using machine learning (ML) techniques. Vibration data is analyzed under different load and fault conditions, extracting statistical, histogram, and auto-regressive moving average (ARMA) features. Feature selection employs the J48 decision tree algorithm, evaluated with eight ML classifiers: support vector machines (SVM), k-nearest neighbor (kNN), linear model tree (LMT), random forest (RF), multilayer perceptron (MLP), logistic regression (LR), J48, and Naive Bayes. The evaluation revealed that individual classifiers achieved the highest testing accuracies with statistical feature selection as 83% for both MLP and LR at no load, 90% for MLP at 5 kg, and 93% for KNN at 10 kg. For histogram feature selection, KNN and MLP both reached 85% at no load, MLP achieved 91% at 5 kg, and RF attained 97% at 10 kg. With ARMA feature selection, KNN reached 93% at no load, LR achieved 94% at 5 kg, and RF reached 86% at 10 kg. The voting strategy notably improved these results, with the RF-KNN-J48 ensemble reaching 98% for histogram features at 10 kg, the KNN-LMT-RF ensemble achieving 94% for ARMA features at no load, and the SVM-MLP-LMT ensemble attaining 95% for ARMA features at 5 kg. Hence, a combination of three classifiers using the majority voting rule consistently outperforms standalone classifiers, striking a balance between diversity and complexity, facilitating robust decision-making. In practical applications, selecting the optimal combination of feature selection method and classifier is vital for accurate fault classification. This study provides valuable guidance for engineers and practitioners implementing robust load classification systems in industrial settings.
{"title":"Intelligent fault diagnosis for tribo-mechanical systems by machine learning: Multi-feature extraction and ensemble voting methods","authors":"V. Shandhoosh , Naveen Venkatesh S , Ganjikunta Chakrapani , V. Sugumaran , Sangharatna M. Ramteke , Max Marian","doi":"10.1016/j.knosys.2024.112694","DOIUrl":"10.1016/j.knosys.2024.112694","url":null,"abstract":"<div><div>Timely fault detection is crucial for preventing issues like worn clutch plates and excessive friction material degradation, enhancing fuel efficiency, and prolonging clutch lifespan. This study focuses on early fault diagnosis in dry friction clutch systems using machine learning (ML) techniques. Vibration data is analyzed under different load and fault conditions, extracting statistical, histogram, and auto-regressive moving average (ARMA) features. Feature selection employs the J48 decision tree algorithm, evaluated with eight ML classifiers: support vector machines (SVM), k-nearest neighbor (kNN), linear model tree (LMT), random forest (RF), multilayer perceptron (MLP), logistic regression (LR), J48, and Naive Bayes. The evaluation revealed that individual classifiers achieved the highest testing accuracies with statistical feature selection as 83% for both MLP and LR at no load, 90% for MLP at 5 kg, and 93% for KNN at 10 kg. For histogram feature selection, KNN and MLP both reached 85% at no load, MLP achieved 91% at 5 kg, and RF attained 97% at 10 kg. With ARMA feature selection, KNN reached 93% at no load, LR achieved 94% at 5 kg, and RF reached 86% at 10 kg. The voting strategy notably improved these results, with the RF-KNN-J48 ensemble reaching 98% for histogram features at 10 kg, the KNN-LMT-RF ensemble achieving 94% for ARMA features at no load, and the SVM-MLP-LMT ensemble attaining 95% for ARMA features at 5 kg. Hence, a combination of three classifiers using the majority voting rule consistently outperforms standalone classifiers, striking a balance between diversity and complexity, facilitating robust decision-making. In practical applications, selecting the optimal combination of feature selection method and classifier is vital for accurate fault classification. This study provides valuable guidance for engineers and practitioners implementing robust load classification systems in industrial settings.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112694"},"PeriodicalIF":7.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142553657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}