Pub Date : 2025-02-13DOI: 10.1016/j.knosys.2025.113141
Chaoqun Wang , Yijun Li , Siyi Wang , Qi Wu
The ‘Buy Now, Pay Later’ service has revolutionized consumer credit, particularly in e-commerce, by offering flexible options and competitive rates. However, assessing credit risk remains challenging due to limited personal information. Given the availability of consumer online activities, including shopping and credit behaviors, and the necessity for model explanation in high-stakes applications such as credit risk management, we propose an intrinsic explainable model, GLEN (GRU-based Linear Explainable Network), to predict consumers’ credit risk. GLEN leverages the sequential behavior processing capabilities of GRU, along with the transparency of linear regression, to predict credit risk and provide explanations simultaneously. Empirically validated on a real-world e-commerce dataset and a public dataset, GLEN demonstrates a good balance between competitive predictive performance and interpretability, highlighting critical factors for credit risk forecasting. Our findings suggest that past credit status is crucial for credit risk forecasting, and the number of borrowings and repayments is more influential than the amount borrowed or repaid. Additionally, browsing frequency and purchase frequency are also important factors. These insights can provide valuable guidance for platforms to predict credit risk more accurately.
{"title":"Demystifying deep credit models in e-commerce lending: An explainable approach to consumer creditworthiness","authors":"Chaoqun Wang , Yijun Li , Siyi Wang , Qi Wu","doi":"10.1016/j.knosys.2025.113141","DOIUrl":"10.1016/j.knosys.2025.113141","url":null,"abstract":"<div><div>The ‘Buy Now, Pay Later’ service has revolutionized consumer credit, particularly in e-commerce, by offering flexible options and competitive rates. However, assessing credit risk remains challenging due to limited personal information. Given the availability of consumer online activities, including shopping and credit behaviors, and the necessity for model explanation in high-stakes applications such as credit risk management, we propose an intrinsic explainable model, GLEN (GRU-based Linear Explainable Network), to predict consumers’ credit risk. GLEN leverages the sequential behavior processing capabilities of GRU, along with the transparency of linear regression, to predict credit risk and provide explanations simultaneously. Empirically validated on a real-world e-commerce dataset and a public dataset, GLEN demonstrates a good balance between competitive predictive performance and interpretability, highlighting critical factors for credit risk forecasting. Our findings suggest that past credit status is crucial for credit risk forecasting, and the number of borrowings and repayments is more influential than the amount borrowed or repaid. Additionally, browsing frequency and purchase frequency are also important factors. These insights can provide valuable guidance for platforms to predict credit risk more accurately.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"312 ","pages":"Article 113141"},"PeriodicalIF":7.2,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-13DOI: 10.1016/j.knosys.2025.113143
Salah Eddine Maoudj, Aissam Belghiat
The rise of Internet of Things (IoT) technology has significantly enhanced several aspects of our modern life, from smart homes and cities to healthcare and industry. However, the distributed nature of IoT devices and the highly dynamic functioning of their environments introduce additional security challenges compared to conventional networks. Moreover, the datasets used to construct intrusion detection systems (IDS) are intrinsically imbalanced. Existing balancing techniques can address this issue with partially imbalanced datasets. However, their efficiency is limited when dealing with highly imbalanced datasets. As a result, the IDS delivers a humble performance that dissatisfies the IoT-based systems requirements. Therefore, novel approaches must be investigated to address this issue. In this paper, we propose a deep learning-based approach with two-step minority classes prediction to enhance intrusion detection in IoT networks. As our main model, we employ a one-dimensional convolutional neural network (1-D CNN), which predicts network traffic with a single output for the minority classes. Additionally, another 1-D CNN is trained on these minorities, but it only performs a second prediction if the first model classifies the output as the minority group. Furthermore, we utilize the class weight technique to achieve more balance in the models’ learning. We evaluated the proposed approach on the UNSW-NB15 and BoT-IoT datasets, two well-known benchmarks in building IDS for IoT networks. Compared to state-of-the-art methods, our approach revealed superior performance, achieving 80.65% and 99.99% accuracy in the multi-classification, respectively.
{"title":"A deep learning-based approach with two-step minority classes prediction for intrusion detection in Internet of Things networks","authors":"Salah Eddine Maoudj, Aissam Belghiat","doi":"10.1016/j.knosys.2025.113143","DOIUrl":"10.1016/j.knosys.2025.113143","url":null,"abstract":"<div><div>The rise of Internet of Things (IoT) technology has significantly enhanced several aspects of our modern life, from smart homes and cities to healthcare and industry. However, the distributed nature of IoT devices and the highly dynamic functioning of their environments introduce additional security challenges compared to conventional networks. Moreover, the datasets used to construct intrusion detection systems (IDS) are intrinsically imbalanced. Existing balancing techniques can address this issue with partially imbalanced datasets. However, their efficiency is limited when dealing with highly imbalanced datasets. As a result, the IDS delivers a humble performance that dissatisfies the IoT-based systems requirements. Therefore, novel approaches must be investigated to address this issue. In this paper, we propose a deep learning-based approach with two-step minority classes prediction to enhance intrusion detection in IoT networks. As our main model, we employ a one-dimensional convolutional neural network (1-D CNN), which predicts network traffic with a single output for the minority classes. Additionally, another 1-D CNN is trained on these minorities, but it only performs a second prediction if the first model classifies the output as the minority group. Furthermore, we utilize the class weight technique to achieve more balance in the models’ learning. We evaluated the proposed approach on the UNSW-NB15 and BoT-IoT datasets, two well-known benchmarks in building IDS for IoT networks. Compared to state-of-the-art methods, our approach revealed superior performance, achieving 80.65% and 99.99% accuracy in the multi-classification, respectively.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"312 ","pages":"Article 113143"},"PeriodicalIF":7.2,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-13DOI: 10.1016/j.knosys.2025.113137
Lei Chen , Yepeng Xu , Chaoqun Fan , Yuan Li , Ming Li , Zexin Lu , Xinquan Xie
Representation learning for multivariate time series (MTS) has shown great potential in various analysis tasks. However, most existing representation models are designed for a certain task, such as forecasting and classification, which may cause poor generality and incomplete feature extraction. Moreover, these models are often sensitive to noise, which also affects the performance. To address these issues, an unsupervised, noise-tolerant universal representation learning model, namely TSG2L, is proposed for multivariate time series from a global-to-local perspective. Inspired by the idea of “drawing the outline before filling in the details”, TSG2L adopts a global-to-local learning way instead of the traditional local-to-global way. Technically, TSG2L divides the representation learning process into two sequential stages: global feature learning (drawing outline) and local feature learning (filling detail). In the first stage, a noise-tolerant multi-scale global reconstruction network is designed to perform variable-independent global feature learning. In the second stage, a noise-tolerant “1+M” prediction network is developed to integrate global features and perform variable-related local feature learning. To the best of our knowledge, this is the first work to explore MTS representation learning from a global-to-local perspective. Extensive experiments on three analysis tasks and eighteen real-world datasets demonstrate that TSG2L outperforms several state-of-the-art models. The source code of TSG2L is available at https://github.com/infogroup502/TSG2L.
{"title":"Noise-tolerant universal representation learning for multivariate time series from global-to-local perspective","authors":"Lei Chen , Yepeng Xu , Chaoqun Fan , Yuan Li , Ming Li , Zexin Lu , Xinquan Xie","doi":"10.1016/j.knosys.2025.113137","DOIUrl":"10.1016/j.knosys.2025.113137","url":null,"abstract":"<div><div>Representation learning for multivariate time series (MTS) has shown great potential in various analysis tasks. However, most existing representation models are designed for a certain task, such as forecasting and classification, which may cause poor generality and incomplete feature extraction. Moreover, these models are often sensitive to noise, which also affects the performance. To address these issues, an unsupervised, noise-tolerant universal representation learning model, namely TSG2L, is proposed for multivariate time series from a global-to-local perspective. Inspired by the idea of “drawing the outline before filling in the details”, TSG2L adopts a global-to-local learning way instead of the traditional local-to-global way. Technically, TSG2L divides the representation learning process into two sequential stages: global feature learning (drawing outline) and local feature learning (filling detail). <em>In the first stage</em>, a noise-tolerant multi-scale global reconstruction network is designed to perform variable-independent global feature learning. <em>In the second stage</em>, a noise-tolerant “1+M” prediction network is developed to integrate global features and perform variable-related local feature learning. To the best of our knowledge, this is the first work to explore MTS representation learning from a global-to-local perspective. Extensive experiments on three analysis tasks and eighteen real-world datasets demonstrate that TSG2L outperforms several state-of-the-art models. The source code of TSG2L is available at <span><span>https://github.com/infogroup502/TSG2L</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"312 ","pages":"Article 113137"},"PeriodicalIF":7.2,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-12DOI: 10.1016/j.knosys.2025.113142
Zhengkai Wang , Hui Liu , Ertong Shang , Quan Wang , Junzhao Du
Accurate prediction of cloud platform load contributes to the optimal allocation of cloud platform resources, and is an important means to solve resource scheduling problems and effectively manage cloud resources. However, most previous studies on cloud load prediction are based on offline settings, lacking scalability in realistic scenarios where data streams constantly arrive. Online real-time prediction of cloud loads can improve prediction efficiency, realizing fast response and dynamic adjustment to sudden loads, effectively minimizing resource wastage and enhancing system robustness. Therefore, we propose a deep learning—based online cloud load prediction network, OCPNet. It employs a forward architecture of learning module stacking, which progressively expands the receptive field of the convolutional kernel inside the learning module by exponentially growing the dilation factor to acquire short- and long-term features. Additionally, an online learning mechanism incorporating memory capabilities is proposed, which utilizes a fast learner to complete the learning of data streams, and a Pearson trigger to initiate the dynamic interaction between the memorizer and fast learner, thereby reducing the concept drift’s impact. Moreover, we propose a feature extractor that enriches the data features of variables by accomplishing the extraction of variable relationships using the flip and multi-attention mechanisms. In experiments on Huawei Cloud and Microsoft Cloud workload datasets, OCPNet is compared with current mainstream deep learning models for cloud workload prediction. Results indicate that OCPNet’s online multivariate and univariate prediction mean square error decreases by 25.5% and 35.5%, respectively, compared with the best deep learning baseline models.
{"title":"OCPNet: A deep learning model for online cloud load prediction","authors":"Zhengkai Wang , Hui Liu , Ertong Shang , Quan Wang , Junzhao Du","doi":"10.1016/j.knosys.2025.113142","DOIUrl":"10.1016/j.knosys.2025.113142","url":null,"abstract":"<div><div>Accurate prediction of cloud platform load contributes to the optimal allocation of cloud platform resources, and is an important means to solve resource scheduling problems and effectively manage cloud resources. However, most previous studies on cloud load prediction are based on offline settings, lacking scalability in realistic scenarios where data streams constantly arrive. Online real-time prediction of cloud loads can improve prediction efficiency, realizing fast response and dynamic adjustment to sudden loads, effectively minimizing resource wastage and enhancing system robustness. Therefore, we propose a deep learning—based online cloud load prediction network, OCPNet. It employs a forward architecture of learning module stacking, which progressively expands the receptive field of the convolutional kernel inside the learning module by exponentially growing the dilation factor to acquire short- and long-term features. Additionally, an online learning mechanism incorporating memory capabilities is proposed, which utilizes a fast learner to complete the learning of data streams, and a Pearson trigger to initiate the dynamic interaction between the memorizer and fast learner, thereby reducing the concept drift’s impact. Moreover, we propose a feature extractor that enriches the data features of variables by accomplishing the extraction of variable relationships using the flip and multi-attention mechanisms. In experiments on Huawei Cloud and Microsoft Cloud workload datasets, OCPNet is compared with current mainstream deep learning models for cloud workload prediction. Results indicate that OCPNet’s online multivariate and univariate prediction mean square error decreases by 25.5% and 35.5%, respectively, compared with the best deep learning baseline models.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"312 ","pages":"Article 113142"},"PeriodicalIF":7.2,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-12DOI: 10.1016/j.knosys.2025.113177
Changqing Shen , Zhenzhong He , Bojian Chen , Weiguo Huang , Lin Li , Dong Wang
In real-world environments, the critical components of rotating machinery often encounter various new fault types because of complex operating conditions. The replay-based continual learning method in fault diagnosis mitigates catastrophic forgetting associated with the introduction of previous fault samples. However, the retention of previous samples during the training of new tasks creates an imbalance in the distribution of dataset and limits the mitigation of catastrophic forgetting. A new continual learning method based on dynamic branch layer fusion is proposed and applied to the diagnosis scenarios with imbalanced dataset. In particular, the proposed method builds a branch layer for each old task to retain the old knowledge upon the arrival of a new task, then the branch layers fusion structure is designed to solve the problem of model growth. Additionally, a two-stage training process encompassing model adaptation and fusion is proposed. On this basis, integration loss is used to optimize the learning of models for all types across tasks. Finally, the assembly of the old and new models is achieved through distillation loss, enhancing the reliability of models on all tasks. Experimental results indicate that the catastrophic forgetting problem prevalent in imbalanced dataset can be effectively alleviated by the proposed method.
{"title":"Dynamic branch layer fusion: A new continual learning method for rotating machinery fault diagnosis","authors":"Changqing Shen , Zhenzhong He , Bojian Chen , Weiguo Huang , Lin Li , Dong Wang","doi":"10.1016/j.knosys.2025.113177","DOIUrl":"10.1016/j.knosys.2025.113177","url":null,"abstract":"<div><div>In real-world environments, the critical components of rotating machinery often encounter various new fault types because of complex operating conditions. The replay-based continual learning method in fault diagnosis mitigates catastrophic forgetting associated with the introduction of previous fault samples. However, the retention of previous samples during the training of new tasks creates an imbalance in the distribution of dataset and limits the mitigation of catastrophic forgetting. A new continual learning method based on dynamic branch layer fusion is proposed and applied to the diagnosis scenarios with imbalanced dataset. In particular, the proposed method builds a branch layer for each old task to retain the old knowledge upon the arrival of a new task, then the branch layers fusion structure is designed to solve the problem of model growth. Additionally, a two-stage training process encompassing model adaptation and fusion is proposed. On this basis, integration loss is used to optimize the learning of models for all types across tasks. Finally, the assembly of the old and new models is achieved through distillation loss, enhancing the reliability of models on all tasks. Experimental results indicate that the catastrophic forgetting problem prevalent in imbalanced dataset can be effectively alleviated by the proposed method.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"313 ","pages":"Article 113177"},"PeriodicalIF":7.2,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143444235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-12DOI: 10.1016/j.knosys.2025.113135
Rongbo Fan , Jialin Xie , Junmin Liu , Yan Zhang , Hong Hou , Jianhua Yang
Unsupervised domain adaptation (UDA) is a key technique for enhancing the generalization and reusability of remote sensing image change detection (CD) models. However, the effectiveness of UDA is often hindered by discrepancies in feature distributions and sample imbalances across disparate CD datasets. To address these issues, we propose the Dynamic Multi-Prototype Cross-Attention model for UDA in CD. This approach enhances the representation of complex land cover features by incorporating multi-prototype features into a cross-attention mechanism, while addressing sample imbalance through a novel pseudo-sample generation strategy. The Multi-prototypes and Difference Feature Cross-Attention Module iteratively updates the multi-prototype features and integrates them with a classical two-stream CD model. This allows the model to achieve domain alignment by minimizing the neighborhood distance between the global multi-prototype features and high-confidence target domain prototype features. In addition, we propose the Sample Fusion and Pasting module, that generates new target domain-style samples of changed regions to facilitate CD-UDA training. Experimental evaluations on the LEVIR, GZ, WH, and GD datasets confirm that DPCA model effectively bridges the feature distribution gap between the source and target domains, significantly improving the detection performance on the unlabeled target domain. The source code is available at https://github.com/Fanrongbo/DPCA-CD-UDA.
{"title":"DPCA: Dynamic multi-prototype cross-attention for change detection unsupervised domain adaptation of remote sensing images","authors":"Rongbo Fan , Jialin Xie , Junmin Liu , Yan Zhang , Hong Hou , Jianhua Yang","doi":"10.1016/j.knosys.2025.113135","DOIUrl":"10.1016/j.knosys.2025.113135","url":null,"abstract":"<div><div>Unsupervised domain adaptation (UDA) is a key technique for enhancing the generalization and reusability of remote sensing image change detection (CD) models. However, the effectiveness of UDA is often hindered by discrepancies in feature distributions and sample imbalances across disparate CD datasets. To address these issues, we propose the Dynamic Multi-Prototype Cross-Attention model for UDA in CD. This approach enhances the representation of complex land cover features by incorporating multi-prototype features into a cross-attention mechanism, while addressing sample imbalance through a novel pseudo-sample generation strategy. The Multi-prototypes and Difference Feature Cross-Attention Module iteratively updates the multi-prototype features and integrates them with a classical two-stream CD model. This allows the model to achieve domain alignment by minimizing the neighborhood distance between the global multi-prototype features and high-confidence target domain prototype features. In addition, we propose the Sample Fusion and Pasting module, that generates new target domain-style samples of changed regions to facilitate CD-UDA training. Experimental evaluations on the LEVIR, GZ, WH, and GD datasets confirm that DPCA model effectively bridges the feature distribution gap between the source and target domains, significantly improving the detection performance on the unlabeled target domain. The source code is available at <span><span>https://github.com/Fanrongbo/DPCA-CD-UDA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"314 ","pages":"Article 113135"},"PeriodicalIF":7.2,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143465239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11DOI: 10.1016/j.knosys.2025.113093
Xiangjie Kong , Jin Liu , Huan Li , Chenwei Zhang , Jiaxin Du , Dongyan Guo , Guojiang Shen
Graph Anomaly Detection (GAD) is of critical importance in areas such as cybersecurity, finance, and healthcare. Detecting anomalous nodes in graph data is a challenging task due to intricate interactions and attribute inconsistencies. Existing methods often distinguish anomalous nodes by using contrasting strategies at various scales. However, they overlook the enhancement methods of positive and negative sample pairs in the contrastive learning process, which can have a significant impact on the robustness and accuracy of the model. To address these limitations, we propose an innovative contrastive self-supervised approach called Diffusion Enhanced Multi-View Contrastive Learning (DE-GAD), which jointly optimizes a diffusion-based enhancement module and a multi-view contrastive learning-based module to better identify anomalous information. Specifically, in the diffusion-based enhancement module, we use the noise addition and stepwise denoising outputs of the diffusion model to enhance the original graphs, and use the loss of reconstruction to the original graphs as one of the criteria for anomaly detection. Second, in the multi-view contrastive module, we establish three contrastive views, namely node–node contrast, node–subgraph contrast, and subgraph–subgraph contrast, to enable the model to better capture the underlying relationships of graph nodes and thereby identify anomalies in the structural space. Finally, two complementary modules and their corresponding losses are integrated to detect anomalous nodes more accurately. Empirical experiments conducted on six benchmark datasets demonstrate the superiority of our proposed approach over existing methods.
{"title":"Graph Anomaly Detection via Diffusion Enhanced Multi-View Contrastive Learning","authors":"Xiangjie Kong , Jin Liu , Huan Li , Chenwei Zhang , Jiaxin Du , Dongyan Guo , Guojiang Shen","doi":"10.1016/j.knosys.2025.113093","DOIUrl":"10.1016/j.knosys.2025.113093","url":null,"abstract":"<div><div>Graph Anomaly Detection (GAD) is of critical importance in areas such as cybersecurity, finance, and healthcare. Detecting anomalous nodes in graph data is a challenging task due to intricate interactions and attribute inconsistencies. Existing methods often distinguish anomalous nodes by using contrasting strategies at various scales. However, they overlook the enhancement methods of positive and negative sample pairs in the contrastive learning process, which can have a significant impact on the robustness and accuracy of the model. To address these limitations, we propose an innovative contrastive self-supervised approach called Diffusion Enhanced Multi-View Contrastive Learning (DE-GAD), which jointly optimizes a diffusion-based enhancement module and a multi-view contrastive learning-based module to better identify anomalous information. Specifically, in the diffusion-based enhancement module, we use the noise addition and stepwise denoising outputs of the diffusion model to enhance the original graphs, and use the loss of reconstruction to the original graphs as one of the criteria for anomaly detection. Second, in the multi-view contrastive module, we establish three contrastive views, namely node–node contrast, node–subgraph contrast, and subgraph–subgraph contrast, to enable the model to better capture the underlying relationships of graph nodes and thereby identify anomalies in the structural space. Finally, two complementary modules and their corresponding losses are integrated to detect anomalous nodes more accurately. Empirical experiments conducted on six benchmark datasets demonstrate the superiority of our proposed approach over existing methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"311 ","pages":"Article 113093"},"PeriodicalIF":7.2,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143395301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11DOI: 10.1016/j.knosys.2025.113170
Shuping Wu , Peiming Shi , Xuefang Xu , Xu Yang , Ruixiong Li , Zijian Qiao
Cross-domain bearing fault diagnosis is a serious challenge due to the unlabeled dataset. Deep subdomain adaptation network assisted by K-means clustering algorithm (KMDSAN), a diagnosis method based on non-adversarial network and alignment of subdomain, is proposed in this paper. Taking deep subdomain adaptation network (DSAN) as basic framework is the main difference between KMDSAN and most existing methods, because DSAN emphasizes the subdomain alignment rather than global alignment. Additionally, the K-means clustering algorithm is utilized to optimize the local maximum mean discrepancy to improve the performance of DSAN. Finally, a deep network with an improved attention mechanism is designed for the feature extraction of original bearing vibration signal. In comparison to other methods, KMDSAN is concise yet highly effective, and results from two datasets related to bearing demonstrate that the proposed method achieves excellent diagnosis accuracy.
{"title":"KMDSAN: A novel method for cross-domain and unsupervised bearing fault diagnosis","authors":"Shuping Wu , Peiming Shi , Xuefang Xu , Xu Yang , Ruixiong Li , Zijian Qiao","doi":"10.1016/j.knosys.2025.113170","DOIUrl":"10.1016/j.knosys.2025.113170","url":null,"abstract":"<div><div>Cross-domain bearing fault diagnosis is a serious challenge due to the unlabeled dataset. Deep subdomain adaptation network assisted by K-means clustering algorithm (KMDSAN), a diagnosis method based on non-adversarial network and alignment of subdomain, is proposed in this paper. Taking deep subdomain adaptation network (DSAN) as basic framework is the main difference between KMDSAN and most existing methods, because DSAN emphasizes the subdomain alignment rather than global alignment. Additionally, the K-means clustering algorithm is utilized to optimize the local maximum mean discrepancy to improve the performance of DSAN. Finally, a deep network with an improved attention mechanism is designed for the feature extraction of original bearing vibration signal. In comparison to other methods, KMDSAN is concise yet highly effective, and results from two datasets related to bearing demonstrate that the proposed method achieves excellent diagnosis accuracy.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"312 ","pages":"Article 113170"},"PeriodicalIF":7.2,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143436939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11DOI: 10.1016/j.knosys.2025.113017
Yu-Ming Shang , Hongli Mao , Tian Tian , Heyan Huang , Xian-Ling Mao
Named Entity Recognition (NER) is a fundamental task in Natural Language Processing (NLP) that aims to identify the span and category of entities within text. Recent advancements have demonstrated significant improvements in NER performance by incorporating document-level context. However, due to input length limitations, these models only consider the context of nearby sentences, failing to capture global long-range dependencies within the entire document. To address this issue, we propose a novel span-based two-stage method that formulates the document as a span graph, enabling the capture of global long-range dependencies at both token and span levels. Specifically, (1) we first train a binary classifier without considering entity types to extract candidate spans from each sentence. (2) Then, we leverage the robust contextual understanding and structural reasoning capabilities of Large Language Models (LLMs) like GPT to incrementally integrate these spans into the document-level span graph. By utilizing this span graph as a guide, we retrieve relevant contextual sentences for each target sentence and jointly encode them using BERT to capture token-level dependencies. Furthermore, by employing a Graph Transformer with well-designed position encoding to incorporate graph structure, our model effectively exploits span-level dependencies throughout the document. Extensive experiments on resource-rich nested and flat NER datasets, as well as low-resource distantly supervised NER datasets, demonstrate that our proposed model outperforms previous state-of-the-art models, showcasing its effectiveness in capturing long-range dependencies and enhancing NER accuracy.
{"title":"From local to global: Leveraging document graph for named entity recognition","authors":"Yu-Ming Shang , Hongli Mao , Tian Tian , Heyan Huang , Xian-Ling Mao","doi":"10.1016/j.knosys.2025.113017","DOIUrl":"10.1016/j.knosys.2025.113017","url":null,"abstract":"<div><div>Named Entity Recognition (NER) is a fundamental task in Natural Language Processing (NLP) that aims to identify the span and category of entities within text. Recent advancements have demonstrated significant improvements in NER performance by incorporating document-level context. However, due to input length limitations, these models only consider the context of nearby sentences, failing to capture global long-range dependencies within the entire document. To address this issue, we propose a novel span-based two-stage method that formulates the document as a span graph, enabling the capture of global long-range dependencies at both token and span levels. Specifically, (1) we first train a binary classifier without considering entity types to extract candidate spans from each sentence. (2) Then, we leverage the robust contextual understanding and structural reasoning capabilities of Large Language Models (LLMs) like GPT to incrementally integrate these spans into the document-level span graph. By utilizing this span graph as a guide, we retrieve relevant contextual sentences for each target sentence and jointly encode them using BERT to capture token-level dependencies. Furthermore, by employing a Graph Transformer with well-designed position encoding to incorporate graph structure, our model effectively exploits span-level dependencies throughout the document. Extensive experiments on resource-rich nested and flat NER datasets, as well as low-resource distantly supervised NER datasets, demonstrate that our proposed model outperforms previous state-of-the-art models, showcasing its effectiveness in capturing long-range dependencies and enhancing NER accuracy.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"312 ","pages":"Article 113017"},"PeriodicalIF":7.2,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11DOI: 10.1016/j.knosys.2025.113148
Pir Noman Ahmad , Adnan Muhammad Shah , KangYoon Lee , Rizwan Ali Naqvi , Wazir Muhammad
Recent social-media analytics research has explored the complex domain of slogans and product or service endorsements, which present classification challenges in marketing, owing to their adaptability across different contexts. Existing research emphasizes flat-text classification, neglecting the nuanced hierarchical structure of English at the document and sentence levels. To overcome this gap, this study introduces a robust slogan identification and classification (RoICS) model within a ubiquitous-learning framework. It uses a new dataset that includes 6,909 ProText and 1,645 propaganda-text corpora (PTC) samples, encompassing both slogan and non-slogan labels. This model investigates the complex hierarchical multilabel structure of slogans using a granular computing–based deep-learning model and fine-grained structures. The proposed RoICS model achieved an accuracy of 84%, outperforming state-of-the-art models. We validated the utility of our contributions through a series of quantitative and qualitative experiments across various openness scenarios (25%, 50%, and 75%) using the ProText and PTC datasets. These findings not only refine our understanding of slogan detection, but also hold significant implications for information-systems researchers and practitioners, offering a potent tool for sentence-level ubiquitous-learning data analysis.
{"title":"Optimizing slogan classification in ubiquitous learning environment: A hierarchical multilabel approach with fuzzy neural networks","authors":"Pir Noman Ahmad , Adnan Muhammad Shah , KangYoon Lee , Rizwan Ali Naqvi , Wazir Muhammad","doi":"10.1016/j.knosys.2025.113148","DOIUrl":"10.1016/j.knosys.2025.113148","url":null,"abstract":"<div><div>Recent social-media analytics research has explored the complex domain of slogans and product or service endorsements, which present classification challenges in marketing, owing to their adaptability across different contexts. Existing research emphasizes flat-text classification, neglecting the nuanced hierarchical structure of English at the document and sentence levels. To overcome this gap, this study introduces a robust slogan identification and classification (RoICS) model within a ubiquitous-learning framework. It uses a new dataset that includes 6,909 ProText and 1,645 propaganda-text corpora (PTC) samples, encompassing both slogan and non-slogan labels. This model investigates the complex hierarchical multilabel structure of slogans using a granular computing–based deep-learning model and fine-grained structures. The proposed RoICS model achieved an accuracy of 84%, outperforming state-of-the-art models. We validated the utility of our contributions through a series of quantitative and qualitative experiments across various openness scenarios (25%, 50%, and 75%) using the ProText and PTC datasets. These findings not only refine our understanding of slogan detection, but also hold significant implications for information-systems researchers and practitioners, offering a potent tool for sentence-level ubiquitous-learning data analysis.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"314 ","pages":"Article 113148"},"PeriodicalIF":7.2,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}