Pub Date : 2024-10-11DOI: 10.1016/j.knosys.2024.112605
Yunfei Bai , Chang Liu , Rui Yang , Xiaomao Li
Without consideration of task specificity, directly transforming domain adaptive pipelines from classification to one-stage detection tends to pose severer misalignments. These misalignments include: (1) Foreground misalignment that the domain discriminator obsessively concentrates on backgrounds since one-stage detectors do not contain proposals for instance-level discrimination. (2) Localization misalignment that domain-adaptive features supervised by the domain discriminator are not suitable for localization tasks, as the discriminator is a classifier in essence. To tackle these problems, we propose the Misalignment-Resistant Domain Adaption (MRDA) for one-stage detectors. Specifically, to alleviate foreground misalignment, a mask-based domain discriminator is proposed to perform instance-level discrimination by assigning the pixel-level domain labels based on instance-level masks. As for localization misalignment, a localization discriminator is introduced to learn domain-adaptive features for localization tasks. It employs an additional box-regression branch with an IoU loss to perform adversarial mutual supervision with the feature extractor. Comprehensive experiments demonstrate that our method effectively mitigates the misalignments and achieves state-of-the-art detection across multiple datasets.
{"title":"Misalignment-resistant domain adaptive learning for one-stage object detection","authors":"Yunfei Bai , Chang Liu , Rui Yang , Xiaomao Li","doi":"10.1016/j.knosys.2024.112605","DOIUrl":"10.1016/j.knosys.2024.112605","url":null,"abstract":"<div><div>Without consideration of task specificity, directly transforming domain adaptive pipelines from classification to one-stage detection tends to pose severer misalignments. These misalignments include: (1) Foreground misalignment that the domain discriminator obsessively concentrates on backgrounds since one-stage detectors do not contain proposals for instance-level discrimination. (2) Localization misalignment that domain-adaptive features supervised by the domain discriminator are not suitable for localization tasks, as the discriminator is a classifier in essence. To tackle these problems, we propose the <strong>Misalignment-Resistant Domain Adaption</strong> (MRDA) for one-stage detectors. Specifically, to alleviate foreground misalignment, a mask-based domain discriminator is proposed to perform instance-level discrimination by assigning the pixel-level domain labels based on instance-level masks. As for localization misalignment, a localization discriminator is introduced to learn domain-adaptive features for localization tasks. It employs an additional box-regression branch with an IoU loss to perform adversarial mutual supervision with the feature extractor. Comprehensive experiments demonstrate that our method effectively mitigates the misalignments and achieves state-of-the-art detection across multiple datasets.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11DOI: 10.1016/j.knosys.2024.112595
Yutian Ren, Aaron Haohua Yen, G.P. Li
Adaptive machine learning (ML) aims to allow ML models to adapt to ever-changing environments with potential concept drift after model deployment. Traditionally, adaptive ML requires a new dataset to be manually labeled to tailor deployed models to altered data distributions. Recently, an interactive causality based self-labeling method was proposed to autonomously associate causally related data streams for domain adaptation, showing promising results compared to traditional feature similarity-based semi-supervised learning. Several unanswered research questions remain, including self-labeling’s compatibility with multivariate causality and the quantitative analysis of the auxiliary models used in the self-labeling. The auxiliary models, the interaction time model (ITM) and the effect state detector (ESD), are vital to the success of self-labeling. This paper further develops the self-labeling framework and its theoretical foundations to address these research questions. A framework for the application of self-labeling to multivariate causal graphs is proposed using four basic causal relationships, and the impact of non-ideal ITM and ESD performance is analyzed. A simulated experiment is conducted based on a multivariate causal graph, validating the proposed theory.
自适应机器学习(ML)旨在让机器学习模型适应不断变化的环境,并在模型部署后适应潜在的概念漂移。传统上,自适应 ML 需要对新数据集进行人工标注,以使部署的模型适应已改变的数据分布。最近,有人提出了一种基于因果关系的交互式自标注方法,可自主关联因果关系相关的数据流以进行领域适应性学习,与传统的基于特征相似性的半监督学习相比,这种方法显示出了良好的效果。目前仍有几个未解答的研究问题,包括自标注与多元因果关系的兼容性以及对自标注中使用的辅助模型的定量分析。辅助模型,即交互时间模型(ITM)和效应状态检测器(ESD),对自我标记的成功至关重要。本文进一步发展了自我标记框架及其理论基础,以解决这些研究问题。利用四种基本因果关系,提出了将自标记应用于多元因果图的框架,并分析了非理想 ITM 和 ESD 性能的影响。基于多变量因果图进行了模拟实验,验证了所提出的理论。
{"title":"Self-labeling in multivariate causality and quantification for adaptive machine learning","authors":"Yutian Ren, Aaron Haohua Yen, G.P. Li","doi":"10.1016/j.knosys.2024.112595","DOIUrl":"10.1016/j.knosys.2024.112595","url":null,"abstract":"<div><div>Adaptive machine learning (ML) aims to allow ML models to adapt to ever-changing environments with potential concept drift after model deployment. Traditionally, adaptive ML requires a new dataset to be manually labeled to tailor deployed models to altered data distributions. Recently, an interactive causality based self-labeling method was proposed to autonomously associate causally related data streams for domain adaptation, showing promising results compared to traditional feature similarity-based semi-supervised learning. Several unanswered research questions remain, including self-labeling’s compatibility with multivariate causality and the quantitative analysis of the auxiliary models used in the self-labeling. The auxiliary models, the interaction time model (ITM) and the effect state detector (ESD), are vital to the success of self-labeling. This paper further develops the self-labeling framework and its theoretical foundations to address these research questions. A framework for the application of self-labeling to multivariate causal graphs is proposed using four basic causal relationships, and the impact of non-ideal ITM and ESD performance is analyzed. A simulated experiment is conducted based on a multivariate causal graph, validating the proposed theory.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11DOI: 10.1016/j.knosys.2024.112611
Muhammad Fayaz , L. Minh Dang , Hyeonjoon Moon
The rapid adoption of drones has transformed industries such as agriculture, environmental monitoring, surveillance, and disaster management by enabling more efficient data collection and analysis. However, existing UAV-based image scene classification techniques face limitations, particularly in handling dynamic scenes, varying environmental conditions, and accurately identifying small or partially obscured objects. These challenges necessitate more advanced and robust methods for land cover classification. In response, this study explores ensemble learning (EL) as a powerful alternative to traditional machine learning approaches. By integrating predictions from multiple models, EL enhances accuracy, precision, and robustness in UAV-based land use and land cover classification. This research introduces a two-phase approach combining data preprocessing with feature extraction using three advanced ensemble models DenseNet201, EfficientNetV2S, and Xception employing transfer learning. These models were selected based on their higher performance during preliminary evaluations. Furthermore, a soft attention mechanism is incorporated into the ensembled network to optimize feature selection, resulting in improved classification outcomes. The proposed model achieved an accuracy of 97 %, precision of 96 %, recall of 96 %, and an F1-score of 97 % on UAV image datasets. Comparative analysis reveals a 4.2 % accuracy improvement with the ensembled models and a 1 % boost with the advanced hybrid models. This work significantly advances UAV image scene classification, offering a practical solution to enhance decision-making precision in various applications. The ensemble system demonstrates its effectiveness in remote sensing applications, especially in land cover analysis across diverse geographical and environmental settings.
{"title":"Enhancing land cover classification via deep ensemble network","authors":"Muhammad Fayaz , L. Minh Dang , Hyeonjoon Moon","doi":"10.1016/j.knosys.2024.112611","DOIUrl":"10.1016/j.knosys.2024.112611","url":null,"abstract":"<div><div>The rapid adoption of drones has transformed industries such as agriculture, environmental monitoring, surveillance, and disaster management by enabling more efficient data collection and analysis. However, existing UAV-based image scene classification techniques face limitations, particularly in handling dynamic scenes, varying environmental conditions, and accurately identifying small or partially obscured objects. These challenges necessitate more advanced and robust methods for land cover classification. In response, this study explores ensemble learning (EL) as a powerful alternative to traditional machine learning approaches. By integrating predictions from multiple models, EL enhances accuracy, precision, and robustness in UAV-based land use and land cover classification. This research introduces a two-phase approach combining data preprocessing with feature extraction using three advanced ensemble models DenseNet201, EfficientNetV2S, and Xception employing transfer learning. These models were selected based on their higher performance during preliminary evaluations. Furthermore, a soft attention mechanism is incorporated into the ensembled network to optimize feature selection, resulting in improved classification outcomes. The proposed model achieved an accuracy of 97 %, precision of 96 %, recall of 96 %, and an F1-score of 97 % on UAV image datasets. Comparative analysis reveals a 4.2 % accuracy improvement with the ensembled models and a 1 % boost with the advanced hybrid models. This work significantly advances UAV image scene classification, offering a practical solution to enhance decision-making precision in various applications. The ensemble system demonstrates its effectiveness in remote sensing applications, especially in land cover analysis across diverse geographical and environmental settings.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11DOI: 10.1016/j.knosys.2024.112594
Ruwen Zhang , Bo Liu , Jiuxin Cao , Hantao Zhao , Xuheng Sun , Yan Liu , Xiangguo Sun
Public sentiment within social networks exerts a profound influence on societal dynamics, underscoring the increasing demand for accurate public opinion prediction. Most existing methods predominantly measure sentiment by quantifying user sentiments individually, overlooking group-level factors that crucially contribute to public sentiment. Thus, based on our finding that public sentiment is primarily shaped by user-group interactions and their interplay with evolving topics, we innovatively model the forming process of public sentiment at the group level. In this paper, we propose the Topic and Role Enhanced Group-level Public Sentiment Prediction model (TRESP), capturing the intricate interplay among sentiment, topic, and role. Specifically, an LSTM neural network is firstly leveraged to trace the temporal correlations between topics and sentiment shifts, yielding a topic-informed content sentiment representation. Subsequently, a specially crafted hierarchical attention network captures social and role attributes, representing the overarching social group environment. Finally, we predict future public sentiment by merging the derived group sentiment representation with the group social representation, demonstrating a holistic insight into the sentiment trajectory. Extensive experiments were conducted on two real-world datasets of over 30,000 tweets collected from more than 14,000 users to validate our model. Notably, our model significantly outperforms the state-of-the-art approaches in public sentiment prediction, indicating the importance and effectiveness of encapsulating interactions both within and among user subgroups.
{"title":"Modeling group-level public sentiment in social networks through topic and role enhancement","authors":"Ruwen Zhang , Bo Liu , Jiuxin Cao , Hantao Zhao , Xuheng Sun , Yan Liu , Xiangguo Sun","doi":"10.1016/j.knosys.2024.112594","DOIUrl":"10.1016/j.knosys.2024.112594","url":null,"abstract":"<div><div>Public sentiment within social networks exerts a profound influence on societal dynamics, underscoring the increasing demand for accurate public opinion prediction. Most existing methods predominantly measure sentiment by quantifying user sentiments individually, overlooking group-level factors that crucially contribute to public sentiment. Thus, based on our finding that public sentiment is primarily shaped by user-group interactions and their interplay with evolving topics, we innovatively model the forming process of public sentiment at the group level. In this paper, we propose the Topic and Role Enhanced Group-level Public Sentiment Prediction model (TRESP), capturing the intricate interplay among sentiment, topic, and role. Specifically, an LSTM neural network is firstly leveraged to trace the temporal correlations between topics and sentiment shifts, yielding a topic-informed content sentiment representation. Subsequently, a specially crafted hierarchical attention network captures social and role attributes, representing the overarching social group environment. Finally, we predict future public sentiment by merging the derived group sentiment representation with the group social representation, demonstrating a holistic insight into the sentiment trajectory. Extensive experiments were conducted on two real-world datasets of over 30,000 tweets collected from more than 14,000 users to validate our model. Notably, our model significantly outperforms the state-of-the-art approaches in public sentiment prediction, indicating the importance and effectiveness of encapsulating interactions both within and among user subgroups.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11DOI: 10.1016/j.knosys.2024.112614
Basharat Ahmad , Zhaoliang Wu , Yongfeng Huang , Sadaqat Ur Rehman
The Internet of Things (IoT) networks, which are defined by their interconnected devices and data streams are an expanding attack surface for cyber adversaries. Industrial Internet of Things (IIoT) is a subset of IoT and has significant importance in-terms of security. Robust intrusion detection systems (IDS) are essential for protecting these critical infrastructures. Our research suggests a novel approach to the detection of anomalies in IoT and IIoT networks that leverages the capabilities of deep transfer learning. Our methodology begins with the EdgeIIoT dataset, which serves as the basis for our data analysis. We convert the data into an appropriate image format to enable Convolutional Neural Network (CNN)-based processing. The hyper-parameters of individual machine learning models are subsequently optimized using a Random Search algorithm. This optimization phase optimizes the performance of each model by modifying the hyper-parameters that are unique to the learning algorithms. The performance of each model is meticulously assessed subsequent to hyper-parameter optimization. The top-performing models are subsequently, strategically selected and combined using the ensemble technique. The IDS scheme’s overall detection accuracy and generalizability are improved by the integration of strengths from multiple models. The proposed scheme demonstrates significant effectiveness in identifying a broad spectrum of attacks, encompassing a total of 14 distinct attack types. This comprehensive detection capability contributes to a more secure and resilient IoT ecosystem. Furthermore, application of quantization to our best models reduces resource utilization significantly without compromising accuracy.
{"title":"Enhancing the security in IoT and IIoT networks: An intrusion detection scheme leveraging deep transfer learning","authors":"Basharat Ahmad , Zhaoliang Wu , Yongfeng Huang , Sadaqat Ur Rehman","doi":"10.1016/j.knosys.2024.112614","DOIUrl":"10.1016/j.knosys.2024.112614","url":null,"abstract":"<div><div>The Internet of Things (IoT) networks, which are defined by their interconnected devices and data streams are an expanding attack surface for cyber adversaries. Industrial Internet of Things (IIoT) is a subset of IoT and has significant importance in-terms of security. Robust intrusion detection systems (IDS) are essential for protecting these critical infrastructures. Our research suggests a novel approach to the detection of anomalies in IoT and IIoT networks that leverages the capabilities of deep transfer learning. Our methodology begins with the EdgeIIoT dataset, which serves as the basis for our data analysis. We convert the data into an appropriate image format to enable Convolutional Neural Network (CNN)-based processing. The hyper-parameters of individual machine learning models are subsequently optimized using a Random Search algorithm. This optimization phase optimizes the performance of each model by modifying the hyper-parameters that are unique to the learning algorithms. The performance of each model is meticulously assessed subsequent to hyper-parameter optimization. The top-performing models are subsequently, strategically selected and combined using the ensemble technique. The IDS scheme’s overall detection accuracy and generalizability are improved by the integration of strengths from multiple models. The proposed scheme demonstrates significant effectiveness in identifying a broad spectrum of attacks, encompassing a total of 14 distinct attack types. This comprehensive detection capability contributes to a more secure and resilient IoT ecosystem. Furthermore, application of quantization to our best models reduces resource utilization significantly without compromising accuracy.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11DOI: 10.1016/j.knosys.2024.112599
Shuaiqi Liu , Xinrui Wang , Mingqi Jiang , Yanling An , Zhihui Gu , Bing Li , Yudong Zhang
In recent years, with the rise of deep learning technologies, EEG-based emotion recognition has garnered significant attention. However, most existing methods tend to focus on the spatiotemporal information of EEG signals while overlooking the potential topological information of brain regions. To address this issue, this paper proposes a dynamic graph attention network with multi-branch feature extraction and staged fusion (MAS-DGAT-Net), which integrates graph convolutional neural networks (GCN) for EEG emotion recognition. Specifically, the differential entropy (DE) features of EEG signals are first reconstructed into a correlation matrix using the Spearman correlation coefficient. Then, the brain-region connectivity-feature extraction (BCFE) module is employed to capture the brain connectivity features associated with emotional activation states. Meanwhile, this paper introduces a dual-branch cross-fusion feature extraction (CFFE) module, which consists of an attention-based cross-fusion feature extraction branch (A-CFFEB) and a cross-fusion feature extraction branch (CFFEB). A-CFFEB efficiently extracts key channel-frequency information from EEG features by using an attention mechanism and then fuses it with the output features from the BCFE. The fused features are subsequently input into the proposed dynamic graph attention module with a broad learning system (DGAT-BLS) to mine the brain connectivity feature information further. Finally, the deep features output by DGAT-BLS and CFFEB are combined for emotion classification. The proposed algorithm has been experimentally validated on SEED, SEED-IV, and DEAP datasets in subject-dependent and subject-independent settings, with the results confirming the model's effectiveness. The source code is publicly available at: https://github.com/cvmdsp/MAS-DGAT-Net
{"title":"MAS-DGAT-Net: A dynamic graph attention network with multibranch feature extraction and staged fusion for EEG emotion recognition","authors":"Shuaiqi Liu , Xinrui Wang , Mingqi Jiang , Yanling An , Zhihui Gu , Bing Li , Yudong Zhang","doi":"10.1016/j.knosys.2024.112599","DOIUrl":"10.1016/j.knosys.2024.112599","url":null,"abstract":"<div><div>In recent years, with the rise of deep learning technologies, EEG-based emotion recognition has garnered significant attention. However, most existing methods tend to focus on the spatiotemporal information of EEG signals while overlooking the potential topological information of brain regions. To address this issue, this paper proposes a dynamic graph attention network with multi-branch feature extraction and staged fusion (MAS-DGAT-Net), which integrates graph convolutional neural networks (GCN) for EEG emotion recognition. Specifically, the differential entropy (DE) features of EEG signals are first reconstructed into a correlation matrix using the Spearman correlation coefficient. Then, the brain-region connectivity-feature extraction (BCFE) module is employed to capture the brain connectivity features associated with emotional activation states. Meanwhile, this paper introduces a dual-branch cross-fusion feature extraction (CFFE) module, which consists of an attention-based cross-fusion feature extraction branch (A-CFFEB) and a cross-fusion feature extraction branch (CFFEB). A-CFFEB efficiently extracts key channel-frequency information from EEG features by using an attention mechanism and then fuses it with the output features from the BCFE. The fused features are subsequently input into the proposed dynamic graph attention module with a broad learning system (DGAT-BLS) to mine the brain connectivity feature information further. Finally, the deep features output by DGAT-BLS and CFFEB are combined for emotion classification. The proposed algorithm has been experimentally validated on SEED, SEED-IV, and DEAP datasets in subject-dependent and subject-independent settings, with the results confirming the model's effectiveness. The source code is publicly available at: <span><span>https://github.com/cvmdsp/MAS-DGAT-Net</span><svg><path></path></svg></span></div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11DOI: 10.1016/j.knosys.2024.112613
Yu-Jeong Cheon , Wook-Yeon Hwang
The cumulative sum (CUSUM) control chart and the multivariate anomaly detection with the generative adversarial network (MAD-GAN) were compared for monitoring the time series data. However, the control boundaries constructed in terms of the one-class classification with only the normal data for the training phase are inappropriate for the test phase because the normal data and the abnormal data should be classified for the test phase. In this regard, we first propose this GAN-based statistical process control (SPC) framework to compare them in terms of detecting the process mean shift based on the perspective of SPC. Second, we propose the residual MAD-GAN in order to improve the detection performance. Third, we develop the loss function of the MAD-GAN. Finally, we find that the maximum mean discrepancy (MMD) as well as the nash equilibrium is useful for the MAD-GAN. Our experiments demonstrate that the residual MAD-GAN is more effective than the residual CUSUM control chart in terms of the run lengths for the time series data. Therefore, we propose SPC practitioners to consider the residual MAD-GAN for detecting the process mean shift in time series data.
{"title":"GAN-based statistical process control for the time series data","authors":"Yu-Jeong Cheon , Wook-Yeon Hwang","doi":"10.1016/j.knosys.2024.112613","DOIUrl":"10.1016/j.knosys.2024.112613","url":null,"abstract":"<div><div>The cumulative sum (CUSUM) control chart and the multivariate anomaly detection with the generative adversarial network (MAD-GAN) were compared for monitoring the time series data. However, the control boundaries constructed in terms of the one-class classification with only the normal data for the training phase are inappropriate for the test phase because the normal data and the abnormal data should be classified for the test phase. In this regard, we first propose this GAN-based statistical process control (SPC) framework to compare them in terms of detecting the process mean shift based on the perspective of SPC. Second, we propose the residual MAD-GAN in order to improve the detection performance. Third, we develop the loss function of the MAD-GAN. Finally, we find that the maximum mean discrepancy (MMD) as well as the nash equilibrium is useful for the MAD-GAN. Our experiments demonstrate that the residual MAD-GAN is more effective than the residual CUSUM control chart in terms of the run lengths for the time series data. Therefore, we propose SPC practitioners to consider the residual MAD-GAN for detecting the process mean shift in time series data.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11DOI: 10.1016/j.knosys.2024.112572
Marcin Przewięźlikowski , Mateusz Pyla , Bartosz Zieliński , Bartłomiej Twardowski , Jacek Tabor , Marek Śmieja
Self-supervised learning (SSL) is a powerful technique for learning from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo can reach quality on par with supervised approaches. However, this invariance may be detrimental for solving downstream tasks that depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. For the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks. 123
{"title":"Augmentation-aware self-supervised learning with conditioned projector","authors":"Marcin Przewięźlikowski , Mateusz Pyla , Bartosz Zieliński , Bartłomiej Twardowski , Jacek Tabor , Marek Śmieja","doi":"10.1016/j.knosys.2024.112572","DOIUrl":"10.1016/j.knosys.2024.112572","url":null,"abstract":"<div><div>Self-supervised learning (SSL) is a powerful technique for learning from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo can reach quality on par with supervised approaches. However, this invariance may be detrimental for solving downstream tasks that depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. For the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined <strong>C</strong>onditional <strong>A</strong>ugmentation-aware <strong>S</strong>elf-<strong>s</strong>upervised <strong>Le</strong>arning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks. <span><span><sup>1</sup></span></span> <span><span><sup>2</sup></span></span> <span><span><sup>3</sup></span></span></div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-11DOI: 10.1016/j.knosys.2024.112580
Orcan Alpar , Ondrej Soukup , Pavel Ryska , Petr Paluska , Martin Valis , Ondrej Krejcar
Multiple sclerosis (MS) is a neurological demyelinating disorder affecting brain and spinal cord by attacking the myelin sheaths of nerves. Estimation of the volumetric changes in MS lesions is a challenging and specialized task which is executed and judged by medical experts. The change in the volume of the lesions provides crucial information on MS progression or regression by comparing the magnetic resonance images (MRI) taken in successive scans. However, visual comparison of the images, even with an expert eye, would not always lead to a conclusive decision nor a consensus on progression or regression. Therefore, we present an automated expert system for estimating MS progression rate by automatic lesion segmentation and volume estimation using two-dimensional MRIs, which is also adaptable to various parameters, slice thickness and increment. A clinical dataset is specially formed for this research which contains three sets of 135 MR images of an MS patient generated within approximately 23- and 6-month periods consecutively with identical device parameters. The lesions are segmented by a novel Rayleigh-Weibull-Fuzzy (RWF) imaging method based on the Nakagami distribution and specialized fuzzy 2-means. Subsequently, the segmentation module is trained to fit the ground truths images created by experts to achieve the highest dice score possible for a total number of 56 images containing lesions, which is found as 93.76 %. Afterwards, several imaginary image sequences are generated by augmented linear and nonlinear morphing for re-segmentation of imaginary lesions by RWF. Finally, we estimated the volumetric change between the first two MRI sequences to adjust the morphing module and to predict the progression rate of the lesions in time. The framework automatically selected the highest accuracy, which is 99.9 % in the training session and estimated the progression rate in the testing phase with 99.69 % accuracy, which are not achievable without augmented morphing methodology. For the first time in the literature, an automated framework could estimate the MS progression rate from the raw MR images, which is also the main innovation of this paper and the outputs would be beneficial for the experts working on this field.
{"title":"Automated multiple sclerosis progression rate computation of a patient from 2D FLAIR images with Rayleigh-Weibull-Fuzzy imaging and augmented morphing method","authors":"Orcan Alpar , Ondrej Soukup , Pavel Ryska , Petr Paluska , Martin Valis , Ondrej Krejcar","doi":"10.1016/j.knosys.2024.112580","DOIUrl":"10.1016/j.knosys.2024.112580","url":null,"abstract":"<div><div>Multiple sclerosis (MS) is a neurological demyelinating disorder affecting brain and spinal cord by attacking the myelin sheaths of nerves. Estimation of the volumetric changes in MS lesions is a challenging and specialized task which is executed and judged by medical experts. The change in the volume of the lesions provides crucial information on MS progression or regression by comparing the magnetic resonance images (MRI) taken in successive scans. However, visual comparison of the images, even with an expert eye, would not always lead to a conclusive decision nor a consensus on progression or regression. Therefore, we present an automated expert system for estimating MS progression rate by automatic lesion segmentation and volume estimation using two-dimensional MRIs, which is also adaptable to various parameters, slice thickness and increment. A clinical dataset is specially formed for this research which contains three sets of 135 MR images of an MS patient generated within approximately 23- and 6-month periods consecutively with identical device parameters. The lesions are segmented by a novel Rayleigh-Weibull-Fuzzy (RWF) imaging method based on the Nakagami distribution and specialized fuzzy 2-means. Subsequently, the segmentation module is trained to fit the ground truths images created by experts to achieve the highest dice score possible for a total number of 56 images containing lesions, which is found as 93.76 %. Afterwards, several imaginary image sequences are generated by augmented linear and nonlinear morphing for re-segmentation of imaginary lesions by RWF. Finally, we estimated the volumetric change between the first two MRI sequences to adjust the morphing module and to predict the progression rate of the lesions in time. The framework automatically selected the highest accuracy, which is 99.9 % in the training session and estimated the progression rate in the testing phase with 99.69 % accuracy, which are not achievable without augmented morphing methodology. For the first time in the literature, an automated framework could estimate the MS progression rate from the raw MR images, which is also the main innovation of this paper and the outputs would be beneficial for the experts working on this field.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-10DOI: 10.1016/j.knosys.2024.112597
Dexian Wang , Pengfei Zhang , Ping Deng , Qiaofeng Wu , Wei Chen , Tao Jiang , Wei Huang , Tianrui Li
Clustering plays a crucial role in the field of data mining, where deep non-negative matrix factorization (NMF) has attracted significant attention due to its effective data representation. However, deep matrix factorization based on autoencoder is typically constructed using multi-layer matrix factorization, which ignores nonlinear mapping and lacks learning rate to guide the update. To address these issues, this paper proposes an autoencoder-like deep NMF representation learning (ADNRL) algorithm for clustering. First, according to the principle of autoencoder, construct the objective function based on NMF. Then, decouple the elements in the matrix and apply the nonlinear activation function to enforce non-negative constraints on the elements. Subsequently, the gradient values corresponding to the elements update guided by the learning rate are transformed into the weight values. This weight values are combined with the activation function to construct the ADNRL deep network, and the objective function is minimized through the learning of the network. Finally, extensive experiments are conducted on eight datasets, and the results demonstrate the superior performance of ADNRL.
{"title":"An autoencoder-like deep NMF representation learning algorithm for clustering","authors":"Dexian Wang , Pengfei Zhang , Ping Deng , Qiaofeng Wu , Wei Chen , Tao Jiang , Wei Huang , Tianrui Li","doi":"10.1016/j.knosys.2024.112597","DOIUrl":"10.1016/j.knosys.2024.112597","url":null,"abstract":"<div><div>Clustering plays a crucial role in the field of data mining, where deep non-negative matrix factorization (NMF) has attracted significant attention due to its effective data representation. However, deep matrix factorization based on autoencoder is typically constructed using multi-layer matrix factorization, which ignores nonlinear mapping and lacks learning rate to guide the update. To address these issues, this paper proposes an autoencoder-like deep NMF representation learning (ADNRL) algorithm for clustering. First, according to the principle of autoencoder, construct the objective function based on NMF. Then, decouple the elements in the matrix and apply the nonlinear activation function to enforce non-negative constraints on the elements. Subsequently, the gradient values corresponding to the elements update guided by the learning rate are transformed into the weight values. This weight values are combined with the activation function to construct the ADNRL deep network, and the objective function is minimized through the learning of the network. Finally, extensive experiments are conducted on eight datasets, and the results demonstrate the superior performance of ADNRL.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}