Pub Date : 2024-10-10DOI: 10.1016/j.knosys.2024.112588
Chenglong Li , Zheng Zheng , Xiaoting Du , Xiangyue Ma , Zhengqi Wang , Xinheng Li
Understanding and predicting the bug type is crucial for developers striving to enhance testing efficiency and reduce software release problems. Bug reports, although semi-structured, contain valuable semantic information, making their comprehension critical for accurate bug prediction. Recent advances in large language models (LLMs), especially generative LLMs, have demonstrated their power in natural language processing. Many studies have utilized these models to understand various forms of textual data. However, the capability of LLMs to fully understand bug reports remains uncertain. To tackle this challenge, we propose KnowBug, a framework designed to augment LLMs with knowledge from bug reports to improve their ability to predict bug types. In this framework, we utilize bug reports from open-source deep learning frameworks, design specialized prompts, and fine-tune LLMs to assess KnowBug’s proficiency in understanding bug reports and predicting different bug types.
{"title":"KnowBug: Enhancing Large language models with bug report knowledge for deep learning framework bug prediction","authors":"Chenglong Li , Zheng Zheng , Xiaoting Du , Xiangyue Ma , Zhengqi Wang , Xinheng Li","doi":"10.1016/j.knosys.2024.112588","DOIUrl":"10.1016/j.knosys.2024.112588","url":null,"abstract":"<div><div>Understanding and predicting the bug type is crucial for developers striving to enhance testing efficiency and reduce software release problems. Bug reports, although semi-structured, contain valuable semantic information, making their comprehension critical for accurate bug prediction. Recent advances in large language models (LLMs), especially generative LLMs, have demonstrated their power in natural language processing. Many studies have utilized these models to understand various forms of textual data. However, the capability of LLMs to fully understand bug reports remains uncertain. To tackle this challenge, we propose <em>KnowBug</em>, a framework designed to augment LLMs with knowledge from bug reports to improve their ability to predict bug types. In this framework, we utilize bug reports from open-source deep learning frameworks, design specialized prompts, and fine-tune LLMs to assess <em>KnowBug</em>’s proficiency in understanding bug reports and predicting different bug types.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-10DOI: 10.1016/j.knosys.2024.112590
Zhen Li , Kaiyu Wang , Chenxi Xue , Haotian Li , Yuki Todo , Zhenyu Lei , Shangce Gao
In recent years, evolutionary algorithms have achieved outstanding results in addressing increasingly complex optimization problems, with differential evolution (DE) gaining significant attention. However, due to its simple yet efficient evolutionary mechanism, DE has consistently faced challenges in mitigating the risk of premature convergence. This paper introduces a novel Ring Sub-population architecture-based Differential Evolution (RSDE) to address this issue. RSDE incorporates a conditional similarity selection mechanism that integrates multiple strategies. By considering fitness evaluation and population distribution, RSDE facilitates rich information exchange among sub-populations, leading to cyclic optimization. This global conditional interaction mechanism provides a new idea for population structure research, effectively preserves valuable solutions within the population, and prevents stagnation due to rapid convergence. The performance of RSDE is rigorously evaluated using 29 benchmark functions from the IEEE Congress on Evolutionary Computation (CEC) 2017, 22 real-world problems from CEC2011, and 12 complex optimization problems from CEC2022. RSDE is compared with 18 advanced algorithms, including leading DE variants and other state-of-the-art methods. The results demonstrate that the proposed RSDE algorithm performs well and is highly competitive with other competitors.
{"title":"Differential evolution with ring sub-population architecture for optimization","authors":"Zhen Li , Kaiyu Wang , Chenxi Xue , Haotian Li , Yuki Todo , Zhenyu Lei , Shangce Gao","doi":"10.1016/j.knosys.2024.112590","DOIUrl":"10.1016/j.knosys.2024.112590","url":null,"abstract":"<div><div>In recent years, evolutionary algorithms have achieved outstanding results in addressing increasingly complex optimization problems, with differential evolution (DE) gaining significant attention. However, due to its simple yet efficient evolutionary mechanism, DE has consistently faced challenges in mitigating the risk of premature convergence. This paper introduces a novel Ring Sub-population architecture-based Differential Evolution (RSDE) to address this issue. RSDE incorporates a conditional similarity selection mechanism that integrates multiple strategies. By considering fitness evaluation and population distribution, RSDE facilitates rich information exchange among sub-populations, leading to cyclic optimization. This global conditional interaction mechanism provides a new idea for population structure research, effectively preserves valuable solutions within the population, and prevents stagnation due to rapid convergence. The performance of RSDE is rigorously evaluated using 29 benchmark functions from the IEEE Congress on Evolutionary Computation (CEC) 2017, 22 real-world problems from CEC2011, and 12 complex optimization problems from CEC2022. RSDE is compared with 18 advanced algorithms, including leading DE variants and other state-of-the-art methods. The results demonstrate that the proposed RSDE algorithm performs well and is highly competitive with other competitors.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142433833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-10DOI: 10.1016/j.knosys.2024.112600
Celeste Damiani , Yulia Rodina , Sergio Decherchi
Federated learning is becoming an increasingly viable and accepted strategy for building machine learning models in critical privacy-preserving scenarios such as clinical settings. Often, the data involved is not limited to clinical data but also includes additional omics features (e.g. proteomics). Consequently, data is distributed not only across hospitals but also across omics centers, which are labs capable of generating such additional features from biosamples. This scenario leads to a hybrid setting where data is scattered both in terms of samples and features. In this setting, we present a novel efficient federated reformulation of the Kernel Regularized Least Squares algorithm which leverages a randomized version of the Nyström method, introduce two variants for the optimization process and validate them using well-established datasets. In principle, the presented core ideas could be applied to any other kernel method to make it federated. Lastly, we discuss security measures to defend against possible attacks.
{"title":"A hybrid federated kernel regularized least squares algorithm","authors":"Celeste Damiani , Yulia Rodina , Sergio Decherchi","doi":"10.1016/j.knosys.2024.112600","DOIUrl":"10.1016/j.knosys.2024.112600","url":null,"abstract":"<div><div>Federated learning is becoming an increasingly viable and accepted strategy for building machine learning models in critical privacy-preserving scenarios such as clinical settings. Often, the data involved is not limited to clinical data but also includes additional omics features (e.g. proteomics). Consequently, data is distributed not only across hospitals but also across omics centers, which are labs capable of generating such additional features from biosamples. This scenario leads to a hybrid setting where data is scattered both in terms of samples and features. In this setting, we present a novel efficient federated reformulation of the Kernel Regularized Least Squares algorithm which leverages a randomized version of the Nyström method, introduce two variants for the optimization process and validate them using well-established datasets. In principle, the presented core ideas could be applied to any other kernel method to make it federated. Lastly, we discuss security measures to defend against possible attacks.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-10DOI: 10.1016/j.knosys.2024.112610
Shuhuan Wen , Simeng Gong , Ziyuan Zhang , F. Richard Yu , Zhiwen Wang
Vision-and-language navigation (VLN) is a challenging task that requires an agent to navigate an indoor environment using natural language instructions. Traditional VLN employs cross-modal feature fusion, where visual and textual information are combined to guide the agent’s navigation. However, incomplete use of perceptual information, scarcity of domain-specific training data, and diverse image and language inputs result in suboptimal performance. Herein, we propose a cross-modal feature fusion VLN history-aware information, that leverages an agent’s past experiences to make more informed navigation decisions. The regretful model and self-monitoring models are added, and the advantage actor critic(A2C) reinforcement learning algorithm is employed to improve the navigation success rate, reduce action redundancy, and shorten navigation paths. Subsequently, a data augmentation method based on speaker data is introduced to improve the model generalizability. We evaluate the proposed algorithm on the room-to-room (R2R) and room-for-room (R4R) benchmarks, and the experimental results demonstrate that, by comparison, the proposed algorithm outperforms state-of-the-art methods.
{"title":"Vision-and-language navigation based on history-aware cross-modal feature fusion in indoor environment","authors":"Shuhuan Wen , Simeng Gong , Ziyuan Zhang , F. Richard Yu , Zhiwen Wang","doi":"10.1016/j.knosys.2024.112610","DOIUrl":"10.1016/j.knosys.2024.112610","url":null,"abstract":"<div><div>Vision-and-language navigation (VLN) is a challenging task that requires an agent to navigate an indoor environment using natural language instructions. Traditional VLN employs cross-modal feature fusion, where visual and textual information are combined to guide the agent’s navigation. However, incomplete use of perceptual information, scarcity of domain-specific training data, and diverse image and language inputs result in suboptimal performance. Herein, we propose a cross-modal feature fusion VLN history-aware information, that leverages an agent’s past experiences to make more informed navigation decisions. The regretful model and self-monitoring models are added, and the advantage actor critic(A2C) reinforcement learning algorithm is employed to improve the navigation success rate, reduce action redundancy, and shorten navigation paths. Subsequently, a data augmentation method based on speaker data is introduced to improve the model generalizability. We evaluate the proposed algorithm on the room-to-room (R2R) and room-for-room (R4R) benchmarks, and the experimental results demonstrate that, by comparison, the proposed algorithm outperforms state-of-the-art methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09DOI: 10.1016/j.knosys.2024.112617
Huan Zhao , Zhiyuan Gong , Keyao Gan , Yujie Gan , Haonan Xing , Shekun Wang
Surrogate-based optimization (SBO) approach is becoming more and more popular in the expensive aerodynamic design of aircraft. However, with increasing number of design variables required for parameterizing a complex shape, SBO is suffering from the serious difficulty of the curse of dimensionality. To ameliorate this issue, a supervised nonlinear dimensionality-reduction surrogate modelling method was proposed. Such a method combines the supervised kernel principal component analysis (SKPCA) and polynomial chaos-Kriging (PCK) techniques into the jointly surrogate modelling process and adaptively establishes the accurate mapping from the high-dimensional inputs to the output of the system. This SKPCA-PCK method, which fully considers the effect of inputs on outputs and adaptively trains these hyper-parameters in the surrogate modelling process, escapes from the low prediction accuracy and instability of the surrogate model in conjunction with current linear or unsupervised dimensionality-reduction methods. Further, an efficient SKPCA-PCK-based global optimization method for high-dimensional aerodynamic design was developed. The performance of the proposed method is examined by investigating two numerical examples, the transonic RAE2822 airfoil and the wing of the NASA Common Research Model. Results demonstrate that the proposed SKPCA-PCK method significantly improves the modelling efficiency and accuracy compared to the unsupervised linear PCA-Kriging method. More importantly, the proposed SKPCA-PCK-based optimization method provides better performance and an appreciably higher optimization efficiency for expensive single-point and robust aerodynamic design involving high-dimensional design variables compared to the Kriging-based optimization method. These results provide further evidence that the proposed method provides a promising approach for mitigating the curse of dimensionality in SBO.
{"title":"Supervised kernel principal component analysis-polynomial chaos-Kriging for high-dimensional surrogate modelling and optimization","authors":"Huan Zhao , Zhiyuan Gong , Keyao Gan , Yujie Gan , Haonan Xing , Shekun Wang","doi":"10.1016/j.knosys.2024.112617","DOIUrl":"10.1016/j.knosys.2024.112617","url":null,"abstract":"<div><div>Surrogate-based optimization (SBO) approach is becoming more and more popular in the expensive aerodynamic design of aircraft. However, with increasing number of design variables required for parameterizing a complex shape, SBO is suffering from the serious difficulty of the curse of dimensionality. To ameliorate this issue, a supervised nonlinear dimensionality-reduction surrogate modelling method was proposed. Such a method combines the supervised kernel principal component analysis (SKPCA) and polynomial chaos-Kriging (PCK) techniques into the jointly surrogate modelling process and adaptively establishes the accurate mapping from the high-dimensional inputs to the output of the system. This SKPCA-PCK method, which fully considers the effect of inputs on outputs and adaptively trains these hyper-parameters in the surrogate modelling process, escapes from the low prediction accuracy and instability of the surrogate model in conjunction with current linear or unsupervised dimensionality-reduction methods. Further, an efficient SKPCA-PCK-based global optimization method for high-dimensional aerodynamic design was developed. The performance of the proposed method is examined by investigating two numerical examples, the transonic RAE2822 airfoil and the wing of the NASA Common Research Model. Results demonstrate that the proposed SKPCA-PCK method significantly improves the modelling efficiency and accuracy compared to the unsupervised linear PCA-Kriging method. More importantly, the proposed SKPCA-PCK-based optimization method provides better performance and an appreciably higher optimization efficiency for expensive single-point and robust aerodynamic design involving high-dimensional design variables compared to the Kriging-based optimization method. These results provide further evidence that the proposed method provides a promising approach for mitigating the curse of dimensionality in SBO.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142437787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09DOI: 10.1016/j.knosys.2024.112601
Shu Li , Lixin Han , Yang Wang , Yonglin Pu , Jun Zhu , Jingxian Li
Contrastive learning demonstrates remarkable generalization performance but lacks theoretical understanding, while contrastive clustering achieves promising performance but exhibits some shortcomings. We first introduce a generalized bias-variance decomposition to study contrastive learning, then present the concept of the conformal field, which unifies instance-level contrastive loss and cluster-level de-redundancy loss (Barlow Twins). Finally, we integrate the conformal field and self-labeling to propose the outstanding contrastive clustering model D3CF. D3CF consists of two novel stages: the pre-training stage simultaneously performs instance-level contrastive learning and multi-view cluster-level redundancy reduction, bringing positive samples together and separating negative samples in the row and column space of the augmented feature matrix; to alleviate the adverse effects caused by false-negative pairs and misclustered assignments in the pre-training stage, the boosting stage enhances contrastive learning from single-positive pairs to multiple-positive pairs by leveraging cross-sample similarities, while utilizing pseudo-labels with high confidence criteria for self-labeling to correct clustering assignments. Extensive experiments on six image benchmark datasets and two text benchmarks demonstrate D3CF’s superior performance and validate the effectiveness of its components. Particularly on CIFAR-10, ImageNet-10, and STL-10, D3CF achieves average accuracies of 89.5%, 97%, and 91%, improving NMI by 5.2%, 4.8%, and 2.1%, and ARI by 7%, 7.3%, and 7.3% over the closest baseline.
{"title":"Contrastive clustering based on generalized bias-variance decomposition","authors":"Shu Li , Lixin Han , Yang Wang , Yonglin Pu , Jun Zhu , Jingxian Li","doi":"10.1016/j.knosys.2024.112601","DOIUrl":"10.1016/j.knosys.2024.112601","url":null,"abstract":"<div><div>Contrastive learning demonstrates remarkable generalization performance but lacks theoretical understanding, while contrastive clustering achieves promising performance but exhibits some shortcomings. We first introduce a generalized bias-variance decomposition to study contrastive learning, then present the concept of the conformal field, which unifies instance-level contrastive loss and cluster-level de-redundancy loss (Barlow Twins). Finally, we integrate the conformal field and self-labeling to propose the outstanding contrastive clustering model D3CF. D3CF consists of two novel stages: the pre-training stage simultaneously performs instance-level contrastive learning and multi-view cluster-level redundancy reduction, bringing positive samples together and separating negative samples in the row and column space of the augmented feature matrix; to alleviate the adverse effects caused by false-negative pairs and misclustered assignments in the pre-training stage, the boosting stage enhances contrastive learning from single-positive pairs to multiple-positive pairs by leveraging cross-sample similarities, while utilizing pseudo-labels with high confidence criteria for self-labeling to correct clustering assignments. Extensive experiments on six image benchmark datasets and two text benchmarks demonstrate D3CF’s superior performance and validate the effectiveness of its components. Particularly on CIFAR-10, ImageNet-10, and STL-10, D3CF achieves average accuracies of 89.5%, 97%, and 91%, improving NMI by 5.2%, 4.8%, and 2.1%, and ARI by 7%, 7.3%, and 7.3% over the closest baseline.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09DOI: 10.1016/j.knosys.2024.112548
Siqi Hui , Sanping Zhou , Ye Deng , Yang Wu , Jinjun Wang
Cross-Domain Few-Shot Learning (CD-FSL) addresses the Few-Shot Learning with a domain gap between source and target domains, which facilitates the transfer of knowledge from a source domain to a target domain with limited labeled samples. Current approaches often incorporate an auxiliary target dataset containing a few labeled samples to enhance model generalization on specific target domains. However, we observe that many models retain a substantial number of channels that learn source-specific knowledge and extract features that perform adequately on the source domain but generalize poorly to the target domain. This often results in compromised performance due to the influence of source-specific knowledge. To address this challenge, we introduce a novel framework, Gradient-Guided Channel Masking (GGCM), designed for CD-FSL to mitigate model channels from acquiring too much source-specific knowledge. GGCM quantifies each channel’s contribution to solving target tasks using gradients of target loss and identifies those with smaller gradients as source-specific. These channels are then masked during the forward propagation of source features to mitigate the learning of source-specific knowledge. Conversely, GGCM mutes non-source-specific channels during the forward propagation of target features, forcing the model to depend on the source-specific channels and thereby enhancing their generalizability. Moreover, we propose a consistency loss that aligns the predictions made by source-specific channels with those made by the entire model. This approach further enhances the generalizability of these channels by enabling them to learn from the generalizable knowledge contained in other non-source-specific channels. Validated across multiple CD-FSL benchmark datasets, our framework demonstrates state-of-the-art performance and effectively suppresses the learning of source-specific knowledge.
{"title":"Gradient-guided channel masking for cross-domain few-shot learning","authors":"Siqi Hui , Sanping Zhou , Ye Deng , Yang Wu , Jinjun Wang","doi":"10.1016/j.knosys.2024.112548","DOIUrl":"10.1016/j.knosys.2024.112548","url":null,"abstract":"<div><div>Cross-Domain Few-Shot Learning (CD-FSL) addresses the Few-Shot Learning with a domain gap between source and target domains, which facilitates the transfer of knowledge from a source domain to a target domain with limited labeled samples. Current approaches often incorporate an auxiliary target dataset containing a few labeled samples to enhance model generalization on specific target domains. However, we observe that many models retain a substantial number of channels that learn source-specific knowledge and extract features that perform adequately on the source domain but generalize poorly to the target domain. This often results in compromised performance due to the influence of source-specific knowledge. To address this challenge, we introduce a novel framework, Gradient-Guided Channel Masking (GGCM), designed for CD-FSL to mitigate model channels from acquiring too much source-specific knowledge. GGCM quantifies each channel’s contribution to solving target tasks using gradients of target loss and identifies those with smaller gradients as source-specific. These channels are then masked during the forward propagation of source features to mitigate the learning of source-specific knowledge. Conversely, GGCM mutes non-source-specific channels during the forward propagation of target features, forcing the model to depend on the source-specific channels and thereby enhancing their generalizability. Moreover, we propose a consistency loss that aligns the predictions made by source-specific channels with those made by the entire model. This approach further enhances the generalizability of these channels by enabling them to learn from the generalizable knowledge contained in other non-source-specific channels. Validated across multiple CD-FSL benchmark datasets, our framework demonstrates state-of-the-art performance and effectively suppresses the learning of source-specific knowledge.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Transfer-based attacks utilize the proxy model to craft adversarial examples against the target model and make significant advancements in the realm of black-box attacks. Recent research suggests that these attacks can be enhanced by incorporating adversarial defenses into the training process of adversarial examples. Specifically, adversarial defenses supervise the training process, forcing the attacker to overcome greater challenges and produce more robust adversarial examples with enhanced transferability. However, current methods mainly rely on limited input transformation defenses, which apply only linear affine changes. These defenses are insufficient for effectively removing harmful content from adversarial examples, resulting in restricted improvements in their transferability. To address this issue, we propose a novel training framework named Transfer-based Attacks through Hypothesis Defense (TA-HD). This framework enhances the generalization of adversarial examples by integrating a hypothesis defense mechanism into the proxy model. Specifically, we propose an input denoising network as the hypothesis defense to effectively remove harmful noise from adversarial examples. Furthermore, we introduce an adversarial training strategy and design specific adversarial loss functions to optimize the input denoising network’s parameters. The visualization of the training process demonstrates the effective denoising capability of the hypothesized defense mechanism and the stability of the training process. Extensive experiments show that the proposed training framework significantly improves the success rate of transfer-based attacks by up to 19.9%. The code is available at https://github.com/haolingguang/TA-HD.
{"title":"A hypothetical defenses-based training framework for generating transferable adversarial examples","authors":"Lingguang Hao , Kuangrong Hao , Yaochu Jin , Hongzhi Zhao","doi":"10.1016/j.knosys.2024.112602","DOIUrl":"10.1016/j.knosys.2024.112602","url":null,"abstract":"<div><div>Transfer-based attacks utilize the proxy model to craft adversarial examples against the target model and make significant advancements in the realm of black-box attacks. Recent research suggests that these attacks can be enhanced by incorporating adversarial defenses into the training process of adversarial examples. Specifically, adversarial defenses supervise the training process, forcing the attacker to overcome greater challenges and produce more robust adversarial examples with enhanced transferability. However, current methods mainly rely on limited input transformation defenses, which apply only linear affine changes. These defenses are insufficient for effectively removing harmful content from adversarial examples, resulting in restricted improvements in their transferability. To address this issue, we propose a novel training framework named Transfer-based Attacks through Hypothesis Defense (TA-HD). This framework enhances the generalization of adversarial examples by integrating a hypothesis defense mechanism into the proxy model. Specifically, we propose an input denoising network as the hypothesis defense to effectively remove harmful noise from adversarial examples. Furthermore, we introduce an adversarial training strategy and design specific adversarial loss functions to optimize the input denoising network’s parameters. The visualization of the training process demonstrates the effective denoising capability of the hypothesized defense mechanism and the stability of the training process. Extensive experiments show that the proposed training framework significantly improves the success rate of transfer-based attacks by up to 19.9%. The code is available at <span><span>https://github.com/haolingguang/TA-HD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142433832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Contemporary advancements in technology provide vast quantities of data with large dimensions, leading to high computing burdens. These big data quantities suffer from irrelevant, redundant, and noisy features. Hence, Feature Selection (FS) has become a crucial task to identify the optimal subsets of features. This research proposes a Binary version of Young's Double-Slit Experiment optimizer (BYDSE) with crossover operation (BYDSEX) for tackling FS issues. Furthermore, the proposed algorithm employs the V-shaped transfer function to convert continuous solutions generated by the standard YDSE into binary ones. To assess the new solutions, we employ a well-known wrapper approach, K-Nearest Neighbors (KNN), which uses the Euclidean distance metric. We integrate an adaptive crossover with a bitwise AND operation into the suggested algorithm to enhance its exploration and population diversity. Moreover, the bitwise AND operation transfers the most informative and beneficial features to the new solutions. We compared BYDSEX with nine of the most recent and powerful algorithms using 31 large-scale datasets to demonstrate its efficacy. Moreover, our BYDSEX optimizer is utilized to detect the DDoS attacks faced by most IoT devices and contemporary technologies, using six datasets extracted from CIC-DDoS2019 and NSL-KDD. Various performance metrics are utilized to assess the algorithms, such as the accuracy, the selected feature size the fitness values, the fitness values, and the time. Two statistical tests are carried out, like paired-samples T and the Wilcoxon signed-rank. BYDSEX achieved superior results compared to its competitors for most of the datasets. Furthermore, BYDSEX obtains average accuracy values of 99.78%, 99.89%, 99.69% and 99.48% for LDAP and MSSQL, NETBIOS and NSL-KDD, respectively.
{"title":"BYDSEX: Binary Young's double-slit experiment optimizer with adaptive crossover for feature selection: Investigating performance issues of network intrusion detection","authors":"Doaa El-Shahat , Mohamed Abdel-Basset , Nourhan Talal , Abduallah Gamal , Mohamed Abouhawwash","doi":"10.1016/j.knosys.2024.112589","DOIUrl":"10.1016/j.knosys.2024.112589","url":null,"abstract":"<div><div>Contemporary advancements in technology provide vast quantities of data with large dimensions, leading to high computing burdens. These big data quantities suffer from irrelevant, redundant, and noisy features. Hence, Feature Selection (FS) has become a crucial task to identify the optimal subsets of features. This research proposes a Binary version of Young's Double-Slit Experiment optimizer (BYDSE) with crossover operation (BYDSEX) for tackling FS issues. Furthermore, the proposed algorithm employs the <em>V-shaped</em> transfer function to convert continuous solutions generated by the standard YDSE into binary ones. To assess the new solutions, we employ a well-known wrapper approach, K-Nearest Neighbors (KNN), which uses the Euclidean distance metric. We integrate an adaptive crossover with a bitwise AND operation into the suggested algorithm to enhance its exploration and population diversity. Moreover, the bitwise AND operation transfers the most informative and beneficial features to the new solutions. We compared BYDSEX with nine of the most recent and powerful algorithms using 31 large-scale datasets to demonstrate its efficacy. Moreover, our BYDSEX optimizer is utilized to detect the DDoS attacks faced by most IoT devices and contemporary technologies, using six datasets extracted from CIC-DDoS2019 and NSL-KDD. Various performance metrics are utilized to assess the algorithms, such as the accuracy, the selected feature size the fitness values, the fitness values, and the time. Two statistical tests are carried out, like paired-samples T and the Wilcoxon signed-rank. BYDSEX achieved superior results compared to its competitors for most of the datasets. Furthermore, BYDSEX obtains average accuracy values of 99.78%, 99.89%, 99.69% and 99.48% for LDAP and MSSQL, NETBIOS and NSL-KDD, respectively.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142433830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-06DOI: 10.1016/j.knosys.2024.112603
Boyuan Zhu , Fagui Liu , Xi Chen , Quan Tang , C.L. Philip Chen
Scene text detection is crucial across numerous application fields. However, despite the emphasis on real-time performance in scene text detection, most existing detection models utilize the Feature Pyramid Network (FPN) for feature extraction, often disregarding its inherent limitations. Integrating high-resolution multi-channel features into FPN requires substantial computational resources. While FPN treats local and global features equally and is stable in various applications, its suitability for text-specific features is questionable. To this end, we propose the Asymmetric Center Positioning Network (ACP-Net) to replace FPN, achieving accuracy and real-time text detection in complex scenarios. ACP-Net features an asymmetric feature structure with independent branches for global and local information, along with an adaptive weighted fusion module to capture long-range dependencies effectively. In addition, a text center positioning module enhances text feature understanding by learning feature centers. Comprehensive evaluations across various terminals confirmed ACP-Net’s superior accuracy and speed.
{"title":"ACP-Net: Asymmetric Center Positioning Network for Real-Time Text Detection","authors":"Boyuan Zhu , Fagui Liu , Xi Chen , Quan Tang , C.L. Philip Chen","doi":"10.1016/j.knosys.2024.112603","DOIUrl":"10.1016/j.knosys.2024.112603","url":null,"abstract":"<div><div>Scene text detection is crucial across numerous application fields. However, despite the emphasis on real-time performance in scene text detection, most existing detection models utilize the Feature Pyramid Network (FPN) for feature extraction, often disregarding its inherent limitations. Integrating high-resolution multi-channel features into FPN requires substantial computational resources. While FPN treats local and global features equally and is stable in various applications, its suitability for text-specific features is questionable. To this end, we propose the Asymmetric Center Positioning Network (ACP-Net) to replace FPN, achieving accuracy and real-time text detection in complex scenarios. ACP-Net features an asymmetric feature structure with independent branches for global and local information, along with an adaptive weighted fusion module to capture long-range dependencies effectively. In addition, a text center positioning module enhances text feature understanding by learning feature centers. Comprehensive evaluations across various terminals confirmed ACP-Net’s superior accuracy and speed.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2,"publicationDate":"2024-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}