International Journal of Machine Learning and Cybernetics最新文献

LSSMSD: defending against black-box DNN model stealing based on localized stochastic sensitivity LSSMSD：基于局部随机灵敏度防御黑盒 DNN 模型窃取

IF 5.6 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics

Pub Date : 2024-09-18 DOI: 10.1007/s13042-024-02376-0

Xueli Zhang, Jiale Chen, Qihua Li, Jianjun Zhang, Wing W. Y. Ng, Ting Wang

Machine learning as a service (MLaaS) has become a widely adopted approach, allowing customers to access even the most complex machine learning models through a pay-per-query model. Black-box distribution has been widely used to keep models secret in MLaaS. However, even with black-box distribution alleviating certain risks, the functionality of a model can still be compromised when customers gain access to their model’s predictions. To protect the intellectual property of model owners, we propose an effective defense method against model stealing attacks with the localized stochastic sensitivity (LSS), namely LSSMSD. First, suspicious queries are detected by employing an out-of-distribution (OOD) detector. Addressing a critical issue with many existing defense methods that overly rely on OOD detection results, thus affecting the model’s fidelity, we innovatively introduce LSS to solve this problem. By calculating the LSS of suspicious queries, we can selectively output misleading predictions for queries with high LSS using an misinformation mechanism. Extensive experiments demonstrate that LSSMSD offers robust protections for victim models against black-box proxy attacks such as Jacobian-based dataset augmentation and Knockoff Nets. It significantly reduces accuracies of attackers’ substitute models (up to 77.94%) while yields minimal impact to benign user accuracies (average (-2.72%)), thereby maintaining the fidelity of the victim model.

机器学习即服务（MLaaS）已被广泛采用，客户可以通过按查询付费的模式访问最复杂的机器学习模型。在 MLaaS 中，黑盒分发被广泛用于对模型保密。然而，即使黑盒分发减轻了某些风险，但当客户访问模型的预测结果时，模型的功能仍可能受到损害。为了保护模型所有者的知识产权，我们提出了一种利用局部随机灵敏度（LSS）对抗模型窃取攻击的有效防御方法，即 LSSMSD。首先，通过使用分布外（OOD）检测器来检测可疑查询。现有的许多防御方法过度依赖 OOD 检测结果，从而影响了模型的保真度，为了解决这一关键问题，我们创新性地引入了 LSS。通过计算可疑查询的 LSS，我们可以利用误报机制有选择性地为 LSS 高的查询输出误导性预测。广泛的实验证明，LSSMSD 可为受害者模型提供稳健的保护，使其免受黑盒代理攻击，如基于雅各布的数据集增强和仿冒网（Knockoff Nets）。它大大降低了攻击者替代模型的准确度（高达77.94%），同时对良性用户的准确度影响极小（平均（-2.72%）），从而保持了受害者模型的保真度。

{"title":"LSSMSD: defending against black-box DNN model stealing based on localized stochastic sensitivity","authors":"Xueli Zhang, Jiale Chen, Qihua Li, Jianjun Zhang, Wing W. Y. Ng, Ting Wang","doi":"10.1007/s13042-024-02376-0","DOIUrl":"https://doi.org/10.1007/s13042-024-02376-0","url":null,"abstract":"Machine learning as a service (MLaaS) has become a widely adopted approach, allowing customers to access even the most complex machine learning models through a pay-per-query model. Black-box distribution has been widely used to keep models secret in MLaaS. However, even with black-box distribution alleviating certain risks, the functionality of a model can still be compromised when customers gain access to their model’s predictions. To protect the intellectual property of model owners, we propose an effective defense method against model stealing attacks with the localized stochastic sensitivity (LSS), namely LSSMSD. First, suspicious queries are detected by employing an out-of-distribution (OOD) detector. Addressing a critical issue with many existing defense methods that overly rely on OOD detection results, thus affecting the model’s fidelity, we innovatively introduce LSS to solve this problem. By calculating the LSS of suspicious queries, we can selectively output misleading predictions for queries with high LSS using an misinformation mechanism. Extensive experiments demonstrate that LSSMSD offers robust protections for victim models against black-box proxy attacks such as Jacobian-based dataset augmentation and Knockoff Nets. It significantly reduces accuracies of attackers’ substitute models (up to 77.94%) while yields minimal impact to benign user accuracies (average (-2.72%)), thereby maintaining the fidelity of the victim model.","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"40 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CHNSCDA: circRNA-disease association prediction based on strongly correlated heterogeneous neighbor sampling CHNSCDA：基于强相关异质邻居抽样的 circRNA-疾病关联预测

IF 5.6 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics

Pub Date : 2024-09-17 DOI: 10.1007/s13042-024-02375-1

Yuanyuan Lin, Nianrui Wang, Jiangyan Liu, Fangqin Zhang, Zhouchao Wei, Ming Yi

Circular RNAs (circRNAs) are a special class of endogenous non-coding RNA molecules with a closed circular structure. Numerous studies have demonstrated that exploring the association between circRNAs and diseases is beneficial in revealing the pathogenesis of diseases. However, traditional biological experimental methods are time-consuming. Although some methods have explored the circRNA associated with diseases from different perspectives, how to effectively integrate the multi-perspective data of circRNAs has not been well studied, and the feature aggregation between heterogeneous nodes has not been fully considered. Based on these considerations, a novel computational framework, called CHNSCDA, is proposed to efficiently forecast unknown circRNA-disease associations(CDAs). Specifically, we calculate the sequence similarity and functional similarity for circRNAs, as well as the semantic similarity for diseases. Then the similarities of circRNAs and diseases are combined with Gaussian interaction profile kernels (GIPs) similarity, respectively. These similarities are fused by taking the maximum values. Moreover, circRNA-circRNA associations and disease-disease associations with strong correlations are selectively combined to construct a heterogeneous network. Subsequently, we predict the potential CDAs based on the multi-head dynamic attention mechanism and multi-layer convolutional neural network. The experimental results show that CHNSCDA outperforms the other four state-of-the-art methods and achieves an area under the ROC curve of 0.9803 in 5-fold cross validation (5-fold CV). In addition, extensive ablation comparison experiments were conducted to confirm the validity of different similarity feature aggregation methods, feature aggregation methods, and dynamic attention. Case studies further demonstrate the outstanding performance of CHNSCDA in predicting potential CDAs.

环状 RNA（circRNA）是一类特殊的内源性非编码 RNA 分子，具有封闭的环状结构。大量研究表明，探索环状 RNA 与疾病之间的关联有利于揭示疾病的发病机理。然而，传统的生物学实验方法耗时较长。虽然一些方法从不同角度探索了与疾病相关的 circRNA，但如何有效整合 circRNA 的多角度数据尚未得到深入研究，异构节点之间的特征聚合也未得到充分考虑。基于这些考虑，我们提出了一个新颖的计算框架，称为CHNSCDA，以有效预测未知的circRNA-疾病关联（CDA）。具体来说，我们计算 circRNA 的序列相似性和功能相似性，以及疾病的语义相似性。然后，将 circRNA 和疾病的相似性分别与高斯交互轮廓核（GIPs）相似性相结合。这些相似性通过取最大值进行融合。此外，我们还选择性地将具有强相关性的 circRNA-circRNA 关联和疾病-疾病关联结合起来，以构建异质网络。随后，我们基于多头动态注意机制和多层卷积神经网络预测潜在的 CDA。实验结果表明，CHNSCDA优于其他四种最先进的方法，在5倍交叉验证（5-fold CV）中的ROC曲线下面积达到0.9803。此外，还进行了大量的消融对比实验，以确认不同相似性特征聚合方法、特征聚合方法和动态注意力的有效性。案例研究进一步证明了 CHNSCDA 在预测潜在 CDA 方面的卓越性能。

{"title":"CHNSCDA: circRNA-disease association prediction based on strongly correlated heterogeneous neighbor sampling","authors":"Yuanyuan Lin, Nianrui Wang, Jiangyan Liu, Fangqin Zhang, Zhouchao Wei, Ming Yi","doi":"10.1007/s13042-024-02375-1","DOIUrl":"https://doi.org/10.1007/s13042-024-02375-1","url":null,"abstract":"Circular RNAs (circRNAs) are a special class of endogenous non-coding RNA molecules with a closed circular structure. Numerous studies have demonstrated that exploring the association between circRNAs and diseases is beneficial in revealing the pathogenesis of diseases. However, traditional biological experimental methods are time-consuming. Although some methods have explored the circRNA associated with diseases from different perspectives, how to effectively integrate the multi-perspective data of circRNAs has not been well studied, and the feature aggregation between heterogeneous nodes has not been fully considered. Based on these considerations, a novel computational framework, called CHNSCDA, is proposed to efficiently forecast unknown circRNA-disease associations(CDAs). Specifically, we calculate the sequence similarity and functional similarity for circRNAs, as well as the semantic similarity for diseases. Then the similarities of circRNAs and diseases are combined with Gaussian interaction profile kernels (GIPs) similarity, respectively. These similarities are fused by taking the maximum values. Moreover, circRNA-circRNA associations and disease-disease associations with strong correlations are selectively combined to construct a heterogeneous network. Subsequently, we predict the potential CDAs based on the multi-head dynamic attention mechanism and multi-layer convolutional neural network. The experimental results show that CHNSCDA outperforms the other four state-of-the-art methods and achieves an area under the ROC curve of 0.9803 in 5-fold cross validation (5-fold CV). In addition, extensive ablation comparison experiments were conducted to confirm the validity of different similarity feature aggregation methods, feature aggregation methods, and dynamic attention. Case studies further demonstrate the outstanding performance of CHNSCDA in predicting potential CDAs.","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"32 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scnet: shape-aware convolution with KFNN for point clouds completion Snet：利用 KFNN 完成点云的形状感知卷积

IF 5.6 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics

Pub Date : 2024-09-16 DOI: 10.1007/s13042-024-02359-1

Xiangyang Wu, Ziyuan Lu, Chongchong Qu, Haixin Zhou, Yongwei Miao

Scanned 3D point cloud data is typically noisy and incomplete. Existing point cloud completion methods tend to learn a mapping of available parts to the complete one but ignore the structural relationships in local regions. They are less competent in learning point distributions and recovering the details of the object. This paper proposes a shape-aware point cloud completion network (SCNet) that employs multi-scale features and a coarse-to-fine strategy to generate detailed, complete point clouds. Firstly, we introduce a K-feature nearest neighbor algorithm to explore local geometric structure and design a novel shape-aware graph convolution that utilizes multiple learnable filters to perceive local shape changes in different directions. Secondly, we adopt non-local feature expansion to generate a coarse point cloud as the rough shape and merge it with the input data to preserve the original structure. Finally, we employ a residual network to fine-tune the point coordinates to smooth the merged point cloud, which is then optimized to a fine point cloud using a refinement module with shape-aware graph convolution and local attention mechanisms. Extensive experiments demonstrate that our SCNet outperforms other methods on the same point cloud completion benchmark and is more stable and robust.

扫描的三维点云数据通常具有噪声和不完整性。现有的点云补全方法倾向于学习可用部分到完整部分的映射，但忽略了局部区域的结构关系。这些方法在学习点分布和恢复物体细节方面能力较弱。本文提出了一种形状感知点云补全网络（SCNet），它采用多尺度特征和从粗到细的策略来生成详细、完整的点云。首先，我们引入了一种 K 特征近邻算法来探索局部几何结构，并设计了一种新颖的形状感知图卷积，利用多个可学习滤波器来感知不同方向的局部形状变化。其次，我们采用非局部特征扩展生成粗点云作为粗略形状，并将其与输入数据合并以保留原始结构。最后，我们利用残差网络对点坐标进行微调，以平滑合并后的点云，然后利用具有形状感知图卷积和局部关注机制的细化模块将其优化为精细点云。广泛的实验证明，在相同的点云完成基准上，我们的 SCNet 优于其他方法，而且更加稳定和鲁棒。

{"title":"Scnet: shape-aware convolution with KFNN for point clouds completion","authors":"Xiangyang Wu, Ziyuan Lu, Chongchong Qu, Haixin Zhou, Yongwei Miao","doi":"10.1007/s13042-024-02359-1","DOIUrl":"https://doi.org/10.1007/s13042-024-02359-1","url":null,"abstract":"Scanned 3D point cloud data is typically noisy and incomplete. Existing point cloud completion methods tend to learn a mapping of available parts to the complete one but ignore the structural relationships in local regions. They are less competent in learning point distributions and recovering the details of the object. This paper proposes a shape-aware point cloud completion network (SCNet) that employs multi-scale features and a coarse-to-fine strategy to generate detailed, complete point clouds. Firstly, we introduce a K-feature nearest neighbor algorithm to explore local geometric structure and design a novel shape-aware graph convolution that utilizes multiple learnable filters to perceive local shape changes in different directions. Secondly, we adopt non-local feature expansion to generate a coarse point cloud as the rough shape and merge it with the input data to preserve the original structure. Finally, we employ a residual network to fine-tune the point coordinates to smooth the merged point cloud, which is then optimized to a fine point cloud using a refinement module with shape-aware graph convolution and local attention mechanisms. Extensive experiments demonstrate that our SCNet outperforms other methods on the same point cloud completion benchmark and is more stable and robust.","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"30 12 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Self-refined variational transformer for image-conditioned layout generation 用于图像条件布局生成的自精炼变分变换器

IF 5.6 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics

Pub Date : 2024-09-16 DOI: 10.1007/s13042-024-02355-5

Yunning Cao, Chuanbin Liu, Ye Ma, Min Zhou, Tiezheng Ge, Yuning Jiang, Hongtao Xie

Layout generation is an emerging computer vision task that incorporates the challenges of object localization and aesthetic evaluation, widely used in advertisements, posters, and slides design. An ideal layout should consider both the intra-domain relationship within layout elements and the inter-domain relationship between layout elements and the image. However, most previous methods simply focus on image-content-agnostic layout generation without leveraging the complex visual information from the image. To address this limitation, we propose a novel paradigm called image-conditioned layout generation, which aims to add text overlays to an image in a semantically coherent manner. Specifically, we introduce the Image-Conditioned Variational Transformer (ICVT) that autoregressively generates diverse layouts in an image. Firstly, the self-attention mechanism is adopted to model the contextual relationship within layout elements, while the cross-attention mechanism is used to fuse the visual information of conditional images. Subsequently, we take them as building blocks of the conditional variational autoencoder (CVAE), which demonstrates attractive diversity. Secondly, to alleviate the gap between the layout elements domain and the visual domain, we design a Geometry Alignment module, in which the geometric information of the image is aligned with the layout representation. Thirdly, we present a self-refinement mechanism to automatically refine the failure case of generated layout, effectively improving the quality of generation. Experimental results show that our model can adaptively generate layouts in the non-intrusive area of the image, resulting in a harmonious layout design.

版式生成是一项新兴的计算机视觉任务，它结合了对象定位和美学评价的挑战，广泛应用于广告、海报和幻灯片设计。理想的布局应同时考虑布局元素内部的域内关系以及布局元素与图像之间的域间关系。然而，以往的大多数方法仅仅关注与图像内容无关的版式生成，而没有充分利用图像中复杂的视觉信息。为了解决这一局限性，我们提出了一种称为图像条件布局生成的新模式，旨在以语义连贯的方式在图像上添加文本叠加。具体来说，我们引入了图像条件变异变换器（ICVT），它能在图像中自动生成多种布局。首先，我们采用自注意机制来模拟布局元素内部的上下文关系，而交叉注意机制则用于融合有条件图像的视觉信息。随后，我们将它们作为条件变异自动编码器（CVAE）的构建模块，从而展现出极具吸引力的多样性。其次，为了缩小布局元素域和视觉域之间的差距，我们设计了一个几何对齐模块，将图像的几何信息与布局表示法对齐。第三，我们提出了一种自完善机制，可自动完善生成布局的失败案例，有效提高生成质量。实验结果表明，我们的模型可以在图像的非侵入区域自适应生成布局，从而实现和谐的布局设计。

{"title":"Self-refined variational transformer for image-conditioned layout generation","authors":"Yunning Cao, Chuanbin Liu, Ye Ma, Min Zhou, Tiezheng Ge, Yuning Jiang, Hongtao Xie","doi":"10.1007/s13042-024-02355-5","DOIUrl":"https://doi.org/10.1007/s13042-024-02355-5","url":null,"abstract":"Layout generation is an emerging computer vision task that incorporates the challenges of object localization and aesthetic evaluation, widely used in advertisements, posters, and slides design. An ideal layout should consider both the intra-domain relationship within layout elements and the inter-domain relationship between layout elements and the image. However, most previous methods simply focus on image-content-agnostic layout generation without leveraging the complex visual information from the image. To address this limitation, we propose a novel paradigm called image-conditioned layout generation, which aims to add text overlays to an image in a semantically coherent manner. Specifically, we introduce the Image-Conditioned Variational Transformer (ICVT) that autoregressively generates diverse layouts in an image. Firstly, the self-attention mechanism is adopted to model the contextual relationship within layout elements, while the cross-attention mechanism is used to fuse the visual information of conditional images. Subsequently, we take them as building blocks of the conditional variational autoencoder (CVAE), which demonstrates attractive diversity. Secondly, to alleviate the gap between the layout elements domain and the visual domain, we design a Geometry Alignment module, in which the geometric information of the image is aligned with the layout representation. Thirdly, we present a self-refinement mechanism to automatically refine the failure case of generated layout, effectively improving the quality of generation. Experimental results show that our model can adaptively generate layouts in the non-intrusive area of the image, resulting in a harmonious layout design.","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"40 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Contextual feature fusion and refinement network for camouflaged object detection 用于伪装物体检测的上下文特征融合与细化网络

IF 5.6 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics

Pub Date : 2024-09-16 DOI: 10.1007/s13042-024-02348-4

Jinyu Yang, Yanjiao Shi, Ying Jiang, Zixuan Lu, Yugen Yi

Camouflaged object detection (COD) is a challenging task due to its irregular shape and color similarity or even blending into the surrounding environment. It is difficult to achieve satisfactory results by directly using salient object detection methods due to the low contrast with the surrounding environment and obscure object boundary in camouflaged object detection. To determine the location of the camouflaged objects and achieve accurate segmentation, the interaction between features is essential. Similarly, an effective feature aggregation method is also very important. In this paper, we propose a contextual fusion and feature refinement network (CFNet). Specifically, we propose a multiple-receptive-fields-based feature extraction module (MFM) that obtains features from multiple scales of receptive fields. Then, the features are input to an attention-based information interaction module (AIM), which establishes the information flow between adjacent layers through an attention mechanism. Finally, the features are fused and optimized layer by layer using a feature fusion module (FFM). We validate the proposed CFNet as an effective COD model on four benchmark datasets, and the generalization ability of our proposed model is verified in the salient object detection task.

伪装物体检测（COD）是一项具有挑战性的任务，因为伪装物体的形状不规则，颜色相似，甚至与周围环境融为一体。在伪装物体检测中，由于与周围环境对比度低，物体边界不明显，直接使用突出物体检测方法很难取得令人满意的结果。要确定伪装物体的位置并实现精确分割，特征之间的相互作用至关重要。同样，有效的特征聚合方法也非常重要。在本文中，我们提出了一种上下文融合和特征细化网络（CFNet）。具体来说，我们提出了一种基于多感受野的特征提取模块（MFM），它能从多个尺度的感受野中获取特征。然后，将这些特征输入基于注意力的信息交互模块（AIM），该模块通过注意力机制建立相邻层之间的信息流。最后，使用特征融合模块（FFM）对特征进行逐层融合和优化。我们在四个基准数据集上验证了所提出的 CFNet 是一种有效的 COD 模型，并在突出物体检测任务中验证了所提出模型的泛化能力。

{"title":"Contextual feature fusion and refinement network for camouflaged object detection","authors":"Jinyu Yang, Yanjiao Shi, Ying Jiang, Zixuan Lu, Yugen Yi","doi":"10.1007/s13042-024-02348-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02348-4","url":null,"abstract":"Camouflaged object detection (COD) is a challenging task due to its irregular shape and color similarity or even blending into the surrounding environment. It is difficult to achieve satisfactory results by directly using salient object detection methods due to the low contrast with the surrounding environment and obscure object boundary in camouflaged object detection. To determine the location of the camouflaged objects and achieve accurate segmentation, the interaction between features is essential. Similarly, an effective feature aggregation method is also very important. In this paper, we propose a contextual fusion and feature refinement network (CFNet). Specifically, we propose a multiple-receptive-fields-based feature extraction module (MFM) that obtains features from multiple scales of receptive fields. Then, the features are input to an attention-based information interaction module (AIM), which establishes the information flow between adjacent layers through an attention mechanism. Finally, the features are fused and optimized layer by layer using a feature fusion module (FFM). We validate the proposed CFNet as an effective COD model on four benchmark datasets, and the generalization ability of our proposed model is verified in the salient object detection task.","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"4 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ultra-high-definition underwater image enhancement via dual-domain interactive transformer network 通过双域交互式变压器网络实现超高清水下图像增强

IF 5.6 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics

Pub Date : 2024-09-15 DOI: 10.1007/s13042-024-02379-x

Weiwei Li, Feiyuan Cao, Yiwen Wei, Zhenghao Shi, Xiuyi Jia

The proliferation of ultra-high-definition (UHD) imaging device is increasingly being used for underwater image acquisition. However, due to light scattering and underwater impurities, UHD underwater images often suffer from color deviations and edge blurriness. Many studies have attempted to enhance underwater images by integrating frequency domain and spatial domain information. Nonetheless, these approaches often interactively fuse dual-domain features only in the final fusion module, neglecting the complementary and guiding roles of frequency domain and spatial domain features. Additionally, the extraction of dual-domain features is independent of each other, which leads to the sharp advantages and disadvantages of the dual-domain features extracted by these methods. Consequently, these methods impose high demands on the feature fusion capabilities of the fusion module. But in order to handle UHD underwater images, the fusion modules in these methods often stack only a limited number of convolution and activation function operations. This limitation results in insufficient fusion capability, leading to defects in the restoration of edges and colors in the images. To address these issues, we develop a dual-domain interaction network for enhancing UHD underwater images. The network takes into account both frequency domain and spatial domain features to complement and guide each other’s feature extraction patterns, and fully integrates the dual-domain features in the model to better recover image details and colors. Specifically, the network consists of a U-shaped structure, where each layer is composed of dual-domain interaction transformer blocks containing interactive multi-head attention and interactive simple gate feed-forward networks. The interactive multi-head attention captures local interaction features of frequency domain and spatial domain information using convolution operation, followed by multi-head attention operation to extract global information of the mixed features. The interactive simple gate feed-forward network further enhances the model’s dual-domain interaction capability and cross-dimensional feature extraction ability, resulting in clearer edges and more realistic colors in the images. Experimental results demonstrate that the performance of our proposal in enhancing underwater images is significantly better than existing methods.

随着超高清（UHD）成像设备的普及，越来越多的水下图像采集技术得到应用。然而，由于光散射和水下杂质的影响，超高清水下图像往往存在色彩偏差和边缘模糊的问题。许多研究试图通过整合频域和空间域信息来增强水下图像。然而，这些方法往往只在最后的融合模块中对双域特征进行交互式融合，忽视了频域和空间域特征的互补和引导作用。此外，双域特征的提取相互独立，导致这些方法提取的双域特征优劣分明。因此，这些方法对融合模块的特征融合能力提出了很高的要求。但是，为了处理超高清水下图像，这些方法中的融合模块往往只能堆叠有限数量的卷积和激活函数运算。这种限制导致融合能力不足，从而造成图像边缘和色彩还原方面的缺陷。为了解决这些问题，我们开发了一种用于增强超高清水下图像的双域交互网络。该网络同时考虑了频域和空间域特征，与特征提取模式相互补充、相互引导，并将双域特征充分整合到模型中，以更好地恢复图像细节和色彩。具体来说，该网络由 U 型结构组成，其中每一层都由包含交互式多头注意力和交互式简单门前馈网络的双域交互变压器块组成。交互式多头注意力通过卷积运算捕捉频域和空间域信息的局部交互特征，然后通过多头注意力运算提取混合特征的全局信息。交互式简单门前馈网络进一步增强了模型的双域交互能力和跨维特征提取能力，使图像的边缘更清晰，色彩更逼真。实验结果表明，我们的建议在增强水下图像方面的性能明显优于现有方法。

{"title":"Ultra-high-definition underwater image enhancement via dual-domain interactive transformer network","authors":"Weiwei Li, Feiyuan Cao, Yiwen Wei, Zhenghao Shi, Xiuyi Jia","doi":"10.1007/s13042-024-02379-x","DOIUrl":"https://doi.org/10.1007/s13042-024-02379-x","url":null,"abstract":"The proliferation of ultra-high-definition (UHD) imaging device is increasingly being used for underwater image acquisition. However, due to light scattering and underwater impurities, UHD underwater images often suffer from color deviations and edge blurriness. Many studies have attempted to enhance underwater images by integrating frequency domain and spatial domain information. Nonetheless, these approaches often interactively fuse dual-domain features only in the final fusion module, neglecting the complementary and guiding roles of frequency domain and spatial domain features. Additionally, the extraction of dual-domain features is independent of each other, which leads to the sharp advantages and disadvantages of the dual-domain features extracted by these methods. Consequently, these methods impose high demands on the feature fusion capabilities of the fusion module. But in order to handle UHD underwater images, the fusion modules in these methods often stack only a limited number of convolution and activation function operations. This limitation results in insufficient fusion capability, leading to defects in the restoration of edges and colors in the images. To address these issues, we develop a dual-domain interaction network for enhancing UHD underwater images. The network takes into account both frequency domain and spatial domain features to complement and guide each other’s feature extraction patterns, and fully integrates the dual-domain features in the model to better recover image details and colors. Specifically, the network consists of a U-shaped structure, where each layer is composed of dual-domain interaction transformer blocks containing interactive multi-head attention and interactive simple gate feed-forward networks. The interactive multi-head attention captures local interaction features of frequency domain and spatial domain information using convolution operation, followed by multi-head attention operation to extract global information of the mixed features. The interactive simple gate feed-forward network further enhances the model’s dual-domain interaction capability and cross-dimensional feature extraction ability, resulting in clearer edges and more realistic colors in the images. Experimental results demonstrate that the performance of our proposal in enhancing underwater images is significantly better than existing methods.","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"32 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Propagation tree says: dynamic evolution characteristics learning approach for rumor detection 传播树说：谣言检测的动态进化特征学习方法

IF 5.6 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics

Pub Date : 2024-09-14 DOI: 10.1007/s13042-024-02354-6

Shouhao Zhao, Shujuan Ji, Jiandong Lv, Xianwen Fang

Due to the rapid spread of rumors on social media, which has a detrimental effect on our lives, it is becoming increasingly important to detect rumors. It has been proved that the study of dynamic graphs is helpful to capture the temporal change of information transmission and understand the evolution trend and pattern change of events. However, the dynamic learning methods currently studied do not fully consider the interaction characteristics of the evolutionary process. Therefore, it is difficult to fully capture the structural and semantic differences between them. In order to fully exploit the potential correlations of such temporal information, we propose a novel model named dynamic evolution characteristics learning (DECL) method for rumor detection. First, we partition the temporal snapshot sequences based on the propagation structure of rumors. Secondly, a multi-task graph contrastive learning method is adopted to enable the graph encoder to capture the essential features of rumors, and to fully explore the temporal structural differences and semantic similarities between true rumor and false rumor events. Experimental results on three real-world social media datasets confirm the effectiveness of our model for rumor detection tasks.

由于谣言在社交媒体上迅速传播，对我们的生活造成了不利影响，因此发现谣言变得越来越重要。实践证明，动态图的研究有助于捕捉信息传播的时空变化，了解事件的演化趋势和模式变化。然而，目前研究的动态学习方法并没有充分考虑演化过程的交互特性。因此，很难完全捕捉它们之间的结构和语义差异。为了充分利用这些时间信息的潜在关联性，我们提出了一种用于谣言检测的名为动态演化特征学习（DECL）方法的新型模型。首先，我们根据谣言的传播结构对时间快照序列进行分区。其次，采用多任务图对比学习方法，使图编码器能够捕捉谣言的本质特征，并充分探索真谣言和假谣言事件之间的时间结构差异和语义相似性。在三个真实社交媒体数据集上的实验结果证实了我们的模型在谣言检测任务中的有效性。

{"title":"Propagation tree says: dynamic evolution characteristics learning approach for rumor detection","authors":"Shouhao Zhao, Shujuan Ji, Jiandong Lv, Xianwen Fang","doi":"10.1007/s13042-024-02354-6","DOIUrl":"https://doi.org/10.1007/s13042-024-02354-6","url":null,"abstract":"Due to the rapid spread of rumors on social media, which has a detrimental effect on our lives, it is becoming increasingly important to detect rumors. It has been proved that the study of dynamic graphs is helpful to capture the temporal change of information transmission and understand the evolution trend and pattern change of events. However, the dynamic learning methods currently studied do not fully consider the interaction characteristics of the evolutionary process. Therefore, it is difficult to fully capture the structural and semantic differences between them. In order to fully exploit the potential correlations of such temporal information, we propose a novel model named dynamic evolution characteristics learning (DECL) method for rumor detection. First, we partition the temporal snapshot sequences based on the propagation structure of rumors. Secondly, a multi-task graph contrastive learning method is adopted to enable the graph encoder to capture the essential features of rumors, and to fully explore the temporal structural differences and semantic similarities between true rumor and false rumor events. Experimental results on three real-world social media datasets confirm the effectiveness of our model for rumor detection tasks.","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"19 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Solving numerical and engineering optimization problems using a dynamic dual-population differential evolution algorithm 使用动态双人口微分进化算法解决数值和工程优化问题

IF 5.6 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics

Pub Date : 2024-09-14 DOI: 10.1007/s13042-024-02361-7

Wenlu Zuo, Yuelin Gao

Differential evolution (DE) is a cutting-edge meta-heuristic algorithm known for its simplicity and low computational overhead. But the traditional DE cannot effectively balance between exploration and exploitation. To solve this problem, in this paper, a dynamic dual-population DE variant (ADPDE) is proposed. Firstly, the dynamic population division mechanism based on individual potential value is presented to divide the population into two subgroups, effectively improving the population diversity. Secondly, a nonlinear reduction mechanism is designed to dynamically adjust the size of potential subgroup to allocate computing resources reasonably. Thirdly, two unique mutation strategies are adopted for two subgroups respectively to better utilise the effective information of potential individuals and ensure fast convergence speed. Finally, adaptive parameter setting methods of two subgroups further achieve the balance between exploration and exploitation. The effectiveness of improved strategies is verified on 21 classical benchmark functions. Then, to verify the overall performance of ADPDE, it is compared with three standard DE algorithms, eight excellent DE variants and seven advanced evolutionary algorithms on CEC2013, CEC2017 and CEC2020 test suites, respectively, and the results show that ADPDE has higher accuracy and faster convergence speed. Furthermore, ADPDE is compared with eight well-known optimizers and CEC2020 winner algorithms on nine real-world engineering optimization problems, and the results indicate ADPDE has the development potential for constrained optimization problems as well.

差分进化论（DE）是一种前沿的元启发式算法，以其简单和计算开销低而著称。但传统的差分进化算法无法有效平衡探索与利用之间的关系。为解决这一问题，本文提出了一种动态双种群 DE 变体（ADPDE）。首先，本文提出了基于个体潜能值的动态种群划分机制，将种群划分为两个子群，有效提高了种群多样性。其次，设计了非线性缩减机制，动态调整潜在子群的大小，合理分配计算资源。第三，对两个子群分别采用两种独特的突变策略，以更好地利用潜在个体的有效信息，确保快速收敛。最后，两个子群的自适应参数设置方法进一步实现了探索与利用之间的平衡。改进策略的有效性在 21 个经典基准函数上得到了验证。然后，为了验证 ADPDE 的整体性能，分别在 CEC2013、CEC2017 和 CEC2020 测试套件上将其与三种标准 DE 算法、八种优秀 DE 变种和七种高级进化算法进行了比较，结果表明 ADPDE 具有更高的精度和更快的收敛速度。此外，ADPDE 还在 9 个实际工程优化问题上与 8 个知名优化器和 CEC2020 获奖算法进行了比较，结果表明 ADPDE 在约束优化问题上也具有发展潜力。

{"title":"Solving numerical and engineering optimization problems using a dynamic dual-population differential evolution algorithm","authors":"Wenlu Zuo, Yuelin Gao","doi":"10.1007/s13042-024-02361-7","DOIUrl":"https://doi.org/10.1007/s13042-024-02361-7","url":null,"abstract":"Differential evolution (DE) is a cutting-edge meta-heuristic algorithm known for its simplicity and low computational overhead. But the traditional DE cannot effectively balance between exploration and exploitation. To solve this problem, in this paper, a dynamic dual-population DE variant (ADPDE) is proposed. Firstly, the dynamic population division mechanism based on individual potential value is presented to divide the population into two subgroups, effectively improving the population diversity. Secondly, a nonlinear reduction mechanism is designed to dynamically adjust the size of potential subgroup to allocate computing resources reasonably. Thirdly, two unique mutation strategies are adopted for two subgroups respectively to better utilise the effective information of potential individuals and ensure fast convergence speed. Finally, adaptive parameter setting methods of two subgroups further achieve the balance between exploration and exploitation. The effectiveness of improved strategies is verified on 21 classical benchmark functions. Then, to verify the overall performance of ADPDE, it is compared with three standard DE algorithms, eight excellent DE variants and seven advanced evolutionary algorithms on CEC2013, CEC2017 and CEC2020 test suites, respectively, and the results show that ADPDE has higher accuracy and faster convergence speed. Furthermore, ADPDE is compared with eight well-known optimizers and CEC2020 winner algorithms on nine real-world engineering optimization problems, and the results indicate ADPDE has the development potential for constrained optimization problems as well.","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"104 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Stock closing price prediction based on ICEEMDAN-FA-BiLSTM–GM combined model 基于 ICEEMDAN-FA-BiLSTM-GM 组合模型的股票收盘价预测

IF 5.6 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics

Pub Date : 2024-09-14 DOI: 10.1007/s13042-024-02366-2

Lewei Xie, Ruibo Wan, Yuxin Wang, Fangjian Li

The accuracy of stock price forecasting is of great significance in investment decision-making and risk management. However, the complexity and fluctuation of stock prices challenge the traditional forecasting methods to achieve the best accuracy. To improve the accuracy of stock price prediction, a sophisticated combination prediction method based on ICEEMDAN-FA-BiLSTM–GM has been proposed in this article. In this paper, a comprehensive and effective indicator system is constructed, covering 60 indicators such as traditional factors, market sentiment, macroeconomic indicators and company financial data, which affect stock prices. In the data preprocessing stage, in order to eliminate the influence of noise, the stock closing price series is first decomposed by using the ICEEMDAN method, which effectively divides them into high-frequency and low-frequency components according to their respective frequencies. Subsequently, LLE technique is used to narrow down the remaining indicators to obtain 9 narrowed features. Finally, each high-frequency subsequence is combined with all the dimensionality reduction features respectively to construct new indicator sets for input to the model. In the prediction stage, the hyperparameters of the prediction model for each subseries have been determined using the FA algorithm. The prediction has been carried out separately for the high-frequency and low-frequency components, employing the BiLSTM and GM prediction methods. Ultimately, the prediction results of each subseries have been superimposed to obtain the final stock price prediction value. In this paper, an empirical study was conducted using stock price data such as Shanghai composite index. The experimental results show that the established stock price prediction model based on ICEEMDAN-FA-BiLSTM–GM has obvious advantages in terms of prediction accuracy and stability compared with traditional methods and other combined prediction methods. This model can provide more accurate stock price prediction and promote the rationalization of investment decision and the accuracy of risk control.

股票价格预测的准确性对投资决策和风险管理具有重要意义。然而，股票价格的复杂性和波动性对传统预测方法的准确性提出了挑战。为了提高股价预测的准确性，本文提出了一种基于 ICEEMDAN-FA-BiLSTM-GM 的复杂组合预测方法。本文构建了一个全面有效的指标体系，涵盖了影响股价的传统因素、市场情绪、宏观经济指标和公司财务数据等 60 个指标。在数据预处理阶段，为了消除噪声的影响，首先使用 ICEEMDAN 方法对股票收盘价格序列进行分解，根据各自的频率将其有效地分为高频成分和低频成分。随后，利用 LLE 技术缩小剩余指标的范围，得到 9 个缩小后的特征。最后，每个高频子序列分别与所有降维特征相结合，构建新的指标集输入模型。在预测阶段，使用 FA 算法确定了每个子序列预测模型的超参数。采用 BiLSTM 和 GM 预测方法分别对高频和低频成分进行预测。最后，将各子序列的预测结果进行叠加，得出最终的股价预测值。本文利用上海综合指数等股价数据进行了实证研究。实验结果表明，基于 ICEEMDAN-FA-BiLSTM-GM 建立的股价预测模型与传统方法和其他组合预测方法相比，在预测精度和稳定性方面具有明显优势。该模型可以提供更准确的股价预测，促进投资决策的合理性和风险控制的准确性。

{"title":"Stock closing price prediction based on ICEEMDAN-FA-BiLSTM–GM combined model","authors":"Lewei Xie, Ruibo Wan, Yuxin Wang, Fangjian Li","doi":"10.1007/s13042-024-02366-2","DOIUrl":"https://doi.org/10.1007/s13042-024-02366-2","url":null,"abstract":"The accuracy of stock price forecasting is of great significance in investment decision-making and risk management. However, the complexity and fluctuation of stock prices challenge the traditional forecasting methods to achieve the best accuracy. To improve the accuracy of stock price prediction, a sophisticated combination prediction method based on ICEEMDAN-FA-BiLSTM–GM has been proposed in this article. In this paper, a comprehensive and effective indicator system is constructed, covering 60 indicators such as traditional factors, market sentiment, macroeconomic indicators and company financial data, which affect stock prices. In the data preprocessing stage, in order to eliminate the influence of noise, the stock closing price series is first decomposed by using the ICEEMDAN method, which effectively divides them into high-frequency and low-frequency components according to their respective frequencies. Subsequently, LLE technique is used to narrow down the remaining indicators to obtain 9 narrowed features. Finally, each high-frequency subsequence is combined with all the dimensionality reduction features respectively to construct new indicator sets for input to the model. In the prediction stage, the hyperparameters of the prediction model for each subseries have been determined using the FA algorithm. The prediction has been carried out separately for the high-frequency and low-frequency components, employing the BiLSTM and GM prediction methods. Ultimately, the prediction results of each subseries have been superimposed to obtain the final stock price prediction value. In this paper, an empirical study was conducted using stock price data such as Shanghai composite index. The experimental results show that the established stock price prediction model based on ICEEMDAN-FA-BiLSTM–GM has obvious advantages in terms of prediction accuracy and stability compared with traditional methods and other combined prediction methods. This model can provide more accurate stock price prediction and promote the rationalization of investment decision and the accuracy of risk control.","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"50 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detecting complex copy-move forgery using KeyPoint-Siamese Capsule Network against adversarial attacks 利用 KeyPoint-Siamese Capsule 网络检测复杂的复制移动伪造，对抗对抗性攻击

IF 5.6 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics

Pub Date : 2024-09-13 DOI: 10.1007/s13042-024-02370-6

S. B. Aiswerya, S. Joseph Jawhar

Digital image forensics, particularly in the realm of detecting Copy-Move Forgery (CMF), is exposed to significant challenges, especially in the face of intricate adversarial attacks. In response to these challenges, this paper presents a robust approach for detecting complex CMFs in digital images using the KeyPoint-Siamese Capsule Network (KP-SCN) and evaluates its resilience against adversarial attacks. The KP-SCN architecture incorporates keypoint detection, a Siamese network for feature extraction, and a capsule network for forgery detection. The method showcases enhanced robustness against adversarial attacks, specifically addressing image perturbation, patch removal, patch replacement, and spatial transformation attacks. By using hierarchical feature representations and dynamic routing in capsule networks, the model effectively handles complex CMF, including rotation, scaling, and non-linear transformations. The proposed KP-SCN approach employs a large dataset for training the KP-SCN, enabling it to identify copy-move forgeries by comparing extracted keypoints and their spatial relationships. KP-SCN demonstrates superior performance compared to the state-of-the-art on the CoMoFoD dataset, achieving precision, recall, and F1-score values of 95.62%, 93.78%, and 94.69%, respectively, and shows strong results on other datasets. For CASIA v2.0, the precision, recall, and F1-score are 90.45%, 88.97%, and 89.70%; for MICC-F2000, they are 91.32%, 90.27%, and 90.79%; for MICC-F600, they are 92.21%, 91.10%, and 91.65%; for MICC-F8multi, they are 89.75%, 87.92%, and 88.83%; and for IMD, they are 93.14%, 92.58%, and 92.86%. The KP-SCN framework maintains high detection rates under various manipulations, including JPEG compression, rotation, scaling, noise, blurring, brightness changes, contrast adjustment, and zoom motion blur compared to the other methods. For instance, it achieves an 80.657% detection rate for CoMoFoD under JPEG compression and 97.883% for IMD under a 10-degree rotation. These findings validate the robustness and adaptability of KP-SCN, making it a reliable solution for real-world forensic applications.

数字图像取证，特别是在检测复制移动伪造（CMF）领域，面临着巨大的挑战，尤其是面对错综复杂的对抗性攻击。为应对这些挑战，本文提出了一种利用关键点-暹罗胶囊网络（KP-SCN）检测数字图像中复杂 CMF 的稳健方法，并评估了该方法抵御对抗性攻击的能力。KP-SCN 架构包含关键点检测、用于特征提取的连体网络和用于伪造检测的胶囊网络。该方法增强了对抗恶意攻击的鲁棒性，特别是针对图像扰动、补丁移除、补丁替换和空间变换攻击的鲁棒性。通过在胶囊网络中使用分层特征表示和动态路由，该模型能有效处理复杂的 CMF，包括旋转、缩放和非线性变换。所提出的 KP-SCN 方法采用大型数据集来训练 KP-SCN，使其能够通过比较提取的关键点及其空间关系来识别复制移动伪造。KP-SCN 在 CoMoFoD 数据集上的表现优于最先进的技术，精确度、召回率和 F1 分数值分别达到 95.62%、93.78% 和 94.69%，在其他数据集上也有很好的表现。CASIA v2.0的精确度、召回率和F1-score分别为90.45%、88.97%和89.70%；MICC-F2000的精确度、召回率和F1-score分别为91.32%、90.27%和90.79%；MICC-F600的精确度、召回率和F1-score分别为92.21%、91.10%和91.65%；MICC-F8multi的精确度、召回率和F1-score分别为89.75%、87.92%和88.83%；IMD的精确度、召回率和F1-score分别为93.14%、92.58%和92.86%。与其他方法相比，KP-SCN 框架在各种操作（包括 JPEG 压缩、旋转、缩放、噪声、模糊、亮度变化、对比度调整和变焦运动模糊）下都能保持较高的检测率。例如，在 JPEG 压缩条件下，CoMoFoD 的检测率为 80.657%；在 10 度旋转条件下，IMD 的检测率为 97.883%。这些发现验证了 KP-SCN 的鲁棒性和适应性，使其成为现实世界取证应用的可靠解决方案。

{"title":"Detecting complex copy-move forgery using KeyPoint-Siamese Capsule Network against adversarial attacks","authors":"S. B. Aiswerya, S. Joseph Jawhar","doi":"10.1007/s13042-024-02370-6","DOIUrl":"https://doi.org/10.1007/s13042-024-02370-6","url":null,"abstract":"Digital image forensics, particularly in the realm of detecting Copy-Move Forgery (CMF), is exposed to significant challenges, especially in the face of intricate adversarial attacks. In response to these challenges, this paper presents a robust approach for detecting complex CMFs in digital images using the KeyPoint-Siamese Capsule Network (KP-SCN) and evaluates its resilience against adversarial attacks. The KP-SCN architecture incorporates keypoint detection, a Siamese network for feature extraction, and a capsule network for forgery detection. The method showcases enhanced robustness against adversarial attacks, specifically addressing image perturbation, patch removal, patch replacement, and spatial transformation attacks. By using hierarchical feature representations and dynamic routing in capsule networks, the model effectively handles complex CMF, including rotation, scaling, and non-linear transformations. The proposed KP-SCN approach employs a large dataset for training the KP-SCN, enabling it to identify copy-move forgeries by comparing extracted keypoints and their spatial relationships. KP-SCN demonstrates superior performance compared to the state-of-the-art on the CoMoFoD dataset, achieving precision, recall, and F1-score values of 95.62%, 93.78%, and 94.69%, respectively, and shows strong results on other datasets. For CASIA v2.0, the precision, recall, and F1-score are 90.45%, 88.97%, and 89.70%; for MICC-F2000, they are 91.32%, 90.27%, and 90.79%; for MICC-F600, they are 92.21%, 91.10%, and 91.65%; for MICC-F8multi, they are 89.75%, 87.92%, and 88.83%; and for IMD, they are 93.14%, 92.58%, and 92.86%. The KP-SCN framework maintains high detection rates under various manipulations, including JPEG compression, rotation, scaling, noise, blurring, brightness changes, contrast adjustment, and zoom motion blur compared to the other methods. For instance, it achieves an 80.657% detection rate for CoMoFoD under JPEG compression and 97.883% for IMD under a 10-degree rotation. These findings validate the robustness and adaptability of KP-SCN, making it a reliable solution for real-world forensic applications.","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"4 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0