首页 > 最新文献

International Journal of Machine Learning and Cybernetics最新文献

英文 中文
A multi-strategy hybrid cuckoo search algorithm with specular reflection based on a population linear decreasing strategy 基于群体线性递减策略的带有镜面反射的多策略混合布谷鸟搜索算法
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-05 DOI: 10.1007/s13042-024-02273-6
Chengtian Ouyang, Xin Liu, Donglin Zhu, Yangyang Zheng, Changjun Zhou, Chengye Zou

The cuckoo search algorithm (CS), an algorithm inspired by the nest-parasitic breeding behavior of cuckoos, has proved its own effectiveness as a problem-solving approach in many fields since it was proposed. Nevertheless, the cuckoo search algorithm still suffers from an imbalance between exploration and exploitation as well as a tendency to fall into local optimization. In this paper, we propose a new hybrid cuckoo search algorithm (LHCS) based on linear decreasing of populations, and in order to optimize the local search of the algorithm and make the algorithm converge quickly, we mix the solution updating strategy of the Grey Yours sincerely, wolf optimizer (GWO) and use the linear decreasing rule to adjust the calling ratio of the strategy in order to balance the global exploration and the local exploitation; Second, the addition of a specular reflection learning strategy enhances the algorithm's ability to jump out of local optima; Finally, the convergence ability of the algorithm on different intervals and the adaptive ability of population diversity are improved using a population linear decreasing strategy. The experimental results on 29 benchmark functions from the CEC2017 test set show that the LHCS algorithm has significant superiority and stability over other algorithms when the quality of all solutions is considered together. In order to further verify the performance of the proposed algorithm in this paper, we applied the algorithm to engineering problems, functional tests, and Wilcoxon test results show that the comprehensive performance of the LHCS algorithm outperforms the other 14 state-of-the-art algorithms. In several engineering optimization problems, the practicality and effectiveness of the LHCS algorithm are verified, and the design cost can be greatly reduced by applying it to real engineering problems.

布谷鸟搜索算法(CS)是一种受布谷鸟筑巢寄生繁殖行为启发而产生的算法,自提出以来,已在许多领域证明了其作为一种解决问题的方法的有效性。然而,布谷鸟搜索算法仍然存在探索与利用不平衡以及容易陷入局部优化的问题。本文提出了一种基于种群线性递减的新型混合布谷鸟搜索算法(LHCS),为了优化算法的局部搜索,使算法快速收敛,我们混合了灰狼优化器(GWO)的解更新策略,并利用线性递减规则调整策略的调用比例,以平衡全局探索和局部开发;其次,增加了镜面反射学习策略,增强了算法跳出局部最优的能力;最后,利用种群线性递减策略提高了算法在不同区间的收敛能力和种群多样性的适应能力。对 CEC2017 测试集中 29 个基准函数的实验结果表明,综合考虑所有解的质量,LHCS 算法比其他算法具有明显的优越性和稳定性。为了进一步验证本文所提算法的性能,我们将该算法应用于工程问题、功能测试,Wilcoxon 检验结果表明,LHCS 算法的综合性能优于其他 14 种最先进算法。在多个工程优化问题中,LHCS 算法的实用性和有效性得到了验证,将其应用于实际工程问题,可以大大降低设计成本。
{"title":"A multi-strategy hybrid cuckoo search algorithm with specular reflection based on a population linear decreasing strategy","authors":"Chengtian Ouyang, Xin Liu, Donglin Zhu, Yangyang Zheng, Changjun Zhou, Chengye Zou","doi":"10.1007/s13042-024-02273-6","DOIUrl":"https://doi.org/10.1007/s13042-024-02273-6","url":null,"abstract":"<p>The cuckoo search algorithm (CS), an algorithm inspired by the nest-parasitic breeding behavior of cuckoos, has proved its own effectiveness as a problem-solving approach in many fields since it was proposed. Nevertheless, the cuckoo search algorithm still suffers from an imbalance between exploration and exploitation as well as a tendency to fall into local optimization. In this paper, we propose a new hybrid cuckoo search algorithm (LHCS) based on linear decreasing of populations, and in order to optimize the local search of the algorithm and make the algorithm converge quickly, we mix the solution updating strategy of the Grey Yours sincerely, wolf optimizer (GWO) and use the linear decreasing rule to adjust the calling ratio of the strategy in order to balance the global exploration and the local exploitation; Second, the addition of a specular reflection learning strategy enhances the algorithm's ability to jump out of local optima; Finally, the convergence ability of the algorithm on different intervals and the adaptive ability of population diversity are improved using a population linear decreasing strategy. The experimental results on 29 benchmark functions from the CEC2017 test set show that the LHCS algorithm has significant superiority and stability over other algorithms when the quality of all solutions is considered together. In order to further verify the performance of the proposed algorithm in this paper, we applied the algorithm to engineering problems, functional tests, and Wilcoxon test results show that the comprehensive performance of the LHCS algorithm outperforms the other 14 state-of-the-art algorithms. In several engineering optimization problems, the practicality and effectiveness of the LHCS algorithm are verified, and the design cost can be greatly reduced by applying it to real engineering problems.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"12 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-dimensional intrinsic dimension reveals a phase transition in gradient-based learning of deep neural networks 低维内在维度揭示了基于梯度学习的深度神经网络的阶段性转变
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-04 DOI: 10.1007/s13042-024-02244-x
Chengli Tan, Jiangshe Zhang, Junmin Liu, Zixiang Zhao

Deep neural networks complete a feature extraction task by propagating the inputs through multiple modules. However, how the representations evolve with the gradient-based optimization remains unknown. Here we leverage the intrinsic dimension of the representations to study the learning dynamics and find that the training process undergoes a phase transition from expansion to compression under disparate training regimes. Surprisingly, this phenomenon is ubiquitous across a wide variety of model architectures, optimizers, and data sets. We demonstrate that the variation in the intrinsic dimension is consistent with the complexity of the learned hypothesis, which can be quantitatively assessed by the critical sample ratio that is rooted in adversarial robustness. Meanwhile, we mathematically show that this phenomenon can be analyzed in terms of the mutable correlation between neurons. Although the evoked activities obey a power-law decaying rule in biological circuits, we identify that the power-law exponent of the representations in deep neural networks predicted adversarial robustness well only at the end of the training but not during the training process. These results together suggest that deep neural networks are prone to producing robust representations by adaptively eliminating or retaining redundancies. The code is publicly available at https://github.com/cltan023/learning2022.

深度神经网络通过多个模块传播输入来完成特征提取任务。然而,表征如何随着基于梯度的优化而演化仍是未知数。在这里,我们利用表征的内在维度来研究学习动态,并发现在不同的训练机制下,训练过程经历了从扩展到压缩的阶段性转变。令人惊讶的是,这种现象在各种模型架构、优化器和数据集中都普遍存在。我们证明了内在维度的变化与所学假设的复杂性是一致的,这可以通过临界样本比进行定量评估,而临界样本比则植根于对抗鲁棒性。同时,我们用数学方法证明,这种现象可以用神经元之间可变的相关性来分析。虽然诱发活动在生物回路中遵循幂律衰减规则,但我们发现,深度神经网络中表征的幂律指数只有在训练结束时才能很好地预测对抗鲁棒性,而在训练过程中却不能。这些结果共同表明,深度神经网络很容易通过自适应地消除或保留冗余来产生鲁棒性表征。代码可在 https://github.com/cltan023/learning2022 公开获取。
{"title":"Low-dimensional intrinsic dimension reveals a phase transition in gradient-based learning of deep neural networks","authors":"Chengli Tan, Jiangshe Zhang, Junmin Liu, Zixiang Zhao","doi":"10.1007/s13042-024-02244-x","DOIUrl":"https://doi.org/10.1007/s13042-024-02244-x","url":null,"abstract":"<p>Deep neural networks complete a feature extraction task by propagating the inputs through multiple modules. However, how the representations evolve with the gradient-based optimization remains unknown. Here we leverage the intrinsic dimension of the representations to study the learning dynamics and find that the training process undergoes a phase transition from expansion to compression under disparate training regimes. Surprisingly, this phenomenon is ubiquitous across a wide variety of model architectures, optimizers, and data sets. We demonstrate that the variation in the intrinsic dimension is consistent with the complexity of the learned hypothesis, which can be quantitatively assessed by the critical sample ratio that is rooted in adversarial robustness. Meanwhile, we mathematically show that this phenomenon can be analyzed in terms of the mutable correlation between neurons. Although the evoked activities obey a power-law decaying rule in biological circuits, we identify that the power-law exponent of the representations in deep neural networks predicted adversarial robustness well only at the end of the training but not during the training process. These results together suggest that deep neural networks are prone to producing robust representations by adaptively eliminating or retaining redundancies. The code is publicly available at https://github.com/cltan023/learning2022.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"48 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel abstractive summarization model based on topic-aware and contrastive learning 基于主题感知和对比学习的新型抽象摘要模型
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-04 DOI: 10.1007/s13042-024-02263-8
Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu

The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.

大多数抽象摘要模型都是基于序列到序列(Sequence-to-Sequence,Seq2Seq)架构设计的。这些模型能够捕捉词与词之间的句法和上下文信息。然而,基于 Seq2Seq 的摘要模型往往会忽略全局语义信息。此外,该模型的目标函数和评价指标之间也存在不一致。针对这些局限性,本文提出了一种名为 ASTCL 的新型模型。它将神经主题模型创新性地集成到 Seq2Seq 框架中,旨在捕捉文本的全局语义信息并指导摘要的生成。此外,它还结合了对比学习技术,通过对多个候选摘要进行评分来减少客观损失与评价指标之间的差异。在 CNN/DM XSum 和 NYT 数据集上的实验结果表明,ASTCL 模型在摘要任务中的表现优于其他通用模型。
{"title":"A novel abstractive summarization model based on topic-aware and contrastive learning","authors":"Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu","doi":"10.1007/s13042-024-02263-8","DOIUrl":"https://doi.org/10.1007/s13042-024-02263-8","url":null,"abstract":"<p>The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"48 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Undersampling based on generalized learning vector quantization and natural nearest neighbors for imbalanced data 基于广义学习向量量化和自然近邻的不平衡数据去采样
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-03 DOI: 10.1007/s13042-024-02261-w
Long-Hui Wang, Qi Dai, Jia-You Wang, Tony Du, Lifang Chen

Imbalanced datasets can adversely affect classifier performance. Conventional undersampling approaches may lead to the loss of essential information, while oversampling techniques could introduce noise. To address this challenge, we propose an undersampling algorithm called GLNDU (Generalized Learning Vector Quantization and Natural Nearest Neighbors-based Undersampling). GLNDU utilizes Generalized Learning Vector Quantization (GLVQ) for computing the centroids of positive and negative instances. It also utilizes the concept of Natural Nearest Neighbors to identify majority-class instances in the overlapping region of the centroids of minority-class instances. Afterwards, these majority-class instances are removed, resulting in a new balanced training dataset that is used to train a foundational classifier. We conduct extensive experiments on 29 publicly available datasets, evaluating the performance using AUC and G_mean values. GLNDU demonstrates significant advantages over established methods such as SVM, CART, and KNN across different types of classifiers. Additionally, the results of the Friedman ranking and Nemenyi post-hoc test provide additional support for the findings obtained from the experiments.

不平衡的数据集会对分类器的性能产生不利影响。传统的欠采样方法可能会导致基本信息丢失,而超采样技术则可能会引入噪声。为了应对这一挑战,我们提出了一种称为 GLNDU(基于广义学习矢量量化和自然近邻的欠采样)的欠采样算法。GLNDU 利用广义学习矢量量化(GLVQ)计算正负实例的中心点。它还利用 "自然近邻"(Natural Nearest Neighbors)的概念,在少数类实例中心点的重叠区域识别多数类实例。之后,这些多数类实例会被移除,从而产生一个新的平衡训练数据集,用于训练基础分类器。我们在 29 个公开可用的数据集上进行了广泛的实验,并使用 AUC 和 G_mean 值对性能进行了评估。与 SVM、CART 和 KNN 等成熟方法相比,GLNDU 在不同类型的分类器上都表现出显著优势。此外,Friedman 排序和 Nemenyi 事后检验的结果也为实验结果提供了更多支持。
{"title":"Undersampling based on generalized learning vector quantization and natural nearest neighbors for imbalanced data","authors":"Long-Hui Wang, Qi Dai, Jia-You Wang, Tony Du, Lifang Chen","doi":"10.1007/s13042-024-02261-w","DOIUrl":"https://doi.org/10.1007/s13042-024-02261-w","url":null,"abstract":"<p>Imbalanced datasets can adversely affect classifier performance. Conventional undersampling approaches may lead to the loss of essential information, while oversampling techniques could introduce noise. To address this challenge, we propose an undersampling algorithm called GLNDU (Generalized Learning Vector Quantization and Natural Nearest Neighbors-based Undersampling). GLNDU utilizes Generalized Learning Vector Quantization (GLVQ) for computing the centroids of positive and negative instances. It also utilizes the concept of Natural Nearest Neighbors to identify majority-class instances in the overlapping region of the centroids of minority-class instances. Afterwards, these majority-class instances are removed, resulting in a new balanced training dataset that is used to train a foundational classifier. We conduct extensive experiments on 29 publicly available datasets, evaluating the performance using AUC and G_mean values. GLNDU demonstrates significant advantages over established methods such as SVM, CART, and KNN across different types of classifiers. Additionally, the results of the Friedman ranking and Nemenyi post-hoc test provide additional support for the findings obtained from the experiments.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"157 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A copy-move forgery detection technique using DBSCAN-based keypoint similarity matching 使用基于 DBSCAN 的关键点相似性匹配的复制移动伪造检测技术
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-03 DOI: 10.1007/s13042-024-02268-3
Soumya Mukherjee, Arup Kumar Pal, Soham Maji

In an era marked by the contrast between information and disinformation, the ability to differentiate between authentic and manipulated images holds immense importance for both security professionals and the scientific community. Copy-move forgery is widely practiced thus, sprang up as a prevalent form of image manipulation among different types of forgeries. In this counterfeiting process, a region of an image is copied and pasted into different parts of the same image to hide or replicate the same objects. As copy-move forgery is hard to detect and localize, a swift and efficacious detection scheme based on keypoint detection is introduced. Especially the localization of forged areas becomes more difficult when the forged image is subjected to different post-processing attacks and geometrical attacks. In this paper, a robust, translation-invariant, and efficient copy-move forgery detection technique has been introduced. To achieve this goal, we developed an AKAZE-driven keypoint-based forgery detection technique. AKAZE is applied to the LL sub-band of the SWT-transformed image to extract translation invariant features, rather than extracting them directly from the original image. We then use the DBSCAN clustering algorithm and a uniform quantizer on each cluster to form group pairs based on their feature descriptor values. To mitigate false positives, keypoint pairs are separated by a distance greater than a predefined shift vector distance. This process forms a collection of keypoints within each cluster by leveraging their similarities in feature descriptors. Our clustering-based similarity-matching algorithm effectively locates the forged region. To assess the proposed scheme we deploy it on different datasets with post-processing attacks ranging from blurring, color reduction, contrast adjustment, brightness change, and noise addition. Even our method successfully withstands geometrical manipulations like rotation, skewing, and different affine transform attacks. Visual outcomes, numerical results, and comparative analysis show that the proposed model accurately detects the forged area with fewer false positives and is more computationally efficient than other methods.

在这个信息与虚假信息对比强烈的时代,区分真假图像的能力对于安全专业人员和科学界来说都极为重要。因此,复制移动伪造被广泛采用,成为不同类型伪造中一种普遍的图像处理方式。在这种伪造过程中,图像的一个区域被复制并粘贴到同一图像的不同部分,以隐藏或复制相同的对象。由于复制移动伪造很难检测和定位,因此引入了一种基于关键点检测的快速有效的检测方案。特别是当伪造图像受到不同的后处理攻击和几何攻击时,伪造区域的定位变得更加困难。本文介绍了一种稳健、平移不变且高效的复制移动伪造检测技术。为了实现这一目标,我们开发了一种基于 AKAZE 驱动的关键点伪造检测技术。AKAZE 应用于 SWT 变换图像的 LL 子带,以提取平移不变特征,而不是直接从原始图像中提取。然后,我们使用 DBSCAN 聚类算法和每个聚类上的均匀量化器,根据特征描述值形成组对。为了减少误报,关键点对之间的距离要大于预定义的移位向量距离。这一过程通过利用特征描述符的相似性,在每个聚类中形成一个关键点集合。我们基于聚类的相似性匹配算法能有效定位伪造区域。为了评估所提出的方案,我们在不同的数据集上对其进行了后处理,包括模糊、减色、对比度调整、亮度变化和噪声添加。我们的方法还能成功抵御旋转、倾斜等几何操作和不同的仿射变换攻击。视觉结果、数值结果和比较分析表明,所提出的模型能准确检测出伪造区域,误报率较低,而且与其他方法相比计算效率更高。
{"title":"A copy-move forgery detection technique using DBSCAN-based keypoint similarity matching","authors":"Soumya Mukherjee, Arup Kumar Pal, Soham Maji","doi":"10.1007/s13042-024-02268-3","DOIUrl":"https://doi.org/10.1007/s13042-024-02268-3","url":null,"abstract":"<p>In an era marked by the contrast between information and disinformation, the ability to differentiate between authentic and manipulated images holds immense importance for both security professionals and the scientific community. Copy-move forgery is widely practiced thus, sprang up as a prevalent form of image manipulation among different types of forgeries. In this counterfeiting process, a region of an image is copied and pasted into different parts of the same image to hide or replicate the same objects. As copy-move forgery is hard to detect and localize, a swift and efficacious detection scheme based on keypoint detection is introduced. Especially the localization of forged areas becomes more difficult when the forged image is subjected to different post-processing attacks and geometrical attacks. In this paper, a robust, translation-invariant, and efficient copy-move forgery detection technique has been introduced. To achieve this goal, we developed an AKAZE-driven keypoint-based forgery detection technique. AKAZE is applied to the LL sub-band of the SWT-transformed image to extract translation invariant features, rather than extracting them directly from the original image. We then use the DBSCAN clustering algorithm and a uniform quantizer on each cluster to form group pairs based on their feature descriptor values. To mitigate false positives, keypoint pairs are separated by a distance greater than a predefined shift vector distance. This process forms a collection of keypoints within each cluster by leveraging their similarities in feature descriptors. Our clustering-based similarity-matching algorithm effectively locates the forged region. To assess the proposed scheme we deploy it on different datasets with post-processing attacks ranging from blurring, color reduction, contrast adjustment, brightness change, and noise addition. Even our method successfully withstands geometrical manipulations like rotation, skewing, and different affine transform attacks. Visual outcomes, numerical results, and comparative analysis show that the proposed model accurately detects the forged area with fewer false positives and is more computationally efficient than other methods.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"43 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Class-structure preserving multi-view correlated discriminant analysis for multiblock data 针对多块数据的类别结构保存多视角相关判别分析
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-02 DOI: 10.1007/s13042-024-02270-9
Sankar Mondal, Pradipta Maji

With the rapid development in data acquisition methods, multiple data sources are now becoming available to explain different views of an object. This consequently introduces several new challenges in integrating the high dimensional, distinct, and heterogeneous views under multi-view learning (MVL) framework. The multiset canonical correlation analysis (MCCA) is a popular subspace learning technique in MVL, which forms a common latent space by maximizing the pairwise correlation across all the views. However, MCCA does not utilize the class label information of the objects and is unable to handle the data non-linearity. Although there exist a few supervised extensions of MCCA, they lack productive use of intra-view and inter-view consistency and/or inconsistency information while using the class label. In this regard, a supervised subspace learning method, termed as class-structure preserving multi-view correlated discriminant analysis (CSP-MvCDA), is proposed by judiciously integrating the merits of MCCA, linear discriminant analysis (LDA), and a locality preserving norm. The proposed method jointly optimizes the inter-set correlation across all the views and intra-set discrimination in each view to obtain a common discriminative latent space, where the shared and complementary information across multiple views is exploited. The locality preserving norm with prior class labels helps to preserve the local class-structure of the data, while the LDA maintains its global class-structure. To show the effectiveness of the proposed method, several cancer and benchmark data sets are used. The experimental results establish that the proposed CSP-MvCDA method is superior to several state-of-the-art algorithms in terms of classification performance.

随着数据采集方法的飞速发展,现在可以利用多种数据源来解释物体的不同视图。因此,在多视图学习(Multi-view Learning,MVL)框架下整合高维、独特和异构视图的过程中面临着一些新的挑战。多集典型相关分析(MCCA)是 MVL 中一种流行的子空间学习技术,它通过最大化所有视图的成对相关性来形成一个共同的潜在空间。然而,MCCA 并不利用对象的类标签信息,也无法处理数据的非线性问题。虽然有一些 MCCA 的监督扩展,但它们在使用类标签时,缺乏对视图内和视图间一致性和/或不一致性信息的有效利用。为此,我们提出了一种监督子空间学习方法,即类结构保存多视角相关判别分析(CSP-MvCDA),它明智地整合了 MCCA、线性判别分析(LDA)和位置保存规范的优点。所提出的方法联合优化了所有视图中的集间相关性和每个视图中的集内判别,从而获得了一个共同的判别潜空间,其中多个视图中的共享和互补信息得到了利用。带有先验类标签的位置保持规范有助于保持数据的局部类结构,而 LDA 则保持其全局类结构。为了证明所提方法的有效性,我们使用了几个癌症和基准数据集。实验结果表明,所提出的 CSP-MvCDA 方法在分类性能方面优于几种最先进的算法。
{"title":"Class-structure preserving multi-view correlated discriminant analysis for multiblock data","authors":"Sankar Mondal, Pradipta Maji","doi":"10.1007/s13042-024-02270-9","DOIUrl":"https://doi.org/10.1007/s13042-024-02270-9","url":null,"abstract":"<p>With the rapid development in data acquisition methods, multiple data sources are now becoming available to explain different views of an object. This consequently introduces several new challenges in integrating the high dimensional, distinct, and heterogeneous views under multi-view learning (MVL) framework. The multiset canonical correlation analysis (MCCA) is a popular subspace learning technique in MVL, which forms a common latent space by maximizing the pairwise correlation across all the views. However, MCCA does not utilize the class label information of the objects and is unable to handle the data non-linearity. Although there exist a few supervised extensions of MCCA, they lack productive use of intra-view and inter-view consistency and/or inconsistency information while using the class label. In this regard, a supervised subspace learning method, termed as class-structure preserving multi-view correlated discriminant analysis (CSP-MvCDA), is proposed by judiciously integrating the merits of MCCA, linear discriminant analysis (LDA), and a locality preserving norm. The proposed method jointly optimizes the inter-set correlation across all the views and intra-set discrimination in each view to obtain a common discriminative latent space, where the shared and complementary information across multiple views is exploited. The locality preserving norm with prior class labels helps to preserve the local class-structure of the data, while the LDA maintains its global class-structure. To show the effectiveness of the proposed method, several cancer and benchmark data sets are used. The experimental results establish that the proposed CSP-MvCDA method is superior to several state-of-the-art algorithms in terms of classification performance.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"49 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The concept information of graph granule with application to knowledge graph embedding 图颗粒的概念信息及其在知识图嵌入中的应用
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-01 DOI: 10.1007/s13042-024-02267-4
Jiaojiao Niu, Degang Chen, Yinglong Ma, Jinhai Li

Knowledge graph embedding (KGE) has become one of the most effective methods for the numerical representation of entities and their relations in knowledge graphs. Traditional methods primarily utilise triple facts, structured as (head entity, relation, tail entity), as the basic knowledge units in the learning process and use additional external information to improve the performance of models. Since triples are sometimes less than adequate and external information is not always available, obtaining structured internal knowledge from knowledge graphs (KGs) naturally becomes a feasible method for KGE learning. Motivated by this, this paper employs formal concept analysis (FCA) to mine deterministic concept knowledge in KGs and proposes a novel KGE model by taking the concept information into account. More specifically, triples sharing the same head entity are organised into knowledge structures named graph granules, and then were transformed into concept lattices, based on which a novel lattice-based KGE model (TransGr) is proposed for knowledge graph completion. TransGr assumes that entities and relations exist in different granules and uses a matrix (obtained by fusing concepts from concept lattice) for quantitatively depicting the graph granule. Afterwards, it forces entities and relations to meet graph granule constraints when learning vector representations of KGs. Experiments on link prediction and triple classification demonstrated that the proposed TransGr is effective on the datasets with relatively complete graph granules.

知识图谱嵌入(KGE)已成为用数字表示知识图谱中实体及其关系的最有效方法之一。传统方法主要利用结构为(头部实体、关系、尾部实体)的三元事实作为学习过程中的基本知识单元,并利用额外的外部信息来提高模型的性能。由于三元事实有时不够充分,而且外部信息并不总是可用的,因此从知识图谱(KG)中获取结构化的内部知识自然就成了 KGE 学习的可行方法。受此启发,本文采用形式化概念分析(FCA)挖掘知识图谱中的确定性概念知识,并通过考虑概念信息提出了一种新型知识图谱模型。更具体地说,共享相同头部实体的三元组被组织成名为图颗粒(graph granules)的知识结构,然后被转换成概念网格,在此基础上提出了一种基于网格的新型 KGE 模型(TransGr)来完成知识图谱。TransGr 假定实体和关系存在于不同的颗粒中,并使用矩阵(通过融合概念网格中的概念获得)来定量描述图颗粒。之后,在学习知识图谱的向量表示时,它会强制实体和关系满足图谱粒度约束。链接预测和三重分类实验表明,所提出的 TransGr 在具有相对完整图颗粒的数据集上是有效的。
{"title":"The concept information of graph granule with application to knowledge graph embedding","authors":"Jiaojiao Niu, Degang Chen, Yinglong Ma, Jinhai Li","doi":"10.1007/s13042-024-02267-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02267-4","url":null,"abstract":"<p>Knowledge graph embedding (KGE) has become one of the most effective methods for the numerical representation of entities and their relations in knowledge graphs. Traditional methods primarily utilise triple facts, structured as (head entity, relation, tail entity), as the basic knowledge units in the learning process and use additional external information to improve the performance of models. Since triples are sometimes less than adequate and external information is not always available, obtaining structured internal knowledge from knowledge graphs (KGs) naturally becomes a feasible method for KGE learning. Motivated by this, this paper employs formal concept analysis (FCA) to mine deterministic concept knowledge in KGs and proposes a novel KGE model by taking the concept information into account. More specifically, triples sharing the same head entity are organised into knowledge structures named graph granules, and then were transformed into concept lattices, based on which a novel lattice-based KGE model (TransGr) is proposed for knowledge graph completion. TransGr assumes that entities and relations exist in different granules and uses a matrix (obtained by fusing concepts from concept lattice) for quantitatively depicting the graph granule. Afterwards, it forces entities and relations to meet graph granule constraints when learning vector representations of KGs. Experiments on link prediction and triple classification demonstrated that the proposed TransGr is effective on the datasets with relatively complete graph granules.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"75 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141524450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sequential attention layer-wise fusion network for multi-view classification 用于多视角分类的顺序注意层融合网络
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-01 DOI: 10.1007/s13042-024-02260-x
Qing Teng, Xibei Yang, Qiguo Sun, Pingxin Wang, Xun Wang, Taihua Xu

Graph convolutional network has shown excellent performance in multi-view classification. Currently, to output a fused node embedding representation in multi-view scenarios, existing researches tend to ensure the consistency of embedded node information among multiple views. However, they pay much attention to the immediate neighbors information rather than multi-order node information which can capture complex relationships and structures to enhance feature propagation. Furthermore, the embedded node information in each convolutional layer has not been fully utilized because the consistency is frequently achieved by the final convolutional layer. To tackle these limitations, we develop a new end-to-end multi-view learning architecture: sequential attention Layer-wise Fusion Network for multi-view classification (SLFNet). Motivated by the fact that for each view, multi-order node information is hidden in the multiple layer-wise node embedding representations, a set of sequential attentions can then be calculated over those multiple layers, which provides a novel fusion strategy from the perspectives of multi-order. The contributions of our architecture are: (1) capturing multi-order node information instead of using the immediate neighbors, thereby obtaining more accurate node embedding representations; (2) designing a sequential attention module that allows adaptive learning of node embedding representation for each layer, thereby attentively fusing these layer-wise node embedding representations. Our experiments, focusing on semi-supervised node classification tasks, highlight the superiorities of SLFNet compared to state-of-the-art approaches. Reports on deeper layer convolutional results further confirm its effectiveness in addressing over-smoothing problem.

图卷积网络在多视图分类中表现出卓越的性能。目前,为了在多视图场景中输出融合节点嵌入表示,现有研究倾向于确保多视图之间嵌入节点信息的一致性。然而,他们更关注的是近邻信息,而不是能捕捉复杂关系和结构以增强特征传播的多阶节点信息。此外,每个卷积层中的嵌入节点信息都没有得到充分利用,因为一致性往往是由最后的卷积层实现的。为了解决这些局限性,我们开发了一种新的端到端多视图学习架构:用于多视图分类的顺序注意层智融合网络(SLFNet)。对于每个视图,多阶节点信息都隐藏在多层节点嵌入表征中,因此可以在这些多层上计算出一组顺序注意力,这就从多阶的角度提供了一种新颖的融合策略。我们的架构的贡献在于(1) 捕获多阶节点信息,而不是使用近邻节点信息,从而获得更准确的节点嵌入表示;(2) 设计一个顺序关注模块,允许自适应学习每一层的节点嵌入表示,从而用心地融合这些分层节点嵌入表示。我们的实验侧重于半监督节点分类任务,与最先进的方法相比,突出了 SLFNet 的优越性。有关深层卷积结果的报告进一步证实了它在解决过度平滑问题方面的有效性。
{"title":"Sequential attention layer-wise fusion network for multi-view classification","authors":"Qing Teng, Xibei Yang, Qiguo Sun, Pingxin Wang, Xun Wang, Taihua Xu","doi":"10.1007/s13042-024-02260-x","DOIUrl":"https://doi.org/10.1007/s13042-024-02260-x","url":null,"abstract":"<p>Graph convolutional network has shown excellent performance in multi-view classification. Currently, to output a fused node embedding representation in multi-view scenarios, existing researches tend to ensure the consistency of embedded node information among multiple views. However, they pay much attention to the immediate neighbors information rather than multi-order node information which can capture complex relationships and structures to enhance feature propagation. Furthermore, the embedded node information in each convolutional layer has not been fully utilized because the consistency is frequently achieved by the final convolutional layer. To tackle these limitations, we develop a new end-to-end multi-view learning architecture: sequential attention Layer-wise Fusion Network for multi-view classification (SLFNet). Motivated by the fact that for each view, multi-order node information is hidden in the multiple layer-wise node embedding representations, a set of sequential attentions can then be calculated over those multiple layers, which provides a novel fusion strategy from the perspectives of multi-order. The contributions of our architecture are: (1) capturing multi-order node information instead of using the immediate neighbors, thereby obtaining more accurate node embedding representations; (2) designing a sequential attention module that allows adaptive learning of node embedding representation for each layer, thereby attentively fusing these layer-wise node embedding representations. Our experiments, focusing on semi-supervised node classification tasks, highlight the superiorities of SLFNet compared to state-of-the-art approaches. Reports on deeper layer convolutional results further confirm its effectiveness in addressing over-smoothing problem.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"13 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated learning-guided intrusion detection and neural key exchange for safeguarding patient data on the internet of medical things 联合学习引导的入侵检测和神经密钥交换,用于保护医疗物联网上的患者数据
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-01 DOI: 10.1007/s13042-024-02269-2
Chongzhou Zhong, Arindam Sarkar, Sarbajit Manna, Mohammad Zubair Khan, Abdulfattah Noorwali, Ashish Das, Koyel Chakraborty

To improve the security of the Internet of Medical Things (IoMT) in healthcare, this paper offers a Federated Learning (FL)-guided Intrusion Detection System (IDS) and an Artificial Neural Network (ANN)-based key exchange mechanism inside a blockchain framework. The IDS are essential for spotting network anomalies and taking preventative action to guarantee the secure and dependable functioning of IoMT systems. The suggested method integrates FL-IDS with a blockchain-based ANN-based key exchange mechanism, providing several important benefits: (1) FL-based IDS creates a shared ledger that aggregates nearby weights and transmits historical weights that have been averaged, lowering computing effort, eliminating poisoning attacks, and improving data visibility and integrity throughout the shared database. (2) The system uses edge-based detection techniques to protect the cloud in the case of a security breach, enabling quicker threat recognition with less computational and processing resource usage. FL’s effectiveness with fewer data samples plays a part in this benefit. (3) The bidirectional alignment of ANNs ensures a strong security framework and facilitates the production of keys inside the IoMT network on the blockchain. (4) Mutual learning approaches synchronize ANNs, making it easier for IoMT devices to distribute synchronized keys. (5) XGBoost and ANN models were put to the test using BoT-IoT datasets to gauge how successful the suggested method is. The findings show that ANN demonstrates greater performance and dependability when dealing with heterogeneous data available in IoMT, such as ICU (Intensive Care Unit) data in the medical profession, compared to alternative approaches studied in this study. Overall, this method demonstrates increased security measures and performance, making it an appealing option for protecting IoMT systems, especially in demanding medical settings like ICUs.

为了提高医疗保健领域医疗物联网(IoMT)的安全性,本文在区块链框架内提供了一个以联合学习(FL)为指导的入侵检测系统(IDS)和一个基于人工神经网络(ANN)的密钥交换机制。IDS 对于发现网络异常并采取预防措施以确保 IoMT 系统的安全可靠运行至关重要。所建议的方法将 FL-IDS 与基于区块链的 ANN 密钥交换机制相结合,具有以下几个重要优势:(1) 基于 FL 的 IDS 创建了一个共享账本,该账本汇总了附近的权重,并传输经过平均处理的历史权重,从而降低了计算难度,消除了中毒攻击,并提高了整个共享数据库的数据可见性和完整性。(2) 该系统使用基于边缘的检测技术,在出现安全漏洞时保护云,从而以更少的计算和处理资源使用量更快地识别威胁。FL 在使用较少数据样本的情况下也能发挥功效,这也是其优势之一。(3) ANN 的双向对齐确保了强大的安全框架,并促进了区块链上 IoMT 网络内部密钥的生成。(4) 互学方法可同步 ANN,使 IoMT 设备更容易分发同步密钥。(5) 使用 BoT-IoT 数据集对 XGBoost 和 ANN 模型进行了测试,以衡量所建议方法的成功程度。研究结果表明,与本研究中的其他方法相比,ANN 在处理 IoMT 中的异构数据(如医疗行业的 ICU(重症监护室)数据)时表现出更高的性能和可靠性。总体而言,这种方法提高了安全措施和性能,使其成为保护 IoMT 系统(尤其是在 ICU 等要求苛刻的医疗环境中)的一个有吸引力的选择。
{"title":"Federated learning-guided intrusion detection and neural key exchange for safeguarding patient data on the internet of medical things","authors":"Chongzhou Zhong, Arindam Sarkar, Sarbajit Manna, Mohammad Zubair Khan, Abdulfattah Noorwali, Ashish Das, Koyel Chakraborty","doi":"10.1007/s13042-024-02269-2","DOIUrl":"https://doi.org/10.1007/s13042-024-02269-2","url":null,"abstract":"<p>To improve the security of the Internet of Medical Things (IoMT) in healthcare, this paper offers a Federated Learning (FL)-guided Intrusion Detection System (IDS) and an Artificial Neural Network (ANN)-based key exchange mechanism inside a blockchain framework. The IDS are essential for spotting network anomalies and taking preventative action to guarantee the secure and dependable functioning of IoMT systems. The suggested method integrates FL-IDS with a blockchain-based ANN-based key exchange mechanism, providing several important benefits: (1) FL-based IDS creates a shared ledger that aggregates nearby weights and transmits historical weights that have been averaged, lowering computing effort, eliminating poisoning attacks, and improving data visibility and integrity throughout the shared database. (2) The system uses edge-based detection techniques to protect the cloud in the case of a security breach, enabling quicker threat recognition with less computational and processing resource usage. FL’s effectiveness with fewer data samples plays a part in this benefit. (3) The bidirectional alignment of ANNs ensures a strong security framework and facilitates the production of keys inside the IoMT network on the blockchain. (4) Mutual learning approaches synchronize ANNs, making it easier for IoMT devices to distribute synchronized keys. (5) XGBoost and ANN models were put to the test using BoT-IoT datasets to gauge how successful the suggested method is. The findings show that ANN demonstrates greater performance and dependability when dealing with heterogeneous data available in IoMT, such as ICU (Intensive Care Unit) data in the medical profession, compared to alternative approaches studied in this study. Overall, this method demonstrates increased security measures and performance, making it an appealing option for protecting IoMT systems, especially in demanding medical settings like ICUs.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"17 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141524451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HOGFormer: high-order graph convolution transformer for 3D human pose estimation HOGFormer:用于三维人体姿态估计的高阶图卷积变换器
IF 5.6 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-29 DOI: 10.1007/s13042-024-02262-9
Yuhong Xie, Chaoqun Hong, Weiwei Zhuang, Lijuan Liu, Jie Li

The combination of graph convolution network (GCN) and Transformer has shown promising results in 3D human pose estimation (HPE) tasks when lifting the 2D to 3D poses. However, recent approaches to 3D HPE still face difficulties such as depth ambiguity and occlusion. To address these issues, we suggest a novel 3D HPE architecture, termed High-Order Graph Convolution Transformer (HOGFormer). HOGFormer consists of three core components: the Chebyshev Graph Convolution (CGConv) module, the Graph-based Dynamic Adjacency Matrix Transformer (GDAMFormer) module, and the High-Order Graph Convolution (HOGConv) module. In more detail, the CGConv module can further increase the estimation accuracy by approximating the graph convolution with Chebyshev polynomials. The GDAMFormer module efficiently addresses issues like self-occlusion and depth blur by using a dynamic adjacency matrix to represent the dynamic relationships among joints. The HOGConv module can effectively extract local features by capturing the local physical dependencies of skeleton connections. With the integration of these modules, the proposed architecture can effectively capture global and local information. We evaluate our architecture quantitatively and qualitatively on the popular benchmark dataset Human3.6M. Our experiments demonstrate that HOGFormer achieves state-of-the-art performance.

图卷积网络(GCN)与变换器的结合在三维人体姿态估计(HPE)任务中将二维姿态提升到三维姿态时显示出了良好的效果。然而,最近的 3D HPE 方法仍然面临深度模糊和遮挡等困难。为了解决这些问题,我们提出了一种新型 3D HPE 架构,称为高阶图卷积变换器(HOGFormer)。HOGFormer 由三个核心组件组成:切比雪夫图卷积 (CGConv) 模块、基于图的动态邻接矩阵变换器 (GDAMFormer) 模块和高阶图卷积 (HOGConv) 模块。更详细地说,CGConv 模块通过用切比雪夫多项式逼近图卷积来进一步提高估计精度。GDAMFormer 模块通过使用动态邻接矩阵来表示关节间的动态关系,从而有效地解决了自闭塞和深度模糊等问题。HOGConv 模块通过捕捉骨架连接的局部物理依赖关系,可以有效提取局部特征。通过这些模块的整合,所提出的架构可以有效捕捉全局和局部信息。我们在流行的基准数据集 Human3.6M 上对我们的架构进行了定量和定性评估。实验证明,HOGFormer 的性能达到了最先进的水平。
{"title":"HOGFormer: high-order graph convolution transformer for 3D human pose estimation","authors":"Yuhong Xie, Chaoqun Hong, Weiwei Zhuang, Lijuan Liu, Jie Li","doi":"10.1007/s13042-024-02262-9","DOIUrl":"https://doi.org/10.1007/s13042-024-02262-9","url":null,"abstract":"<p>The combination of graph convolution network (GCN) and Transformer has shown promising results in 3D human pose estimation (HPE) tasks when lifting the 2D to 3D poses. However, recent approaches to 3D HPE still face difficulties such as depth ambiguity and occlusion. To address these issues, we suggest a novel 3D HPE architecture, termed High-Order Graph Convolution Transformer (HOGFormer). HOGFormer consists of three core components: the Chebyshev Graph Convolution (CGConv) module, the Graph-based Dynamic Adjacency Matrix Transformer (GDAMFormer) module, and the High-Order Graph Convolution (HOGConv) module. In more detail, the CGConv module can further increase the estimation accuracy by approximating the graph convolution with Chebyshev polynomials. The GDAMFormer module efficiently addresses issues like self-occlusion and depth blur by using a dynamic adjacency matrix to represent the dynamic relationships among joints. The HOGConv module can effectively extract local features by capturing the local physical dependencies of skeleton connections. With the integration of these modules, the proposed architecture can effectively capture global and local information. We evaluate our architecture quantitatively and qualitatively on the popular benchmark dataset Human3.6M. Our experiments demonstrate that HOGFormer achieves state-of-the-art performance.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"23 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Machine Learning and Cybernetics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1