CAAI Transactions on Intelligence Technology最新文献_第7页

Guest Editorial: Special issue on recurrent dynamic neural networks: Theory and applications 客座编辑：《递归动态神经网络：理论与应用》特刊

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2023-09-05 DOI: 10.1049/cit2.12266

Long Jin, Predrag S. Stanimirović

Recurrent dynamic neural network has been proven to be a powerful tool in the online solving of problems with considerable complexity and has been applied to various fields. In recent years, various recurrent dynamic neural networks have been developed to solve complex time-varying problems, such as time-varying matrix inversion, time-varying nonlinear optimisation, motion control of manipulators and so on. However, some thorny issues remain, including, but not limited to, sensitivity to noises, slow convergent speed, and high computational complexity.We envisioned this Special Issue could provide a platform for researchers in this area to publish their latest research ideas. This call received 25 high-quality submissions. After passing through the peer review process, eight high-quality papers were accepted for publication.In the first paper (Ren et al.), the authors give an overview of the latest process of weakly supervised learning in medical image analysis, including incomplete, inexact and inaccurate supervision, and introduce the related works on different applications for medical image analysis. Related concepts are illustrated to help readers get an overview ranging from supervised to unsupervised learning within the scope of machine learning. Furthermore, the challenges and future works of weakly supervised learning in medical image analysis are discussed.In the second paper (Gheisari et al.), the ways, advantages, drawbacks, architectures, and methods of deep learning (DL) are investigated in order to have a straightforward and clear understanding of it from different views. Moreover, the existing related methods are compared with each other, and the application of DL is described in some applications, such as medical image analysis, handwriting recognition and so on.In the third paper (Shi et al.), an advanced continuous-time recurrent neural network (RNN) model based on a double integral RNN design formula is proposed for solving continuous time-varying matrix inversion, which has an incomparable disturbance-suppression property. For digital hardware applications, the corresponding advanced discrete-time RNN model is proposed based on the discretisation formulas. As a result of theoretical analysis, it is demonstrated that the advanced continuous-time RNN model and the corresponding advanced discrete-time RNN model have global and exponential convergence performance, and they are excellent for suppressing different disturbances. Finally, inspiring experiments, including two numerical experiments and a practical experiment, are presented to demonstrate the effectiveness and superiority of the advanced discrete-time RNN model for solving discrete time-varying matrix inversion with disturbance-suppression.In the fourth paper (Li Z. and Li S.), for the first time, a novel recursive recurrent network model is proposed to solve the kinematic control issue for manipulators with different levels of physi

递归动态神经网络已被证明是在线解决相当复杂问题的强大工具，并已应用于各个领域。近年来，人们开发了各种递归动态神经网络来解决复杂的时变问题，如时变矩阵反演、时变非线性优化、机械手运动控制等。然而，仍然存在一些棘手的问题，包括但不限于对噪声的敏感性、收敛速度慢和计算复杂度高。我们设想这期特刊可以为该领域的研究人员提供一个发表最新研究想法的平台。这次电话会议收到了25份高质量的意见书。经过同行评审，8篇高质量论文被接受发表。在第一篇论文（Ren et al.）中，作者概述了医学图像分析中弱监督学习的最新过程，包括不完全、不精确和不准确的监督，并介绍了在医学图像分析的不同应用方面的相关工作。对相关概念进行了说明，以帮助读者了解机器学习范围内从监督到无监督的学习。此外，还讨论了弱监督学习在医学图像分析中的挑战和未来工作。在第二篇论文（Gheisari等人）中，研究了深度学习（DL）的方式、优点、缺点、架构和方法，以便从不同的角度对其有一个直观而清晰的理解。此外，对现有的相关方法进行了比较，并介绍了DL在医学图像分析、笔迹识别等应用中的应用，针对连续时变矩阵反演问题，提出了一种基于二重积分RNN设计公式的先进连续时间递归神经网络（RNN）模型，该模型具有无与伦比的扰动抑制特性。对于数字硬件应用，基于离散化公式，提出了相应的高级离散时间RNN模型。理论分析结果表明，高级连续时间RNN模型和相应的高级离散时间RNN具有全局和指数收敛性能，在抑制不同扰动方面表现出色。最后，通过两个数值实验和一个实际实验，验证了先进的离散时间RNN模型在求解扰动抑制下的离散时变矩阵反演中的有效性和优越性，提出了一种新的递归递归网络模型来解决具有不同物理约束水平的机械手的运动控制问题，并且所提出的递归RNN可以被公式化为一个新的流形系统，以确保在不同阶数的所有关节约束下的控制解。理论分析表明了所提出的递归RNN的稳定性及其对解的收敛性。仿真结果进一步证明了该方法在不同关节约束水平下基于Kuka机械手系统的末端执行器路径跟踪控制中的有效性。与基于伪逆的方法和传统RNN方法等其他方法的比较证明了该方法的优越性。在第五篇论文（赵等）中，为了研究张拉整体形式确定问题，作者通过分析张拉整体结构的物理性质，建立了一种简明有效的动态松弛-噪声容忍归零神经网络（DR-NTZNN）形式确定算法。此外，由找形问题转化而来的非线性约束优化问题，采用序列二次规划算法求解。此外，为了抑制噪声项，提出了一种容错归零神经网络来求解搜索方向，它可以赋予寻形模型的抗噪声能力，增强计算能力。此外，还提出了一种动态松弛方法，用于在获取搜索方向时快速计算节点坐标。数值结果表明，该找形模型对拓扑结构复杂的高维自由形式索杆机构具有强大的寻形能力。此外，与现有的其他找形方法相比，对比仿真结果表明DR-NTZNN找形算法具有良好的抗噪声性能和计算能力。最终，在未来，如何构建一个适用于工程应用的通用动力学松弛找形模型是主要关注的问题。在第六篇论文中（Wei et al。 )针对具有向量相对度的线性时变多输入多输出系统，提出了一种开闭环迭代学习控制（ILC）策略，其中操作的时间间隔与迭代有关。为了补偿迭代相关区间引起的跟踪信号丢失，在ILC设计中引入了反馈控制。由于许多连续迭代的跟踪信号在一定间隔内丢失，反馈控制部分可以使用当前迭代的跟踪信息进行补偿。在假设初始状态在数学期望意义上均匀地围绕期望的初始状态振动的情况下，随着迭代次数趋于无穷大，ILC跟踪误差的期望可以收敛到零。在初始状态在有界的期望初始状态周围变化的情况下，随着迭代次数趋于无穷大，ILC跟踪误差的期望可以被驱动到有界范围，其上界与波动成比例。结果表明，收敛条件取决于前馈控制增益，而反馈控制通过选择合适的反馈控制增益可以加快收敛速度。作为一种特殊情况，通过提出一种简化的基于迭代相关区间的开闭环ILC方法，也解决了具有积分高相对度的受控系统。最后，通过两种初始状态下的仿真实例，说明了所开发的基于迭代相关区间的开闭环ILC的有效性。在第七篇论文（Wang S.et al.）中，提出了一种基于归零神经动力学方法的水声传感器网络定位方法，以更好地定位移动的水下节点。构建了一个专门为UASN定位设计的归零神经动力学模型，并对其有效性进行了严格的理论分析。所提出的归零神经动力学模型与一些定位算法兼容，这些算法可以用来消除非理想情况下的误差，从而进一步提高其有效性。最后，通过实例和计算机仿真验证了所提出的归零神经动力学模型的有效性和兼容性。在第八篇论文（Wang G.et al.）中，提出了一种新的具有激活可变参数的基于梯度的神经网络模型，即激活可变参数梯度神经网络（AVPGNN）模型，用于求解时变约束二次规划问题。在可变参数的情况下，AVPGNN模型可以避免矩阵求逆带来的限制，实现零残差。此外，利用各种激活函数来提高AVPGNN模型的收敛速度。对AVPGNN模型的精度和收敛速度进行了严格的理论分析，并通过数值实验进行了验证。最后，为了探索AVPGNN模型的可行性，举例说明了其在机器人机械手运动规划和上市证券投资组合选择中的应用。我们感谢所有作者的投稿，感谢所有评审员的宝贵评论和意见。我们希望这期特刊能为递归动态神经网络的研究界带来新的成果。

{"title":"Guest Editorial: Special issue on recurrent dynamic neural networks: Theory and applications","authors":"Long Jin, Predrag S. Stanimirović","doi":"10.1049/cit2.12266","DOIUrl":"https://doi.org/10.1049/cit2.12266","url":null,"abstract":"Recurrent dynamic neural network has been proven to be a powerful tool in the online solving of problems with considerable complexity and has been applied to various fields. In recent years, various recurrent dynamic neural networks have been developed to solve complex time-varying problems, such as time-varying matrix inversion, time-varying nonlinear optimisation, motion control of manipulators and so on. However, some thorny issues remain, including, but not limited to, sensitivity to noises, slow convergent speed, and high computational complexity.We envisioned this Special Issue could provide a platform for researchers in this area to publish their latest research ideas. This call received 25 high-quality submissions. After passing through the peer review process, eight high-quality papers were accepted for publication.In the first paper (Ren et al.), the authors give an overview of the latest process of weakly supervised learning in medical image analysis, including incomplete, inexact and inaccurate supervision, and introduce the related works on different applications for medical image analysis. Related concepts are illustrated to help readers get an overview ranging from supervised to unsupervised learning within the scope of machine learning. Furthermore, the challenges and future works of weakly supervised learning in medical image analysis are discussed.In the second paper (Gheisari et al.), the ways, advantages, drawbacks, architectures, and methods of deep learning (DL) are investigated in order to have a straightforward and clear understanding of it from different views. Moreover, the existing related methods are compared with each other, and the application of DL is described in some applications, such as medical image analysis, handwriting recognition and so on.In the third paper (Shi et al.), an advanced continuous-time recurrent neural network (RNN) model based on a double integral RNN design formula is proposed for solving continuous time-varying matrix inversion, which has an incomparable disturbance-suppression property. For digital hardware applications, the corresponding advanced discrete-time RNN model is proposed based on the discretisation formulas. As a result of theoretical analysis, it is demonstrated that the advanced continuous-time RNN model and the corresponding advanced discrete-time RNN model have global and exponential convergence performance, and they are excellent for suppressing different disturbances. Finally, inspiring experiments, including two numerical experiments and a practical experiment, are presented to demonstrate the effectiveness and superiority of the advanced discrete-time RNN model for solving discrete time-varying matrix inversion with disturbance-suppression.In the fourth paper (Li Z. and Li S.), for the first time, a novel recursive recurrent network model is proposed to solve the kinematic control issue for manipulators with different levels of physi","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"8 3","pages":"547-548"},"PeriodicalIF":5.1,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12266","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50122144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

D2LFS2Net: Multi‐class skin lesion diagnosis using deep learning and variance‐controlled Marine Predator optimisation: An application for precision medicine D2LFS2Net:基于深度学习和方差控制的海洋捕食者优化的多类别皮肤病变诊断:在精准医学中的应用

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2023-08-30 DOI: 10.1049/cit2.12267

Veena Dillshad, M. A. Khan, Muhammad Nazir, Oumaima Saidani, Nazik Alturki, Seifedine Kadry

In computer vision applications like surveillance and remote sensing, to mention a few, deep learning has had considerable success. Medical imaging still faces a number of difficulties, including intra‐class similarity, a scarcity of training data, and poor contrast skin lesions, notably in the case of skin cancer. An optimisation‐aided deep learning‐based system is proposed for accurate multi‐class skin lesion identification. The sequential procedures of the proposed system start with preprocessing and end with categorisation. The preprocessing step is where a hybrid contrast enhancement technique is initially proposed for lesion identification with healthy regions. Instead of flipping and rotating data, the outputs from the middle phases of the hybrid enhanced technique are employed for data augmentation in the next step. Next, two pre‐trained deep learning models, MobileNetV2 and NasNet Mobile, are trained using deep transfer learning on the upgraded enriched dataset. Later, a dual‐threshold serial approach is employed to obtain and combine the features of both models. The next step was the variance‐controlled Marine Predator methodology, which the authors proposed as a superior optimisation method. The top features from the fused feature vector are classified using machine learning classifiers. The experimental strategy provided enhanced accuracy of 94.4% using the publicly available dataset HAM10000. Additionally, the proposed framework is evaluated compared to current approaches, with remarkable results.

在监视和遥感等计算机视觉应用中，深度学习已经取得了相当大的成功。医学成像仍然面临许多困难，包括类内相似性、训练数据的缺乏和皮肤病变对比度差，特别是在皮肤癌的情况下。提出了一种基于优化辅助深度学习的系统，用于准确识别多类皮肤病变。该系统的顺序过程从预处理开始，以分类结束。预处理步骤是混合对比度增强技术最初提出的病变识别与健康区域。混合增强技术的中间相位输出用于下一步的数据增强，而不是翻转和旋转数据。接下来，两个预先训练的深度学习模型，MobileNetV2和NasNet Mobile，在升级后的丰富数据集上使用深度迁移学习进行训练。然后，采用双阈值串行方法来获取和组合两个模型的特征。下一步是方差控制的海洋捕食者方法，这是作者提出的一种优越的优化方法。使用机器学习分类器对融合特征向量的顶部特征进行分类。使用公开可用的数据集HAM10000，实验策略的准确率提高了94.4%。此外，将所提出的框架与现有方法进行了比较，结果显著。

{"title":"D2LFS2Net: Multi‐class skin lesion diagnosis using deep learning and variance‐controlled Marine Predator optimisation: An application for precision medicine","authors":"Veena Dillshad, M. A. Khan, Muhammad Nazir, Oumaima Saidani, Nazik Alturki, Seifedine Kadry","doi":"10.1049/cit2.12267","DOIUrl":"https://doi.org/10.1049/cit2.12267","url":null,"abstract":"In computer vision applications like surveillance and remote sensing, to mention a few, deep learning has had considerable success. Medical imaging still faces a number of difficulties, including intra‐class similarity, a scarcity of training data, and poor contrast skin lesions, notably in the case of skin cancer. An optimisation‐aided deep learning‐based system is proposed for accurate multi‐class skin lesion identification. The sequential procedures of the proposed system start with preprocessing and end with categorisation. The preprocessing step is where a hybrid contrast enhancement technique is initially proposed for lesion identification with healthy regions. Instead of flipping and rotating data, the outputs from the middle phases of the hybrid enhanced technique are employed for data augmentation in the next step. Next, two pre‐trained deep learning models, MobileNetV2 and NasNet Mobile, are trained using deep transfer learning on the upgraded enriched dataset. Later, a dual‐threshold serial approach is employed to obtain and combine the features of both models. The next step was the variance‐controlled Marine Predator methodology, which the authors proposed as a superior optimisation method. The top features from the fused feature vector are classified using machine learning classifiers. The experimental strategy provided enhanced accuracy of 94.4% using the publicly available dataset HAM10000. Additionally, the proposed framework is evaluated compared to current approaches, with remarkable results.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"28 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86453594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine learning and human-machine trust in healthcare: A systematic survey 医疗保健领域的机器学习和人机信任：系统调查

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2023-08-30 DOI: 10.1049/cit2.12268

Han Lin, Jiatong Han, Pingping Wu, Jiangyan Wang, Juan Tu, Hao Tang, Liuning Zhu

As human-machine interaction (HMI) in healthcare continues to evolve, the issue of trust in HMI in healthcare has been raised and explored. It is critical for the development and safety of healthcare that humans have proper trust in medical machines. Intelligent machines that have applied machine learning (ML) technologies continue to penetrate deeper into the medical environment, which also places higher demands on intelligent healthcare. In order to make machines play a role in HMI in healthcare more effectively and make human-machine cooperation more harmonious, the authors need to build good human-machine trust (HMT) in healthcare. This article provides a systematic overview of the prominent research on ML and HMT in healthcare. In addition, this study explores and analyses ML and three important factors that influence HMT in healthcare, and then proposes a HMT model in healthcare. Finally, general trends are summarised and issues to consider addressing in future research on HMT in healthcare are identified.

随着医疗保健领域人机交互（HMI）的不断发展，人们提出并探讨了医疗保健领域人机交互的信任问题。人类对医疗机器的适当信任对于医疗保健的发展和安全至关重要。应用了机器学习（ML）技术的智能机器不断深入医疗环境，这也对智能医疗提出了更高的要求。为了让机器更有效地在医疗领域的人机界面中发挥作用，让人机合作更加和谐，作者需要在医疗领域建立良好的人机信任（HMT）。本文系统地概述了有关医疗保健中的 ML 和 HMT 的著名研究。此外，本研究还探讨和分析了 ML 以及影响医疗保健领域 HMT 的三个重要因素，然后提出了医疗保健领域的 HMT 模型。最后，本文总结了总体趋势，并指出了未来医疗保健领域 HMT 研究中需要考虑解决的问题。

{"title":"Machine learning and human-machine trust in healthcare: A systematic survey","authors":"Han Lin, Jiatong Han, Pingping Wu, Jiangyan Wang, Juan Tu, Hao Tang, Liuning Zhu","doi":"10.1049/cit2.12268","DOIUrl":"10.1049/cit2.12268","url":null,"abstract":"As human-machine interaction (HMI) in healthcare continues to evolve, the issue of trust in HMI in healthcare has been raised and explored. It is critical for the development and safety of healthcare that humans have proper trust in medical machines. Intelligent machines that have applied machine learning (ML) technologies continue to penetrate deeper into the medical environment, which also places higher demands on intelligent healthcare. In order to make machines play a role in HMI in healthcare more effectively and make human-machine cooperation more harmonious, the authors need to build good human-machine trust (HMT) in healthcare. This article provides a systematic overview of the prominent research on ML and HMT in healthcare. In addition, this study explores and analyses ML and three important factors that influence HMT in healthcare, and then proposes a HMT model in healthcare. Finally, general trends are summarised and issues to consider addressing in future research on HMT in healthcare are identified.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 2","pages":"286-302"},"PeriodicalIF":5.1,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12268","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77282377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Lateral interaction by Laplacian-based graph smoothing for deep neural networks 基于拉普拉斯图平滑的深度神经网络侧向互动

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2023-08-29 DOI: 10.1049/cit2.12265

Jianhui Chen, Zuoren Wang, Cheng-Lin Liu

Lateral interaction in the biological brain is a key mechanism that underlies higher cognitive functions. Linear self-organising map (SOM) introduces lateral interaction in a general form in which signals of any modality can be used. Some approaches directly incorporate SOM learning rules into neural networks, but incur complex operations and poor extendibility. The efficient way to implement lateral interaction in deep neural networks is not well established. The use of Laplacian Matrix-based Smoothing (LS) regularisation is proposed for implementing lateral interaction in a concise form. The authors’ derivation and experiments show that lateral interaction implemented by SOM model is a special case of LS-regulated k-means, and they both show the topology-preserving capability. The authors also verify that LS-regularisation can be used in conjunction with the end-to-end training paradigm in deep auto-encoders. Additionally, the benefits of LS-regularisation in relaxing the requirement of parameter initialisation in various models and improving the classification performance of prototype classifiers are evaluated. Furthermore, the topologically ordered structure introduced by LS-regularisation in feature extractor can improve the generalisation performance on classification tasks. Overall, LS-regularisation is an effective and efficient way to implement lateral interaction and can be easily extended to different models.

生物大脑中的横向相互作用是高级认知功能的关键机制。线性自组织映射(SOM)以一般形式引入横向相互作用，其中可以使用任何模态的信号。有些方法直接将SOM学习规则合并到神经网络中，但操作复杂，可扩展性差。在深度神经网络中实现横向交互的有效方法还没有很好的建立。提出了利用基于拉普拉斯矩阵的平滑(LS)正则化实现横向交互的简明形式。作者的推导和实验表明，SOM模型实现的横向相互作用是ls调节k-means的一种特殊情况，它们都具有拓扑保持能力。作者还验证了ls正则化可以与深度自编码器中的端到端训练范例结合使用。此外，还评估了ls正则化在降低各种模型参数初始化要求和提高原型分类器分类性能方面的好处。此外，ls正则化在特征提取器中引入的拓扑有序结构可以提高分类任务的泛化性能。总体而言，ls -正则化是实现横向交互的有效方法，可以很容易地扩展到不同的模型。

{"title":"Lateral interaction by Laplacian-based graph smoothing for deep neural networks","authors":"Jianhui Chen, Zuoren Wang, Cheng-Lin Liu","doi":"10.1049/cit2.12265","DOIUrl":"10.1049/cit2.12265","url":null,"abstract":"Lateral interaction in the biological brain is a key mechanism that underlies higher cognitive functions. Linear self-organising map (SOM) introduces lateral interaction in a general form in which signals of any modality can be used. Some approaches directly incorporate SOM learning rules into neural networks, but incur complex operations and poor extendibility. The efficient way to implement lateral interaction in deep neural networks is not well established. The use of Laplacian Matrix-based Smoothing (LS) regularisation is proposed for implementing lateral interaction in a concise form. The authors’ derivation and experiments show that lateral interaction implemented by SOM model is a special case of LS-regulated k-means, and they both show the topology-preserving capability. The authors also verify that LS-regularisation can be used in conjunction with the end-to-end training paradigm in deep auto-encoders. Additionally, the benefits of LS-regularisation in relaxing the requirement of parameter initialisation in various models and improving the classification performance of prototype classifiers are evaluated. Furthermore, the topologically ordered structure introduced by LS-regularisation in feature extractor can improve the generalisation performance on classification tasks. Overall, LS-regularisation is an effective and efficient way to implement lateral interaction and can be easily extended to different models.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"8 4","pages":"1590-1607"},"PeriodicalIF":5.1,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12265","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74297058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Precise region semantics-assisted GAN for pose-guided person image generation 精确区域语义辅助 GAN，用于生成姿态引导的人物图像

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2023-08-02 DOI: 10.1049/cit2.12255

Ji Liu, Zhenyu Weng, Yuesheng Zhu

Generating a realistic person's image from one source pose conditioned on another different target pose is a promising computer vision task. The previous mainstream methods mainly focus on exploring the transformation relationship between the keypoint-based source pose and the target pose, but rarely investigate the region-based human semantic information. Some current methods that adopt the parsing map neither consider the precise local pose-semantic matching issues nor the correspondence between two different poses. In this study, a Region Semantics-Assisted Generative Adversarial Network (RSA-GAN) is proposed for the pose-guided person image generation task. In particular, a regional pose-guided semantic fusion module is first developed to solve the imprecise match issue between the semantic parsing map from a certain source image and the corresponding keypoints in the source pose. To well align the style of the human in the source image with the target pose, a pose correspondence guided style injection module is designed to learn the correspondence between the source pose and the target pose. In addition, one gated depth-wise convolutional cross-attention based style integration module is proposed to distribute the well-aligned coarse style information together with the precisely matched pose-guided semantic information towards the target pose. The experimental results indicate that the proposed RSA-GAN achieves a 23% reduction in LPIPS compared to the method without using the semantic maps and a 6.9% reduction in FID for the method with semantic maps, respectively, and also shows higher realistic qualitative results.

根据一个源姿态和另一个不同的目标姿态生成逼真的人物图像是一项前景广阔的计算机视觉任务。以往的主流方法主要侧重于探索基于关键点的源姿态与目标姿态之间的变换关系，而很少研究基于区域的人体语义信息。目前一些采用解析图的方法既没有考虑精确的局部姿势-语义匹配问题，也没有考虑两个不同姿势之间的对应关系。本研究针对姿势引导的人物图像生成任务，提出了一种区域语义辅助生成对抗网络（RSA-GAN）。其中，首先开发了一个区域姿势引导的语义融合模块，以解决某个源图像的语义解析图和源姿势中相应关键点之间的不精确匹配问题。为了使源图像中人的姿态与目标姿态很好地匹配，设计了一个姿态对应引导的姿态注入模块来学习源姿态与目标姿态之间的对应关系。此外，还提出了一个基于深度卷积交叉注意的选通风格整合模块，将对齐后的粗略风格信息与精确匹配的姿势引导语义信息一起分配给目标姿势。实验结果表明，与不使用语义图的方法相比，所提出的 RSA-GAN 的 LPIPS 降低了 23%，与使用语义图的方法相比，FID 降低了 6.9%，而且还显示出更高的真实定性结果。

{"title":"Precise region semantics-assisted GAN for pose-guided person image generation","authors":"Ji Liu, Zhenyu Weng, Yuesheng Zhu","doi":"10.1049/cit2.12255","DOIUrl":"10.1049/cit2.12255","url":null,"abstract":"Generating a realistic person's image from one source pose conditioned on another different target pose is a promising computer vision task. The previous mainstream methods mainly focus on exploring the transformation relationship between the keypoint-based source pose and the target pose, but rarely investigate the region-based human semantic information. Some current methods that adopt the parsing map neither consider the precise local pose-semantic matching issues nor the correspondence between two different poses. In this study, a Region Semantics-Assisted Generative Adversarial Network (RSA-GAN) is proposed for the pose-guided person image generation task. In particular, a regional pose-guided semantic fusion module is first developed to solve the imprecise match issue between the semantic parsing map from a certain source image and the corresponding keypoints in the source pose. To well align the style of the human in the source image with the target pose, a pose correspondence guided style injection module is designed to learn the correspondence between the source pose and the target pose. In addition, one gated depth-wise convolutional cross-attention based style integration module is proposed to distribute the well-aligned coarse style information together with the precisely matched pose-guided semantic information towards the target pose. The experimental results indicate that the proposed RSA-GAN achieves a 23% reduction in LPIPS compared to the method without using the semantic maps and a 6.9% reduction in FID for the method with semantic maps, respectively, and also shows higher realistic qualitative results.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 3","pages":"665-678"},"PeriodicalIF":5.1,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12255","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88630955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Position-aware pushing and grasping synergy with deep reinforcement learning in clutter 利用深度强化学习在杂乱环境中实现位置感知的推抓协同作用

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2023-08-02 DOI: 10.1049/cit2.12264

Min Zhao, Guoyu Zuo, Shuangyue Yu, Daoxiong Gong, Zihao Wang, Ouattara Sie

The positional information of objects is crucial to enable robots to perform grasping and pushing manipulations in clutter. To effectively perform grasping and pushing manipulations, robots need to perceive the position information of objects, including the coordinates and spatial relationship between objects (e.g., proximity, adjacency). The authors propose an end-to-end position-aware deep Q-learning framework to achieve efficient collaborative pushing and grasping in clutter. Specifically, a pair of conjugate pushing and grasping attention modules are proposed to capture the position information of objects and generate high-quality affordance maps of operating positions with features of pushing and grasping operations. In addition, the authors propose an object isolation metric and clutter metric based on instance segmentation to measure the spatial relationships between objects in cluttered environments. To further enhance the perception capacity of position information of the objects, the authors associate the change in the object isolation metric and clutter metric in cluttered environment before and after performing the action with reward function. A series of experiments are carried out in simulation and real-world which indicate that the method improves sample efficiency, task completion rate, grasping success rate and action efficiency compared to state-of-the-art end-to-end methods. Noted that the authors’ system can be robustly applied to real-world use and extended to novel objects. Supplementary material is available at https://youtu.be/NhG_k5v3NnM}{https://youtu.be/NhG_k5v3NnM.

物体的位置信息对于机器人在杂乱无章的环境中进行抓取和推动操作至关重要。为了有效地执行抓取和推动操作，机器人需要感知物体的位置信息，包括坐标和物体之间的空间关系（如邻近性、相邻性）。作者提出了一种端到端位置感知深度 Q-learning 框架，以实现杂波中的高效协同推送和抓取。具体来说，作者提出了一对共轭推拿和抓握注意力模块，用于捕捉物体的位置信息，并生成具有推拿和抓握操作特征的高质量操作位置承受力地图。此外，作者还提出了基于实例分割的物体隔离度量和杂乱度量，以衡量杂乱环境中物体之间的空间关系。为了进一步提高对物体位置信息的感知能力，作者将杂乱环境中物体隔离度量和杂乱度量在执行动作前后的变化与奖励函数联系起来。一系列模拟和实际实验表明，与最先进的端到端方法相比，该方法提高了采样效率、任务完成率、抓取成功率和行动效率。注意到作者的系统可以稳健地应用于现实世界，并扩展到新型物体。补充材料见 https://youtu.be/NhG_k5v3NnM}{https://youtu.be/NhG_k5v3NnM.

{"title":"Position-aware pushing and grasping synergy with deep reinforcement learning in clutter","authors":"Min Zhao, Guoyu Zuo, Shuangyue Yu, Daoxiong Gong, Zihao Wang, Ouattara Sie","doi":"10.1049/cit2.12264","DOIUrl":"10.1049/cit2.12264","url":null,"abstract":"The positional information of objects is crucial to enable robots to perform grasping and pushing manipulations in clutter. To effectively perform grasping and pushing manipulations, robots need to perceive the position information of objects, including the coordinates and spatial relationship between objects (e.g., proximity, adjacency). The authors propose an end-to-end position-aware deep Q-learning framework to achieve efficient collaborative pushing and grasping in clutter. Specifically, a pair of conjugate pushing and grasping attention modules are proposed to capture the position information of objects and generate high-quality affordance maps of operating positions with features of pushing and grasping operations. In addition, the authors propose an object isolation metric and clutter metric based on instance segmentation to measure the spatial relationships between objects in cluttered environments. To further enhance the perception capacity of position information of the objects, the authors associate the change in the object isolation metric and clutter metric in cluttered environment before and after performing the action with reward function. A series of experiments are carried out in simulation and real-world which indicate that the method improves sample efficiency, task completion rate, grasping success rate and action efficiency compared to state-of-the-art end-to-end methods. Noted that the authors’ system can be robustly applied to real-world use and extended to novel objects. Supplementary material is available at https://youtu.be/NhG_k5v3NnM}{https://youtu.be/NhG_k5v3NnM.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 3","pages":"738-755"},"PeriodicalIF":5.1,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12264","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80719987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

D2PAM: Epileptic seizures prediction using adversarial deep dual patch attention mechanism D2PAM：使用对抗性深度双补丁注意机制预测癫痫发作

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2023-07-29 DOI: 10.1049/cit2.12261

Arfat Ahmad Khan, Rakesh Kumar Madendran, Usharani Thirunavukkarasu, Muhammad Faheem

Epilepsy is considered as a serious brain disorder in which patients frequently experience seizures. The seizures are defined as the unexpected electrical changes in brain neural activity, which leads to unconsciousness. Existing researches made an intense effort for predicting the epileptic seizures using brain signal data. However, they faced difficulty in obtaining the patients' characteristics because the model's distribution turned to fake predictions, affecting the model's reliability. In addition, the existing prediction models have severe issues, such as overfitting and false positive rates. To overcome these existing issues, we propose a deep learning approach known as Deep dual-patch attention mechanism (D²PAM) for classifying the pre-ictal signals of people with Epilepsy based on the brain signals. Deep neural network is integrated with D²PAM, and it lowers the effect of differences between patients to predict ES. The multi-network design enhances the trained model's generalisability and stability efficiently. Also, the proposed model for processing the brain signal is designed to transform the signals into data blocks, which is appropriate for pre-ictal classification. The earlier warning of epilepsy with the proposed model obtains the auxiliary diagnosis. The data of real patients for the experiments provides the improved accuracy by D²PAM approximation compared to the existing techniques. To be more distinctive, the authors have analysed the performance of their work with five patients, and the accuracy comes out to be 95%, 97%, 99%, 99%, and 99% respectively. Overall, the numerical results unveil that the proposed work outperforms the existing models.

癫痫被认为是一种严重的脑部疾病，患者经常出现癫痫发作。癫痫发作被定义为大脑神经活动中意外的电变化，从而导致无意识。现有研究对利用脑信号数据预测癫痫发作进行了大量研究。然而，他们在获取患者特征方面面临困难，因为模型的分布变成了虚假预测，影响了模型的可靠性。此外，现有的预测模型存在严重的问题，如过拟合和误报率。为了克服这些现有问题，我们提出了一种称为深度双补丁注意机制（D2PAM）的深度学习方法，用于根据大脑信号对癫痫患者的发作前信号进行分类。将深度神经网络与D2PAM相结合，降低了患者差异对ES预测的影响。多网络设计有效地提高了训练模型的通用性和稳定性。此外，所提出的大脑信号处理模型被设计为将信号转换为数据块，这适用于发作前分类。利用该模型对癫痫的早期预警进行辅助诊断。与现有技术相比，用于实验的真实患者的数据通过D2PAM近似提供了改进的准确性。为了更具特色，作者分析了他们对五名患者的工作表现，准确率分别为95%、97%、99%、99%和99%。总体而言，数值结果表明，所提出的工作优于现有的模型。

{"title":"D2PAM: Epileptic seizures prediction using adversarial deep dual patch attention mechanism","authors":"Arfat Ahmad Khan, Rakesh Kumar Madendran, Usharani Thirunavukkarasu, Muhammad Faheem","doi":"10.1049/cit2.12261","DOIUrl":"https://doi.org/10.1049/cit2.12261","url":null,"abstract":"Epilepsy is considered as a serious brain disorder in which patients frequently experience seizures. The seizures are defined as the unexpected electrical changes in brain neural activity, which leads to unconsciousness. Existing researches made an intense effort for predicting the epileptic seizures using brain signal data. However, they faced difficulty in obtaining the patients' characteristics because the model's distribution turned to fake predictions, affecting the model's reliability. In addition, the existing prediction models have severe issues, such as overfitting and false positive rates. To overcome these existing issues, we propose a deep learning approach known as Deep dual-patch attention mechanism (D2PAM) for classifying the pre-ictal signals of people with Epilepsy based on the brain signals. Deep neural network is integrated with D2PAM, and it lowers the effect of differences between patients to predict ES. The multi-network design enhances the trained model's generalisability and stability efficiently. Also, the proposed model for processing the brain signal is designed to transform the signals into data blocks, which is appropriate for pre-ictal classification. The earlier warning of epilepsy with the proposed model obtains the auxiliary diagnosis. The data of real patients for the experiments provides the improved accuracy by D2PAM approximation compared to the existing techniques. To be more distinctive, the authors have analysed the performance of their work with five patients, and the accuracy comes out to be 95%, 97%, 99%, 99%, and 99% respectively. Overall, the numerical results unveil that the proposed work outperforms the existing models.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"8 3","pages":"755-769"},"PeriodicalIF":5.1,"publicationDate":"2023-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12261","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50155515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Optimal fusion-based localization method for tracking of smartphone user in tall complex buildings 基于融合的最佳定位方法，用于在高大复杂建筑中跟踪智能手机用户

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2023-07-28 DOI: 10.1049/cit2.12262

Harun Jamil, Do-Hyeun Kim

In the event of a fire breaking out or in other complicated situations, a mobile computing solution combining the Internet of Things and wearable devices can actually assist tracking solutions for rescuing and evacuating people in multistory structures. Thus, it is crucial to increase the positioning technology's accuracy. The sequential Monte Carlo (SMC) approach is used in various applications such as target tracking and intelligent surveillance, which rely on smartphone-based inertial data sequences. However, the SMC method has intrinsic flaws, such as sample impoverishment and particle degeneracy. A novel SMC approach is presented, which is built on the weighted differential evolution (WDE) algorithm. Sequential Monte Carlo approaches start with random particle placements and arrives at the desired distribution with a slower variance reduction, like in a high-dimensional space, such as a multistory structure. Weighted differential evolution is included before the resampling procedure to guarantee the appropriate variety of the particle set, prevent the usage of an inadequate number of valid samples, and preserve smartphone user position accuracy. The values of the smartphone-based sensors and BLE-beacons are set as input to the SMC, which aids in fast approximating the posterior distributions, to speed up the particle congregation process in the proposed SMC-based WDE approach. Lastly, the robustness and efficacy of the suggested technique more accurately reflect the actual situation of smartphone users. According to simulation findings, the suggested approach provides improved location estimation with reduced localization error and quick convergence. The results confirm that the proposed optimal fusion-based SMC-WDE scheme performs 9.92% better in terms of MAPE, 15.24% for the case of MAE, and 0.031% when evaluating based on the R2 Score.

在发生火灾或其他复杂情况时，结合物联网和可穿戴设备的移动计算解决方案实际上可以帮助跟踪解决方案，以救援和疏散多层结构中的人员。因此，提高定位技术的精度至关重要。序列蒙特卡罗(SMC)方法用于各种应用，如目标跟踪和智能监视，这些应用依赖于基于智能手机的惯性数据序列。然而，SMC方法存在样品贫化和粒子简并等缺陷。提出了一种基于加权差分进化(WDE)算法的SMC算法。顺序蒙特卡罗方法从随机粒子位置开始，并以较慢的方差减少达到所需的分布，例如在高维空间中，例如多层结构中。在重采样过程之前包含加权差分进化，以保证粒子集的适当变化，防止使用数量不足的有效样本，并保持智能手机用户的位置准确性。基于智能手机的传感器和ble信标的值被设置为SMC的输入，这有助于快速逼近后验分布，从而加快了基于SMC的WDE方法中的粒子聚集过程。最后，建议的技术的鲁棒性和有效性更准确地反映了智能手机用户的实际情况。仿真结果表明，该方法具有定位误差小、收敛快的优点。结果证实，基于融合的SMC-WDE方案在MAPE方面的性能提高了9.92%，在MAE方面的性能提高了15.24%，在R2评分方面的性能提高了0.031%。

{"title":"Optimal fusion-based localization method for tracking of smartphone user in tall complex buildings","authors":"Harun Jamil, Do-Hyeun Kim","doi":"10.1049/cit2.12262","DOIUrl":"10.1049/cit2.12262","url":null,"abstract":"In the event of a fire breaking out or in other complicated situations, a mobile computing solution combining the Internet of Things and wearable devices can actually assist tracking solutions for rescuing and evacuating people in multistory structures. Thus, it is crucial to increase the positioning technology's accuracy. The sequential Monte Carlo (SMC) approach is used in various applications such as target tracking and intelligent surveillance, which rely on smartphone-based inertial data sequences. However, the SMC method has intrinsic flaws, such as sample impoverishment and particle degeneracy. A novel SMC approach is presented, which is built on the weighted differential evolution (WDE) algorithm. Sequential Monte Carlo approaches start with random particle placements and arrives at the desired distribution with a slower variance reduction, like in a high-dimensional space, such as a multistory structure. Weighted differential evolution is included before the resampling procedure to guarantee the appropriate variety of the particle set, prevent the usage of an inadequate number of valid samples, and preserve smartphone user position accuracy. The values of the smartphone-based sensors and BLE-beacons are set as input to the SMC, which aids in fast approximating the posterior distributions, to speed up the particle congregation process in the proposed SMC-based WDE approach. Lastly, the robustness and efficacy of the suggested technique more accurately reflect the actual situation of smartphone users. According to simulation findings, the suggested approach provides improved location estimation with reduced localization error and quick convergence. The results confirm that the proposed optimal fusion-based SMC-WDE scheme performs 9.92% better in terms of MAPE, 15.24% for the case of MAE, and 0.031% when evaluating based on the R2 Score.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"8 4","pages":"1104-1123"},"PeriodicalIF":5.1,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12262","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74325728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RGB-guided hyperspectral image super-resolution with deep progressive learning 利用深度渐进学习实现 RGB 引导的高光谱图像超分辨率

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2023-07-17 DOI: 10.1049/cit2.12256

Tao Zhang, Ying Fu, Liwei Huang, Siyuan Li, Shaodi You, Chenggang Yan

Due to hardware limitations, existing hyperspectral (HS) camera often suffer from low spatial/temporal resolution. Recently, it has been prevalent to super-resolve a low resolution (LR) HS image into a high resolution (HR) HS image with a HR RGB (or multispectral) image guidance. Previous approaches for this guided super-resolution task often model the intrinsic characteristic of the desired HR HS image using hand-crafted priors. Recently, researchers pay more attention to deep learning methods with direct supervised or unsupervised learning, which exploit deep prior only from training dataset or testing data. In this article, an efficient convolutional neural network-based method is presented to progressively super-resolve HS image with RGB image guidance. Specifically, a progressive HS image super-resolution network is proposed, which progressively super-resolve the LR HS image with pixel shuffled HR RGB image guidance. Then, the super-resolution network is progressively trained with supervised pre-training and unsupervised adaption, where supervised pre-training learns the general prior on training data and unsupervised adaptation generalises the general prior to specific prior for variant testing scenes. The proposed method can effectively exploit prior from training dataset and testing HS and RGB images with spectral-spatial constraint. It has a good generalisation capability, especially for blind HS image super-resolution. Comprehensive experimental results show that the proposed deep progressive learning method outperforms the existing state-of-the-art methods for HS image super-resolution in non-blind and blind cases.

由于硬件的限制，现有的高光谱（HS）相机通常空间/时间分辨率较低。近来，利用高光谱 RGB（或多光谱）图像引导将低分辨率（LR）高光谱图像超分辨率转换为高分辨率（HR）高光谱图像的方法十分流行。以往用于这种引导超分辨率任务的方法通常使用手工创建的先验来模拟所需的高分辨率 HS 图像的内在特征。最近，研究人员更加关注直接监督或无监督学习的深度学习方法，这些方法只利用训练数据集或测试数据中的深度先验。本文提出了一种基于卷积神经网络的高效方法，在 RGB 图像引导下逐步实现 HS 图像的超分辨率。具体来说，本文提出了一种渐进式 HS 图像超分辨网络，该网络在像素洗牌后的 HR RGB 图像引导下对 LR HS 图像进行渐进式超分辨。然后，通过有监督预训练和无监督自适应对超分辨率网络进行渐进式训练，其中有监督预训练在训练数据上学习一般先验，无监督自适应将一般先验泛化为针对不同测试场景的特定先验。所提出的方法能有效利用训练数据集和测试 HS 以及具有光谱空间约束的 RGB 图像中的先验。它具有良好的泛化能力，尤其适用于盲 HS 图像超分辨率。综合实验结果表明，所提出的深度渐进学习方法在非盲区和盲区 HS 图像超分辨率方面优于现有的先进方法。

{"title":"RGB-guided hyperspectral image super-resolution with deep progressive learning","authors":"Tao Zhang, Ying Fu, Liwei Huang, Siyuan Li, Shaodi You, Chenggang Yan","doi":"10.1049/cit2.12256","DOIUrl":"10.1049/cit2.12256","url":null,"abstract":"Due to hardware limitations, existing hyperspectral (HS) camera often suffer from low spatial/temporal resolution. Recently, it has been prevalent to super-resolve a low resolution (LR) HS image into a high resolution (HR) HS image with a HR RGB (or multispectral) image guidance. Previous approaches for this guided super-resolution task often model the intrinsic characteristic of the desired HR HS image using hand-crafted priors. Recently, researchers pay more attention to deep learning methods with direct supervised or unsupervised learning, which exploit deep prior only from training dataset or testing data. In this article, an efficient convolutional neural network-based method is presented to progressively super-resolve HS image with RGB image guidance. Specifically, a progressive HS image super-resolution network is proposed, which progressively super-resolve the LR HS image with pixel shuffled HR RGB image guidance. Then, the super-resolution network is progressively trained with supervised pre-training and unsupervised adaption, where supervised pre-training learns the general prior on training data and unsupervised adaptation generalises the general prior to specific prior for variant testing scenes. The proposed method can effectively exploit prior from training dataset and testing HS and RGB images with spectral-spatial constraint. It has a good generalisation capability, especially for blind HS image super-resolution. Comprehensive experimental results show that the proposed deep progressive learning method outperforms the existing state-of-the-art methods for HS image super-resolution in non-blind and blind cases.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 3","pages":"679-694"},"PeriodicalIF":5.1,"publicationDate":"2023-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12256","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83005763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Neural dynamics for improving optimiser in deep learning with noise considered 在考虑噪声的深度学习中改进优化器的神经动力学

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2023-07-16 DOI: 10.1049/cit2.12263

Dan Su, Predrag S. Stanimirović, Ling Bo Han, Long Jin

As deep learning evolves, neural network structures become increasingly sophisticated, bringing a series of new optimisation challenges. For example, deep neural networks (DNNs) are vulnerable to a variety of attacks. Training neural networks under privacy constraints is a method to alleviate privacy leakage, and one way to do this is to add noise to the gradient. However, the existing optimisers suffer from weak convergence in the presence of increased noise during training, which leads to a low robustness of the optimiser. To stabilise and improve the convergence of DNNs, the authors propose a neural dynamics (ND) optimiser, which is inspired by the zeroing neural dynamics originated from zeroing neural networks. The authors first analyse the relationship between DNNs and control systems. Then, the authors construct the ND optimiser to update network parameters. Moreover, the proposed ND optimiser alleviates the non-convergence problem that may be suffered by adding noise to the gradient from different scenarios. Furthermore, experiments are conducted on different neural network structures, including ResNet18, ResNet34, Inception-v3, MobileNet, and long and short-term memory network. Comparative results using CIFAR, YouTube Faces, and R8 datasets demonstrate that the ND optimiser improves the accuracy and stability of DNNs under noise-free and noise-polluted conditions. The source code is publicly available at https://github.com/LongJin-lab/ND.

随着深度学习的发展，神经网络结构变得越来越复杂，带来了一系列新的优化挑战。例如，深度神经网络（DNN）容易受到各种攻击。在隐私约束下训练神经网络是缓解隐私泄露的一种方法，其中一种方法是在梯度中添加噪声。然而，现有的优化器在训练过程中出现噪声增加时收敛性较弱，导致优化器的鲁棒性较低。为了稳定和提高 DNN 的收敛性，作者提出了一种神经动力学（ND）优化器，其灵感来源于归零神经网络的归零神经动力学。作者首先分析了 DNN 与控制系统之间的关系。然后，作者构建了 ND 优化器来更新网络参数。此外，所提出的 ND 优化器还缓解了因在不同情况下向梯度添加噪声而可能导致的不收敛问题。此外，他们还在不同的神经网络结构上进行了实验，包括 ResNet18、ResNet34、Inception-v3、MobileNet 以及长短期记忆网络。使用 CIFAR、YouTube Faces 和 R8 数据集得出的比较结果表明，ND 优化器提高了无噪声和噪声污染条件下 DNN 的准确性和稳定性。源代码可在 https://github.com/LongJin-lab/ND 公开获取。

{"title":"Neural dynamics for improving optimiser in deep learning with noise considered","authors":"Dan Su, Predrag S. Stanimirović, Ling Bo Han, Long Jin","doi":"10.1049/cit2.12263","DOIUrl":"10.1049/cit2.12263","url":null,"abstract":"As deep learning evolves, neural network structures become increasingly sophisticated, bringing a series of new optimisation challenges. For example, deep neural networks (DNNs) are vulnerable to a variety of attacks. Training neural networks under privacy constraints is a method to alleviate privacy leakage, and one way to do this is to add noise to the gradient. However, the existing optimisers suffer from weak convergence in the presence of increased noise during training, which leads to a low robustness of the optimiser. To stabilise and improve the convergence of DNNs, the authors propose a neural dynamics (ND) optimiser, which is inspired by the zeroing neural dynamics originated from zeroing neural networks. The authors first analyse the relationship between DNNs and control systems. Then, the authors construct the ND optimiser to update network parameters. Moreover, the proposed ND optimiser alleviates the non-convergence problem that may be suffered by adding noise to the gradient from different scenarios. Furthermore, experiments are conducted on different neural network structures, including ResNet18, ResNet34, Inception-v3, MobileNet, and long and short-term memory network. Comparative results using CIFAR, YouTube Faces, and R8 datasets demonstrate that the ND optimiser improves the accuracy and stability of DNNs under noise-free and noise-polluted conditions. The source code is publicly available at https://github.com/LongJin-lab/ND.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 3","pages":"722-737"},"PeriodicalIF":5.1,"publicationDate":"2023-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12263","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74244306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0