CAAI Transactions on Intelligence Technology最新文献_第5页

An intelligent prediction model of epidemic characters based on multi-feature 基于多特征的流行病特征智能预测模型

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2024-03-05 DOI: 10.1049/cit2.12294

Xiaoying Wang, Chunmei Li, Yilei Wang, Lin Yin, Qilin Zhou, Rui Zheng, Qingwu Wu, Yuqi Zhou, Min Dai

The epidemic characters of Omicron (e.g. large-scale transmission) are significantly different from the initial variants of COVID-19. The data generated by large-scale transmission is important to predict the trend of epidemic characters. However, the results of current prediction models are inaccurate since they are not closely combined with the actual situation of Omicron transmission. In consequence, these inaccurate results have negative impacts on the process of the manufacturing and the service industry, for example, the production of masks and the recovery of the tourism industry. The authors have studied the epidemic characters in two ways, that is, investigation and prediction. First, a large amount of data is collected by utilising the Baidu index and conduct questionnaire survey concerning epidemic characters. Second, the β-SEIDR model is established, where the population is classified as Susceptible, Exposed, Infected, Dead and β-Recovered persons, to intelligently predict the epidemic characters of COVID-19. Note that β-Recovered persons denote that the Recovered persons may become Susceptible persons with probability β. The simulation results show that the model can accurately predict the epidemic characters.

Omicron 的流行特征（如大规模传播）与 COVID-19 的初始变种有显著不同。大规模传播产生的数据对于预测流行特征的趋势非常重要。然而，目前的预测模型由于没有与 Omicron 传播的实际情况紧密结合，其结果并不准确。因此，这些不准确的结果对制造业和服务业的发展进程（如口罩生产和旅游业的恢复）产生了负面影响。作者从调查和预测两个方面对疫情特征进行了研究。首先，利用百度指数收集了大量数据，并对疫情人物进行了问卷调查。其次，建立 β-SEIDR 模型，将人群分为易感人群、暴露人群、感染人群、死亡人群和 β-Recovered 人群，对 COVID-19 的流行特征进行智能预测。模拟结果表明，该模型能够准确预测疫情特征。

{"title":"An intelligent prediction model of epidemic characters based on multi-feature","authors":"Xiaoying Wang, Chunmei Li, Yilei Wang, Lin Yin, Qilin Zhou, Rui Zheng, Qingwu Wu, Yuqi Zhou, Min Dai","doi":"10.1049/cit2.12294","DOIUrl":"10.1049/cit2.12294","url":null,"abstract":"The epidemic characters of Omicron (e.g. large-scale transmission) are significantly different from the initial variants of COVID-19. The data generated by large-scale transmission is important to predict the trend of epidemic characters. However, the results of current prediction models are inaccurate since they are not closely combined with the actual situation of Omicron transmission. In consequence, these inaccurate results have negative impacts on the process of the manufacturing and the service industry, for example, the production of masks and the recovery of the tourism industry. The authors have studied the epidemic characters in two ways, that is, investigation and prediction. First, a large amount of data is collected by utilising the Baidu index and conduct questionnaire survey concerning epidemic characters. Second, the β-SEIDR model is established, where the population is classified as Susceptible, Exposed, Infected, Dead and β-Recovered persons, to intelligently predict the epidemic characters of COVID-19. Note that β-Recovered persons denote that the Recovered persons may become Susceptible persons with probability β. The simulation results show that the model can accurately predict the epidemic characters.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 3","pages":"595-607"},"PeriodicalIF":5.1,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12294","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140263286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A novel medical image data protection scheme for smart healthcare system 用于智能医疗系统的新型医疗图像数据保护方案

IF 8.4 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2024-02-13 DOI: 10.1049/cit2.12292

Mujeeb Ur Rehman, Arslan Shafique, Muhammad Shahbaz Khan, Maha Driss, Wadii Boulila, Yazeed Yasin Ghadi, Suresh Babu Changalasetty, Majed Alhaisoni, Jawad Ahmad

The Internet of Multimedia Things (IoMT) refers to a network of interconnected multimedia devices that communicate with each other over the Internet. Recently, smart healthcare has emerged as a significant application of the IoMT, particularly in the context of knowledge-based learning systems. Smart healthcare systems leverage knowledge-based learning to become more context-aware, adaptable, and auditable while maintaining the ability to learn from historical data. In smart healthcare systems, devices capture images, such as X-rays, Magnetic Resonance Imaging. The security and integrity of these images are crucial for the databases used in knowledge-based learning systems to foster structured decision-making and enhance the learning abilities of AI. Moreover, in knowledge-driven systems, the storage and transmission of HD medical images exert a burden on the limited bandwidth of the communication channel, leading to data transmission delays. To address the security and latency concerns, this paper presents a lightweight medical image encryption scheme utilising bit-plane decomposition and chaos theory. The results of the experiment yield entropy, energy, and correlation values of 7.999, 0.0156, and 0.0001, respectively. This validates the effectiveness of the encryption system proposed in this paper, which offers high-quality encryption, a large key space, key sensitivity, and resistance to statistical attacks.

多媒体物联网（IoMT）是指通过互联网相互连接的多媒体设备网络。最近，智能医疗已成为 IoMT 的一项重要应用，尤其是在基于知识的学习系统方面。智能医疗系统利用基于知识的学习来提高对环境的感知能力、适应能力和审计能力，同时保持从历史数据中学习的能力。在智能医疗系统中，设备会捕捉图像，如 X 光、磁共振成像。这些图像的安全性和完整性对基于知识的学习系统中使用的数据库至关重要，可促进结构化决策并增强人工智能的学习能力。此外，在知识驱动型系统中，高清医学图像的存储和传输对有限的通信信道带宽造成了负担，导致数据传输延迟。为了解决安全性和延迟问题，本文提出了一种利用位平面分解和混沌理论的轻量级医学图像加密方案。实验结果表明，该方案的熵值为 7.999，能量为 0.0156，相关性为 0.0001。这验证了本文提出的加密系统的有效性，它具有高质量加密、大密钥空间、密钥灵敏度和抗统计攻击等特点。

{"title":"A novel medical image data protection scheme for smart healthcare system","authors":"Mujeeb Ur Rehman, Arslan Shafique, Muhammad Shahbaz Khan, Maha Driss, Wadii Boulila, Yazeed Yasin Ghadi, Suresh Babu Changalasetty, Majed Alhaisoni, Jawad Ahmad","doi":"10.1049/cit2.12292","DOIUrl":"10.1049/cit2.12292","url":null,"abstract":"The Internet of Multimedia Things (IoMT) refers to a network of interconnected multimedia devices that communicate with each other over the Internet. Recently, smart healthcare has emerged as a significant application of the IoMT, particularly in the context of knowledge-based learning systems. Smart healthcare systems leverage knowledge-based learning to become more context-aware, adaptable, and auditable while maintaining the ability to learn from historical data. In smart healthcare systems, devices capture images, such as X-rays, Magnetic Resonance Imaging. The security and integrity of these images are crucial for the databases used in knowledge-based learning systems to foster structured decision-making and enhance the learning abilities of AI. Moreover, in knowledge-driven systems, the storage and transmission of HD medical images exert a burden on the limited bandwidth of the communication channel, leading to data transmission delays. To address the security and latency concerns, this paper presents a lightweight medical image encryption scheme utilising bit-plane decomposition and chaos theory. The results of the experiment yield entropy, energy, and correlation values of 7.999, 0.0156, and 0.0001, respectively. This validates the effectiveness of the encryption system proposed in this paper, which offers high-quality encryption, a large key space, key sensitivity, and resistance to statistical attacks.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 4","pages":"821-836"},"PeriodicalIF":8.4,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12292","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139841677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Knowledge-based deep learning system for classifying Alzheimer's disease for multi-task learning 基于知识的深度学习系统，用于对阿尔茨海默病进行多任务学习分类

IF 8.4 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2024-02-08 DOI: 10.1049/cit2.12291

Amol Dattatray Dhaygude, Gaurav Kumar Ameta, Ihtiram Raza Khan, Pavitar Parkash Singh, Renato R. Maaliw III, Natrayan Lakshmaiya, Mohammad Shabaz, Muhammad Attique Khan, Hany S. Hussein, Hammam Alshazly

Deep learning has recently become a viable approach for classifying Alzheimer's disease (AD) in medical imaging. However, existing models struggle to efficiently extract features from medical images and may squander additional information resources for illness classification. To address these issues, a deep three-dimensional convolutional neural network incorporating multi-task learning and attention mechanisms is proposed. An upgraded primary C3D network is utilised to create rougher low-level feature maps. It introduces a new convolution block that focuses on the structural aspects of the magnetic resonance imaging image and another block that extracts attention weights unique to certain pixel positions in the feature map and multiplies them with the feature map output. Then, several fully connected layers are used to achieve multi-task learning, generating three outputs, including the primary classification task. The other two outputs employ backpropagation during training to improve the primary classification job. Experimental findings show that the authors’ proposed method outperforms current approaches for classifying AD, achieving enhanced classification accuracy and other indicators on the Alzheimer's disease Neuroimaging Initiative dataset. The authors demonstrate promise for future disease classification studies.

最近，深度学习已成为医学影像中阿尔茨海默病（AD）分类的一种可行方法。然而，现有模型难以有效地从医学图像中提取特征，可能会浪费用于疾病分类的额外信息资源。为了解决这些问题，我们提出了一种融合了多任务学习和注意力机制的深度三维卷积神经网络。利用升级后的初级 C3D 网络来创建更粗糙的低级特征图。它引入了一个新的卷积块，重点关注磁共振成像图像的结构方面，另一个卷积块则提取特征图中某些像素位置特有的注意力权重，并将其与特征图输出相乘。然后，使用多个全连接层实现多任务学习，产生三个输出，包括主要分类任务。另外两个输出在训练过程中采用反向传播，以改进主要分类工作。实验结果表明，作者提出的方法优于当前的 AD 分类方法，在阿尔茨海默病神经影像倡议数据集上实现了更高的分类准确率和其他指标。作者展示了未来疾病分类研究的前景。

{"title":"Knowledge-based deep learning system for classifying Alzheimer's disease for multi-task learning","authors":"Amol Dattatray Dhaygude, Gaurav Kumar Ameta, Ihtiram Raza Khan, Pavitar Parkash Singh, Renato R. Maaliw III, Natrayan Lakshmaiya, Mohammad Shabaz, Muhammad Attique Khan, Hany S. Hussein, Hammam Alshazly","doi":"10.1049/cit2.12291","DOIUrl":"10.1049/cit2.12291","url":null,"abstract":"Deep learning has recently become a viable approach for classifying Alzheimer's disease (AD) in medical imaging. However, existing models struggle to efficiently extract features from medical images and may squander additional information resources for illness classification. To address these issues, a deep three-dimensional convolutional neural network incorporating multi-task learning and attention mechanisms is proposed. An upgraded primary C3D network is utilised to create rougher low-level feature maps. It introduces a new convolution block that focuses on the structural aspects of the magnetic resonance imaging image and another block that extracts attention weights unique to certain pixel positions in the feature map and multiplies them with the feature map output. Then, several fully connected layers are used to achieve multi-task learning, generating three outputs, including the primary classification task. The other two outputs employ backpropagation during training to improve the primary classification job. Experimental findings show that the authors’ proposed method outperforms current approaches for classifying AD, achieving enhanced classification accuracy and other indicators on the Alzheimer's disease Neuroimaging Initiative dataset. The authors demonstrate promise for future disease classification studies.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 4","pages":"805-820"},"PeriodicalIF":8.4,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12291","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139792149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hyperspectral image super resolution using deep internal and self-supervised learning 利用深度内部学习和自我监督学习实现高光谱图像超分辨率

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2024-02-01 DOI: 10.1049/cit2.12285

Zhe Liu, Xian-Hua Han

By automatically learning the priors embedded in images with powerful modelling capabilities, deep learning-based algorithms have recently made considerable progress in reconstructing the high-resolution hyperspectral (HR-HS) image. With previously collected large-amount of external data, these methods are intuitively realised under the full supervision of the ground-truth data. Thus, the database construction in merging the low-resolution (LR) HS (LR-HS) and HR multispectral (MS) or RGB image research paradigm, commonly named as HSI SR, requires collecting corresponding training triplets: HR-MS (RGB), LR-HS and HR-HS image simultaneously, and often faces difficulties in reality. The learned models with the training datasets collected simultaneously under controlled conditions may significantly degrade the HSI super-resolved performance to the real images captured under diverse environments. To handle the above-mentioned limitations, the authors propose to leverage the deep internal and self-supervised learning to solve the HSI SR problem. The authors advocate that it is possible to train a specific CNN model at test time, called as deep internal learning (DIL), by on-line preparing the training triplet samples from the observed LR-HS/HR-MS (or RGB) images and the down-sampled LR-HS version. However, the number of the training triplets extracted solely from the transformed data of the observation itself is extremely few particularly for the HSI SR tasks with large spatial upscale factors, which would result in limited reconstruction performance. To solve this problem, the authors further exploit deep self-supervised learning (DSL) by considering the observations as the unlabelled training samples. Specifically, the degradation modules inside the network were elaborated to realise the spatial and spectral down-sampling procedures for transforming the generated HR-HS estimation to the high-resolution RGB/LR-HS approximation, and then the reconstruction errors of the observations were formulated for measuring the network modelling performance. By consolidating the DIL and DSL into a unified deep framework, the authors construct a more robust HSI SR method without any prior training and have great potential of flexible adaptation to different settings per observation. To verify the effectiveness of the proposed approach, extensive experiments have been conducted on two benchmark HS datasets, including the CAVE and Harvard datasets, and demonstrate the great performance gain of the proposed method over the state-of-the-art methods.

通过自动学习图像中蕴含的具有强大建模能力的先验，基于深度学习的算法最近在重建高分辨率高光谱（HR-HS）图像方面取得了长足的进步。有了之前收集的大量外部数据，这些方法就能在地面实况数据的全面监督下直观地实现。因此，合并低分辨率（LR）高光谱（LR-HS）和高分辨率多光谱（MS）或 RGB 图像研究范例（通常称为 HSI SR）的数据库建设需要收集相应的训练三元组：HR-MS (RGB)、LR-HS 和 HR-HS 图像，在现实中往往面临困难。在受控条件下同时收集训练数据集的学习模型，可能会大大降低在不同环境下拍摄的真实图像的恒星仪超分辨性能。针对上述局限性，作者提出利用深度内部学习和自监督学习来解决恒星仪超分辨问题。作者认为，在测试时，可以通过在线准备观测到的 LR-HS/HR-MS （或 RGB）图像和向下采样的 LR-HS 版本的训练三元组样本来训练特定的 CNN 模型，称为深度内部学习（DIL）。然而，仅从观测数据本身的转换数据中提取的训练三元组数量极少，特别是对于空间放大系数较大的 HSI SR 任务，这将导致重建性能有限。为解决这一问题，作者进一步利用深度自监督学习（DSL），将观测数据视为未标记的训练样本。具体而言，作者详细阐述了网络内部的降级模块，以实现空间和光谱下采样程序，将生成的 HR-HS 估计转换为高分辨率 RGB/LR-HS 近似值，然后计算观测值的重建误差，以衡量网络建模性能。通过将 DIL 和 DSL 整合到一个统一的深度框架中，作者构建了一种更稳健的 HSI SR 方法，无需任何事先训练，并具有灵活适应每个观测点不同设置的巨大潜力。为了验证所提方法的有效性，我们在两个基准 HS 数据集（包括 CAVE 和 Harvard 数据集）上进行了大量实验，结果表明所提方法的性能大大优于最先进的方法。

{"title":"Hyperspectral image super resolution using deep internal and self-supervised learning","authors":"Zhe Liu, Xian-Hua Han","doi":"10.1049/cit2.12285","DOIUrl":"https://doi.org/10.1049/cit2.12285","url":null,"abstract":"By automatically learning the priors embedded in images with powerful modelling capabilities, deep learning-based algorithms have recently made considerable progress in reconstructing the high-resolution hyperspectral (HR-HS) image. With previously collected large-amount of external data, these methods are intuitively realised under the full supervision of the ground-truth data. Thus, the database construction in merging the low-resolution (LR) HS (LR-HS) and HR multispectral (MS) or RGB image research paradigm, commonly named as HSI SR, requires collecting corresponding training triplets: HR-MS (RGB), LR-HS and HR-HS image simultaneously, and often faces difficulties in reality. The learned models with the training datasets collected simultaneously under controlled conditions may significantly degrade the HSI super-resolved performance to the real images captured under diverse environments. To handle the above-mentioned limitations, the authors propose to leverage the deep internal and self-supervised learning to solve the HSI SR problem. The authors advocate that it is possible to train a specific CNN model at test time, called as deep internal learning (DIL), by on-line preparing the training triplet samples from the observed LR-HS/HR-MS (or RGB) images and the down-sampled LR-HS version. However, the number of the training triplets extracted solely from the transformed data of the observation itself is extremely few particularly for the HSI SR tasks with large spatial upscale factors, which would result in limited reconstruction performance. To solve this problem, the authors further exploit deep self-supervised learning (DSL) by considering the observations as the unlabelled training samples. Specifically, the degradation modules inside the network were elaborated to realise the spatial and spectral down-sampling procedures for transforming the generated HR-HS estimation to the high-resolution RGB/LR-HS approximation, and then the reconstruction errors of the observations were formulated for measuring the network modelling performance. By consolidating the DIL and DSL into a unified deep framework, the authors construct a more robust HSI SR method without any prior training and have great potential of flexible adaptation to different settings per observation. To verify the effectiveness of the proposed approach, extensive experiments have been conducted on two benchmark HS datasets, including the CAVE and Harvard datasets, and demonstrate the great performance gain of the proposed method over the state-of-the-art methods.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 1","pages":"128-141"},"PeriodicalIF":5.1,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12285","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139732276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Guest Editorial: Special issue on advances in representation learning for computer vision 客座编辑：计算机视觉表征学习进展特刊

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2024-02-01 DOI: 10.1049/cit2.12290

Andrew Beng Jin Teoh, Thian Song Ong, Kian Ming Lim, Chin Poo Lee

Deep learning has been a catalyst for a transformative revolution in machine learning and computer vision in the past decade. Within these research domains, methods grounded in deep learning have exhibited exceptional performance across a spectrum of tasks. The success of deep learning methods can be attributed to their capability to derive potent representations from data, integral for a myriad of downstream applications. These representations encapsulate the intrinsic structure, features, or latent variables characterising the underlying statistics of visual data. Despite these achievements, the challenge persists in effectively conducting representation learning of visual data with deep models, particularly when confronted with vast and noisy datasets. This special issue is a dedicated platform for researchers worldwide to disseminate their latest, high-quality articles, aiming to enhance readers' comprehension of the principles, limitations, and diverse applications of representation learning in computer vision.Wencheng Yang et al. present the first paper in this special issue. The authors thoroughly review feature extraction and learning methods in their work, specifically focusing on cancellable biometrics, a topic not addressed in previous survey articles. While preserving user data privacy, they emphasise the significance of cancellable biometrics in the capacity of feature representation for achieving good recognition accuracy. The paper states that selecting appropriate feature extraction and learning methods relies on individual applications' specific needs and restrictions. Deep learning-based feature learning has significantly improved cancellable biometrics in recent years, while hand-crafted feature extraction has matured. In addition, the research also discusses the problems and potential research areas in this field, providing valuable insights for future studies in cancellable biometrics, which attempts to strike a balance between privacy protection and recognition efficiency.The second paper by Mecheter et al. delves into the intricate realm of medical image analysis, specifically focusing on the segmentation of Magnetic Resonance images. The challenge lies in achieving precise segmentation, particularly with incorporating deep learning networks and the scarcity of sufficient medical images. Mecheter et al. tackle this challenge by proposing a novel approach—transfer learning from T1-weighted to T2-weighted MR sequences. Their work aims to enhance bone segmentation while minimising computational resources. The paper introduces an innovative excitation-based convolutional neural network and explores four transfer learning mechanisms. The hybrid transfer learning approach is particularly interesting, addressing overfitting concerns, and preserving features from both modalities with minimal computation time. Evaluating 14 clinical 3D brain MR and CT images demonstrates the superior performance and efficiency of hy

为了克服这些局限性，作者提出了一种利用深度内部学习（DIL）和深度自我监督学习（DSL）的新方法。DIL 包括在测试时通过从观测到的 LR-HS/HR-MS 图像和降采样 LR-HS 版本中在线准备训练三元组样本来训练特定的 CNN 模型。此外，DSL 利用观测结果作为无标签训练样本，增强了模型对不同环境的适应性。作者通过整合 DIL 和 DSL 的统一深度框架，提出了一种无需事先训练的稳健人机交互 SR 方法。他们在 CAVE 和哈佛等基准高光谱数据集上进行了大量实验，证明了该方法的功效，并展示了与最先进方法相比显著的性能提升。我们希望这些入选的论文能提高学术界对当前趋势的理解，并为未来的重点领域提供指导。我们衷心感谢所有作者选择本专栏作为传播其研究成果的平台。特别感谢审稿人，他们宝贵而周到的反馈意见让作者受益匪浅。此外，我们还要感谢 IET 工作人员在本特刊筹备过程中给予的大力支持和建议。

{"title":"Guest Editorial: Special issue on advances in representation learning for computer vision","authors":"Andrew Beng Jin Teoh, Thian Song Ong, Kian Ming Lim, Chin Poo Lee","doi":"10.1049/cit2.12290","DOIUrl":"https://doi.org/10.1049/cit2.12290","url":null,"abstract":"Deep learning has been a catalyst for a transformative revolution in machine learning and computer vision in the past decade. Within these research domains, methods grounded in deep learning have exhibited exceptional performance across a spectrum of tasks. The success of deep learning methods can be attributed to their capability to derive potent representations from data, integral for a myriad of downstream applications. These representations encapsulate the intrinsic structure, features, or latent variables characterising the underlying statistics of visual data. Despite these achievements, the challenge persists in effectively conducting representation learning of visual data with deep models, particularly when confronted with vast and noisy datasets. This special issue is a dedicated platform for researchers worldwide to disseminate their latest, high-quality articles, aiming to enhance readers' comprehension of the principles, limitations, and diverse applications of representation learning in computer vision.Wencheng Yang et al. present the first paper in this special issue. The authors thoroughly review feature extraction and learning methods in their work, specifically focusing on cancellable biometrics, a topic not addressed in previous survey articles. While preserving user data privacy, they emphasise the significance of cancellable biometrics in the capacity of feature representation for achieving good recognition accuracy. The paper states that selecting appropriate feature extraction and learning methods relies on individual applications' specific needs and restrictions. Deep learning-based feature learning has significantly improved cancellable biometrics in recent years, while hand-crafted feature extraction has matured. In addition, the research also discusses the problems and potential research areas in this field, providing valuable insights for future studies in cancellable biometrics, which attempts to strike a balance between privacy protection and recognition efficiency.The second paper by Mecheter et al. delves into the intricate realm of medical image analysis, specifically focusing on the segmentation of Magnetic Resonance images. The challenge lies in achieving precise segmentation, particularly with incorporating deep learning networks and the scarcity of sufficient medical images. Mecheter et al. tackle this challenge by proposing a novel approach—transfer learning from T1-weighted to T2-weighted MR sequences. Their work aims to enhance bone segmentation while minimising computational resources. The paper introduces an innovative excitation-based convolutional neural network and explores four transfer learning mechanisms. The hybrid transfer learning approach is particularly interesting, addressing overfitting concerns, and preserving features from both modalities with minimal computation time. Evaluating 14 clinical 3D brain MR and CT images demonstrates the superior performance and efficiency of hy","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 1","pages":"1-3"},"PeriodicalIF":5.1,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12290","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139732277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning to represent 2D human face with mathematical model 学习用数学模型表示二维人脸

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2024-01-30 DOI: 10.1049/cit2.12284

Liping Zhang, Weijun Li, Linjun Sun, Lina Yu, Xin Ning, Xiaoli Dong

How to represent a human face pattern? While it is presented in a continuous way in human visual system, computers often store and process it in a discrete manner with 2D arrays of pixels. The authors attempt to learn a continuous surface representation for face image with explicit function. First, an explicit model (EmFace) for human face representation is proposed in the form of a finite sum of mathematical terms, where each term is an analytic function element. Further, to estimate the unknown parameters of EmFace, a novel neural network, EmNet, is designed with an encoder-decoder structure and trained from massive face images, where the encoder is defined by a deep convolutional neural network and the decoder is an explicit mathematical expression of EmFace. The authors demonstrate that our EmFace represents face image more accurate than the comparison method, with an average mean square error of 0.000888, 0.000936, 0.000953 on LFW, IARPA Janus Benchmark-B, and IJB-C datasets. Visualisation results show that, EmFace has a higher representation performance on faces with various expressions, postures, and other factors. Furthermore, EmFace achieves reasonable performance on several face image processing tasks, including face image restoration, denoising, and transformation.

如何表示人脸图案？在人类视觉系统中，人脸是以连续的方式呈现的，而计算机通常是以二维像素阵列的离散方式存储和处理人脸。作者试图学习一种具有显式功能的连续人脸图像表面表示法。首先，以数学项有限和的形式提出了人脸表示的显式模型（EmFace），其中每个项都是一个解析函数元素。此外，为了估计 EmFace 的未知参数，作者设计了一个具有编码器-解码器结构的新型神经网络 EmNet，并通过海量人脸图像对其进行了训练，其中编码器由深度卷积神经网络定义，解码器则是 EmFace 的显式数学表达式。作者证明，我们的 EmFace 比对比方法更准确地表示人脸图像，在 LFW、IARPA Janus Benchmark-B 和 IJB-C 数据集上的平均均方误差分别为 0.000888、0.000936 和 0.000953。可视化结果表明，EmFace 对具有各种表情、姿势和其他因素的人脸具有更高的表示性能。此外，EmFace 在多项人脸图像处理任务（包括人脸图像复原、去噪和变换）中都取得了合理的性能。

{"title":"Learning to represent 2D human face with mathematical model","authors":"Liping Zhang, Weijun Li, Linjun Sun, Lina Yu, Xin Ning, Xiaoli Dong","doi":"10.1049/cit2.12284","DOIUrl":"https://doi.org/10.1049/cit2.12284","url":null,"abstract":"How to represent a human face pattern? While it is presented in a continuous way in human visual system, computers often store and process it in a discrete manner with 2D arrays of pixels. The authors attempt to learn a continuous surface representation for face image with explicit function. First, an explicit model (EmFace) for human face representation is proposed in the form of a finite sum of mathematical terms, where each term is an analytic function element. Further, to estimate the unknown parameters of EmFace, a novel neural network, EmNet, is designed with an encoder-decoder structure and trained from massive face images, where the encoder is defined by a deep convolutional neural network and the decoder is an explicit mathematical expression of EmFace. The authors demonstrate that our EmFace represents face image more accurate than the comparison method, with an average mean square error of 0.000888, 0.000936, 0.000953 on LFW, IARPA Janus Benchmark-B, and IJB-C datasets. Visualisation results show that, EmFace has a higher representation performance on faces with various expressions, postures, and other factors. Furthermore, EmFace achieves reasonable performance on several face image processing tasks, including face image restoration, denoising, and transformation.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 1","pages":"54-68"},"PeriodicalIF":5.1,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12284","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139732400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust zero-watermarking algorithm based on discrete wavelet transform and daisy descriptors for encrypted medical image 基于离散小波变换和菊花描述符的加密医学图像鲁棒零水印算法

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2024-01-29 DOI: 10.1049/cit2.12282

Yiyi Yuan, Jingbing Li, Jing Liu, Uzair Aslam Bhatti, Zilong Liu, Yen-wei Chen

In the intricate network environment, the secure transmission of medical images faces challenges such as information leakage and malicious tampering, significantly impacting the accuracy of disease diagnoses by medical professionals. To address this problem, the authors propose a robust feature watermarking algorithm for encrypted medical images based on multi-stage discrete wavelet transform (DWT), Daisy descriptor, and discrete cosine transform (DCT). The algorithm initially encrypts the original medical image through DWT-DCT and Logistic mapping. Subsequently, a 3-stage DWT transformation is applied to the encrypted medical image, with the centre point of the LL3 sub-band within its low-frequency component serving as the sampling point. The Daisy descriptor matrix for this point is then computed. Finally, a DCT transformation is performed on the Daisy descriptor matrix, and the low-frequency portion is processed using the perceptual hashing algorithm to generate a 32-bit binary feature vector for the medical image. This scheme utilises cryptographic knowledge and zero-watermarking technique to embed watermarks without modifying medical images and can extract the watermark from test images without the original image, which meets the basic requirements of medical image watermarking. The embedding and extraction of watermarks are accomplished in a mere 0.160 and 0.411s, respectively, with minimal computational overhead. Simulation results demonstrate the robustness of the algorithm against both conventional attacks and geometric attacks, with a notable performance in resisting rotation attacks.

在错综复杂的网络环境中，医疗图像的安全传输面临着信息泄露和恶意篡改等挑战，严重影响了医务人员诊断疾病的准确性。针对这一问题，作者提出了一种基于多级离散小波变换（DWT）、黛西描述符和离散余弦变换（DCT）的加密医学图像鲁棒特征水印算法。该算法首先通过 DWT-DCT 和 Logistic 映射对原始医学图像进行加密。随后，对加密后的医学图像进行 3 级 DWT 变换，以其低频分量中 LL3 子带的中心点作为采样点。然后计算出该点的黛西描述矩阵。最后，对 Daisy 描述矩阵进行 DCT 变换，并使用感知哈希算法处理低频部分，生成医疗图像的 32 位二进制特征向量。该方案利用密码学知识和零水印技术，在不修改医学图像的情况下嵌入水印，并能在不改变原始图像的情况下从测试图像中提取水印，满足了医学图像水印的基本要求。嵌入和提取水印的时间分别仅为 0.160 秒和 0.411 秒，计算开销极小。仿真结果表明，该算法对传统攻击和几何攻击都有很强的抵御能力，在抵御旋转攻击方面表现突出。

{"title":"Robust zero-watermarking algorithm based on discrete wavelet transform and daisy descriptors for encrypted medical image","authors":"Yiyi Yuan, Jingbing Li, Jing Liu, Uzair Aslam Bhatti, Zilong Liu, Yen-wei Chen","doi":"10.1049/cit2.12282","DOIUrl":"https://doi.org/10.1049/cit2.12282","url":null,"abstract":"In the intricate network environment, the secure transmission of medical images faces challenges such as information leakage and malicious tampering, significantly impacting the accuracy of disease diagnoses by medical professionals. To address this problem, the authors propose a robust feature watermarking algorithm for encrypted medical images based on multi-stage discrete wavelet transform (DWT), Daisy descriptor, and discrete cosine transform (DCT). The algorithm initially encrypts the original medical image through DWT-DCT and Logistic mapping. Subsequently, a 3-stage DWT transformation is applied to the encrypted medical image, with the centre point of the LL3 sub-band within its low-frequency component serving as the sampling point. The Daisy descriptor matrix for this point is then computed. Finally, a DCT transformation is performed on the Daisy descriptor matrix, and the low-frequency portion is processed using the perceptual hashing algorithm to generate a 32-bit binary feature vector for the medical image. This scheme utilises cryptographic knowledge and zero-watermarking technique to embed watermarks without modifying medical images and can extract the watermark from test images without the original image, which meets the basic requirements of medical image watermarking. The embedding and extraction of watermarks are accomplished in a mere 0.160 and 0.411s, respectively, with minimal computational overhead. Simulation results demonstrate the robustness of the algorithm against both conventional attacks and geometric attacks, with a notable performance in resisting rotation attacks.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 1","pages":"40-53"},"PeriodicalIF":5.1,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12282","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139732407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Feature extraction and learning approaches for cancellable biometrics: A survey 可取消生物识别技术的特征提取和学习方法：调查

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2024-01-22 DOI: 10.1049/cit2.12283

Wencheng Yang, Song Wang, Jiankun Hu, Xiaohui Tao, Yan Li

Biometric recognition is a widely used technology for user authentication. In the application of this technology, biometric security and recognition accuracy are two important issues that should be considered. In terms of biometric security, cancellable biometrics is an effective technique for protecting biometric data. Regarding recognition accuracy, feature representation plays a significant role in the performance and reliability of cancellable biometric systems. How to design good feature representations for cancellable biometrics is a challenging topic that has attracted a great deal of attention from the computer vision community, especially from researchers of cancellable biometrics. Feature extraction and learning in cancellable biometrics is to find suitable feature representations with a view to achieving satisfactory recognition performance, while the privacy of biometric data is protected. This survey informs the progress, trend and challenges of feature extraction and learning for cancellable biometrics, thus shedding light on the latest developments and future research of this area.

生物识别是一种广泛应用的用户身份验证技术。在这项技术的应用中，生物识别的安全性和识别准确性是需要考虑的两个重要问题。在生物识别安全性方面，可取消生物识别技术是保护生物识别数据的有效技术。在识别准确性方面，特征表示对可注销生物识别系统的性能和可靠性起着重要作用。如何为可注销生物识别技术设计良好的特征表示是一个具有挑战性的课题，吸引了计算机视觉界，尤其是可注销生物识别技术研究人员的大量关注。可注销生物识别技术中的特征提取和学习就是要找到合适的特征表示，以达到令人满意的识别性能，同时保护生物识别数据的隐私。本调查报告介绍了可取消生物识别技术中特征提取和学习的进展、趋势和挑战，从而揭示了这一领域的最新发展和未来研究。

{"title":"Feature extraction and learning approaches for cancellable biometrics: A survey","authors":"Wencheng Yang, Song Wang, Jiankun Hu, Xiaohui Tao, Yan Li","doi":"10.1049/cit2.12283","DOIUrl":"10.1049/cit2.12283","url":null,"abstract":"Biometric recognition is a widely used technology for user authentication. In the application of this technology, biometric security and recognition accuracy are two important issues that should be considered. In terms of biometric security, cancellable biometrics is an effective technique for protecting biometric data. Regarding recognition accuracy, feature representation plays a significant role in the performance and reliability of cancellable biometric systems. How to design good feature representations for cancellable biometrics is a challenging topic that has attracted a great deal of attention from the computer vision community, especially from researchers of cancellable biometrics. Feature extraction and learning in cancellable biometrics is to find suitable feature representations with a view to achieving satisfactory recognition performance, while the privacy of biometric data is protected. This survey informs the progress, trend and challenges of feature extraction and learning for cancellable biometrics, thus shedding light on the latest developments and future research of this area.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 1","pages":"4-25"},"PeriodicalIF":5.1,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12283","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139606924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Image enhancement with intensity transformation on embedding space 利用嵌入空间的强度变换增强图像效果

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2024-01-18 DOI: 10.1049/cit2.12279

Hanul Kim, Yeji Jeon, Yeong Jun Koh

In recent times, an image enhancement approach, which learns the global transformation function using deep neural networks, has gained attention. However, many existing methods based on this approach have a limitation: their transformation functions are too simple to imitate complex colour transformations between low-quality images and manually retouched high-quality images. In order to address this limitation, a simple yet effective approach for image enhancement is proposed. The proposed algorithm based on the channel-wise intensity transformation is designed. However, this transformation is applied to the learnt embedding space instead of specific colour spaces and then return enhanced features to colours. To this end, the authors define the continuous intensity transformation (CIT) to describe the mapping between input and output intensities on the embedding space. Then, the enhancement network is developed, which produces multi-scale feature maps from input images, derives the set of transformation functions, and performs the CIT to obtain enhanced images. Extensive experiments on the MIT-Adobe 5K dataset demonstrate that the authors’ approach improves the performance of conventional intensity transforms on colour space metrics. Specifically, the authors achieved a 3.8% improvement in peak signal-to-noise ratio, a 1.8% improvement in structual similarity index measure, and a 27.5% improvement in learned perceptual image patch similarity. Also, the authors’ algorithm outperforms state-of-the-art alternatives on three image enhancement datasets: MIT-Adobe 5K, Low-Light, and Google HDR+.

近来，一种利用深度神经网络学习全局变换函数的图像增强方法备受关注。然而，基于这种方法的许多现有方法都有一个局限性：它们的变换函数过于简单，无法模仿低质量图像与人工修饰的高质量图像之间复杂的色彩变换。针对这一局限，本文提出了一种简单而有效的图像增强方法。所提出的算法是基于信道强度变换设计的。不过，这种变换应用于学习的嵌入空间，而不是特定的色彩空间，然后将增强的特征返回到色彩中。为此，作者定义了连续强度变换（CIT）来描述嵌入空间上输入和输出强度之间的映射。然后，开发了增强网络，从输入图像生成多尺度特征图，推导出一组变换函数，并执行 CIT 以获得增强图像。在 MIT-Adobe 5K 数据集上进行的大量实验表明，作者的方法提高了传统强度变换在色彩空间指标上的性能。具体来说，作者的算法在峰值信噪比方面提高了 3.8%，在结构相似性指数测量方面提高了 1.8%，在学习感知图像补丁相似性方面提高了 27.5%。此外，在三个图像增强数据集上，作者的算法优于最先进的替代方案：MIT-Adobe 5K、Low-Light 和 Google HDR+。

{"title":"Image enhancement with intensity transformation on embedding space","authors":"Hanul Kim, Yeji Jeon, Yeong Jun Koh","doi":"10.1049/cit2.12279","DOIUrl":"10.1049/cit2.12279","url":null,"abstract":"In recent times, an image enhancement approach, which learns the global transformation function using deep neural networks, has gained attention. However, many existing methods based on this approach have a limitation: their transformation functions are too simple to imitate complex colour transformations between low-quality images and manually retouched high-quality images. In order to address this limitation, a simple yet effective approach for image enhancement is proposed. The proposed algorithm based on the channel-wise intensity transformation is designed. However, this transformation is applied to the learnt embedding space instead of specific colour spaces and then return enhanced features to colours. To this end, the authors define the continuous intensity transformation (CIT) to describe the mapping between input and output intensities on the embedding space. Then, the enhancement network is developed, which produces multi-scale feature maps from input images, derives the set of transformation functions, and performs the CIT to obtain enhanced images. Extensive experiments on the MIT-Adobe 5K dataset demonstrate that the authors’ approach improves the performance of conventional intensity transforms on colour space metrics. Specifically, the authors achieved a 3.8% improvement in peak signal-to-noise ratio, a 1.8% improvement in structual similarity index measure, and a 27.5% improvement in learned perceptual image patch similarity. Also, the authors’ algorithm outperforms state-of-the-art alternatives on three image enhancement datasets: MIT-Adobe 5K, Low-Light, and Google HDR+.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 1","pages":"101-115"},"PeriodicalIF":5.1,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12279","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139615138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Heterogeneous decentralised machine unlearning with seed model distillation 利用种子模型蒸馏实现异构分散式机器非学习

IF 5.1 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2024-01-17 DOI: 10.1049/cit2.12281

Guanhua Ye, Tong Chen, Quoc Viet Hung Nguyen, Hongzhi Yin

As some recent information security legislation endowed users with unconditional rights to be forgotten by any trained machine learning model, personalised IoT service providers have to put unlearning functionality into their consideration. The most straightforward method to unlearn users' contribution is to retrain the model from the initial state, which is not realistic in high throughput applications with frequent unlearning requests. Though some machine unlearning frameworks have been proposed to speed up the retraining process, they fail to match decentralised learning scenarios. A decentralised unlearning framework called heterogeneous decentralised unlearning framework with seed (HDUS) is designed, which uses distilled seed models to construct erasable ensembles for all clients. Moreover, the framework is compatible with heterogeneous on-device models, representing stronger scalability in real-world applications. Extensive experiments on three real-world datasets show that our HDUS achieves state-of-the-art performance.

由于最近的一些信息安全立法赋予用户无条件被任何训练有素的机器学习模型遗忘的权利，因此个性化物联网服务提供商必须考虑取消学习功能。解除学习用户贡献的最直接方法是从初始状态重新训练模型，但这在频繁提出解除学习请求的高吞吐量应用中并不现实。虽然已经提出了一些机器解除学习框架来加快重新训练过程，但它们无法与分散学习场景相匹配。我们设计了一种名为 "带种子的异构分散式解除学习框架（HDUS）"的分散式解除学习框架，它使用经过提炼的种子模型为所有客户端构建可擦除的集合。此外，该框架与异构设备模型兼容，在实际应用中具有更强的可扩展性。在三个真实世界数据集上进行的广泛实验表明，我们的 HDUS 达到了最先进的性能。

引用次数: 0