首页 > 最新文献

CAAI Transactions on Intelligence Technology最新文献

英文 中文
Multi-omics graph convolutional networks for digestive system tumour classification and early-late stage diagnosis 多组学图卷积网络在消化系统肿瘤分类和早期晚期诊断中的应用
IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-01 DOI: 10.1049/cit2.12395
Lin Zhou, Zhengzhi Zhu, Hongbo Gao, Chunyu Wang, Muhammad Attique Khan, Mati Ullah, Siffat Ullah Khan

The prevalence of digestive system tumours (DST) poses a significant challenge in the global crusade against cancer. These neoplasms constitute 20% of all documented cancer diagnoses and contribute to 22.5% of cancer-related fatalities. The accurate diagnosis of DST is paramount for vigilant patient monitoring and the judicious selection of optimal treatments. Addressing this challenge, the authors introduce a novel methodology, denominated as the Multi-omics Graph Transformer Convolutional Network (MGTCN). This innovative approach aims to discern various DST tumour types and proficiently discern between early-late stage tumours, ensuring a high degree of accuracy. The MGTCN model incorporates the Graph Transformer Layer framework to meticulously transform the multi-omics adjacency matrix, thereby illuminating potential associations among diverse samples. A rigorous experimental evaluation was undertaken on the DST dataset from The Cancer Genome Atlas to scrutinise the efficacy of the MGTCN model. The outcomes unequivocally underscore the efficiency and precision of MGTCN in diagnosing diverse DST tumour types and successfully discriminating between early-late stage DST cases. The source code for this groundbreaking study is readily accessible for download at https://github.com/bigone1/MGTCN.

消化系统肿瘤(DST)的流行对全球抗癌运动提出了重大挑战。这些肿瘤占所有记录在案的癌症诊断的20%,占癌症相关死亡的22.5%。DST的准确诊断对于警惕患者监测和明智选择最佳治疗至关重要。为了解决这一挑战,作者引入了一种新的方法,称为多组学图转换卷积网络(MGTCN)。这种创新的方法旨在识别各种DST肿瘤类型,并熟练区分早期和晚期肿瘤,确保高度的准确性。MGTCN模型结合了Graph Transformer Layer框架来细致地转换多组学邻接矩阵,从而阐明了不同样本之间的潜在关联。对来自癌症基因组图谱的DST数据集进行了严格的实验评估,以仔细检查MGTCN模型的有效性。结果明确强调了MGTCN在诊断不同DST肿瘤类型和成功区分早期晚期DST病例方面的效率和准确性。这项开创性研究的源代码可以从https://github.com/bigone1/MGTCN下载。
{"title":"Multi-omics graph convolutional networks for digestive system tumour classification and early-late stage diagnosis","authors":"Lin Zhou,&nbsp;Zhengzhi Zhu,&nbsp;Hongbo Gao,&nbsp;Chunyu Wang,&nbsp;Muhammad Attique Khan,&nbsp;Mati Ullah,&nbsp;Siffat Ullah Khan","doi":"10.1049/cit2.12395","DOIUrl":"10.1049/cit2.12395","url":null,"abstract":"<p>The prevalence of digestive system tumours (DST) poses a significant challenge in the global crusade against cancer. These neoplasms constitute 20% of all documented cancer diagnoses and contribute to 22.5% of cancer-related fatalities. The accurate diagnosis of DST is paramount for vigilant patient monitoring and the judicious selection of optimal treatments. Addressing this challenge, the authors introduce a novel methodology, denominated as the Multi-omics Graph Transformer Convolutional Network (MGTCN). This innovative approach aims to discern various DST tumour types and proficiently discern between early-late stage tumours, ensuring a high degree of accuracy. The MGTCN model incorporates the Graph Transformer Layer framework to meticulously transform the multi-omics adjacency matrix, thereby illuminating potential associations among diverse samples. A rigorous experimental evaluation was undertaken on the DST dataset from The Cancer Genome Atlas to scrutinise the efficacy of the MGTCN model. The outcomes unequivocally underscore the efficiency and precision of MGTCN in diagnosing diverse DST tumour types and successfully discriminating between early-late stage DST cases. The source code for this groundbreaking study is readily accessible for download at https://github.com/bigone1/MGTCN.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1572-1586"},"PeriodicalIF":7.3,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12395","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143247961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rehabilitation exoskeleton system with bidirectional virtual reality feedback training strategy 康复外骨骼系统双向虚拟现实反馈训练策略
IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-26 DOI: 10.1049/cit2.12391
Yongsheng Gao, Guodong Lang, Chenxiao Zhang, Rui Wu, Yanhe Zhu, Yu Zhao, Jie Zhao

Virtual reality (VR) technology revitalises rehabilitation training by creating rich, interactive virtual rehabilitation scenes and tasks that deeply engage patients. Robotics with immersive VR environments have the potential to significantly enhance the sense of immersion for patients during training. This paper proposes a rehabilitation robot system. The system integrates a VR environment, the exoskeleton entity, and research on rehabilitation assessment metrics derived from surface electromyographic signal (sEMG). Employing more realistic and engaging virtual stimuli, this method guides patients to actively participate, thereby enhancing the effectiveness of neural connection reconstruction—an essential aspect of rehabilitation. Furthermore, this study introduces a muscle activation model that merges linear and non-linear states of muscle, avoiding the impact of non-linear shape factors on model accuracy present in traditional models. A muscle strength assessment model based on optimised generalised regression (WOA-GRNN) is also proposed, with a root mean square error of 0.017,347 and a mean absolute percentage error of 1.2461%, serving as critical assessment indicators for the effectiveness of rehabilitation. Finally, the system is preliminarily applied in human movement experiments, validating the practicality and potential effectiveness of VR-centred rehabilitation strategies in medical recovery.

虚拟现实(VR)技术通过创造丰富的、互动的虚拟康复场景和任务,使患者深入参与,从而振兴康复训练。具有沉浸式虚拟现实环境的机器人技术有可能在训练期间显著增强患者的沉浸感。本文提出了一种康复机器人系统。该系统集成了VR环境、外骨骼实体和基于表面肌电信号(sEMG)的康复评估指标的研究。该方法采用更加逼真和引人入胜的虚拟刺激,引导患者积极参与,从而提高神经连接重建的有效性,这是康复的一个重要方面。此外,本研究引入了一种肌肉激活模型,该模型融合了肌肉的线性和非线性状态,避免了传统模型中非线性形状因素对模型精度的影响。提出了基于优化广义回归(WOA-GRNN)的肌力评估模型,其均方根误差为0.017347,平均绝对百分比误差为1.2461%,可作为康复效果的重要评估指标。最后,将该系统初步应用于人体运动实验,验证了以vr为中心的康复策略在医疗康复中的实用性和潜在有效性。
{"title":"Rehabilitation exoskeleton system with bidirectional virtual reality feedback training strategy","authors":"Yongsheng Gao,&nbsp;Guodong Lang,&nbsp;Chenxiao Zhang,&nbsp;Rui Wu,&nbsp;Yanhe Zhu,&nbsp;Yu Zhao,&nbsp;Jie Zhao","doi":"10.1049/cit2.12391","DOIUrl":"10.1049/cit2.12391","url":null,"abstract":"<p>Virtual reality (VR) technology revitalises rehabilitation training by creating rich, interactive virtual rehabilitation scenes and tasks that deeply engage patients. Robotics with immersive VR environments have the potential to significantly enhance the sense of immersion for patients during training. This paper proposes a rehabilitation robot system. The system integrates a VR environment, the exoskeleton entity, and research on rehabilitation assessment metrics derived from surface electromyographic signal (sEMG). Employing more realistic and engaging virtual stimuli, this method guides patients to actively participate, thereby enhancing the effectiveness of neural connection reconstruction—an essential aspect of rehabilitation. Furthermore, this study introduces a muscle activation model that merges linear and non-linear states of muscle, avoiding the impact of non-linear shape factors on model accuracy present in traditional models. A muscle strength assessment model based on optimised generalised regression (WOA-GRNN) is also proposed, with a root mean square error of 0.017,347 and a mean absolute percentage error of 1.2461%, serving as critical assessment indicators for the effectiveness of rehabilitation. Finally, the system is preliminarily applied in human movement experiments, validating the practicality and potential effectiveness of VR-centred rehabilitation strategies in medical recovery.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 3","pages":"728-737"},"PeriodicalIF":7.3,"publicationDate":"2024-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12391","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144503137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topological search and gradient descent boosted Runge–Kutta optimiser with application to engineering design and feature selection 拓扑搜索和梯度下降将龙格-库塔优化算法应用于工程设计和特征选择
IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-24 DOI: 10.1049/cit2.12387
Jinge Shi, Yi Chen, Ali Asghar Heidari, Zhennao Cai, Huiling Chen, Guoxi Liang

The Runge–Kutta optimiser (RUN) algorithm, renowned for its powerful optimisation capabilities, faces challenges in dealing with increasing complexity in real-world problems. Specifically, it shows deficiencies in terms of limited local exploration capabilities and less precise solutions. Therefore, this research aims to integrate the topological search (TS) mechanism with the gradient search rule (GSR) into the framework of RUN, introducing an enhanced algorithm called TGRUN to improve the performance of the original algorithm. The TS mechanism employs a circular topological scheme to conduct a thorough exploration of solution regions surrounding each solution, enabling a careful examination of valuable solution areas and enhancing the algorithm’s effectiveness in local exploration. To prevent the algorithm from becoming trapped in local optima, the GSR also integrates gradient descent principles to direct the algorithm in a wider investigation of the global solution space. This study conducted a serious of experiments on the IEEE CEC2017 comprehensive benchmark function to assess the enhanced effectiveness of TGRUN. Additionally, the evaluation includes real-world engineering design and feature selection problems serving as an additional test for assessing the optimisation capabilities of the algorithm. The validation outcomes indicate a significant improvement in the optimisation capabilities and solution accuracy of TGRUN.

龙格-库塔优化器(RUN)算法以其强大的优化能力而闻名,但在处理日益复杂的现实问题时面临着挑战。具体来说,它显示了有限的当地勘探能力和不太精确的解决方案方面的不足。因此,本研究旨在将拓扑搜索(TS)机制和梯度搜索规则(GSR)整合到RUN框架中,引入一种增强算法TGRUN,以提高原算法的性能。TS机制采用循环拓扑方案,对每个解周围的解区域进行彻底的探索,从而能够仔细检查有价值的解区域,提高算法在局部探索中的有效性。为了防止算法陷入局部最优,GSR还集成了梯度下降原理,以指导算法更广泛地研究全局解空间。本研究在IEEE CEC2017综合基准函数上进行了大量实验,以评估TGRUN的增强效果。此外,评估包括现实世界的工程设计和特征选择问题,作为评估算法优化能力的额外测试。验证结果表明,TGRUN的优化能力和求解精度有了显著提高。
{"title":"Topological search and gradient descent boosted Runge–Kutta optimiser with application to engineering design and feature selection","authors":"Jinge Shi,&nbsp;Yi Chen,&nbsp;Ali Asghar Heidari,&nbsp;Zhennao Cai,&nbsp;Huiling Chen,&nbsp;Guoxi Liang","doi":"10.1049/cit2.12387","DOIUrl":"10.1049/cit2.12387","url":null,"abstract":"<p>The Runge–Kutta optimiser (RUN) algorithm, renowned for its powerful optimisation capabilities, faces challenges in dealing with increasing complexity in real-world problems. Specifically, it shows deficiencies in terms of limited local exploration capabilities and less precise solutions. Therefore, this research aims to integrate the topological search (TS) mechanism with the gradient search rule (GSR) into the framework of RUN, introducing an enhanced algorithm called TGRUN to improve the performance of the original algorithm. The TS mechanism employs a circular topological scheme to conduct a thorough exploration of solution regions surrounding each solution, enabling a careful examination of valuable solution areas and enhancing the algorithm’s effectiveness in local exploration. To prevent the algorithm from becoming trapped in local optima, the GSR also integrates gradient descent principles to direct the algorithm in a wider investigation of the global solution space. This study conducted a serious of experiments on the IEEE CEC2017 comprehensive benchmark function to assess the enhanced effectiveness of TGRUN. Additionally, the evaluation includes real-world engineering design and feature selection problems serving as an additional test for assessing the optimisation capabilities of the algorithm. The validation outcomes indicate a significant improvement in the optimisation capabilities and solution accuracy of TGRUN.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 2","pages":"557-614"},"PeriodicalIF":7.3,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12387","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RJAN: Region-based joint attention network for 3D shape recognition 基于区域的三维形状识别联合注意网络
IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-24 DOI: 10.1049/cit2.12388
Yue Zhao, Weizhi Nie, Jie Nie, Yuyi Zhang, Bo Wang

As an essential field of multimedia and computer vision, 3D shape recognition has attracted much research attention in recent years. Multiview-based approaches have demonstrated their superiority in generating effective 3D shape representations. Typical methods usually extract the multiview global features and aggregate them together to generate 3D shape descriptors. However, there exist two disadvantages: First, the mainstream methods ignore the comprehensive exploration of local information in each view. Second, many approaches roughly aggregate multiview features by adding or concatenating them together. The information loss for some discriminative characteristics limits the representation effectiveness. To address these problems, a novel architecture named region-based joint attention network (RJAN) was proposed. Specifically, the authors first design a hierarchical local information exploration module for view descriptor extraction. The region-to-region and channel-to-channel relationships from different granularities can be comprehensively explored and utilised to provide more discriminative characteristics for view feature learning. Subsequently, a novel relation-aware view aggregation module is designed to aggregate the multiview features for shape descriptor generation, considering the view-to-view relationships. Extensive experiments were conducted on three public databases: ModelNet40, ModelNet10, and ShapeNetCore55. RJAN achieves state-of-the-art performance in the tasks of 3D shape classification and 3D shape retrieval, which demonstrates the effectiveness of RJAN. The code has been released on https://github.com/slurrpp/RJAN.

三维形状识别作为多媒体和计算机视觉的一个重要研究领域,近年来受到了广泛的关注。基于多视图的方法在生成有效的三维形状表示方面具有优势。典型的方法通常是提取多视图全局特征并将其聚合在一起生成三维形状描述符。然而,主流方法存在两个缺点:一是忽视了对各个视角的局部信息的全面挖掘。其次,许多方法通过将多视图特征添加或连接在一起来粗略地聚合多视图特征。一些判别特征的信息丢失限制了表征的有效性。为了解决这些问题,提出了一种基于区域的联合关注网络(RJAN)。具体来说,作者首先设计了一个分层的局部信息挖掘模块,用于视图描述符的提取。可以综合探索和利用不同粒度的区域与区域、通道与通道之间的关系,为视图特征学习提供更多的判别特征。随后,考虑到视图间的关系,设计了一种新的关系感知视图聚合模块,对多视图特征进行聚合,生成形状描述符。在ModelNet40、ModelNet10和ShapeNetCore55三个公共数据库上进行了广泛的实验。RJAN在三维形状分类和三维形状检索任务中达到了最先进的性能,证明了RJAN的有效性。该代码已在https://github.com/slurrpp/RJAN上发布。
{"title":"RJAN: Region-based joint attention network for 3D shape recognition","authors":"Yue Zhao,&nbsp;Weizhi Nie,&nbsp;Jie Nie,&nbsp;Yuyi Zhang,&nbsp;Bo Wang","doi":"10.1049/cit2.12388","DOIUrl":"10.1049/cit2.12388","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <p>As an essential field of multimedia and computer vision, 3D shape recognition has attracted much research attention in recent years. Multiview-based approaches have demonstrated their superiority in generating effective 3D shape representations. Typical methods usually extract the multiview global features and aggregate them together to generate 3D shape descriptors. However, there exist two disadvantages: First, the mainstream methods ignore the comprehensive exploration of local information in each view. Second, many approaches roughly aggregate multiview features by adding or concatenating them together. The information loss for some discriminative characteristics limits the representation effectiveness. To address these problems, a novel architecture named region-based joint attention network (RJAN) was proposed. Specifically, the authors first design a hierarchical local information exploration module for view descriptor extraction. The region-to-region and channel-to-channel relationships from different granularities can be comprehensively explored and utilised to provide more discriminative characteristics for view feature learning. Subsequently, a novel relation-aware view aggregation module is designed to aggregate the multiview features for shape descriptor generation, considering the view-to-view relationships. Extensive experiments were conducted on three public databases: ModelNet40, ModelNet10, and ShapeNetCore55. RJAN achieves state-of-the-art performance in the tasks of 3D shape classification and 3D shape retrieval, which demonstrates the effectiveness of RJAN. The code has been released on https://github.com/slurrpp/RJAN.</p>\u0000 </section>\u0000 </div>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 2","pages":"460-473"},"PeriodicalIF":7.3,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12388","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial intelligence assisted prediction of optimum operating conditions of shell and tube heat exchangers: A grey-box approach 人工智能辅助壳管式换热器最佳工况预测:灰盒方法
IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-24 DOI: 10.1049/cit2.12393
Zahid Ullah, Iftikhar Ahmad, Abdul Samad, Husnain Saghir, Farooq Ahmad, Manabu Kano, Hakan Caliskan, Nesrin Caliskan, Hiki Hong

In this study, a Grey-box (GB) model was developed to predict the optimum mass flow rates of inlet streams of a Shell and Tube Heat Exchanger (STHE) under varying process conditions. Aspen Exchanger Design and Rating (Aspen-EDR) was initially used to construct a first principle model (FP) of the STHE using industrial data. The Genetic Algorithm (GA) was incorporated into the FP model to attain the minimum exit temperature for the hot kerosene process stream under varying process conditions. A dataset comprised of optimum process conditions was generated through FP-GA integration and was utilised to develop an Artificial Neural Networks (ANN) model. Subsequently, the ANN model was merged with the FP model by substituting the GA, to form a GB model. The developed GB model, that is, ANN and FP integration, achieved higher effectiveness and lower outlet temperature than those derived through the standalone FP model. Performance of the GB framework was also comparable to the FP-GA approach but it significantly reduced the computation time required for estimating the optimum process conditions. The proposed GB-based method improved the STHE's ability to extract energy from the process stream and strengthened its resilience to cope with diverse process conditions.

本文建立了灰盒模型,对不同工艺条件下管壳式换热器(STHE)进口流的最佳质量流量进行了预测。Aspen交换器设计和评级(Aspen- edr)最初用于根据工业数据构建STHE的第一原理模型(FP)。将遗传算法引入FP模型,求解不同工艺条件下热煤油工艺流的最小出口温度。通过FP-GA集成生成由最佳工艺条件组成的数据集,并利用该数据集开发人工神经网络(ANN)模型。随后,通过替换遗传算法将ANN模型与FP模型合并,形成GB模型。所开发的GB模型,即ANN和FP的集成,比单独的FP模型获得了更高的有效性和更低的出口温度。GB框架的性能也与FP-GA方法相当,但它显著减少了估计最佳工艺条件所需的计算时间。提出的基于gb的方法提高了STHE从工艺流中提取能量的能力,增强了其应对多种工艺条件的弹性。
{"title":"Artificial intelligence assisted prediction of optimum operating conditions of shell and tube heat exchangers: A grey-box approach","authors":"Zahid Ullah,&nbsp;Iftikhar Ahmad,&nbsp;Abdul Samad,&nbsp;Husnain Saghir,&nbsp;Farooq Ahmad,&nbsp;Manabu Kano,&nbsp;Hakan Caliskan,&nbsp;Nesrin Caliskan,&nbsp;Hiki Hong","doi":"10.1049/cit2.12393","DOIUrl":"10.1049/cit2.12393","url":null,"abstract":"<p>In this study, a Grey-box (GB) model was developed to predict the optimum mass flow rates of inlet streams of a Shell and Tube Heat Exchanger (STHE) under varying process conditions. Aspen Exchanger Design and Rating (Aspen-EDR) was initially used to construct a first principle model (FP) of the STHE using industrial data. The Genetic Algorithm (GA) was incorporated into the FP model to attain the minimum exit temperature for the hot kerosene process stream under varying process conditions. A dataset comprised of optimum process conditions was generated through FP-GA integration and was utilised to develop an Artificial Neural Networks (ANN) model. Subsequently, the ANN model was merged with the FP model by substituting the GA, to form a GB model. The developed GB model, that is, ANN and FP integration, achieved higher effectiveness and lower outlet temperature than those derived through the standalone FP model. Performance of the GB framework was also comparable to the FP-GA approach but it significantly reduced the computation time required for estimating the optimum process conditions. The proposed GB-based method improved the STHE's ability to extract energy from the process stream and strengthened its resilience to cope with diverse process conditions.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 2","pages":"349-358"},"PeriodicalIF":7.3,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12393","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving long-tail classification via decoupling and regularisation 通过解耦和正则化改进长尾分类
IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-24 DOI: 10.1049/cit2.12374
Shuzheng Gao, Chaozheng Wang, Cuiyun Gao, Wenjian Luo, Peiyi Han, Qing Liao, Guandong Xu

Real-world data always exhibit an imbalanced and long-tailed distribution, which leads to poor performance for neural network-based classification. Existing methods mainly tackle this problem by reweighting the loss function or rebalancing the classifier. However, one crucial aspect overlooked by previous research studies is the imbalanced feature space problem caused by the imbalanced angle distribution. In this paper, the authors shed light on the significance of the angle distribution in achieving a balanced feature space, which is essential for improving model performance under long-tailed distributions. Nevertheless, it is challenging to effectively balance both the classifier norms and angle distribution due to problems such as the low feature norm. To tackle these challenges, the authors first thoroughly analyse the classifier and feature space by decoupling the classification logits into three key components: classifier norm (i.e. the magnitude of the classifier vector), feature norm (i.e. the magnitude of the feature vector), and cosine similarity between the classifier vector and feature vector. In this way, the authors analyse the change of each component in the training process and reveal three critical problems that should be solved, that is, the imbalanced angle distribution, the lack of feature discrimination, and the low feature norm. Drawing from this analysis, the authors propose a novel loss function that incorporates hyperspherical uniformity, additive angular margin, and feature norm regularisation. Each component of the loss function addresses a specific problem and synergistically contributes to achieving a balanced classifier and feature space. The authors conduct extensive experiments on three popular benchmark datasets including CIFAR-10/100-LT, ImageNet-LT, and iNaturalist 2018. The experimental results demonstrate that the authors’ loss function outperforms several previous state-of-the-art methods in addressing the challenges posed by imbalanced and long-tailed datasets, that is, by improving upon the best-performing baselines on CIFAR-100-LT by 1.34, 1.41, 1.41 and 1.33, respectively.

现实世界的数据总是表现出不平衡和长尾分布,这导致基于神经网络的分类性能不佳。现有的方法主要是通过重新加权损失函数或重新平衡分类器来解决这个问题。然而,以往的研究忽略了一个重要的方面,即由于角度分布不平衡而导致的特征空间不平衡问题。在本文中,作者阐明了角度分布对实现平衡特征空间的重要性,这对于提高长尾分布下的模型性能至关重要。然而,由于特征范数偏低等问题,如何有效地平衡分类器范数和角度分布是一个挑战。为了解决这些挑战,作者首先通过将分类逻辑解耦为三个关键组件来彻底分析分类器和特征空间:分类器范数(即分类器向量的大小),特征范数(即特征向量的大小)以及分类器向量和特征向量之间的余弦相似度。通过对训练过程中各分量变化的分析,揭示了需要解决的三个关键问题,即角度分布不平衡、特征判别不足和特征范数偏低。根据这一分析,作者提出了一种新的损失函数,它结合了超球面均匀性、加性角余量和特征范数正则化。损失函数的每个组成部分解决一个特定的问题,并协同有助于实现一个平衡的分类器和特征空间。作者在三个流行的基准数据集上进行了广泛的实验,包括CIFAR-10/100-LT、ImageNet-LT和iNaturalist 2018。实验结果表明,作者的损失函数在解决不平衡和长尾数据集带来的挑战方面优于先前的几种最先进的方法,即在CIFAR-100-LT上性能最佳的基线上分别提高1.34、1.41、1.41和1.33。
{"title":"Improving long-tail classification via decoupling and regularisation","authors":"Shuzheng Gao,&nbsp;Chaozheng Wang,&nbsp;Cuiyun Gao,&nbsp;Wenjian Luo,&nbsp;Peiyi Han,&nbsp;Qing Liao,&nbsp;Guandong Xu","doi":"10.1049/cit2.12374","DOIUrl":"10.1049/cit2.12374","url":null,"abstract":"<p>Real-world data always exhibit an imbalanced and long-tailed distribution, which leads to poor performance for neural network-based classification. Existing methods mainly tackle this problem by reweighting the loss function or rebalancing the classifier. However, one crucial aspect overlooked by previous research studies is the imbalanced feature space problem caused by the imbalanced angle distribution. In this paper, the authors shed light on the significance of the angle distribution in achieving a balanced feature space, which is essential for improving model performance under long-tailed distributions. Nevertheless, it is challenging to effectively balance both the classifier norms and angle distribution due to problems such as the low feature norm. To tackle these challenges, the authors first thoroughly analyse the classifier and feature space by decoupling the classification logits into three key components: classifier norm (i.e. the magnitude of the classifier vector), feature norm (i.e. the magnitude of the feature vector), and cosine similarity between the classifier vector and feature vector. In this way, the authors analyse the change of each component in the training process and reveal three critical problems that should be solved, that is, the imbalanced angle distribution, the lack of feature discrimination, and the low feature norm. Drawing from this analysis, the authors propose a novel loss function that incorporates hyperspherical uniformity, additive angular margin, and feature norm regularisation. Each component of the loss function addresses a specific problem and synergistically contributes to achieving a balanced classifier and feature space. The authors conduct extensive experiments on three popular benchmark datasets including CIFAR-10/100-LT, ImageNet-LT, and iNaturalist 2018. The experimental results demonstrate that the authors’ loss function outperforms several previous state-of-the-art methods in addressing the challenges posed by imbalanced and long-tailed datasets, that is, by improving upon the best-performing baselines on CIFAR-100-LT by 1.34, 1.41, 1.41 and 1.33, respectively.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"62-71"},"PeriodicalIF":7.3,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12374","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143536070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning-based tracking control of AUV: Mixed policy improvement and game-based disturbance rejection 基于学习的AUV跟踪控制:混合策略改进和基于博弈的干扰抑制
IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-21 DOI: 10.1049/cit2.12372
Jun Ye, Hongbo Gao, Manjiang Hu, Yougang Bian, Qingjia Cui, Xiaohui Qin, Rongjun Ding

A mixed adaptive dynamic programming (ADP) scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle (AUV) systems subject to disturbances and safe constraints. By combining prior dynamic knowledge and actual sampled data, the proposed approach effectively mitigates the defect caused by the inaccurate dynamic model and significantly improves the training speed of the ADP algorithm. Initially, the dataset is enriched with sufficient reference data collected based on a nominal model without considering modelling bias. Also, the control object interacts with the real environment and continuously gathers adequate sampled data in the dataset. To comprehensively leverage the advantages of model-based and model-free methods during training, an adaptive tuning factor is introduced based on the dataset that possesses model-referenced information and conforms to the distribution of the real-world environment, which balances the influence of model-based control law and data-driven policy gradient on the direction of policy improvement. As a result, the proposed approach accelerates the learning speed compared to data-driven methods, concurrently also enhancing the tracking performance in comparison to model-based control methods. Moreover, the optimal control problem under disturbances is formulated as a zero-sum game, and the actor-critic-disturbance framework is introduced to approximate the optimal control input, cost function, and disturbance policy, respectively. Furthermore, the convergence property of the proposed algorithm based on the value iteration method is analysed. Finally, an example of AUV path following based on the improved line-of-sight guidance is presented to demonstrate the effectiveness of the proposed method.

针对自主水下航行器(AUV)系统受干扰和安全约束的最优控制问题,提出了一种基于零和博弈理论的混合自适应动态规划(ADP)方案。该方法通过将先验动态知识与实际采样数据相结合,有效缓解了动态模型不准确带来的缺陷,显著提高了ADP算法的训练速度。首先,在不考虑建模偏差的情况下,根据标称模型收集足够的参考数据来丰富数据集。此外,控制对象与真实环境交互,并在数据集中持续收集足够的采样数据。为了在训练过程中综合利用基于模型和无模型方法的优势,在具有模型参考信息且符合真实环境分布的数据集基础上引入自适应调整因子,平衡基于模型的控制律和数据驱动的策略梯度对策略改进方向的影响。因此,与数据驱动方法相比,该方法加快了学习速度,同时与基于模型的控制方法相比,该方法也提高了跟踪性能。此外,将扰动下的最优控制问题表述为零和博弈,并引入行动者-临界扰动框架分别逼近最优控制输入、成本函数和扰动策略。进一步分析了基于值迭代法的算法的收敛性。最后,给出了基于改进视距制导的AUV路径跟踪实例,验证了该方法的有效性。
{"title":"Learning-based tracking control of AUV: Mixed policy improvement and game-based disturbance rejection","authors":"Jun Ye,&nbsp;Hongbo Gao,&nbsp;Manjiang Hu,&nbsp;Yougang Bian,&nbsp;Qingjia Cui,&nbsp;Xiaohui Qin,&nbsp;Rongjun Ding","doi":"10.1049/cit2.12372","DOIUrl":"10.1049/cit2.12372","url":null,"abstract":"<p>A mixed adaptive dynamic programming (ADP) scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle (AUV) systems subject to disturbances and safe constraints. By combining prior dynamic knowledge and actual sampled data, the proposed approach effectively mitigates the defect caused by the inaccurate dynamic model and significantly improves the training speed of the ADP algorithm. Initially, the dataset is enriched with sufficient reference data collected based on a nominal model without considering modelling bias. Also, the control object interacts with the real environment and continuously gathers adequate sampled data in the dataset. To comprehensively leverage the advantages of model-based and model-free methods during training, an adaptive tuning factor is introduced based on the dataset that possesses model-referenced information and conforms to the distribution of the real-world environment, which balances the influence of model-based control law and data-driven policy gradient on the direction of policy improvement. As a result, the proposed approach accelerates the learning speed compared to data-driven methods, concurrently also enhancing the tracking performance in comparison to model-based control methods. Moreover, the optimal control problem under disturbances is formulated as a zero-sum game, and the actor-critic-disturbance framework is introduced to approximate the optimal control input, cost function, and disturbance policy, respectively. Furthermore, the convergence property of the proposed algorithm based on the value iteration method is analysed. Finally, an example of AUV path following based on the improved line-of-sight guidance is presented to demonstrate the effectiveness of the proposed method.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 2","pages":"510-528"},"PeriodicalIF":7.3,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12372","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143856812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-sensor missile-borne LiDAR point cloud data augmentation based on Monte Carlo distortion simulation 基于蒙特卡罗失真仿真的多传感器弹载激光雷达点云数据增强
IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-17 DOI: 10.1049/cit2.12389
Luda Zhao, Yihua Hu, Fei Han, Zhenglei Dou, Shanshan Li, Yan Zhang, Qilong Wu

Large-scale point cloud datasets form the basis for training various deep learning networks and achieving high-quality network processing tasks. Due to the diversity and robustness constraints of the data, data augmentation (DA) methods are utilised to expand dataset diversity and scale. However, due to the complex and distinct characteristics of LiDAR point cloud data from different platforms (such as missile-borne and vehicular LiDAR data), directly applying traditional 2D visual domain DA methods to 3D data can lead to networks trained using this approach not robustly achieving the corresponding tasks. To address this issue, the present study explores DA for missile-borne LiDAR point cloud using a Monte Carlo (MC) simulation method that closely resembles practical application. Firstly, the model of multi-sensor imaging system is established, taking into account the joint errors arising from the platform itself and the relative motion during the imaging process. A distortion simulation method based on MC simulation for augmenting missile-borne LiDAR point cloud data is proposed, underpinned by an analysis of combined errors between different modal sensors, achieving high-quality augmentation of point cloud data. The effectiveness of the proposed method in addressing imaging system errors and distortion simulation is validated using the imaging scene dataset constructed in this paper. Comparative experiments between the proposed point cloud DA algorithm and the current state-of-the-art algorithms in point cloud detection and single object tracking tasks demonstrate that the proposed method can improve the network performance obtained from unaugmented datasets by over 17.3% and 17.9%, surpassing SOTA performance of current point cloud DA algorithms.

大规模点云数据集是训练各种深度学习网络,实现高质量网络处理任务的基础。由于数据的多样性和鲁棒性约束,数据增强(DA)方法被用于扩展数据集的多样性和规模。然而,由于来自不同平台的LiDAR点云数据(如弹载和车载LiDAR数据)具有复杂和鲜明的特点,直接将传统的2D视觉域DA方法应用于3D数据可能导致使用该方法训练的网络不能鲁棒地完成相应的任务。为了解决这一问题,本研究使用与实际应用非常相似的蒙特卡罗(MC)模拟方法探讨了弹载激光雷达点云的数据处理。首先,建立了多传感器成像系统模型,考虑了成像过程中平台自身和相对运动引起的关节误差;提出了一种基于MC仿真的增强弹载LiDAR点云数据畸变仿真方法,在此基础上分析了不同模态传感器之间的组合误差,实现了点云数据的高质量增强。利用构建的成像场景数据集验证了该方法在解决成像系统误差和畸变仿真中的有效性。在点云检测和单目标跟踪任务中,将所提算法与当前最先进算法进行对比实验,结果表明,所提算法在未增强数据集上获得的网络性能分别提高17.3%和17.9%以上,优于当前点云DA算法的SOTA性能。
{"title":"Multi-sensor missile-borne LiDAR point cloud data augmentation based on Monte Carlo distortion simulation","authors":"Luda Zhao,&nbsp;Yihua Hu,&nbsp;Fei Han,&nbsp;Zhenglei Dou,&nbsp;Shanshan Li,&nbsp;Yan Zhang,&nbsp;Qilong Wu","doi":"10.1049/cit2.12389","DOIUrl":"10.1049/cit2.12389","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <p>Large-scale point cloud datasets form the basis for training various deep learning networks and achieving high-quality network processing tasks. Due to the diversity and robustness constraints of the data, data augmentation (DA) methods are utilised to expand dataset diversity and scale. However, due to the complex and distinct characteristics of LiDAR point cloud data from different platforms (such as missile-borne and vehicular LiDAR data), directly applying traditional 2D visual domain DA methods to 3D data can lead to networks trained using this approach not robustly achieving the corresponding tasks. To address this issue, the present study explores DA for missile-borne LiDAR point cloud using a Monte Carlo (MC) simulation method that closely resembles practical application. Firstly, the model of multi-sensor imaging system is established, taking into account the joint errors arising from the platform itself and the relative motion during the imaging process. A distortion simulation method based on MC simulation for augmenting missile-borne LiDAR point cloud data is proposed, underpinned by an analysis of combined errors between different modal sensors, achieving high-quality augmentation of point cloud data. The effectiveness of the proposed method in addressing imaging system errors and distortion simulation is validated using the imaging scene dataset constructed in this paper. Comparative experiments between the proposed point cloud DA algorithm and the current state-of-the-art algorithms in point cloud detection and single object tracking tasks demonstrate that the proposed method can improve the network performance obtained from unaugmented datasets by over 17.3% and 17.9%, surpassing SOTA performance of current point cloud DA algorithms.</p>\u0000 </section>\u0000 </div>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"300-316"},"PeriodicalIF":7.3,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12389","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143533353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resource-adaptive and OOD-robust inference of deep neural networks on IoT devices 物联网设备上深度神经网络的资源自适应和ood鲁棒推理
IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-09 DOI: 10.1049/cit2.12384
Cailen Robertson, Ngoc Anh Tong, Thanh Toan Nguyen, Quoc Viet Hung Nguyen, Jun Jo

Efficiently executing inference tasks of deep neural networks on devices with limited resources poses a significant load in IoT systems. To alleviate the load, one innovative method is branching that adds extra layers with classification exits to a pre-trained model, enabling inputs with high-confidence predictions to exit early, thus reducing inference cost. However, branching networks, not originally tailored for IoT environments, are susceptible to noisy and out-of-distribution (OOD) data, and they demand additional training for optimal performance. The authors introduce BrevisNet, a novel branching methodology designed for creating on-device branching models that are both resource-adaptive and noise-robust for IoT applications. The method leverages the refined uncertainty estimation capabilities of Dirichlet distributions for classification predictions, combined with the superior OOD detection of energy-based models. The authors propose a unique training approach and thresholding technique that enhances the precision of branch predictions, offering robustness against noise and OOD inputs. The findings demonstrate that BrevisNet surpasses existing branching techniques in training efficiency, accuracy, overall performance, and robustness.

在资源有限的设备上高效地执行深度神经网络的推理任务对物联网系统构成了巨大的负载。为了减轻负载,一种创新的方法是分支,将带有分类出口的额外层添加到预训练模型中,使具有高置信度预测的输入能够提前退出,从而降低推理成本。然而,分支网络最初不是为物联网环境量身定制的,容易受到噪声和分布外(OOD)数据的影响,并且需要额外的训练才能获得最佳性能。作者介绍了BrevisNet,这是一种新颖的分支方法,旨在为物联网应用创建既具有资源适应性又具有噪声鲁棒性的设备上分支模型。该方法利用Dirichlet分布的精细不确定性估计能力进行分类预测,并结合基于能量的模型的优越OOD检测。作者提出了一种独特的训练方法和阈值技术,提高了分支预测的精度,提供了对噪声和OOD输入的鲁棒性。研究结果表明,BrevisNet在训练效率、准确性、整体性能和鲁棒性方面超越了现有的分支技术。
{"title":"Resource-adaptive and OOD-robust inference of deep neural networks on IoT devices","authors":"Cailen Robertson,&nbsp;Ngoc Anh Tong,&nbsp;Thanh Toan Nguyen,&nbsp;Quoc Viet Hung Nguyen,&nbsp;Jun Jo","doi":"10.1049/cit2.12384","DOIUrl":"10.1049/cit2.12384","url":null,"abstract":"<p>Efficiently executing inference tasks of deep neural networks on devices with limited resources poses a significant load in IoT systems. To alleviate the load, one innovative method is branching that adds extra layers with classification exits to a pre-trained model, enabling inputs with high-confidence predictions to exit early, thus reducing inference cost. However, branching networks, not originally tailored for IoT environments, are susceptible to noisy and out-of-distribution (OOD) data, and they demand additional training for optimal performance. The authors introduce BrevisNet, a novel branching methodology designed for creating on-device branching models that are both resource-adaptive and noise-robust for IoT applications. The method leverages the refined uncertainty estimation capabilities of Dirichlet distributions for classification predictions, combined with the superior OOD detection of energy-based models. The authors propose a unique training approach and thresholding technique that enhances the precision of branch predictions, offering robustness against noise and OOD inputs. The findings demonstrate that BrevisNet surpasses existing branching techniques in training efficiency, accuracy, overall performance, and robustness.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"115-133"},"PeriodicalIF":7.3,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12384","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A criterion for selecting the appropriate one from the trained models for model-based offline policy evaluation 为基于模型的离线策略评估从训练模型中选择合适模型的标准
IF 7.3 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-09 DOI: 10.1049/cit2.12376
Chongchong Li, Yue Wang, Zhi-Ming Ma, Yuting Liu

Offline policy evaluation, evaluating and selecting complex policies for decision-making by only using offline datasets is important in reinforcement learning. At present, the model-based offline policy evaluation (MBOPE) is widely welcomed because of its easy to implement and good performance. MBOPE directly approximates the unknown value of a given policy using the Monte Carlo method given the estimated transition and reward functions of the environment. Usually, multiple models are trained, and then one of them is selected to be used. However, a challenge remains in selecting an appropriate model from those trained for further use. The authors first analyse the upper bound of the difference between the approximated value and the unknown true value. Theoretical results show that this difference is related to the trajectories generated by the given policy on the learnt model and the prediction error of the transition and reward functions at these generated data points. Based on the theoretical results, a new criterion is proposed to tell which trained model is better suited for evaluating the given policy. At last, the effectiveness of the proposed criterion is demonstrated on both benchmark and synthetic offline datasets.

离线策略评估,仅使用离线数据集评估和选择复杂的决策策略在强化学习中很重要。目前,基于模型的离线策略评估(mope)因其易于实现和性能好而受到广泛欢迎。mope在给定环境的估计过渡函数和奖励函数的情况下,使用蒙特卡罗方法直接逼近给定策略的未知值。通常,训练多个模型,然后选择其中一个模型进行使用。然而,从那些训练过的模型中选择合适的模型以供进一步使用仍然是一个挑战。作者首先分析了逼近值与未知真值之差的上界。理论结果表明,这种差异与给定策略在学习模型上产生的轨迹以及在这些生成的数据点上的过渡函数和奖励函数的预测误差有关。在理论结果的基础上,提出了一个新的准则来判断哪个训练模型更适合评估给定的策略。最后,在基准数据集和综合离线数据集上验证了该准则的有效性。
{"title":"A criterion for selecting the appropriate one from the trained models for model-based offline policy evaluation","authors":"Chongchong Li,&nbsp;Yue Wang,&nbsp;Zhi-Ming Ma,&nbsp;Yuting Liu","doi":"10.1049/cit2.12376","DOIUrl":"10.1049/cit2.12376","url":null,"abstract":"<p>Offline policy evaluation, evaluating and selecting complex policies for decision-making by only using offline datasets is important in reinforcement learning. At present, the model-based offline policy evaluation (MBOPE) is widely welcomed because of its easy to implement and good performance. MBOPE directly approximates the unknown value of a given policy using the Monte Carlo method given the estimated transition and reward functions of the environment. Usually, multiple models are trained, and then one of them is selected to be used. However, a challenge remains in selecting an appropriate model from those trained for further use. The authors first analyse the upper bound of the difference between the approximated value and the unknown true value. Theoretical results show that this difference is related to the trajectories generated by the given policy on the learnt model and the prediction error of the transition and reward functions at these generated data points. Based on the theoretical results, a new criterion is proposed to tell which trained model is better suited for evaluating the given policy. At last, the effectiveness of the proposed criterion is demonstrated on both benchmark and synthetic offline datasets.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 1","pages":"223-234"},"PeriodicalIF":7.3,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12376","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
CAAI Transactions on Intelligence Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1