仅在非线性黑盒系统识别的训练阶段利用深度网络的能力

IF 6.8 1区计算机科学 0 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Sciences Pub Date : 2025-01-01 Epub Date: 2024-08-15 DOI:10.1016/j.ins.2024.121351

Vahid MohammadZadeh Eivaghi, Mahdi Aliyari-Shoorehdeli

{"title":"仅在非线性黑盒系统识别的训练阶段利用深度网络的能力","authors":"Vahid MohammadZadeh Eivaghi, Mahdi Aliyari-Shoorehdeli","doi":"10.1016/j.ins.2024.121351","DOIUrl":null,"url":null,"abstract":"<div><p>To benefit from the modeling capacity of deep models in system identification without worrying about inference time, this study presents a novel training strategy that uses deep models only during the training stage. For this purpose, two separate models with different structures and goals are employed. The first one is a deep generative model aiming at modeling the distribution of system output(s), called the teacher model, and the second one is a shallow basis function model, named the student model, fed by system input(s) to predict the system output(s). That means these isolated paths must reach the same ultimate target. As deep models show a great performance in modeling highly nonlinear systems, aligning the representation space learned by these two models makes the student model inherit the teacher model’s approximation power. The proposed objective function consists of the objective of each student and teacher model, adding up with a distance penalty between the learned latent representations. The simulation results on three nonlinear benchmarks show a comparative performance with examined deep architectures applied on the same benchmarks. Algorithmic transparency and structure efficiency are also achieved as byproducts.</p></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"686 ","pages":"Article 121351"},"PeriodicalIF":6.8000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploiting the capacity of deep networks only at the training stage for nonlinear black-box system identification\",\"authors\":\"Vahid MohammadZadeh Eivaghi, Mahdi Aliyari-Shoorehdeli\",\"doi\":\"10.1016/j.ins.2024.121351\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>To benefit from the modeling capacity of deep models in system identification without worrying about inference time, this study presents a novel training strategy that uses deep models only during the training stage. For this purpose, two separate models with different structures and goals are employed. The first one is a deep generative model aiming at modeling the distribution of system output(s), called the teacher model, and the second one is a shallow basis function model, named the student model, fed by system input(s) to predict the system output(s). That means these isolated paths must reach the same ultimate target. As deep models show a great performance in modeling highly nonlinear systems, aligning the representation space learned by these two models makes the student model inherit the teacher model’s approximation power. The proposed objective function consists of the objective of each student and teacher model, adding up with a distance penalty between the learned latent representations. The simulation results on three nonlinear benchmarks show a comparative performance with examined deep architectures applied on the same benchmarks. Algorithmic transparency and structure efficiency are also achieved as byproducts.</p></div>\",\"PeriodicalId\":51063,\"journal\":{\"name\":\"Information Sciences\",\"volume\":\"686 \",\"pages\":\"Article 121351\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0020025524012659\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/8/15 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"0\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025524012659","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/15 0:00:00","PubModel":"Epub","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

为了在系统识别中受益于深度模型的建模能力，而无需担心推理时间，本研究提出了一种新颖的训练策略，即仅在训练阶段使用深度模型。为此，采用了两个结构和目标不同的独立模型。第一个是深度生成模型，旨在模拟系统输出的分布，称为教师模型；第二个是浅基函数模型，称为学生模型，由系统输入来预测系统输出。这意味着这些孤立的路径必须达到相同的最终目标。由于深度模型在高度非线性系统建模方面表现出色，因此将这两个模型学习到的表示空间对齐，可以使学生模型继承教师模型的近似能力。所提出的目标函数包括每个学生模型和教师模型的目标，再加上所学潜表征之间的距离惩罚。在三个非线性基准上的模拟结果显示，在相同基准上应用的深度架构与经过检验的深度架构性能相当。作为副产品，还实现了算法透明度和结构效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Exploiting the capacity of deep networks only at the training stage for nonlinear black-box system identification

To benefit from the modeling capacity of deep models in system identification without worrying about inference time, this study presents a novel training strategy that uses deep models only during the training stage. For this purpose, two separate models with different structures and goals are employed. The first one is a deep generative model aiming at modeling the distribution of system output(s), called the teacher model, and the second one is a shallow basis function model, named the student model, fed by system input(s) to predict the system output(s). That means these isolated paths must reach the same ultimate target. As deep models show a great performance in modeling highly nonlinear systems, aligning the representation space learned by these two models makes the student model inherit the teacher model’s approximation power. The proposed objective function consists of the objective of each student and teacher model, adding up with a distance penalty between the learned latent representations. The simulation results on three nonlinear benchmarks show a comparative performance with examined deep architectures applied on the same benchmarks. Algorithmic transparency and structure efficiency are also achieved as byproducts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Sciences 工程技术-计算机：信息系统

CiteScore

14.00

自引率

17.30%

发文量

1322

审稿时长

10.4 months

期刊介绍： Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions. Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.