Time-Series Forecasting, Knowledge Distillation, and Refinement within a Multimodal PDE Foundation Model

arXiv - CS - Machine Learning Pub Date : 2024-09-17 DOI:arxiv-2409.11609

Derek Jollie, Jingmin Sun, Zecheng Zhang, Hayden Schaeffer

{"title":"Time-Series Forecasting, Knowledge Distillation, and Refinement within a Multimodal PDE Foundation Model","authors":"Derek Jollie, Jingmin Sun, Zecheng Zhang, Hayden Schaeffer","doi":"arxiv-2409.11609","DOIUrl":null,"url":null,"abstract":"Symbolic encoding has been used in multi-operator learning as a way to embed\nadditional information for distinct time-series data. For spatiotemporal\nsystems described by time-dependent partial differential equations, the\nequation itself provides an additional modality to identify the system. The\nutilization of symbolic expressions along side time-series samples allows for\nthe development of multimodal predictive neural networks. A key challenge with\ncurrent approaches is that the symbolic information, i.e. the equations, must\nbe manually preprocessed (simplified, rearranged, etc.) to match and relate to\nthe existing token library, which increases costs and reduces flexibility,\nespecially when dealing with new differential equations. We propose a new token\nlibrary based on SymPy to encode differential equations as an additional\nmodality for time-series models. The proposed approach incurs minimal cost, is\nautomated, and maintains high prediction accuracy for forecasting tasks.\nAdditionally, we include a Bayesian filtering module that connects the\ndifferent modalities to refine the learned equation. This improves the accuracy\nof the learned symbolic representation and the predicted time-series.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"94 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11609","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Symbolic encoding has been used in multi-operator learning as a way to embed additional information for distinct time-series data. For spatiotemporal systems described by time-dependent partial differential equations, the equation itself provides an additional modality to identify the system. The utilization of symbolic expressions along side time-series samples allows for the development of multimodal predictive neural networks. A key challenge with current approaches is that the symbolic information, i.e. the equations, must be manually preprocessed (simplified, rearranged, etc.) to match and relate to the existing token library, which increases costs and reduces flexibility, especially when dealing with new differential equations. We propose a new token library based on SymPy to encode differential equations as an additional modality for time-series models. The proposed approach incurs minimal cost, is automated, and maintains high prediction accuracy for forecasting tasks. Additionally, we include a Bayesian filtering module that connects the different modalities to refine the learned equation. This improves the accuracy of the learned symbolic representation and the predicted time-series.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

多模态 PDE 基础模型中的时间序列预测、知识提炼和完善

符号编码已被用于多运算器学习中，作为一种为不同时间序列数据嵌入附加信息的方法。对于由随时间变化的偏微分方程描述的时空系统，方程本身提供了识别系统的额外模式。利用符号表达式和时间序列样本可以开发多模态预测神经网络。当前方法面临的一个主要挑战是，必须对符号信息（即方程）进行人工预处理（简化、重新排列等），以便与现有标记库匹配和关联，这增加了成本，降低了灵活性，尤其是在处理新的微分方程时。我们提出了一种基于 SymPy 的新标记库，用于编码微分方程，作为时间序列模型的附加模式。此外，我们还包含一个贝叶斯过滤模块，它可以连接不同的模态来完善所学方程。这提高了所学符号表示和预测时间序列的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Machine Learning

自引率

0.00%

发文量