Giacomo Camposampiero, Michael Hersche, Aleksandar Terzić, Roger Wattenhofer, Abu Sebastian, Abbas Rahimi
We introduce the Abductive Rule Learner with Context-awareness (ARLC), a model that solves abstract reasoning tasks based on Learn-VRF. ARLC features a novel and more broadly applicable training objective for abductive reasoning, resulting in better interpretability and higher accuracy when solving Raven's progressive matrices (RPM). ARLC allows both programming domain knowledge and learning the rules underlying a data distribution. We evaluate ARLC on the I-RAVEN dataset, showcasing state-of-the-art accuracy across both in-distribution and out-of-distribution (unseen attribute-rule pairs) tests. ARLC surpasses neuro-symbolic and connectionist baselines, including large language models, despite having orders of magnitude fewer parameters. We show ARLC's robustness to post-programming training by incrementally learning from examples on top of programmed knowledge, which only improves its performance and does not result in catastrophic forgetting of the programmed solution. We validate ARLC's seamless transfer learning from a 2x2 RPM constellation to unseen constellations. Our code is available at https://github.com/IBM/abductive-rule-learner-with-context-awareness.
{"title":"Towards Learning Abductive Reasoning using VSA Distributed Representations","authors":"Giacomo Camposampiero, Michael Hersche, Aleksandar Terzić, Roger Wattenhofer, Abu Sebastian, Abbas Rahimi","doi":"arxiv-2406.19121","DOIUrl":"https://doi.org/arxiv-2406.19121","url":null,"abstract":"We introduce the Abductive Rule Learner with Context-awareness (ARLC), a\u0000model that solves abstract reasoning tasks based on Learn-VRF. ARLC features a\u0000novel and more broadly applicable training objective for abductive reasoning,\u0000resulting in better interpretability and higher accuracy when solving Raven's\u0000progressive matrices (RPM). ARLC allows both programming domain knowledge and\u0000learning the rules underlying a data distribution. We evaluate ARLC on the\u0000I-RAVEN dataset, showcasing state-of-the-art accuracy across both\u0000in-distribution and out-of-distribution (unseen attribute-rule pairs) tests.\u0000ARLC surpasses neuro-symbolic and connectionist baselines, including large\u0000language models, despite having orders of magnitude fewer parameters. We show\u0000ARLC's robustness to post-programming training by incrementally learning from\u0000examples on top of programmed knowledge, which only improves its performance\u0000and does not result in catastrophic forgetting of the programmed solution. We\u0000validate ARLC's seamless transfer learning from a 2x2 RPM constellation to\u0000unseen constellations. Our code is available at\u0000https://github.com/IBM/abductive-rule-learner-with-context-awareness.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"144 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141528603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruochen Wang, Si Si, Felix Yu, Dorothea Wiesmann, Cho-Jui Hsieh, Inderjit Dhillon
The trade-off between expressiveness and interpretability remains a core challenge when building human-centric predictive models for classification and decision-making. While symbolic rules offer interpretability, they often lack expressiveness, whereas neural networks excel in performance but are known for being black boxes. In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge this gap. In the proposed LLM-based Symbolic Programs (LSPs), the pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts. Symbolic programs then integrate these modules into an interpretable decision rule. To train LSPs, we develop a divide-and-conquer approach to incrementally build the program from scratch, where the learning process of each step is guided by LLMs. To evaluate the effectiveness of LSPs in extracting interpretable and accurate knowledge from data, we introduce IL-Bench, a collection of diverse tasks, including both synthetic and real-world scenarios across different modalities. Empirical results demonstrate LSP's superior performance compared to traditional neurosymbolic programs and vanilla automatic prompt tuning methods. Moreover, as the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable), and other LLMs, and generalizes well to out-of-distribution samples.
{"title":"Large Language Models are Interpretable Learners","authors":"Ruochen Wang, Si Si, Felix Yu, Dorothea Wiesmann, Cho-Jui Hsieh, Inderjit Dhillon","doi":"arxiv-2406.17224","DOIUrl":"https://doi.org/arxiv-2406.17224","url":null,"abstract":"The trade-off between expressiveness and interpretability remains a core\u0000challenge when building human-centric predictive models for classification and\u0000decision-making. While symbolic rules offer interpretability, they often lack\u0000expressiveness, whereas neural networks excel in performance but are known for\u0000being black boxes. In this paper, we show a combination of Large Language\u0000Models (LLMs) and symbolic programs can bridge this gap. In the proposed\u0000LLM-based Symbolic Programs (LSPs), the pretrained LLM with natural language\u0000prompts provides a massive set of interpretable modules that can transform raw\u0000input into natural language concepts. Symbolic programs then integrate these\u0000modules into an interpretable decision rule. To train LSPs, we develop a\u0000divide-and-conquer approach to incrementally build the program from scratch,\u0000where the learning process of each step is guided by LLMs. To evaluate the\u0000effectiveness of LSPs in extracting interpretable and accurate knowledge from\u0000data, we introduce IL-Bench, a collection of diverse tasks, including both\u0000synthetic and real-world scenarios across different modalities. Empirical\u0000results demonstrate LSP's superior performance compared to traditional\u0000neurosymbolic programs and vanilla automatic prompt tuning methods. Moreover,\u0000as the knowledge learned by LSP is a combination of natural language\u0000descriptions and symbolic rules, it is easily transferable to humans\u0000(interpretable), and other LLMs, and generalizes well to out-of-distribution\u0000samples.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"63 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141528605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, we prove over 3000 previously ATP-unproved Mizar/MPTP problems by using several ATP and AI methods, raising the number of ATP-solved Mizar problems from 75% to above 80%. First, we start to experiment with the cvc5 SMT solver which uses several instantiation-based heuristics that differ from the superposition-based systems, that were previously applied to Mizar,and add many new solutions. Then we use automated strategy invention to develop cvc5 strategies that largely improve cvc5's performance on the hard problems. In particular, the best invented strategy solves over 14% more problems than the best previously available cvc5 strategy. We also show that different clausification methods have a high impact on such instantiation-based methods, again producing many new solutions. In total, the methods solve 3021 (21.3%) of the 14163 previously unsolved hard Mizar problems. This is a new milestone over the Mizar large-theory benchmark and a large strengthening of the hammer methods for Mizar.
{"title":"Solving Hard Mizar Problems with Instantiation and Strategy Invention","authors":"Jan Jakubův, Mikoláš Janota, Josef Urban","doi":"arxiv-2406.17762","DOIUrl":"https://doi.org/arxiv-2406.17762","url":null,"abstract":"In this work, we prove over 3000 previously ATP-unproved Mizar/MPTP problems\u0000by using several ATP and AI methods, raising the number of ATP-solved Mizar\u0000problems from 75% to above 80%. First, we start to experiment with the cvc5\u0000SMT solver which uses several instantiation-based heuristics that differ from\u0000the superposition-based systems, that were previously applied to Mizar,and add\u0000many new solutions. Then we use automated strategy invention to develop cvc5\u0000strategies that largely improve cvc5's performance on the hard problems. In\u0000particular, the best invented strategy solves over 14% more problems than the\u0000best previously available cvc5 strategy. We also show that different\u0000clausification methods have a high impact on such instantiation-based methods,\u0000again producing many new solutions. In total, the methods solve 3021 (21.3%)\u0000of the 14163 previously unsolved hard Mizar problems. This is a new milestone\u0000over the Mizar large-theory benchmark and a large strengthening of the hammer\u0000methods for Mizar.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141546557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
MaxSAT modulo theories (MaxSMT) is an important generalization of Satisfiability modulo theories (SMT) with various applications. In this paper, we focus on MaxSMT with the background theory of Linear Integer Arithmetic, denoted as MaxSMT(LIA). We design the first local search algorithm for MaxSMT(LIA) called PairLS, based on the following novel ideas. A novel operator called pairwise operator is proposed for integer variables. It extends the original local search operator by simultaneously operating on two variables, enriching the search space. Moreover, a compensation-based picking heuristic is proposed to determine and distinguish the pairwise operations. Experiments are conducted to evaluate our algorithm on massive benchmarks. The results show that our solver is competitive with state-of-the-art MaxSMT solvers. Furthermore, we also apply the pairwise operation to enhance the local search algorithm of SMT, which shows its extensibility.
{"title":"A Local Search Algorithm for MaxSMT(LIA)","authors":"Xiang He, Bohan Li, Mengyu Zhao, Shaowei Cai","doi":"arxiv-2406.15782","DOIUrl":"https://doi.org/arxiv-2406.15782","url":null,"abstract":"MaxSAT modulo theories (MaxSMT) is an important generalization of\u0000Satisfiability modulo theories (SMT) with various applications. In this paper,\u0000we focus on MaxSMT with the background theory of Linear Integer Arithmetic,\u0000denoted as MaxSMT(LIA). We design the first local search algorithm for\u0000MaxSMT(LIA) called PairLS, based on the following novel ideas. A novel operator\u0000called pairwise operator is proposed for integer variables. It extends the\u0000original local search operator by simultaneously operating on two variables,\u0000enriching the search space. Moreover, a compensation-based picking heuristic is\u0000proposed to determine and distinguish the pairwise operations. Experiments are\u0000conducted to evaluate our algorithm on massive benchmarks. The results show\u0000that our solver is competitive with state-of-the-art MaxSMT solvers.\u0000Furthermore, we also apply the pairwise operation to enhance the local search\u0000algorithm of SMT, which shows its extensibility.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141546558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Formal verification has emerged as a promising method to ensure the safety and reliability of neural networks. Naively verifying a safety property amounts to ensuring the safety of a neural network for the whole input space irrespective of any training or test set. However, this also implies that the safety of the neural network is checked even for inputs that do not occur in the real-world and have no meaning at all, often resulting in spurious errors. To tackle this shortcoming, we propose the VeriFlow architecture as a flow based density model tailored to allow any verification approach to restrict its search to the some data distribution of interest. We argue that our architecture is particularly well suited for this purpose because of two major properties. First, we show that the transformation and log-density function that are defined by our model are piece-wise affine. Therefore, the model allows the usage of verifiers based on SMT with linear arithmetic. Second, upper density level sets (UDL) of the data distribution take the shape of an $L^p$-ball in the latent space. As a consequence, representations of UDLs specified by a given probability are effectively computable in latent space. This allows for SMT and abstract interpretation approaches with fine-grained, probabilistically interpretable, control regarding on how (a)typical the inputs subject to verification are.
{"title":"VeriFlow: Modeling Distributions for Neural Network Verification","authors":"Faried Abu Zaid, Daniel Neider, Mustafa Yalçıner","doi":"arxiv-2406.14265","DOIUrl":"https://doi.org/arxiv-2406.14265","url":null,"abstract":"Formal verification has emerged as a promising method to ensure the safety\u0000and reliability of neural networks. Naively verifying a safety property amounts\u0000to ensuring the safety of a neural network for the whole input space\u0000irrespective of any training or test set. However, this also implies that the\u0000safety of the neural network is checked even for inputs that do not occur in\u0000the real-world and have no meaning at all, often resulting in spurious errors.\u0000To tackle this shortcoming, we propose the VeriFlow architecture as a flow\u0000based density model tailored to allow any verification approach to restrict its\u0000search to the some data distribution of interest. We argue that our\u0000architecture is particularly well suited for this purpose because of two major\u0000properties. First, we show that the transformation and log-density function\u0000that are defined by our model are piece-wise affine. Therefore, the model\u0000allows the usage of verifiers based on SMT with linear arithmetic. Second,\u0000upper density level sets (UDL) of the data distribution take the shape of an\u0000$L^p$-ball in the latent space. As a consequence, representations of UDLs\u0000specified by a given probability are effectively computable in latent space.\u0000This allows for SMT and abstract interpretation approaches with fine-grained,\u0000probabilistically interpretable, control regarding on how (a)typical the inputs\u0000subject to verification are.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141546352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There has been a growing need to devise processes that can create comprehensive datasets in the world of Computer Algebra, both for accurate benchmarking and for new intersections with machine learning technology. We present here a method to generate integrands that are guaranteed to be integrable, dubbed the LIOUVILLE method. It is based on Liouville's theorem and the Parallel Risch Algorithm for symbolic integration. We show that this data generation method retains the best qualities of previous data generation methods, while overcoming some of the issues built into that prior work. The LIOUVILLE generator is able to generate sufficiently complex and realistic integrands, and could be used for benchmarking or machine learning training tasks related to symbolic integration.
{"title":"The Liouville Generator for Producing Integrable Expressions","authors":"Rashid Barket, Matthew England, Jürgen Gerhard","doi":"arxiv-2406.11631","DOIUrl":"https://doi.org/arxiv-2406.11631","url":null,"abstract":"There has been a growing need to devise processes that can create\u0000comprehensive datasets in the world of Computer Algebra, both for accurate\u0000benchmarking and for new intersections with machine learning technology. We\u0000present here a method to generate integrands that are guaranteed to be\u0000integrable, dubbed the LIOUVILLE method. It is based on Liouville's theorem and\u0000the Parallel Risch Algorithm for symbolic integration. We show that this data generation method retains the best qualities of\u0000previous data generation methods, while overcoming some of the issues built\u0000into that prior work. The LIOUVILLE generator is able to generate sufficiently\u0000complex and realistic integrands, and could be used for benchmarking or machine\u0000learning training tasks related to symbolic integration.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"9 17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141546353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wolfgang Stammer, Antonia Wüst, David Steinmann, Kristian Kersting
The challenge in object-based visual reasoning lies in generating descriptive yet distinct concept representations. Moreover, doing this in an unsupervised fashion requires human users to understand a model's learned concepts and potentially revise false concepts. In addressing this challenge, we introduce the Neural Concept Binder, a new framework for deriving discrete concept representations resulting in what we term "concept-slot encodings". These encodings leverage both "soft binding" via object-centric block-slot encodings and "hard binding" via retrieval-based inference. The Neural Concept Binder facilitates straightforward concept inspection and direct integration of external knowledge, such as human input or insights from other AI models like GPT-4. Additionally, we demonstrate that incorporating the hard binding mechanism does not compromise performance; instead, it enables seamless integration into both neural and symbolic modules for intricate reasoning tasks, as evidenced by evaluations on our newly introduced CLEVR-Sudoku dataset.
{"title":"Neural Concept Binder","authors":"Wolfgang Stammer, Antonia Wüst, David Steinmann, Kristian Kersting","doi":"arxiv-2406.09949","DOIUrl":"https://doi.org/arxiv-2406.09949","url":null,"abstract":"The challenge in object-based visual reasoning lies in generating descriptive\u0000yet distinct concept representations. Moreover, doing this in an unsupervised\u0000fashion requires human users to understand a model's learned concepts and\u0000potentially revise false concepts. In addressing this challenge, we introduce\u0000the Neural Concept Binder, a new framework for deriving discrete concept\u0000representations resulting in what we term \"concept-slot encodings\". These\u0000encodings leverage both \"soft binding\" via object-centric block-slot encodings\u0000and \"hard binding\" via retrieval-based inference. The Neural Concept Binder\u0000facilitates straightforward concept inspection and direct integration of\u0000external knowledge, such as human input or insights from other AI models like\u0000GPT-4. Additionally, we demonstrate that incorporating the hard binding\u0000mechanism does not compromise performance; instead, it enables seamless\u0000integration into both neural and symbolic modules for intricate reasoning\u0000tasks, as evidenced by evaluations on our newly introduced CLEVR-Sudoku\u0000dataset.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141528606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Symbolic mathematical computing systems have served as a canary in the coal mine of software systems for more than sixty years. They have introduced or have been early adopters of programming language ideas such ideas as dynamic memory management, arbitrary precision arithmetic and dependent types. These systems have the feature of being highly complex while at the same time operating in a domain where results are well-defined and clearly verifiable. These software systems span multiple layers of abstraction with concerns ranging from instruction scheduling and cache pressure up to algorithmic complexity of constructions in algebraic geometry. All of the major symbolic mathematical computing systems include low-level code for arithmetic, memory management and other primitives, a compiler or interpreter for a bespoke programming language, a library of high level mathematical algorithms, and some form of user interface. Each of these parts invokes multiple deep issues. We present some lessons learned from this environment and free flowing opinions on topics including: * Portability of software across architectures and decades; * Infrastructure to embrace and infrastructure to avoid; * Choosing base abstractions upon which to build; * How to get the most out of a small code base; * How developments in compilers both to optimise and to validate code have always been and remain of critical importance, with plenty of remaining challenges; * The way in which individuals including in particular Alan Mycroft who has been able to span from hand-crafting Z80 machine code up to the most abstruse high level code analysis techniques are needed, and * Why it is important to teach full-stack thinking to the next generation.
{"title":"A Symbolic Computing Perspective on Software Systems","authors":"Arthur C. Norman, Stephen M. Watt","doi":"arxiv-2406.09085","DOIUrl":"https://doi.org/arxiv-2406.09085","url":null,"abstract":"Symbolic mathematical computing systems have served as a canary in the coal\u0000mine of software systems for more than sixty years. They have introduced or\u0000have been early adopters of programming language ideas such ideas as dynamic\u0000memory management, arbitrary precision arithmetic and dependent types. These\u0000systems have the feature of being highly complex while at the same time\u0000operating in a domain where results are well-defined and clearly verifiable.\u0000These software systems span multiple layers of abstraction with concerns\u0000ranging from instruction scheduling and cache pressure up to algorithmic\u0000complexity of constructions in algebraic geometry. All of the major symbolic\u0000mathematical computing systems include low-level code for arithmetic, memory\u0000management and other primitives, a compiler or interpreter for a bespoke\u0000programming language, a library of high level mathematical algorithms, and some\u0000form of user interface. Each of these parts invokes multiple deep issues. We present some lessons learned from this environment and free flowing\u0000opinions on topics including: * Portability of software across architectures and decades; * Infrastructure to embrace and infrastructure to avoid; * Choosing base abstractions upon which to build; * How to get the most out of a small code base; * How developments in compilers both to optimise and to validate code have\u0000always been and remain of critical importance, with plenty of remaining\u0000challenges; * The way in which individuals including in particular Alan Mycroft who has\u0000been able to span from hand-crafting Z80 machine code up to the most abstruse\u0000high level code analysis techniques are needed, and * Why it is important to teach full-stack thinking to the next generation.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141528607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongtuo Liu, Sara Magliacane, Miltiadis Kofinas, Efstratios Gavves
Hybrid dynamical systems are prevalent in science and engineering to express complex systems with continuous and discrete states. To learn the laws of systems, all previous methods for equation discovery in hybrid systems follow a two-stage paradigm, i.e. they first group time series into small cluster fragments and then discover equations in each fragment separately through methods in non-hybrid systems. Although effective, these methods do not fully take advantage of the commonalities in the shared dynamics of multiple fragments that are driven by the same equations. Besides, the two-stage paradigm breaks the interdependence between categorizing and representing dynamics that jointly form hybrid systems. In this paper, we reformulate the problem and propose an end-to-end learning framework, i.e. Amortized Equation Discovery (AMORE), to jointly categorize modes and discover equations characterizing the dynamics of each mode by all segments of the mode. Experiments on four hybrid and six non-hybrid systems show that our method outperforms previous methods on equation discovery, segmentation, and forecasting.
{"title":"Amortized Equation Discovery in Hybrid Dynamical Systems","authors":"Yongtuo Liu, Sara Magliacane, Miltiadis Kofinas, Efstratios Gavves","doi":"arxiv-2406.03818","DOIUrl":"https://doi.org/arxiv-2406.03818","url":null,"abstract":"Hybrid dynamical systems are prevalent in science and engineering to express\u0000complex systems with continuous and discrete states. To learn the laws of\u0000systems, all previous methods for equation discovery in hybrid systems follow a\u0000two-stage paradigm, i.e. they first group time series into small cluster\u0000fragments and then discover equations in each fragment separately through\u0000methods in non-hybrid systems. Although effective, these methods do not fully\u0000take advantage of the commonalities in the shared dynamics of multiple\u0000fragments that are driven by the same equations. Besides, the two-stage\u0000paradigm breaks the interdependence between categorizing and representing\u0000dynamics that jointly form hybrid systems. In this paper, we reformulate the\u0000problem and propose an end-to-end learning framework, i.e. Amortized Equation\u0000Discovery (AMORE), to jointly categorize modes and discover equations\u0000characterizing the dynamics of each mode by all segments of the mode.\u0000Experiments on four hybrid and six non-hybrid systems show that our method\u0000outperforms previous methods on equation discovery, segmentation, and\u0000forecasting.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"24 6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141552446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Interpretable mathematical expressions defining discrete-time dynamical systems (iterated maps) can model many phenomena of scientific interest, enabling a deeper understanding of system behaviors. Since formulating governing expressions from first principles can be difficult, it is of particular interest to identify expressions for iterated maps given only their data streams. In this work, we consider a modified Symbolic Artificial Neural Network-Trained Expressions (SymANNTEx) architecture for this task, an architecture more expressive than others in the literature. We make a modification to the model pipeline to optimize the regression, then characterize the behavior of the adjusted model in identifying several classical chaotic maps. With the goal of parsimony, sparsity-inducing weight regularization and information theory-informed simplification are implemented. We show that our modified SymANNTEx model properly identifies single-state maps and achieves moderate success in approximating a dual-state attractor. These performances offer significant promise for data-driven scientific discovery and interpretation.
{"title":"Expressive Symbolic Regression for Interpretable Models of Discrete-Time Dynamical Systems","authors":"Adarsh Iyer, Nibodh Boddupalli, Jeff Moehlis","doi":"arxiv-2406.06585","DOIUrl":"https://doi.org/arxiv-2406.06585","url":null,"abstract":"Interpretable mathematical expressions defining discrete-time dynamical\u0000systems (iterated maps) can model many phenomena of scientific interest,\u0000enabling a deeper understanding of system behaviors. Since formulating\u0000governing expressions from first principles can be difficult, it is of\u0000particular interest to identify expressions for iterated maps given only their\u0000data streams. In this work, we consider a modified Symbolic Artificial Neural\u0000Network-Trained Expressions (SymANNTEx) architecture for this task, an\u0000architecture more expressive than others in the literature. We make a\u0000modification to the model pipeline to optimize the regression, then\u0000characterize the behavior of the adjusted model in identifying several\u0000classical chaotic maps. With the goal of parsimony, sparsity-inducing weight\u0000regularization and information theory-informed simplification are implemented.\u0000We show that our modified SymANNTEx model properly identifies single-state maps\u0000and achieves moderate success in approximating a dual-state attractor. These\u0000performances offer significant promise for data-driven scientific discovery and\u0000interpretation.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"123 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141546354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}