Recently, AI systems have made remarkable progress in various tasks. Deep Reinforcement Learning(DRL) is an effective tool for agents to learn policies in low-level state spaces to solve highly complex tasks. Researchers have introduced Intrinsic Motivation(IM) to the RL mechanism, which simulates the agent's curiosity, encouraging agents to explore interesting areas of the environment. This new feature has proved vital in enabling agents to learn policies without being given specific goals. However, even though DRL intelligence emerges through a sub-symbolic model, there is still a need for a sort of abstraction to understand the knowledge collected by the agent. To this end, the classical planning formalism has been used in recent research to explicitly represent the knowledge an autonomous agent acquires and effectively reach extrinsic goals. Despite classical planning usually presents limited expressive capabilities, PPDDL demonstrated usefulness in reviewing the knowledge gathered by an autonomous system, making explicit causal correlations, and can be exploited to find a plan to reach any state the agent faces during its experience. This work presents a new architecture implementing an open-ended learning system able to synthesize from scratch its experience into a PPDDL representation and update it over time. Without a predefined set of goals and tasks, the system integrates intrinsic motivations to explore the environment in a self-directed way, exploiting the high-level knowledge acquired during its experience. The system explores the environment and iteratively: (a) discover options, (b) explore the environment using options, (c) abstract the knowledge collected and (d) plan. This paper proposes an alternative approach to implementing open-ended learning architectures exploiting low-level and high-level representations to extend its knowledge in a virtuous loop.
{"title":"Synthesizing Evolving Symbolic Representations for Autonomous Systems","authors":"Gabriele Sartor, Angelo Oddi, Riccardo Rasconi, Vieri Giuliano Santucci, Rosa Meo","doi":"arxiv-2409.11756","DOIUrl":"https://doi.org/arxiv-2409.11756","url":null,"abstract":"Recently, AI systems have made remarkable progress in various tasks. Deep\u0000Reinforcement Learning(DRL) is an effective tool for agents to learn policies\u0000in low-level state spaces to solve highly complex tasks. Researchers have\u0000introduced Intrinsic Motivation(IM) to the RL mechanism, which simulates the\u0000agent's curiosity, encouraging agents to explore interesting areas of the\u0000environment. This new feature has proved vital in enabling agents to learn\u0000policies without being given specific goals. However, even though DRL\u0000intelligence emerges through a sub-symbolic model, there is still a need for a\u0000sort of abstraction to understand the knowledge collected by the agent. To this\u0000end, the classical planning formalism has been used in recent research to\u0000explicitly represent the knowledge an autonomous agent acquires and effectively\u0000reach extrinsic goals. Despite classical planning usually presents limited\u0000expressive capabilities, PPDDL demonstrated usefulness in reviewing the\u0000knowledge gathered by an autonomous system, making explicit causal\u0000correlations, and can be exploited to find a plan to reach any state the agent\u0000faces during its experience. This work presents a new architecture implementing\u0000an open-ended learning system able to synthesize from scratch its experience\u0000into a PPDDL representation and update it over time. Without a predefined set\u0000of goals and tasks, the system integrates intrinsic motivations to explore the\u0000environment in a self-directed way, exploiting the high-level knowledge\u0000acquired during its experience. The system explores the environment and\u0000iteratively: (a) discover options, (b) explore the environment using options,\u0000(c) abstract the knowledge collected and (d) plan. This paper proposes an\u0000alternative approach to implementing open-ended learning architectures\u0000exploiting low-level and high-level representations to extend its knowledge in\u0000a virtuous loop.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
LMNtal is a programming and modeling language based on hierarchical graph rewriting that uses logical variables to represent connectivity and membranes to represent hierarchy. On the theoretical side, it allows logical interpretation based on intuitionistic linear logic; on the practical side, its full-fledged implementation supports a graph-based parallel model checker and has been used to model diverse applications including various computational models. This paper discuss how we extend LMNtal to QLMNtal (LMNtal with Quantification) to further enhance the usefulness of hierarchical graph rewriting for high-level modeling by introducing quantifiers into rewriting as well as matching. Those quantifiers allows us to express universal quantification, cardinality and non-existence in an integrated manner. Unlike other attempts to introduce quantifiers into graph rewriting, QLMNtal has term-based syntax, whose semantics is smoothly integrated into the small-step semantics of the base language LMNtal. The proposed constructs allow combined and nested use of quantifiers within individual rewrite rules.
{"title":"Introducing Quantification into a Hierarchical Graph Rewriting Language","authors":"Haruto Mishina, Kazunori Ueda","doi":"arxiv-2409.11015","DOIUrl":"https://doi.org/arxiv-2409.11015","url":null,"abstract":"LMNtal is a programming and modeling language based on hierarchical graph\u0000rewriting that uses logical variables to represent connectivity and membranes\u0000to represent hierarchy. On the theoretical side, it allows logical\u0000interpretation based on intuitionistic linear logic; on the practical side, its\u0000full-fledged implementation supports a graph-based parallel model checker and\u0000has been used to model diverse applications including various computational\u0000models. This paper discuss how we extend LMNtal to QLMNtal (LMNtal with\u0000Quantification) to further enhance the usefulness of hierarchical graph\u0000rewriting for high-level modeling by introducing quantifiers into rewriting as\u0000well as matching. Those quantifiers allows us to express universal\u0000quantification, cardinality and non-existence in an integrated manner. Unlike\u0000other attempts to introduce quantifiers into graph rewriting, QLMNtal has\u0000term-based syntax, whose semantics is smoothly integrated into the small-step\u0000semantics of the base language LMNtal. The proposed constructs allow combined\u0000and nested use of quantifiers within individual rewrite rules.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arya Grayeli, Atharva Sehgal, Omar Costilla-Reyes, Miles Cranmer, Swarat Chaudhuri
We present a novel method for symbolic regression (SR), the task of searching for compact programmatic hypotheses that best explain a dataset. The problem is commonly solved using genetic algorithms; we show that we can enhance such methods by inducing a library of abstract textual concepts. Our algorithm, called LaSR, uses zero-shot queries to a large language model (LLM) to discover and evolve concepts occurring in known high-performing hypotheses. We discover new hypotheses using a mix of standard evolutionary steps and LLM-guided steps (obtained through zero-shot LLM queries) conditioned on discovered concepts. Once discovered, hypotheses are used in a new round of concept abstraction and evolution. We validate LaSR on the Feynman equations, a popular SR benchmark, as well as a set of synthetic tasks. On these benchmarks, LaSR substantially outperforms a variety of state-of-the-art SR approaches based on deep learning and evolutionary algorithms. Moreover, we show that LaSR can be used to discover a novel and powerful scaling law for LLMs.
我们为符号回归(SR)提出了一种新方法,符号回归的任务是搜索最能解释数据集的简洁程序假设。这个问题通常使用遗传算法来解决;我们的研究表明,我们可以通过诱导抽象文本概念库来增强这种方法。我们的算法称为 LaSR,它使用对大型语言模型 (LLM) 的零点查询来发现和演化出现在已知高效假设中的概念。我们使用标准进化步骤和 LLM 引导步骤(通过零次 LLM 查询获得)的组合,以发现的概念为条件,发现新的假设。我们在费曼方程(一种流行的 SR 基准)和一组合成任务上验证了 LaSR。在这些基准测试中,LaSR 的性能大大优于各种基于深度学习和进化算法的先进 SR 方法。此外,我们还展示了 LaSR 可用于发现 LLMs 的新颖而强大的缩放规律。
{"title":"Symbolic Regression with a Learned Concept Library","authors":"Arya Grayeli, Atharva Sehgal, Omar Costilla-Reyes, Miles Cranmer, Swarat Chaudhuri","doi":"arxiv-2409.09359","DOIUrl":"https://doi.org/arxiv-2409.09359","url":null,"abstract":"We present a novel method for symbolic regression (SR), the task of searching\u0000for compact programmatic hypotheses that best explain a dataset. The problem is\u0000commonly solved using genetic algorithms; we show that we can enhance such\u0000methods by inducing a library of abstract textual concepts. Our algorithm,\u0000called LaSR, uses zero-shot queries to a large language model (LLM) to discover\u0000and evolve concepts occurring in known high-performing hypotheses. We discover\u0000new hypotheses using a mix of standard evolutionary steps and LLM-guided steps\u0000(obtained through zero-shot LLM queries) conditioned on discovered concepts.\u0000Once discovered, hypotheses are used in a new round of concept abstraction and\u0000evolution. We validate LaSR on the Feynman equations, a popular SR benchmark,\u0000as well as a set of synthetic tasks. On these benchmarks, LaSR substantially\u0000outperforms a variety of state-of-the-art SR approaches based on deep learning\u0000and evolutionary algorithms. Moreover, we show that LaSR can be used to\u0000discover a novel and powerful scaling law for LLMs.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"100 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Computer algebra systems are really good at factoring polynomials, i.e. writing f as a product of irreducible factors. It is relatively easy to verify that we have a factorisation, but verifying that these factors are irreducible is a much harder problem. This paper reports work-in-progress to do such verification in Lean.
计算机代数系统非常擅长对多项式进行因式分解,即把 f 写成不可还原因式的乘积。验证我们是否有因式分解相对容易,但验证这些因式是否不可还原却是一个难得多的问题。本文报告了在 Lean 中进行这种验证的工作进展。
{"title":"Towards Verified Polynomial Factorisation","authors":"James H. Davenport","doi":"arxiv-2409.09533","DOIUrl":"https://doi.org/arxiv-2409.09533","url":null,"abstract":"Computer algebra systems are really good at factoring polynomials, i.e.\u0000writing f as a product of irreducible factors. It is relatively easy to verify\u0000that we have a factorisation, but verifying that these factors are irreducible\u0000is a much harder problem. This paper reports work-in-progress to do such\u0000verification in Lean.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"66 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Discovering Ordinary Differential Equations (ODEs) from trajectory data is a crucial task in AI-driven scientific discovery. Recent methods for symbolic discovery of ODEs primarily rely on fixed training datasets collected a-priori, often leading to suboptimal performance, as observed in our experiments in Figure 1. Inspired by active learning, we explore methods for querying informative trajectory data to evaluate predicted ODEs, where data are obtained by the specified initial conditions of the trajectory. Chaos theory indicates that small changes in the initial conditions of a dynamical system can result in vastly different trajectories, necessitating the maintenance of a large set of initial conditions of the trajectory. To address this challenge, we introduce Active Symbolic Discovery of Ordinary Differential Equations via Phase Portrait Sketching (APPS). Instead of directly selecting individual initial conditions, APPS first identifies an informative region and samples a batch of initial conditions within that region. Compared to traditional active learning methods, APPS eliminates the need for maintaining a large amount of data. Extensive experiments demonstrate that APPS consistently discovers more accurate ODE expressions than baseline methods using passively collected datasets.
{"title":"Active Symbolic Discovery of Ordinary Differential Equations via Phase Portrait Sketching","authors":"Nan Jiang, Md Nasim, Yexiang Xue","doi":"arxiv-2409.01416","DOIUrl":"https://doi.org/arxiv-2409.01416","url":null,"abstract":"Discovering Ordinary Differential Equations (ODEs) from trajectory data is a\u0000crucial task in AI-driven scientific discovery. Recent methods for symbolic\u0000discovery of ODEs primarily rely on fixed training datasets collected a-priori,\u0000often leading to suboptimal performance, as observed in our experiments in\u0000Figure 1. Inspired by active learning, we explore methods for querying\u0000informative trajectory data to evaluate predicted ODEs, where data are obtained\u0000by the specified initial conditions of the trajectory. Chaos theory indicates\u0000that small changes in the initial conditions of a dynamical system can result\u0000in vastly different trajectories, necessitating the maintenance of a large set\u0000of initial conditions of the trajectory. To address this challenge, we\u0000introduce Active Symbolic Discovery of Ordinary Differential Equations via\u0000Phase Portrait Sketching (APPS). Instead of directly selecting individual\u0000initial conditions, APPS first identifies an informative region and samples a\u0000batch of initial conditions within that region. Compared to traditional active\u0000learning methods, APPS eliminates the need for maintaining a large amount of\u0000data. Extensive experiments demonstrate that APPS consistently discovers more\u0000accurate ODE expressions than baseline methods using passively collected\u0000datasets.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"96 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe a datalog query evaluation approach based on efficient and composable boolean matrix manipulation modules. We first define an overarching problem, Boolean Matrix Logic Programming (BMLP), which uses boolean matrices as an alternative computation to evaluate datalog programs. We develop two novel BMLP modules for bottom-up inferences on linear dyadic recursive datalog programs, and show how additional modules can extend this capability to compute both linear and non-linear recursive datalog programs of arity two. Our empirical results demonstrate that these modules outperform general-purpose and specialised systems by factors of 30x and 9x, respectively, when evaluating large programs with millions of facts. This boolean matrix approach significantly enhances the efficiency of datalog querying to support logic programming techniques.
{"title":"Boolean Matrix Logic Programming","authors":"Lun Ai, Stephen H. Muggleton","doi":"arxiv-2408.10369","DOIUrl":"https://doi.org/arxiv-2408.10369","url":null,"abstract":"We describe a datalog query evaluation approach based on efficient and\u0000composable boolean matrix manipulation modules. We first define an overarching\u0000problem, Boolean Matrix Logic Programming (BMLP), which uses boolean matrices\u0000as an alternative computation to evaluate datalog programs. We develop two\u0000novel BMLP modules for bottom-up inferences on linear dyadic recursive datalog\u0000programs, and show how additional modules can extend this capability to compute\u0000both linear and non-linear recursive datalog programs of arity two. Our\u0000empirical results demonstrate that these modules outperform general-purpose and\u0000specialised systems by factors of 30x and 9x, respectively, when evaluating\u0000large programs with millions of facts. This boolean matrix approach\u0000significantly enhances the efficiency of datalog querying to support logic\u0000programming techniques.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Siyu Wu, Alessandro Oltramari, Jonathan Francis, C. Lee Giles, Frank E. Ritter
Resolving the dichotomy between the human-like yet constrained reasoning processes of Cognitive Architectures and the broad but often noisy inference behavior of Large Language Models (LLMs) remains a challenging but exciting pursuit, for enabling reliable machine reasoning capabilities in production systems. Because Cognitive Architectures are famously developed for the purpose of modeling the internal mechanisms of human cognitive decision-making at a computational level, new investigations consider the goal of informing LLMs with the knowledge necessary for replicating such processes, e.g., guided perception, memory, goal-setting, and action. Previous approaches that use LLMs for grounded decision-making struggle with complex reasoning tasks that require slower, deliberate cognition over fast and intuitive inference -- reporting issues related to the lack of sufficient grounding, as in hallucination. To resolve these challenges, we introduce LLM-ACTR, a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making by integrating the ACT-R Cognitive Architecture with LLMs. Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations, injects this information into trainable LLM adapter layers, and fine-tunes the LLMs for downstream prediction. Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability of our approach, compared to LLM-only baselines that leverage chain-of-thought reasoning strategies.
{"title":"Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making","authors":"Siyu Wu, Alessandro Oltramari, Jonathan Francis, C. Lee Giles, Frank E. Ritter","doi":"arxiv-2408.09176","DOIUrl":"https://doi.org/arxiv-2408.09176","url":null,"abstract":"Resolving the dichotomy between the human-like yet constrained reasoning\u0000processes of Cognitive Architectures and the broad but often noisy inference\u0000behavior of Large Language Models (LLMs) remains a challenging but exciting\u0000pursuit, for enabling reliable machine reasoning capabilities in production\u0000systems. Because Cognitive Architectures are famously developed for the purpose\u0000of modeling the internal mechanisms of human cognitive decision-making at a\u0000computational level, new investigations consider the goal of informing LLMs\u0000with the knowledge necessary for replicating such processes, e.g., guided\u0000perception, memory, goal-setting, and action. Previous approaches that use LLMs\u0000for grounded decision-making struggle with complex reasoning tasks that require\u0000slower, deliberate cognition over fast and intuitive inference -- reporting\u0000issues related to the lack of sufficient grounding, as in hallucination. To\u0000resolve these challenges, we introduce LLM-ACTR, a novel neuro-symbolic\u0000architecture that provides human-aligned and versatile decision-making by\u0000integrating the ACT-R Cognitive Architecture with LLMs. Our framework extracts\u0000and embeds knowledge of ACT-R's internal decision-making process as latent\u0000neural representations, injects this information into trainable LLM adapter\u0000layers, and fine-tunes the LLMs for downstream prediction. Our experiments on\u0000novel Design for Manufacturing tasks show both improved task performance as\u0000well as improved grounded decision-making capability of our approach, compared\u0000to LLM-only baselines that leverage chain-of-thought reasoning strategies.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"322 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuanqing Yu, Wangtao Sun, Jingwei Li, Kang Liu, Chengbao Liu, Jie Tan
In the realm of event prediction, temporal knowledge graph forecasting (TKGF) stands as a pivotal technique. Previous approaches face the challenges of not utilizing experience during testing and relying on a single short-term history, which limits adaptation to evolving data. In this paper, we introduce the Online Neural-Symbolic Event Prediction (ONSEP) framework, which innovates by integrating dynamic causal rule mining (DCRM) and dual history augmented generation (DHAG). DCRM dynamically constructs causal rules from real-time data, allowing for swift adaptation to new causal relationships. In parallel, DHAG merges short-term and long-term historical contexts, leveraging a bi-branch approach to enrich event prediction. Our framework demonstrates notable performance enhancements across diverse datasets, with significant Hit@k (k=1,3,10) improvements, showcasing its ability to augment large language models (LLMs) for event prediction without necessitating extensive retraining. The ONSEP framework not only advances the field of TKGF but also underscores the potential of neural-symbolic approaches in adapting to dynamic data environments.
{"title":"ONSEP: A Novel Online Neural-Symbolic Framework for Event Prediction Based on Large Language Model","authors":"Xuanqing Yu, Wangtao Sun, Jingwei Li, Kang Liu, Chengbao Liu, Jie Tan","doi":"arxiv-2408.07840","DOIUrl":"https://doi.org/arxiv-2408.07840","url":null,"abstract":"In the realm of event prediction, temporal knowledge graph forecasting (TKGF)\u0000stands as a pivotal technique. Previous approaches face the challenges of not\u0000utilizing experience during testing and relying on a single short-term history,\u0000which limits adaptation to evolving data. In this paper, we introduce the\u0000Online Neural-Symbolic Event Prediction (ONSEP) framework, which innovates by\u0000integrating dynamic causal rule mining (DCRM) and dual history augmented\u0000generation (DHAG). DCRM dynamically constructs causal rules from real-time\u0000data, allowing for swift adaptation to new causal relationships. In parallel,\u0000DHAG merges short-term and long-term historical contexts, leveraging a\u0000bi-branch approach to enrich event prediction. Our framework demonstrates\u0000notable performance enhancements across diverse datasets, with significant\u0000Hit@k (k=1,3,10) improvements, showcasing its ability to augment large language\u0000models (LLMs) for event prediction without necessitating extensive retraining.\u0000The ONSEP framework not only advances the field of TKGF but also underscores\u0000the potential of neural-symbolic approaches in adapting to dynamic data\u0000environments.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Convolutional neural networks (CNNs) are a popular choice of model for tasks in computer vision. When CNNs are made with many layers, resulting in a deep neural network, skip connections may be added to create an easier gradient optimization problem while retaining model expressiveness. In this paper, we show that arbitrarily complex, trained, linear CNNs with skip connections can be simplified into a single-layer model, resulting in greatly reduced computational requirements during prediction time. We also present a method for training nonlinear models with skip connections that are gradually removed throughout training, giving the benefits of skip connections without requiring computational overhead during during prediction time. These results are demonstrated with practical examples on Residual Networks (ResNet) architecture.
{"title":"Algebraic Representations for Faster Predictions in Convolutional Neural Networks","authors":"Johnny Joyce, Jan Verschelde","doi":"arxiv-2408.07815","DOIUrl":"https://doi.org/arxiv-2408.07815","url":null,"abstract":"Convolutional neural networks (CNNs) are a popular choice of model for tasks\u0000in computer vision. When CNNs are made with many layers, resulting in a deep\u0000neural network, skip connections may be added to create an easier gradient\u0000optimization problem while retaining model expressiveness. In this paper, we\u0000show that arbitrarily complex, trained, linear CNNs with skip connections can\u0000be simplified into a single-layer model, resulting in greatly reduced\u0000computational requirements during prediction time. We also present a method for\u0000training nonlinear models with skip connections that are gradually removed\u0000throughout training, giving the benefits of skip connections without requiring\u0000computational overhead during during prediction time. These results are\u0000demonstrated with practical examples on Residual Networks (ResNet)\u0000architecture.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In a broad sense, artificial intelligence is a service to find a solution to complex intellectual problems. In this sense, the MathPartner service provides artificial intelligence that allows us to formulate questions and receive answers to questions formulated in a mathematical language. For mathematicians and physicists today, such a language is LaTeX. The MathPartner service uses a dialect of LaTeX, which is called Mathpar. The service is a cloud-based computer algebra system and provides users with the opportunity to solve many mathematical problems. In this publication, we focus only on a small class of extremum problems, which are widely applied in economics, management, logistics, and in many engineering fields. In particular, we consider the shortest path problem and discuss an algorithm that is based on the tropical mathematics. The ability to work with many types of classical and tropical algebras, which are freely available to users, is an important distinguishing feature of this intelligent tool for symbolic-numerical calculations. We also consider the use of the simplex algorithm for solving optimization problems.
{"title":"MathPartner: An Artificial Intelligence Cloud Service","authors":"Gennadi Malaschonok, Alexandr Seliverstov","doi":"arxiv-2408.04999","DOIUrl":"https://doi.org/arxiv-2408.04999","url":null,"abstract":"In a broad sense, artificial intelligence is a service to find a solution to\u0000complex intellectual problems. In this sense, the MathPartner service provides\u0000artificial intelligence that allows us to formulate questions and receive\u0000answers to questions formulated in a mathematical language. For mathematicians\u0000and physicists today, such a language is LaTeX. The MathPartner service uses a\u0000dialect of LaTeX, which is called Mathpar. The service is a cloud-based\u0000computer algebra system and provides users with the opportunity to solve many\u0000mathematical problems. In this publication, we focus only on a small class of\u0000extremum problems, which are widely applied in economics, management,\u0000logistics, and in many engineering fields. In particular, we consider the\u0000shortest path problem and discuss an algorithm that is based on the tropical\u0000mathematics. The ability to work with many types of classical and tropical\u0000algebras, which are freely available to users, is an important distinguishing\u0000feature of this intelligent tool for symbolic-numerical calculations. We also\u0000consider the use of the simplex algorithm for solving optimization problems.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141938851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}