Overparameterization: A Connection Between Software 1.0 and Software 2.0

Summit on Advances in Programming Languages Pub Date : 1900-01-01 DOI:10.4230/LIPIcs.SNAPL.2019.1

Michael Carbin

{"title":"Overparameterization: A Connection Between Software 1.0 and Software 2.0","authors":"Michael Carbin","doi":"10.4230/LIPIcs.SNAPL.2019.1","DOIUrl":null,"url":null,"abstract":"A new ecosystem of machine-learning driven applications, titled Software 2.0, has arisen that integrates neural networks into a variety of computational tasks. Such applications include image recognition, natural language processing, and other traditional machine learning tasks. However, these techniques have also grown to include other structured domains, such as program analysis and program optimization for which novel, domain-specific insights mate with model design. In this paper, we connect the world of Software 2.0 with that of traditional software - Software 1.0 - through overparameterization: a program may provide more computational capacity and precision than is necessary for the task at hand. \nIn Software 2.0, overparamterization - when a machine learning model has more parameters than datapoints in the dataset - arises as a contemporary understanding of the ability for modern, gradient-based learning methods to learn models over complex datasets with high-accuracy. Specifically, the more parameters a model has, the better it learns. \nIn Software 1.0, the results of the approximate computing community show that traditional software is also overparameterized in that software often simply computes results that are more precise than is required by the user. Approximate computing exploits this overparameterization to improve performance by eliminating unnecessary, excess computation. For example, one - of many techniques - is to reduce the precision of arithmetic in the application. \nIn this paper, we argue that the gap between available precision and that that is required for either Software 1.0 or Software 2.0 is a fundamental aspect of software design that illustrates the balance between software designed for general-purposes and domain-adapted solutions. A general-purpose solution is easier to develop and maintain versus a domain-adapted solution. However, that ease comes at the expense of performance. \nWe show that the approximate computing community and the machine learning community have developed overlapping techniques to improve performance by reducing overparameterization. We also show that because of these shared techniques, questions, concerns, and answers on how to construct software can translate from one software variant to the other.","PeriodicalId":231548,"journal":{"name":"Summit on Advances in Programming Languages","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Summit on Advances in Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.SNAPL.2019.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

A new ecosystem of machine-learning driven applications, titled Software 2.0, has arisen that integrates neural networks into a variety of computational tasks. Such applications include image recognition, natural language processing, and other traditional machine learning tasks. However, these techniques have also grown to include other structured domains, such as program analysis and program optimization for which novel, domain-specific insights mate with model design. In this paper, we connect the world of Software 2.0 with that of traditional software - Software 1.0 - through overparameterization: a program may provide more computational capacity and precision than is necessary for the task at hand. In Software 2.0, overparamterization - when a machine learning model has more parameters than datapoints in the dataset - arises as a contemporary understanding of the ability for modern, gradient-based learning methods to learn models over complex datasets with high-accuracy. Specifically, the more parameters a model has, the better it learns. In Software 1.0, the results of the approximate computing community show that traditional software is also overparameterized in that software often simply computes results that are more precise than is required by the user. Approximate computing exploits this overparameterization to improve performance by eliminating unnecessary, excess computation. For example, one - of many techniques - is to reduce the precision of arithmetic in the application. In this paper, we argue that the gap between available precision and that that is required for either Software 1.0 or Software 2.0 is a fundamental aspect of software design that illustrates the balance between software designed for general-purposes and domain-adapted solutions. A general-purpose solution is easier to develop and maintain versus a domain-adapted solution. However, that ease comes at the expense of performance. We show that the approximate computing community and the machine learning community have developed overlapping techniques to improve performance by reducing overparameterization. We also show that because of these shared techniques, questions, concerns, and answers on how to construct software can translate from one software variant to the other.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

过度参数化:软件1.0与软件2.0之间的连接

一个名为Software 2.0的机器学习驱动应用的新生态系统已经出现，它将神经网络集成到各种计算任务中。这些应用包括图像识别、自然语言处理和其他传统的机器学习任务。然而，这些技术也已经发展到包括其他结构化领域，例如程序分析和程序优化，这些新颖的、特定于领域的见解与模型设计相结合。在本文中，我们通过过度参数化将软件2.0的世界与传统软件1.0的世界连接起来:一个程序可以提供比手头任务所需的更多的计算能力和精度。在软件2.0中，过度参数化——当机器学习模型的参数多于数据集中的数据点时——作为对现代基于梯度的学习方法在复杂数据集上以高精度学习模型的能力的当代理解而出现。具体来说，一个模型拥有的参数越多，它的学习效果就越好。在软件1.0中，近似计算社区的结果表明，传统软件也被过度参数化，因为软件通常只是简单地计算出比用户所要求的更精确的结果。近似计算利用这种过度参数化，通过消除不必要的、多余的计算来提高性能。例如，许多技术中的一种是降低应用程序中的算术精度。在本文中，我们认为可用精度与软件1.0或软件2.0所要求的精度之间的差距是软件设计的一个基本方面，它说明了为通用目的设计的软件与适应领域的解决方案之间的平衡。与适应领域的解决方案相比，通用解决方案更容易开发和维护。然而，这种轻松是以牺牲性能为代价的。我们表明，近似计算社区和机器学习社区已经开发了重叠技术，通过减少过度参数化来提高性能。我们还表明，由于这些共享的技术，关于如何构建软件的问题、关注点和答案可以从一种软件变体转换到另一种。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助