{"title":"Partitioned least squares","authors":"Roberto Esposito, Mattia Cerrato, Marco Locatelli","doi":"10.1007/s10994-024-06582-3","DOIUrl":null,"url":null,"abstract":"<p>Linear least squares is one of the most widely used regression methods in many fields. The simplicity of the model allows this method to be used when data is scarce and allows practitioners to gather some insight into the problem by inspecting the values of the learnt parameters. In this paper we propose a variant of the linear least squares model allowing practitioners to partition the input features into groups of variables that they require to contribute similarly to the final result. We show that the new formulation is not convex and provide two alternative methods to deal with the problem: one non-exact method based on an alternating least squares approach; and one exact method based on a reformulation of the problem. We show the correctness of the exact method and compare the two solutions showing that the exact solution provides better results in a fraction of the time required by the alternating least squares solution (when the number of partitions is small). We also provide a branch and bound algorithm that can be used in place of the exact method when the number of partitions is too large as well as a proof of NP-completeness of the optimization problem.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"73 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06582-3","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Linear least squares is one of the most widely used regression methods in many fields. The simplicity of the model allows this method to be used when data is scarce and allows practitioners to gather some insight into the problem by inspecting the values of the learnt parameters. In this paper we propose a variant of the linear least squares model allowing practitioners to partition the input features into groups of variables that they require to contribute similarly to the final result. We show that the new formulation is not convex and provide two alternative methods to deal with the problem: one non-exact method based on an alternating least squares approach; and one exact method based on a reformulation of the problem. We show the correctness of the exact method and compare the two solutions showing that the exact solution provides better results in a fraction of the time required by the alternating least squares solution (when the number of partitions is small). We also provide a branch and bound algorithm that can be used in place of the exact method when the number of partitions is too large as well as a proof of NP-completeness of the optimization problem.
期刊介绍:
Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.