The approach to data-driven optimization described in this paper was developed when the authors were part of an IBM project team working with the U.S. Department of Energy, Pacific National Laboratory, and various energy utility partners on an initiative to develop a smart energy distribution infrastructure. Within this broader scope and based on the data collected in some initial controlled experiments, the paper specifically addresses the design and optimization of real-time price incentives to consumers to manage their electricity demand and determine the energy capacity to be provisioned by the utility. This latter problem fits into the well-known price-setting newsvendor problem framework, and our goal was to replace the simplistic methods in the literature by more realistic data-driven methods to take into account the data-collection capabilities and the modeling complexity of real-world applications. Our aspirations for the paper are (1) to introduce data-driven, distribution-free approaches to decision-making problems and (2) to motivate scalable conditional value-at-risk regression-based approaches for these problems.
{"title":"A Prescriptive Machine-Learning Framework to the Price-Setting Newsvendor Problem","authors":"P. Harsha, R. Natarajan, D. Subramanian","doi":"10.1287/IJOO.2019.0046","DOIUrl":"https://doi.org/10.1287/IJOO.2019.0046","url":null,"abstract":"The approach to data-driven optimization described in this paper was developed when the authors were part of an IBM project team working with the U.S. Department of Energy, Pacific National Laboratory, and various energy utility partners on an initiative to develop a smart energy distribution infrastructure. Within this broader scope and based on the data collected in some initial controlled experiments, the paper specifically addresses the design and optimization of real-time price incentives to consumers to manage their electricity demand and determine the energy capacity to be provisioned by the utility. This latter problem fits into the well-known price-setting newsvendor problem framework, and our goal was to replace the simplistic methods in the literature by more realistic data-driven methods to take into account the data-collection capabilities and the modeling complexity of real-world applications. Our aspirations for the paper are (1) to introduce data-driven, distribution-free approaches to decision-making problems and (2) to motivate scalable conditional value-at-risk regression-based approaches for these problems.","PeriodicalId":73382,"journal":{"name":"INFORMS journal on optimization","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42932227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we consider minimization of a difference-of-convex (DC) function with and without linear equality constraints. We first study a smooth approximation of a generic DC function, termed difference-of-Moreau-envelopes (DME) smoothing, where both components of the DC function are replaced by their respective Moreau envelopes. The resulting smooth approximation is shown to be Lipschitz differentiable, capture stationary points, local, and global minima of the original DC function, and enjoy some growth conditions, such as level-boundedness and coercivity, for broad classes of DC functions. For a smoothed DC program without linear constraints, it is shown that the classic gradient descent method and an inexact variant converge to a stationary solution of the original DC function in the limit with a rate of [Formula: see text], where K is the number of proximal evaluations of both components. Furthermore, when the DC program is explicitly constrained in an affine subspace, we combine the smoothing technique with the augmented Lagrangian function and derive two variants of the augmented Lagrangian method (ALM), named linearly constrained DC (LCDC)-ALM and composite LCDC-ALM, targeting on different structures of the DC objective function. We show that both algorithms find an ϵ-approximate stationary solution of the original DC program in [Formula: see text] iterations. Comparing to existing methods designed for linearly constrained weakly convex minimization, the proposed ALM-based algorithms can be applied to a broader class of problems, where the objective contains a nonsmooth concave component. Finally, numerical experiments are presented to demonstrate the performance of the proposed algorithms. Funding: This work was partially supported by the NSF [Grant ECCS1751747]. Supplemental Material: The e-companion is available at https://doi.org/10.1287/ijoo.2022.0087 .
{"title":"Algorithms for Difference-of-Convex Programs Based on Difference-of-Moreau-Envelopes Smoothing","authors":"Kaizhao Sun, X. Sun","doi":"10.1287/ijoo.2022.0087","DOIUrl":"https://doi.org/10.1287/ijoo.2022.0087","url":null,"abstract":"In this paper, we consider minimization of a difference-of-convex (DC) function with and without linear equality constraints. We first study a smooth approximation of a generic DC function, termed difference-of-Moreau-envelopes (DME) smoothing, where both components of the DC function are replaced by their respective Moreau envelopes. The resulting smooth approximation is shown to be Lipschitz differentiable, capture stationary points, local, and global minima of the original DC function, and enjoy some growth conditions, such as level-boundedness and coercivity, for broad classes of DC functions. For a smoothed DC program without linear constraints, it is shown that the classic gradient descent method and an inexact variant converge to a stationary solution of the original DC function in the limit with a rate of [Formula: see text], where K is the number of proximal evaluations of both components. Furthermore, when the DC program is explicitly constrained in an affine subspace, we combine the smoothing technique with the augmented Lagrangian function and derive two variants of the augmented Lagrangian method (ALM), named linearly constrained DC (LCDC)-ALM and composite LCDC-ALM, targeting on different structures of the DC objective function. We show that both algorithms find an ϵ-approximate stationary solution of the original DC program in [Formula: see text] iterations. Comparing to existing methods designed for linearly constrained weakly convex minimization, the proposed ALM-based algorithms can be applied to a broader class of problems, where the objective contains a nonsmooth concave component. Finally, numerical experiments are presented to demonstrate the performance of the proposed algorithms. Funding: This work was partially supported by the NSF [Grant ECCS1751747]. Supplemental Material: The e-companion is available at https://doi.org/10.1287/ijoo.2022.0087 .","PeriodicalId":73382,"journal":{"name":"INFORMS journal on optimization","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42942001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, several classes of cutting planes have been introduced for binary polynomial optimization. In this paper, we present the first results connecting the combinatorial structure of these inequalities with their Chvátal rank. We determine the Chvátal rank of all known cutting planes and show that almost all of them have Chvátal rank 1. We observe that these inequalities have an associated hypergraph that is β-acyclic. Our second goal is to derive deeper cutting planes; to do so, we consider hypergraphs that admit β-cycles. We introduce a novel class of valid inequalities arising from odd β-cycles, that generally have Chvátal rank 2. These inequalities allow us to obtain the first characterization of the multilinear polytope for hypergraphs that contain β-cycles. Namely, we show that the multilinear polytope for cycle hypergraphs is given by the standard linearization inequalities, flower inequalities, and odd β-cycle inequalities. We also prove that odd β-cycle inequalities can be separated in linear time when the hypergraph is a cycle hypergraph. This shows that instances represented by cycle hypergraphs can be solved in polynomial time. Last, to test the strength of odd β-cycle inequalities, we perform numerical experiments that imply that they close a significant percentage of the integrality gap.
{"title":"Chvátal Rank in Binary Polynomial Optimization","authors":"Alberto Del Pia, S. Di Gregorio","doi":"10.1287/IJOO.2019.0049","DOIUrl":"https://doi.org/10.1287/IJOO.2019.0049","url":null,"abstract":"Recently, several classes of cutting planes have been introduced for binary polynomial optimization. In this paper, we present the first results connecting the combinatorial structure of these inequalities with their Chvátal rank. We determine the Chvátal rank of all known cutting planes and show that almost all of them have Chvátal rank 1. We observe that these inequalities have an associated hypergraph that is β-acyclic. Our second goal is to derive deeper cutting planes; to do so, we consider hypergraphs that admit β-cycles. We introduce a novel class of valid inequalities arising from odd β-cycles, that generally have Chvátal rank 2. These inequalities allow us to obtain the first characterization of the multilinear polytope for hypergraphs that contain β-cycles. Namely, we show that the multilinear polytope for cycle hypergraphs is given by the standard linearization inequalities, flower inequalities, and odd β-cycle inequalities. We also prove that odd β-cycle inequalities can be separated in linear time when the hypergraph is a cycle hypergraph. This shows that instances represented by cycle hypergraphs can be solved in polynomial time. Last, to test the strength of odd β-cycle inequalities, we perform numerical experiments that imply that they close a significant percentage of the integrality gap.","PeriodicalId":73382,"journal":{"name":"INFORMS journal on optimization","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48197054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary of Contribution This article was inspired by price formation changes recently proposed and implemented in several U.S. wholesale electricity markets. The analysis draws from and contributes to three lines of literature. First, the paper specifies two mechanisms that lead to inefficient and inconsistent prices in real-world markets. Second, the article illustrates the importance of considering uncertainty in evaluating policies for pricing in nonconvex markets and observes that convex hull pricing, sometimes described as an ?ideal? due to its uplift-minimizing property in deterministic analyses, can perform poorly in settings with uncertainty. Lastly, the paper strengthens the theoretical basis for operating reserve demand curves by connecting their parameterization to outcomes expected in efficient stochastic markets.
{"title":"Quasi-Stochastic Electricity Markets","authors":"J. Mays","doi":"10.1287/ijoo.2021.0051","DOIUrl":"https://doi.org/10.1287/ijoo.2021.0051","url":null,"abstract":"Summary of Contribution This article was inspired by price formation changes recently proposed and implemented in several U.S. wholesale electricity markets. The analysis draws from and contributes to three lines of literature. First, the paper specifies two mechanisms that lead to inefficient and inconsistent prices in real-world markets. Second, the article illustrates the importance of considering uncertainty in evaluating policies for pricing in nonconvex markets and observes that convex hull pricing, sometimes described as an ?ideal? due to its uplift-minimizing property in deterministic analyses, can perform poorly in settings with uncertainty. Lastly, the paper strengthens the theoretical basis for operating reserve demand curves by connecting their parameterization to outcomes expected in efficient stochastic markets.","PeriodicalId":73382,"journal":{"name":"INFORMS journal on optimization","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66363413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01Epub Date: 2020-11-03DOI: 10.1287/ijoo.2019.0040
Hasan Manzour, Simge Küçükyavuz, Hao-Hsiang Wu, Ali Shojaie
Learning directed acyclic graphs (DAGs) from data is a challenging task both in theory and in practice, because the number of possible DAGs scales superexponentially with the number of nodes. In this paper, we study the problem of learning an optimal DAG from continuous observational data. We cast this problem in the form of a mathematical programming model that can naturally incorporate a superstructure to reduce the set of possible candidate DAGs. We use a negative log-likelihood score function with both and penalties and propose a new mixed-integer quadratic program, referred to as a layered network (LN) formulation. The LN formulation is a compact model that enjoys as tight an optimal continuous relaxation value as the stronger but larger formulations under a mild condition. Computational results indicate that the proposed formulation outperforms existing mathematical formulations and scales better than available algorithms that can solve the same problem with only regularization. In particular, the LN formulation clearly outperforms existing methods in terms of computational time needed to find an optimal DAG in the presence of a sparse superstructure.
从数据中学习有向无环图(DAG)无论在理论上还是在实践中都是一项具有挑战性的任务,因为可能的 DAG 数量与节点数量成超指数关系。本文研究了从连续观测数据中学习最优 DAG 的问题。我们以数学编程模型的形式来解决这个问题,该模型可以自然地结合上层结构来减少可能的候选 DAG 集。我们使用具有 ℓ 0 和 ℓ 1 惩罚的负对数似然得分函数,并提出了一种新的混合整数二次方程程序,称为分层网络(LN)公式。LN 公式是一个紧凑的模型,在温和的条件下,它与更强但更大的公式一样,享有紧密的最优连续松弛值。计算结果表明,所提出的公式优于现有的数学公式,其规模也优于仅用 ℓ 1 正则化就能解决相同问题的现有算法。特别是,在存在稀疏上层结构的情况下,就找到最优 DAG 所需的计算时间而言,LN 公式明显优于现有方法。
{"title":"Integer Programming for Learning Directed Acyclic Graphs from Continuous Data.","authors":"Hasan Manzour, Simge Küçükyavuz, Hao-Hsiang Wu, Ali Shojaie","doi":"10.1287/ijoo.2019.0040","DOIUrl":"10.1287/ijoo.2019.0040","url":null,"abstract":"<p><p>Learning directed acyclic graphs (DAGs) from data is a challenging task both in theory and in practice, because the number of possible DAGs scales superexponentially with the number of nodes. In this paper, we study the problem of learning an optimal DAG from continuous observational data. We cast this problem in the form of a mathematical programming model that can naturally incorporate a superstructure to reduce the set of possible candidate DAGs. We use a negative log-likelihood score function with both <math> <mrow><msub><mi>ℓ</mi> <mn>0</mn></msub> </mrow> </math> and <math> <mrow><msub><mi>ℓ</mi> <mn>1</mn></msub> </mrow> </math> penalties and propose a new mixed-integer quadratic program, referred to as a layered network (LN) formulation. The LN formulation is a compact model that enjoys as tight an optimal continuous relaxation value as the stronger but larger formulations under a mild condition. Computational results indicate that the proposed formulation outperforms existing mathematical formulations and scales better than available algorithms that can solve the same problem with only <math> <mrow><msub><mi>ℓ</mi> <mn>1</mn></msub> </mrow> </math> regularization. In particular, the LN formulation clearly outperforms existing methods in terms of computational time needed to find an optimal DAG in the presence of a sparse superstructure.</p>","PeriodicalId":73382,"journal":{"name":"INFORMS journal on optimization","volume":"3 1","pages":"46-73"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10088505/pdf/nihms-1885648.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9314757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The separability of clusters is one of the most desired properties in clustering. There is a wide range of settings in which different clusterings of the same data set appear. We are interested in applications for which there is a need for an explicit, gradual transition of one separable clustering into another one. This transition should be a sequence of simple, natural steps that upholds separability of the clusters throughout. We design an algorithm for such a transition. We exploit the intimate connection of separability and linear programming over bounded-shape partition and transportation polytopes: separable clusterings lie on the boundary of partition polytopes and form a subset of the vertices of the corresponding transportation polytopes, and circuits of both polytopes are readily interpreted as sequential or cyclical exchanges of items between clusters. This allows for a natural approach to achieve the desired transition through a combination of two walks: an edge walk between two so-called radial clusterings in a transportation polytope, computed through an adaptation of classical tools of sensitivity analysis and parametric programming, and a walk from a separable clustering to a corresponding radial clustering, computed through a tailored, iterative routine updating cluster sizes and reoptimizing the cluster assignment of items.
{"title":"An Algorithm for the Separation-Preserving Transition of Clusterings","authors":"S. Borgwardt, Felix Happach, Stetson Zirkelbach","doi":"10.1287/ijoo.2022.0074","DOIUrl":"https://doi.org/10.1287/ijoo.2022.0074","url":null,"abstract":"The separability of clusters is one of the most desired properties in clustering. There is a wide range of settings in which different clusterings of the same data set appear. We are interested in applications for which there is a need for an explicit, gradual transition of one separable clustering into another one. This transition should be a sequence of simple, natural steps that upholds separability of the clusters throughout. We design an algorithm for such a transition. We exploit the intimate connection of separability and linear programming over bounded-shape partition and transportation polytopes: separable clusterings lie on the boundary of partition polytopes and form a subset of the vertices of the corresponding transportation polytopes, and circuits of both polytopes are readily interpreted as sequential or cyclical exchanges of items between clusters. This allows for a natural approach to achieve the desired transition through a combination of two walks: an edge walk between two so-called radial clusterings in a transportation polytope, computed through an adaptation of classical tools of sensitivity analysis and parametric programming, and a walk from a separable clustering to a corresponding radial clustering, computed through a tailored, iterative routine updating cluster sizes and reoptimizing the cluster assignment of items.","PeriodicalId":73382,"journal":{"name":"INFORMS journal on optimization","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44211183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}