Traffic assignment is a core component of many urban transport planning tools. It is used to determine how traffic is distributed over a transportation network. We study the task of computing traffic assignments for public transport: Given a public transit network, a timetable, vehicle capacities and a demand (i.e. a list of passengers, each with an associated origin, destination, and departure time), the goal is to predict the resulting passenger flow and the corresponding load of each vehicle. Microscopic stochastic simulation of individual passengers is a standard, but computationally expensive approach. Briem et al. (2017) have shown that a clever adaptation of the Connection Scan Algorithm (CSA) can lead to highly efficient traffic assignment algorithms, but ignores vehicle capacities, resulting in overcrowded vehicles. Taking their work as a starting point, we here propose a new and extended model that guarantees capacity-feasible assignments and incorporates dynamic network congestion effects such as crowded vehicles, denied boarding, and dwell time delays. Moreover, we also incorporate learning and adaptation of individual passengers based on their experience with the network. Applications include studying the evolution of perceived travel times as a result of adaptation, the impact of an increase in capacity, or network effects due to changes in the timetable such as the addition or the removal of a service or a whole line. The proposed framework has been experimentally evaluated with public transport networks of G"ottingen and Stuttgart (Germany). The simulation proves to be highly efficient. On a standard PC the computation of a traffic assignment takes just a few seconds per simulation day.
{"title":"Dynamic Traffic Assignment for Public Transport with Vehicle Capacities","authors":"Julian Patzner, Matthias Müller-Hannemann","doi":"arxiv-2408.06308","DOIUrl":"https://doi.org/arxiv-2408.06308","url":null,"abstract":"Traffic assignment is a core component of many urban transport planning\u0000tools. It is used to determine how traffic is distributed over a transportation\u0000network. We study the task of computing traffic assignments for public\u0000transport: Given a public transit network, a timetable, vehicle capacities and\u0000a demand (i.e. a list of passengers, each with an associated origin,\u0000destination, and departure time), the goal is to predict the resulting\u0000passenger flow and the corresponding load of each vehicle. Microscopic\u0000stochastic simulation of individual passengers is a standard, but\u0000computationally expensive approach. Briem et al. (2017) have shown that a\u0000clever adaptation of the Connection Scan Algorithm (CSA) can lead to highly\u0000efficient traffic assignment algorithms, but ignores vehicle capacities,\u0000resulting in overcrowded vehicles. Taking their work as a starting point, we\u0000here propose a new and extended model that guarantees capacity-feasible\u0000assignments and incorporates dynamic network congestion effects such as crowded\u0000vehicles, denied boarding, and dwell time delays. Moreover, we also incorporate\u0000learning and adaptation of individual passengers based on their experience with\u0000the network. Applications include studying the evolution of perceived travel\u0000times as a result of adaptation, the impact of an increase in capacity, or\u0000network effects due to changes in the timetable such as the addition or the\u0000removal of a service or a whole line. The proposed framework has been\u0000experimentally evaluated with public transport networks of G\"ottingen and\u0000Stuttgart (Germany). The simulation proves to be highly efficient. On a\u0000standard PC the computation of a traffic assignment takes just a few seconds\u0000per simulation day.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142175154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sun-Yuan Hsieh, Hoang-Oanh Le, Van Bang Le, Sheng-Lung Peng
We study a new variant of graph coloring by adding a connectivity constraint. A path in a vertex-colored graph is called conflict-free if there is a color that appears exactly once on its vertices. A connected graph $G$ is said to be strongly conflict-free vertex-connection $k$-colorable if $G$ admits a vertex $k$-coloring such that any two distinct vertices of $G$ are connected by a conflict-free $shortest$ path. Among others, we show that deciding whether a given graph is strongly conflict-free vertex-connection $3$-colorable is NP-complete even when restricted to $3$-colorable graphs with diameter $3$, radius $2$ and domination number $3$, and, assuming the Exponential Time Hypothesis (ETH), cannot be solved in $2^{o(n)}$ time on such restricted input graphs with $n$ vertices. This hardness result is quite strong when compared to the ordinary $3$-COLORING problem: it is known that $3$-COLORING is solvable in polynomial time in graphs with bounded domination number, and assuming ETH, cannot be solved in $2^{o(sqrt{n})}$ time in $n$-vertex graphs with diameter $3$ and radius $2$. On the positive side, we point out that a strong conflict-free vertex-connection coloring with minimum color number of a given split graph or a co-bipartite graph can be computed in polynomial time.
{"title":"The complexity of strong conflict-free vertex-connection $k$-colorability","authors":"Sun-Yuan Hsieh, Hoang-Oanh Le, Van Bang Le, Sheng-Lung Peng","doi":"arxiv-2408.05865","DOIUrl":"https://doi.org/arxiv-2408.05865","url":null,"abstract":"We study a new variant of graph coloring by adding a connectivity constraint.\u0000A path in a vertex-colored graph is called conflict-free if there is a color\u0000that appears exactly once on its vertices. A connected graph $G$ is said to be\u0000strongly conflict-free vertex-connection $k$-colorable if $G$ admits a vertex\u0000$k$-coloring such that any two distinct vertices of $G$ are connected by a\u0000conflict-free $shortest$ path. Among others, we show that deciding whether a given graph is strongly\u0000conflict-free vertex-connection $3$-colorable is NP-complete even when\u0000restricted to $3$-colorable graphs with diameter $3$, radius $2$ and domination\u0000number $3$, and, assuming the Exponential Time Hypothesis (ETH), cannot be\u0000solved in $2^{o(n)}$ time on such restricted input graphs with $n$ vertices.\u0000This hardness result is quite strong when compared to the ordinary $3$-COLORING\u0000problem: it is known that $3$-COLORING is solvable in polynomial time in graphs\u0000with bounded domination number, and assuming ETH, cannot be solved in\u0000$2^{o(sqrt{n})}$ time in $n$-vertex graphs with diameter $3$ and radius $2$.\u0000On the positive side, we point out that a strong conflict-free\u0000vertex-connection coloring with minimum color number of a given split graph or\u0000a co-bipartite graph can be computed in polynomial time.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142175156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A string $s$ is called a parameterized square when $s = xy$ for strings $x$, $y$ and $x$ and $y$ are parameterized equivalent. Kociumaka et al. showed the number of parameterized squares, which are non-equivalent in parameterized equivalence, in a string of length $n$ that contains $sigma$ distinct characters is at most $2 sigma! n$ [TCS 2016]. In this paper, we show that the maximum number of non-equivalent parameterized squares is less than $sigma n$, which significantly improves the best-known upper bound by Kociumaka et al.
{"title":"On the Number of Non-equivalent Parameterized Squares in a String","authors":"Rikuya Hamai, Kazushi Taketsugu, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai","doi":"arxiv-2408.04920","DOIUrl":"https://doi.org/arxiv-2408.04920","url":null,"abstract":"A string $s$ is called a parameterized square when $s = xy$ for strings $x$,\u0000$y$ and $x$ and $y$ are parameterized equivalent. Kociumaka et al. showed the\u0000number of parameterized squares, which are non-equivalent in parameterized\u0000equivalence, in a string of length $n$ that contains $sigma$ distinct\u0000characters is at most $2 sigma! n$ [TCS 2016]. In this paper, we show that the\u0000maximum number of non-equivalent parameterized squares is less than $sigma n$,\u0000which significantly improves the best-known upper bound by Kociumaka et al.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"370 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frequency estimation, a.k.a. histograms, is a workhorse of data analysis, and as such has been thoroughly studied under differentially privacy. In particular, computing histograms in the local model of privacy has been the focus of a fruitful recent line of work, and various algorithms have been proposed, achieving the order-optimal $ell_infty$ error in the high-privacy (small $varepsilon$) regime while balancing other considerations such as time- and communication-efficiency. However, to the best of our knowledge, the picture is much less clear when it comes to the medium- or low-privacy regime (large $varepsilon$), despite its increased relevance in practice. In this paper, we investigate locally private histograms, and the very related distribution learning task, in this medium-to-low privacy regime, and establish near-tight (and somewhat unexpected) bounds on the $ell_infty$ error achievable. Our theoretical findings emerge from a novel analysis, which appears to improve bounds across the board for the locally private histogram problem. We back our theoretical findings by an empirical comparison of existing algorithms in all privacy regimes, to assess their typical performance and behaviour beyond the worst-case setting.
{"title":"Locally Private Histograms in All Privacy Regimes","authors":"Clément L. Canonne, Abigail Gentle","doi":"arxiv-2408.04888","DOIUrl":"https://doi.org/arxiv-2408.04888","url":null,"abstract":"Frequency estimation, a.k.a. histograms, is a workhorse of data analysis, and\u0000as such has been thoroughly studied under differentially privacy. In\u0000particular, computing histograms in the local model of privacy has been the\u0000focus of a fruitful recent line of work, and various algorithms have been\u0000proposed, achieving the order-optimal $ell_infty$ error in the high-privacy\u0000(small $varepsilon$) regime while balancing other considerations such as time-\u0000and communication-efficiency. However, to the best of our knowledge, the\u0000picture is much less clear when it comes to the medium- or low-privacy regime\u0000(large $varepsilon$), despite its increased relevance in practice. In this\u0000paper, we investigate locally private histograms, and the very related\u0000distribution learning task, in this medium-to-low privacy regime, and establish\u0000near-tight (and somewhat unexpected) bounds on the $ell_infty$ error\u0000achievable. Our theoretical findings emerge from a novel analysis, which\u0000appears to improve bounds across the board for the locally private histogram\u0000problem. We back our theoretical findings by an empirical comparison of\u0000existing algorithms in all privacy regimes, to assess their typical performance\u0000and behaviour beyond the worst-case setting.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"193 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel W. Cranston, Moritz Mühlenthaler, Benjamin Peyrille
The problem textsc{Token Jumping} asks whether, given a graph $G$ and two independent sets of emph{tokens} $I$ and $J$ of $G$, we can transform $I$ into $J$ by changing the position of a single token in each step and having an independent set of tokens throughout. We show that there is a polynomial-time algorithm that, given an instance of textsc{Token Jumping}, computes an equivalent instance of size $O(g^2 + gk + k^2)$, where $g$ is the genus of the input graph and $k$ is the size of the independent sets.
{"title":"A simple quadratic kernel for Token Jumping on surfaces","authors":"Daniel W. Cranston, Moritz Mühlenthaler, Benjamin Peyrille","doi":"arxiv-2408.04743","DOIUrl":"https://doi.org/arxiv-2408.04743","url":null,"abstract":"The problem textsc{Token Jumping} asks whether, given a graph $G$ and two\u0000independent sets of emph{tokens} $I$ and $J$ of $G$, we can transform $I$ into\u0000$J$ by changing the position of a single token in each step and having an\u0000independent set of tokens throughout. We show that there is a polynomial-time\u0000algorithm that, given an instance of textsc{Token Jumping}, computes an\u0000equivalent instance of size $O(g^2 + gk + k^2)$, where $g$ is the genus of the\u0000input graph and $k$ is the size of the independent sets.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oksana Firman, Grzegorz Gutowski, Myroslav Kryven, Yuto Okada, Alexander Wolff
The treewidth is a structural parameter that measures the tree-likeness of a graph. Many algorithmic and combinatorial results are expressed in terms of the treewidth. In this paper, we study the treewidth of outer $k$-planar graphs, that is, graphs that admit a straight-line drawing where all the vertices lie on a circle, and every edge is crossed by at most $k$ other edges. Wood and Telle [New York J. Math., 2007] showed that every outer $k$-planar graph has treewidth at most $3k + 11$ using so-called planar decompositions, and later, Auer et al. [Algorithmica, 2016] proved that the treewidth of outer $1$-planar graphs is at most $3$, which is tight. In this paper, we improve the general upper bound to $1.5k + 2$ and give a tight bound of $4$ for $k = 2$. We also establish a lower bound: we show that, for every even $k$, there is an outer $k$-planar graph with treewidth $k+2$. Our new bound immediately implies a better bound on the cop number, which answers an open question of Durocher et al. [GD 2023] in the affirmative. Our treewidth bound relies on a new and simple triangulation method for outer $k$-planar graphs that yields few crossings with graph edges per edge of the triangulation. Our method also enables us to obtain a tight upper bound of $k + 2$ for the separation number of outer $k$-planar graphs, improving an upper bound of $2k + 3$ by Chaplick et al. [GD 2017]. We also consider outer min-$k$-planar graphs, a generalization of outer $k$-planar graphs, where we achieve smaller improvements.
{"title":"Bounding the Treewidth of Outer $k$-Planar Graphs via Triangulations","authors":"Oksana Firman, Grzegorz Gutowski, Myroslav Kryven, Yuto Okada, Alexander Wolff","doi":"arxiv-2408.04264","DOIUrl":"https://doi.org/arxiv-2408.04264","url":null,"abstract":"The treewidth is a structural parameter that measures the tree-likeness of a\u0000graph. Many algorithmic and combinatorial results are expressed in terms of the\u0000treewidth. In this paper, we study the treewidth of outer $k$-planar graphs,\u0000that is, graphs that admit a straight-line drawing where all the vertices lie\u0000on a circle, and every edge is crossed by at most $k$ other edges. Wood and Telle [New York J. Math., 2007] showed that every outer $k$-planar\u0000graph has treewidth at most $3k + 11$ using so-called planar decompositions,\u0000and later, Auer et al. [Algorithmica, 2016] proved that the treewidth of outer\u0000$1$-planar graphs is at most $3$, which is tight. In this paper, we improve the general upper bound to $1.5k + 2$ and give a\u0000tight bound of $4$ for $k = 2$. We also establish a lower bound: we show that,\u0000for every even $k$, there is an outer $k$-planar graph with treewidth $k+2$.\u0000Our new bound immediately implies a better bound on the cop number, which\u0000answers an open question of Durocher et al. [GD 2023] in the affirmative. Our treewidth bound relies on a new and simple triangulation method for outer\u0000$k$-planar graphs that yields few crossings with graph edges per edge of the\u0000triangulation. Our method also enables us to obtain a tight upper bound of $k +\u00002$ for the separation number of outer $k$-planar graphs, improving an upper\u0000bound of $2k + 3$ by Chaplick et al. [GD 2017]. We also consider outer\u0000min-$k$-planar graphs, a generalization of outer $k$-planar graphs, where we\u0000achieve smaller improvements.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141969830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study the problem of maximizing a monotone submodular function subject to a matroid constraint, and present for it a deterministic non-oblivious local search algorithm that has an approximation guarantee of $1 - 1/e - varepsilon$ (for any $varepsilon> 0$) and query complexity of $tilde{O}_varepsilon(nr)$, where $n$ is the size of the ground set and $r$ is the rank of the matroid. Our algorithm vastly improves over the previous state-of-the-art $0.5008$-approximation deterministic algorithm, and in fact, shows that there is no separation between the approximation guarantees that can be obtained by deterministic and randomized algorithms for the problem considered. The query complexity of our algorithm can be improved to $tilde{O}_varepsilon(n + rsqrt{n})$ using randomization, which is nearly-linear for $r = O(sqrt{n})$, and is always at least as good as the previous state-of-the-art algorithms.
{"title":"Deterministic Algorithm and Faster Algorithm for Submodular Maximization subject to a Matroid Constraint","authors":"Niv Buchbinder, Moran Feldman","doi":"arxiv-2408.03583","DOIUrl":"https://doi.org/arxiv-2408.03583","url":null,"abstract":"We study the problem of maximizing a monotone submodular function subject to\u0000a matroid constraint, and present for it a deterministic non-oblivious local\u0000search algorithm that has an approximation guarantee of $1 - 1/e - varepsilon$\u0000(for any $varepsilon> 0$) and query complexity of $tilde{O}_varepsilon(nr)$,\u0000where $n$ is the size of the ground set and $r$ is the rank of the matroid. Our\u0000algorithm vastly improves over the previous state-of-the-art\u0000$0.5008$-approximation deterministic algorithm, and in fact, shows that there\u0000is no separation between the approximation guarantees that can be obtained by\u0000deterministic and randomized algorithms for the problem considered. The query\u0000complexity of our algorithm can be improved to $tilde{O}_varepsilon(n +\u0000rsqrt{n})$ using randomization, which is nearly-linear for $r = O(sqrt{n})$,\u0000and is always at least as good as the previous state-of-the-art algorithms.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The coloring problem is a well-research topic and its complexity is known for several classes of graphs. However, the question of its complexity remains open for the class of antiprismatic graphs, which are the complement of prismatic graphs and one of the four remaining cases highlighted by Lozin and Malishev. In this article we focus on the equivalent question of the complexity of the clique cover problem in prismatic graphs. A graph $G$ is prismatic if for every triangle $T$ of $G$, every vertex of $G$ not in $T$ has a unique neighbor in $T$. A graph is co-bridge-free if it has no $C_4+2K_1$ as induced subgraph. We give a polynomial time algorithm that solves the clique cover problem in co-bridge-free prismatic graphs. It relies on the structural description given by Chudnovsky and Seymour, and on later work of Preissmann, Robin and Trotignon. We show that co-bridge-free prismatic graphs have a bounded number of disjoint triangles and that implies that the algorithm presented by Preissmann et al. applies.
{"title":"Coloring bridge-free antiprismatic graphs","authors":"Cléophée Robin, Eileen Robinson","doi":"arxiv-2408.01328","DOIUrl":"https://doi.org/arxiv-2408.01328","url":null,"abstract":"The coloring problem is a well-research topic and its complexity is known for\u0000several classes of graphs. However, the question of its complexity remains open\u0000for the class of antiprismatic graphs, which are the complement of prismatic\u0000graphs and one of the four remaining cases highlighted by Lozin and Malishev.\u0000In this article we focus on the equivalent question of the complexity of the\u0000clique cover problem in prismatic graphs. A graph $G$ is prismatic if for every triangle $T$ of $G$, every vertex of\u0000$G$ not in $T$ has a unique neighbor in $T$. A graph is co-bridge-free if it\u0000has no $C_4+2K_1$ as induced subgraph. We give a polynomial time algorithm that\u0000solves the clique cover problem in co-bridge-free prismatic graphs. It relies\u0000on the structural description given by Chudnovsky and Seymour, and on later\u0000work of Preissmann, Robin and Trotignon. We show that co-bridge-free prismatic graphs have a bounded number of\u0000disjoint triangles and that implies that the algorithm presented by Preissmann\u0000et al. applies.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multivariate decision trees are powerful machine learning tools for classification and regression that attract many researchers and industry professionals. An optimal binary tree has two types of vertices, (i) branching vertices which have exactly two children and where datapoints are assessed on a set of discrete features and (ii) leaf vertices at which datapoints are given a prediction, and can be obtained by solving a biobjective optimization problem that seeks to (i) maximize the number of correctly classified datapoints and (ii) minimize the number of branching vertices. Branching vertices are linear combinations of training features and therefore can be thought of as hyperplanes. In this paper, we propose two cut-based mixed integer linear optimization (MILO) formulations for designing optimal binary classification trees (leaf vertices assign discrete classes). Our models leverage on-the-fly identification of minimal infeasible subsystems (MISs) from which we derive cutting planes that hold the form of packing constraints. We show theoretical improvements on the strongest flow-based MILO formulation currently in the literature and conduct experiments on publicly available datasets to show our models' ability to scale, strength against traditional branch and bound approaches, and robustness in out-of-sample test performance. Our code and data are available on GitHub.
{"title":"Optimal Mixed Integer Linear Optimization Trained Multivariate Classification Trees","authors":"Brandon Alston, Illya V. Hicks","doi":"arxiv-2408.01297","DOIUrl":"https://doi.org/arxiv-2408.01297","url":null,"abstract":"Multivariate decision trees are powerful machine learning tools for\u0000classification and regression that attract many researchers and industry\u0000professionals. An optimal binary tree has two types of vertices, (i) branching\u0000vertices which have exactly two children and where datapoints are assessed on a\u0000set of discrete features and (ii) leaf vertices at which datapoints are given a\u0000prediction, and can be obtained by solving a biobjective optimization problem\u0000that seeks to (i) maximize the number of correctly classified datapoints and\u0000(ii) minimize the number of branching vertices. Branching vertices are linear\u0000combinations of training features and therefore can be thought of as\u0000hyperplanes. In this paper, we propose two cut-based mixed integer linear\u0000optimization (MILO) formulations for designing optimal binary classification\u0000trees (leaf vertices assign discrete classes). Our models leverage on-the-fly\u0000identification of minimal infeasible subsystems (MISs) from which we derive\u0000cutting planes that hold the form of packing constraints. We show theoretical\u0000improvements on the strongest flow-based MILO formulation currently in the\u0000literature and conduct experiments on publicly available datasets to show our\u0000models' ability to scale, strength against traditional branch and bound\u0000approaches, and robustness in out-of-sample test performance. Our code and data\u0000are available on GitHub.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141933620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bachtiar Herdianto, Romain Billot, Flavien Lucas, Marc Sevaux
We propose a metaheuristic algorithm enhanced with feature-based guidance that is designed to solve the Capacitated Vehicle Routing Problem (CVRP). To formulate the proposed guidance, we developed and explained a supervised Machine Learning (ML) model, that is used to formulate the guidance and control the diversity of the solution during the optimization process. We propose a metaheuristic algorithm combining neighborhood search and a novel mechanism of hybrid split and path relinking to implement the proposed guidance. The proposed guidance has proven to give a statistically significant improvement to the proposed metaheuristic algorithm when solving CVRP. Moreover, the proposed guided metaheuristic is also capable of producing competitive solutions among state-of-the-art metaheuristic algorithms.
{"title":"Metaheuristic Enhanced with Feature-Based Guidance and Diversity Management for Solving the Capacitated Vehicle Routing Problem","authors":"Bachtiar Herdianto, Romain Billot, Flavien Lucas, Marc Sevaux","doi":"arxiv-2407.20777","DOIUrl":"https://doi.org/arxiv-2407.20777","url":null,"abstract":"We propose a metaheuristic algorithm enhanced with feature-based guidance\u0000that is designed to solve the Capacitated Vehicle Routing Problem (CVRP). To\u0000formulate the proposed guidance, we developed and explained a supervised\u0000Machine Learning (ML) model, that is used to formulate the guidance and control\u0000the diversity of the solution during the optimization process. We propose a\u0000metaheuristic algorithm combining neighborhood search and a novel mechanism of\u0000hybrid split and path relinking to implement the proposed guidance. The\u0000proposed guidance has proven to give a statistically significant improvement to\u0000the proposed metaheuristic algorithm when solving CVRP. Moreover, the proposed\u0000guided metaheuristic is also capable of producing competitive solutions among\u0000state-of-the-art metaheuristic algorithms.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141871554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}