Multiserver jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in today’s computing clusters. But, little is known about the delay performance of systems with multiserver jobs. We consider queueing models for multiserver jobs in scaling regimes where the system load becomes heavy and meanwhile, the total number of servers in the system and the number of servers that a job needs become large. Prior work has derived upper bounds on the queueing probability in this scaling regime. However, without proper lower bounds, the existing results cannot be used to differentiate between policies. In this paper, we study the delay performance by establishing sharp bounds on the steady-state mean waiting time of multiserver jobs, where the waiting time of a job is the time spent in queueing rather than in service. We first characterize the exact order of the mean waiting time under the first come, first serve (FCFS) policy. Then, we prove a lower bound on the mean waiting time of all policies, which has an order gap with the mean waiting time under FCFS. We show that the lower bound is achievable by a priority policy that we call smallest need first (SNF).Funding: This research was supported in part by the National Science Foundation [Grant ECCS-2145713].Supplemental Material: The online appendix is available at https://doi.org/10.1287/stsy.2023.0006 .
{"title":"Sharp Waiting-Time Bounds for Multiserver Jobs","authors":"Yige Hong, Weina Wang","doi":"10.1287/stsy.2023.0006","DOIUrl":"https://doi.org/10.1287/stsy.2023.0006","url":null,"abstract":"Multiserver jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in today’s computing clusters. But, little is known about the delay performance of systems with multiserver jobs. We consider queueing models for multiserver jobs in scaling regimes where the system load becomes heavy and meanwhile, the total number of servers in the system and the number of servers that a job needs become large. Prior work has derived upper bounds on the queueing probability in this scaling regime. However, without proper lower bounds, the existing results cannot be used to differentiate between policies. In this paper, we study the delay performance by establishing sharp bounds on the steady-state mean waiting time of multiserver jobs, where the waiting time of a job is the time spent in queueing rather than in service. We first characterize the exact order of the mean waiting time under the first come, first serve (FCFS) policy. Then, we prove a lower bound on the mean waiting time of all policies, which has an order gap with the mean waiting time under FCFS. We show that the lower bound is achievable by a priority policy that we call smallest need first (SNF).Funding: This research was supported in part by the National Science Foundation [Grant ECCS-2145713].Supplemental Material: The online appendix is available at https://doi.org/10.1287/stsy.2023.0006 .","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"81 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142178811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper studies a two-class, two-server parallel server system under the recently introduced extended heavy traffic condition, which states that the underlying “static allocation” linear program (LP) is critical, but does not require that it has a unique solution. The main result is the construction of policies that asymptotically achieve previously proved a lower bound, on an expected discounted linear combination of diffusion-scaled queue lengths and are therefore asymptotically optimal (AO). Each extreme point solution to the LP determines a control mode—that is, a set of activities (class-server pairs) that are operational. When there are multiple solutions, these modes can be selected dynamically. It is shown that the number of modes required for AO is either one or two. In the latter case, there is a switching point in the (normalized) workload domain, characterized in terms of a free boundary problem. Our policies are defined by identifying pairs of elementary policies and switching between them at this switching point. They provide the first example in the heavy traffic literature where weak limits under an AO policy are given by a diffusion process where both the drift and diffusion coefficients are discontinuous.Funding: R. Atar is supported by the Israel Science Foundation [Grant 1035/20].
{"title":"Asymptotic Optimality of Switched Control Policies in a Simple Parallel Server System Under an Extended Heavy Traffic Condition","authors":"Rami Atar, Eyal Castiel, Martin I. Reiman","doi":"10.1287/stsy.2022.0022","DOIUrl":"https://doi.org/10.1287/stsy.2022.0022","url":null,"abstract":"This paper studies a two-class, two-server parallel server system under the recently introduced extended heavy traffic condition, which states that the underlying “static allocation” linear program (LP) is critical, but does not require that it has a unique solution. The main result is the construction of policies that asymptotically achieve previously proved a lower bound, on an expected discounted linear combination of diffusion-scaled queue lengths and are therefore asymptotically optimal (AO). Each extreme point solution to the LP determines a control mode—that is, a set of activities (class-server pairs) that are operational. When there are multiple solutions, these modes can be selected dynamically. It is shown that the number of modes required for AO is either one or two. In the latter case, there is a switching point in the (normalized) workload domain, characterized in terms of a free boundary problem. Our policies are defined by identifying pairs of elementary policies and switching between them at this switching point. They provide the first example in the heavy traffic literature where weak limits under an AO policy are given by a diffusion process where both the drift and diffusion coefficients are discontinuous.Funding: R. Atar is supported by the Israel Science Foundation [Grant 1035/20].","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141770098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yijie Wang, Madhushini Narayana Prasad, Grani A. Hanasusanto, John J. Hasenbein
This paper presents an extension of Naor’s analysis on the join-or-balk problem in observable M/M/1 queues. Although all other Markovian assumptions still hold, we explore this problem assuming uncertain arrival rates under the distributionally robust settings. We first study the problem with the classical moment ambiguity set, where the support, mean, and mean-absolute deviation of the underlying distribution are known. Next, we extend the model to the data-driven setting, where decision makers only have access to a finite set of samples. We develop three optimal joining threshold strategies from the perspectives of an individual customer, a social optimizer, and a revenue maximizer such that their respective worst-case expected benefit rates are maximized. Finally, we compare our findings with Naor’s original results and the traditional sample average approximation scheme.Funding: This research was supported by the National Science Foundation [Grants 2342505 and 2343869].
{"title":"Distributionally Robust Observable Strategic Queues","authors":"Yijie Wang, Madhushini Narayana Prasad, Grani A. Hanasusanto, John J. Hasenbein","doi":"10.1287/stsy.2022.0009","DOIUrl":"https://doi.org/10.1287/stsy.2022.0009","url":null,"abstract":"This paper presents an extension of Naor’s analysis on the join-or-balk problem in observable M/M/1 queues. Although all other Markovian assumptions still hold, we explore this problem assuming uncertain arrival rates under the distributionally robust settings. We first study the problem with the classical moment ambiguity set, where the support, mean, and mean-absolute deviation of the underlying distribution are known. Next, we extend the model to the data-driven setting, where decision makers only have access to a finite set of samples. We develop three optimal joining threshold strategies from the perspectives of an individual customer, a social optimizer, and a revenue maximizer such that their respective worst-case expected benefit rates are maximized. Finally, we compare our findings with Naor’s original results and the traditional sample average approximation scheme.Funding: This research was supported by the National Science Foundation [Grants 2342505 and 2343869].","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141062091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The basic adjoint relationship (BAR) approach is an analysis technique based on the stationary equation of a Markov process. This approach was introduced to study heavy-traffic, steady-state convergence of generalized Jackson networks in which each service station has a single job class. We extend it to multiclass queueing networks operating under static-buffer-priority (SBP) service disciplines. Our extension makes a connection with Palm distributions that allows one to attack a difficulty arising from queue-length truncation, which appears to be unavoidable in the multiclass setting. For multiclass queueing networks operating under SBP service disciplines, our BAR approach provides an alternative to the “interchange of limits” approach that has dominated the literature in the last twenty years. The BAR approach can produce sharp results and allows one to establish steady-state convergence under three additional conditions: stability, state space collapse (SSC) and a certain matrix being “tight.” These three conditions do not appear to depend on the interarrival and service-time distributions beyond their means, and their verification can be studied as three separate modules. In particular, they can be studied in a simpler, continuous-time Markov chain setting when all distributions are exponential. As an example, these three conditions are shown to hold in reentrant lines operating under last-buffer-first-serve discipline. In a two-station, five-class reentrant line, under the heavy-traffic condition, the tight-matrix condition implies both the stability condition and the SSC condition. Whether such a relationship holds generally is an open problem.
{"title":"The BAR Approach for Multiclass Queueing Networks with SBP Service Policies","authors":"Anton Braverman, J. G. Dai, Masakiyo Miyazawa","doi":"10.1287/stsy.2023.0011","DOIUrl":"https://doi.org/10.1287/stsy.2023.0011","url":null,"abstract":"The basic adjoint relationship (BAR) approach is an analysis technique based on the stationary equation of a Markov process. This approach was introduced to study heavy-traffic, steady-state convergence of generalized Jackson networks in which each service station has a single job class. We extend it to multiclass queueing networks operating under static-buffer-priority (SBP) service disciplines. Our extension makes a connection with Palm distributions that allows one to attack a difficulty arising from queue-length truncation, which appears to be unavoidable in the multiclass setting. For multiclass queueing networks operating under SBP service disciplines, our BAR approach provides an alternative to the “interchange of limits” approach that has dominated the literature in the last twenty years. The BAR approach can produce sharp results and allows one to establish steady-state convergence under three additional conditions: stability, state space collapse (SSC) and a certain matrix being “tight.” These three conditions do not appear to depend on the interarrival and service-time distributions beyond their means, and their verification can be studied as three separate modules. In particular, they can be studied in a simpler, continuous-time Markov chain setting when all distributions are exponential. As an example, these three conditions are shown to hold in reentrant lines operating under last-buffer-first-serve discipline. In a two-station, five-class reentrant line, under the heavy-traffic condition, the tight-matrix condition implies both the stability condition and the SSC condition. Whether such a relationship holds generally is an open problem.","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140827875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Motivated by transplant applications, we study a bipartite matching queue with multiclass customers and multitype resources. Customers may change their classes or abandon the system while waiting in queue, and they may decline the offered resource units which results in matching failure. We are interested in designing efficient instantaneous matching policies that allocate resources upon arrival to waiting customers. Our objective is bicriteria and formulated as a cost functional that linearly combines the long-run average expected reward due to successful matches and the long-run average expected cost from customer waiting and abandonment. We first develop a stability condition on the class change and abandonment rates, which requires at least one customer queue with abandonment and that any queue without abandonment have a class transition path to a queue with abandonment. Under this condition, we construct a simple linear program, referred to as the fluid control problem (FCP), which serves as a lower bound for the original stochastic control problem under any admissible policy. We then propose a randomized matching policy based on the solution of the FCP and show that the proposed policy is asymptotically optimal under both the long-run average and ergodic cost criteria. In addition, we apply our method to study two X matching models with two customer classes and two resource types to provide insights on how the class change and matching failure impact the optimal policies.
受移植应用的启发,我们研究了具有多类别客户和多类型资源的双向匹配队列。客户在队列中等待时可能会改变他们的类别或放弃系统,他们也可能拒绝接受所提供的资源单位,从而导致匹配失败。我们对设计高效的瞬时匹配策略很感兴趣,这种策略可以在资源到达时分配给等待的客户。我们的目标是双标准的,并表述为一个成本函数,它线性结合了成功匹配带来的长期平均预期回报以及客户等待和放弃带来的长期平均预期成本。我们首先制定了班级变化率和放弃率的稳定条件,要求至少有一个客户队列出现放弃现象,并且任何未出现放弃现象的队列都有通往出现放弃现象队列的班级转换路径。在此条件下,我们构建了一个简单的线性程序,称为流体控制问题(FCP),它是任何可接受策略下原始随机控制问题的下限。然后,我们根据 FCP 的解提出了一种随机匹配策略,并证明所提出的策略在长期平均成本和遍历成本标准下都是渐近最优的。此外,我们还应用我们的方法研究了具有两个客户类别和两种资源类型的 X 匹配模型,以深入了解类别变化和匹配失败对最优策略的影响。
{"title":"Ergodic Control of Bipartite Matching Queues with Class Change and Matching Failure","authors":"Amin Khademi, Xin Liu","doi":"10.1287/stsy.2022.0008","DOIUrl":"https://doi.org/10.1287/stsy.2022.0008","url":null,"abstract":"Motivated by transplant applications, we study a bipartite matching queue with multiclass customers and multitype resources. Customers may change their classes or abandon the system while waiting in queue, and they may decline the offered resource units which results in matching failure. We are interested in designing efficient instantaneous matching policies that allocate resources upon arrival to waiting customers. Our objective is bicriteria and formulated as a cost functional that linearly combines the long-run average expected reward due to successful matches and the long-run average expected cost from customer waiting and abandonment. We first develop a stability condition on the class change and abandonment rates, which requires at least one customer queue with abandonment and that any queue without abandonment have a class transition path to a queue with abandonment. Under this condition, we construct a simple linear program, referred to as the fluid control problem (FCP), which serves as a lower bound for the original stochastic control problem under any admissible policy. We then propose a randomized matching policy based on the solution of the FCP and show that the proposed policy is asymptotically optimal under both the long-run average and ergodic cost criteria. In addition, we apply our method to study two X matching models with two customer classes and two resource types to provide insights on how the class change and matching failure impact the optimal policies.","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"112 39","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140379098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Meijer, Dennis Schol, Willem van Jaarsveld, M. Vlasiou, Bert Zwart
High-tech systems are typically produced in two stages: (1) production of components using specialized equipment and staff and (2) system assembly/integration. Component production capacity is subject to fluctuations, causing a high risk of shortages of at least one component, which results in costly delays. Companies hedge this risk by strategic investments in excess production capacity and in buffer inventories of components. To optimize these, it is crucial to characterize the relation between component shortage risk and capacity and inventory investments. We suppose that component production capacity and produce demand are normally distributed over finite time intervals, and we accordingly model the production system as a symmetric fork-join queueing network with N statistically identical queues with a common arrival process and independent service processes. Assuming a symmetric cost structure, we subsequently apply extreme value theory to gain analytic insights into this optimization problem. We derive several new results for this queueing network, notably that the scaled maximum of N steady-state queue lengths converges in distribution to a Gaussian random variable. These results translate into asymptotically optimal methods to dimension the system. Tests on a range of problems reveal that these methods typically work well for systems of moderate size. Funding: This work is part of the research program Complexity in High-Tech Manufacturing, (partly) financed by the Dutch Research Council (NWO) [Grant 438.16.121]. The research is also supported by the NWO programs MEERVOUD to M. Vlasiou [Grant 632.003.002] and Talent VICI to B. Zwart [Grant 639.033.413].
高科技系统的生产通常分为两个阶段:(1) 使用专业设备和人员生产组件;(2) 系统组装/集成。元件生产能力会出现波动,造成至少一种元件短缺的高风险,从而导致代价高昂的延误。公司通过对过剩生产能力和元件缓冲库存进行战略投资来规避这一风险。要优化这些投资,关键是要确定零部件短缺风险与产能和库存投资之间的关系。我们假设零部件生产能力和生产需求在有限的时间间隔内呈正态分布,并相应地将生产系统建模为一个对称的叉接排队网络,其中有 N 个统计上相同的队列,它们具有共同的到达过程和独立的服务过程。假定成本结构是对称的,我们随后将应用极值理论来分析这一优化问题。我们得出了该排队网络的几个新结果,特别是 N 个稳态队列长度的缩放最大值在分布上收敛于高斯随机变量。这些结果转化成了系统维度的渐近最优方法。对一系列问题的测试表明,这些方法通常对中等规模的系统效果良好。资助:本研究是高科技制造中的复杂性研究项目的一部分,由荷兰研究理事会(NWO)[438.16.121 号拨款](部分)资助。M. Vlasiou 的 MEERVOUD 项目 [632.003.002 号资助] 和 B. Zwart 的 Talent VICI 项目 [639.033.413 号资助] 也为本研究提供了支持。
{"title":"Optimization of Inventory and Capacity in Large-Scale Assembly Systems Using Extreme-Value Theory","authors":"M. Meijer, Dennis Schol, Willem van Jaarsveld, M. Vlasiou, Bert Zwart","doi":"10.1287/stsy.2022.0014","DOIUrl":"https://doi.org/10.1287/stsy.2022.0014","url":null,"abstract":"High-tech systems are typically produced in two stages: (1) production of components using specialized equipment and staff and (2) system assembly/integration. Component production capacity is subject to fluctuations, causing a high risk of shortages of at least one component, which results in costly delays. Companies hedge this risk by strategic investments in excess production capacity and in buffer inventories of components. To optimize these, it is crucial to characterize the relation between component shortage risk and capacity and inventory investments. We suppose that component production capacity and produce demand are normally distributed over finite time intervals, and we accordingly model the production system as a symmetric fork-join queueing network with N statistically identical queues with a common arrival process and independent service processes. Assuming a symmetric cost structure, we subsequently apply extreme value theory to gain analytic insights into this optimization problem. We derive several new results for this queueing network, notably that the scaled maximum of N steady-state queue lengths converges in distribution to a Gaussian random variable. These results translate into asymptotically optimal methods to dimension the system. Tests on a range of problems reveal that these methods typically work well for systems of moderate size. Funding: This work is part of the research program Complexity in High-Tech Manufacturing, (partly) financed by the Dutch Research Council (NWO) [Grant 438.16.121]. The research is also supported by the NWO programs MEERVOUD to M. Vlasiou [Grant 632.003.002] and Talent VICI to B. Zwart [Grant 639.033.413].","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"119 23","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140381250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we consider the widely studied problem of empirical risk minimization (ERM) of strongly convex and smooth loss functions using iterative gradient-based methods. A major goal of the existing literature has been to compare different prototypical algorithms, such as batch gradient descent (GD) or stochastic gradient descent (SGD), by analyzing their rates of convergence to ϵ-approximate solutions with respect to the number of gradient computations, which is also known as the oracle complexity. For example, the oracle complexity of GD is [Formula: see text], where n is the number of training samples and p is the parameter space dimension. When n is large, this can be prohibitively expensive in practice, and SGD is preferred due to its oracle complexity of [Formula: see text]. Such standard analyses only utilize the smoothness of the loss function in the parameter being optimized. In contrast, we demonstrate that when the loss function is smooth in the data, we can learn the oracle at every iteration and beat the oracle complexities of GD, SGD, and their variants in important regimes. Specifically, at every iteration, our proposed algorithm, Local Polynomial Interpolation-based Gradient Descent (LPI-GD), first performs local polynomial regression with a virtual batch of data points to learn the gradient of the loss function and then estimates the true gradient of the ERM objective function. We establish that the oracle complexity of LPI-GD is [Formula: see text], where d is the data space dimension, and the gradient of the loss function is assumed to belong to an η-Hölder class with respect to the data. Our proof extends the analysis of local polynomial regression in nonparametric statistics to provide supremum norm guarantees for interpolation in multivariate settings and also exploits tools from the inexact GD literature. Unlike the complexities of GD and SGD, the complexity of our method depends on d. However, our algorithm outperforms GD, SGD, and their variants in oracle complexity for a broad range of settings where d is small relative to n. For example, with typical loss functions (such as squared or cross-entropy loss), when [Formula: see text] for any [Formula: see text] and [Formula: see text] is at the statistical limit, our method can be made to require [Formula: see text] oracle calls for any [Formula: see text], while SGD and GD require [Formula: see text] and [Formula: see text] oracle calls, respectively.Funding: This work was supported in part by the Office of Naval Research [Grant N000142012394], in part by the Army Research Office [Multidisciplinary University Research Initiative Grant W911NF-19-1-0217], and in part by the National Science Foundation [Transdisciplinary Research In Principles Of Data Science, Foundations of Data Science].
在本文中,我们使用基于梯度的迭代方法,研究了强凸平滑损失函数的经验风险最小化(ERM)这一广泛研究的问题。现有文献的一个主要目标是比较不同的原型算法,如批量梯度下降算法(GD)或随机梯度下降算法(SGD),分析它们收敛到ϵ近似解的率与梯度计算次数(也称为oracle复杂度)的关系。例如,GD 的oracle 复杂度为[公式:见正文],其中 n 是训练样本数,p 是参数空间维数。当 n 较大时,这种方法在实际应用中会过于昂贵,而 SGD 的甲骨文复杂度为[公式:见正文],因此更受青睐。这种标准分析只能利用被优化参数的损失函数的平滑性。相比之下,我们证明了当数据中的损失函数是平滑的,我们可以在每次迭代中学习神谕,并在重要情况下击败 GD、SGD 及其变体的神谕复杂度。具体来说,在每次迭代时,我们提出的算法--基于局部多项式插值的梯度下降算法(LPI-GD)--首先用一批虚拟数据点进行局部多项式回归,学习损失函数的梯度,然后估计 ERM 目标函数的真实梯度。我们确定 LPI-GD 的算法复杂度为 [公式:见正文],其中 d 是数据空间维度,损失函数的梯度假定与数据有关,属于 η-Hölder 类。我们的证明扩展了非参数统计中的局部多项式回归分析,为多变量设置中的插值提供了至高规范保证,同时也利用了非精确 GD 文献中的工具。与 GD 和 SGD 的复杂性不同,我们的方法的复杂性取决于 d。然而,在 d 相对于 n 较小的各种情况下,我们的算法在 Oracle 复杂性方面优于 GD、SGD 及其变体。例如,对于典型的损失函数(如平方损失或交叉熵损失),当任意[公式:见正文]的[公式:见正文]和[公式:见正文]处于统计极限时,我们的方法可以使任意[公式:见正文]的[公式:见正文]都不需要[公式:见正文]的神谕调用,而SGD和GD则分别需要[公式:见正文]和[公式:见正文]的神谕调用:这项工作部分得到海军研究办公室[Grant N000142012394]的支持,部分得到陆军研究办公室[Multidisciplinary University Research Initiative Grant W911NF-19-1-0217]的支持,部分得到美国国家科学基金会[Transdisciplinary Research In Principles Of Data Science, Foundations of Data Science]的支持。
{"title":"Gradient-Based Empirical Risk Minimization Using Local Polynomial Regression","authors":"Ali Jadbabaie, Anuran Makur, Devavrat Shah","doi":"10.1287/stsy.2022.0003","DOIUrl":"https://doi.org/10.1287/stsy.2022.0003","url":null,"abstract":"In this paper, we consider the widely studied problem of empirical risk minimization (ERM) of strongly convex and smooth loss functions using iterative gradient-based methods. A major goal of the existing literature has been to compare different prototypical algorithms, such as batch gradient descent (GD) or stochastic gradient descent (SGD), by analyzing their rates of convergence to ϵ-approximate solutions with respect to the number of gradient computations, which is also known as the oracle complexity. For example, the oracle complexity of GD is [Formula: see text], where n is the number of training samples and p is the parameter space dimension. When n is large, this can be prohibitively expensive in practice, and SGD is preferred due to its oracle complexity of [Formula: see text]. Such standard analyses only utilize the smoothness of the loss function in the parameter being optimized. In contrast, we demonstrate that when the loss function is smooth in the data, we can learn the oracle at every iteration and beat the oracle complexities of GD, SGD, and their variants in important regimes. Specifically, at every iteration, our proposed algorithm, Local Polynomial Interpolation-based Gradient Descent (LPI-GD), first performs local polynomial regression with a virtual batch of data points to learn the gradient of the loss function and then estimates the true gradient of the ERM objective function. We establish that the oracle complexity of LPI-GD is [Formula: see text], where d is the data space dimension, and the gradient of the loss function is assumed to belong to an η-Hölder class with respect to the data. Our proof extends the analysis of local polynomial regression in nonparametric statistics to provide supremum norm guarantees for interpolation in multivariate settings and also exploits tools from the inexact GD literature. Unlike the complexities of GD and SGD, the complexity of our method depends on d. However, our algorithm outperforms GD, SGD, and their variants in oracle complexity for a broad range of settings where d is small relative to n. For example, with typical loss functions (such as squared or cross-entropy loss), when [Formula: see text] for any [Formula: see text] and [Formula: see text] is at the statistical limit, our method can be made to require [Formula: see text] oracle calls for any [Formula: see text], while SGD and GD require [Formula: see text] and [Formula: see text] oracle calls, respectively.Funding: This work was supported in part by the Office of Naval Research [Grant N000142012394], in part by the Army Research Office [Multidisciplinary University Research Initiative Grant W911NF-19-1-0217], and in part by the National Science Foundation [Transdisciplinary Research In Principles Of Data Science, Foundations of Data Science].","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140599585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the appointment scheduling for a physician in a healthcare facility. Patients, of two types differentiated by their revenues and day preferences, contact the facility through either a call center to be scheduled immediately or a website to be scheduled the following morning. The facility aims to maximize the long-run average revenue, while ensuring that a certain service level is satisfied for patients generating lower revenue. The facility has two decisions: offering a set of appointment days and choosing the patient type to prioritize while contacting the website patients. Model 1 is a periodic Markov Decision Process (MDP) model without the service-level constraint. We establish certain structural properties of Model 1, while providing sufficient conditions for the existence of a preferred patient type and for the nonoptimality of the commonly used offer-all policy. We also demonstrate the importance of patient preference in determining the preferred type. Model 2 is the constrained MDP model that accommodates the service-level constraint and has an optimal randomized policy with a special structure. This allows developing an efficient method to identify a well-performing policy. We illustrate the performance of this policy through numerical experiments, for systems with and without no-shows.Supplemental Material: The online appendix is available at https://doi.org/10.1287/stsy.2022.0029 .
{"title":"Appointment Requests from Multiple Channels: Characterizing Optimal Set of Appointment Days to Offer with Patient Preferences","authors":"Feray Tunçalp, Lerzan Örmeci","doi":"10.1287/stsy.2022.0029","DOIUrl":"https://doi.org/10.1287/stsy.2022.0029","url":null,"abstract":"We consider the appointment scheduling for a physician in a healthcare facility. Patients, of two types differentiated by their revenues and day preferences, contact the facility through either a call center to be scheduled immediately or a website to be scheduled the following morning. The facility aims to maximize the long-run average revenue, while ensuring that a certain service level is satisfied for patients generating lower revenue. The facility has two decisions: offering a set of appointment days and choosing the patient type to prioritize while contacting the website patients. Model 1 is a periodic Markov Decision Process (MDP) model without the service-level constraint. We establish certain structural properties of Model 1, while providing sufficient conditions for the existence of a preferred patient type and for the nonoptimality of the commonly used offer-all policy. We also demonstrate the importance of patient preference in determining the preferred type. Model 2 is the constrained MDP model that accommodates the service-level constraint and has an optimal randomized policy with a special structure. This allows developing an efficient method to identify a well-performing policy. We illustrate the performance of this policy through numerical experiments, for systems with and without no-shows.Supplemental Material: The online appendix is available at https://doi.org/10.1287/stsy.2022.0029 .","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140168240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider load balancing in large-scale heterogeneous server systems in the presence of data locality that imposes constraints on which tasks can be assigned to which servers. The constraints are naturally captured by a bipartite graph between the servers and the dispatchers handling assignments of various arrival flows. When a task arrives, the corresponding dispatcher assigns it to a server with the shortest queue among [Formula: see text] randomly selected servers obeying these constraints. Server processing speeds are heterogeneous, and they depend on the server type. For a broad class of bipartite graphs, we characterize the limit of the appropriately scaled occupancy process, both on the process level and in steady state, as the system size becomes large. Using such a characterization, we show that imposing data locality constraints can significantly improve the performance of heterogeneous systems. This is in stark contrast to either heterogeneous servers in a full flexible system or data locality constraints in systems with homogeneous servers, both of which have been observed to degrade the system performance. Extensive numerical experiments corroborate the theoretical results.Funding: This work was partially supported by the National Science Foundation [CCF. 07/2021–06/2024].
{"title":"Exploiting Data Locality to Improve Performance of Heterogeneous Server Clusters","authors":"Zhisheng Zhao, Debankur Mukherjee, Ruoyu Wu","doi":"10.1287/stsy.2022.0040","DOIUrl":"https://doi.org/10.1287/stsy.2022.0040","url":null,"abstract":"We consider load balancing in large-scale heterogeneous server systems in the presence of data locality that imposes constraints on which tasks can be assigned to which servers. The constraints are naturally captured by a bipartite graph between the servers and the dispatchers handling assignments of various arrival flows. When a task arrives, the corresponding dispatcher assigns it to a server with the shortest queue among [Formula: see text] randomly selected servers obeying these constraints. Server processing speeds are heterogeneous, and they depend on the server type. For a broad class of bipartite graphs, we characterize the limit of the appropriately scaled occupancy process, both on the process level and in steady state, as the system size becomes large. Using such a characterization, we show that imposing data locality constraints can significantly improve the performance of heterogeneous systems. This is in stark contrast to either heterogeneous servers in a full flexible system or data locality constraints in systems with homogeneous servers, both of which have been observed to degrade the system performance. Extensive numerical experiments corroborate the theoretical results.Funding: This work was partially supported by the National Science Foundation [CCF. 07/2021–06/2024].","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139902558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper studies the sensitivity (or insensitivity) of a class of load balancing algorithms that achieve asymptotic zero-waiting in the sub-Halfin-Whitt regime, named LB-zero. Most existing results on zero-waiting load balancing algorithms assume the service time distribution is exponential. This paper establishes the large-system insensitivity of LB-zero for jobs whose service time follows a Coxian distribution with a finite number of phases. This result justifies that LB-zero achieves asymptotic zero-waiting for a large class of service time distributions as the Coxian family is dense in the class of positive-valued distributions. To prove this result, this paper develops a new technique, called “iterative state-space peeling” (ISSP). ISSP first identifies an iterative relation between the upper and lower bounds on the queue states and then proves that the system lives near the fixed point of the iterative bounds with a high probability. Based on ISSP, the steady-state distribution of the queue length is further analyzed by applying Stein’s method in the neighborhood of the fixed point. ISSP, like state-space collapse in heavy-traffic analysis, is a general approach that may be used to study other complex stochastic systems.
{"title":"Large-System Insensitivity of Zero-Waiting Load Balancing Algorithms","authors":"Xin Liu, Kang Gong, Lei Ying","doi":"10.1287/stsy.2022.0023","DOIUrl":"https://doi.org/10.1287/stsy.2022.0023","url":null,"abstract":"This paper studies the sensitivity (or insensitivity) of a class of load balancing algorithms that achieve asymptotic zero-waiting in the sub-Halfin-Whitt regime, named LB-zero. Most existing results on zero-waiting load balancing algorithms assume the service time distribution is exponential. This paper establishes the large-system insensitivity of LB-zero for jobs whose service time follows a Coxian distribution with a finite number of phases. This result justifies that LB-zero achieves asymptotic zero-waiting for a large class of service time distributions as the Coxian family is dense in the class of positive-valued distributions. To prove this result, this paper develops a new technique, called “iterative state-space peeling” (ISSP). ISSP first identifies an iterative relation between the upper and lower bounds on the queue states and then proves that the system lives near the fixed point of the iterative bounds with a high probability. Based on ISSP, the steady-state distribution of the queue length is further analyzed by applying Stein’s method in the neighborhood of the fixed point. ISSP, like state-space collapse in heavy-traffic analysis, is a general approach that may be used to study other complex stochastic systems.","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139583130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}