Gagan Aggarwal, Ashwinkumar Badanidiyuru, Paul Dütting, Federico Fusco
{"title":"Selling Joint Ads: A Regret Minimization Perspective","authors":"Gagan Aggarwal, Ashwinkumar Badanidiyuru, Paul Dütting, Federico Fusco","doi":"arxiv-2409.07819","DOIUrl":null,"url":null,"abstract":"Motivated by online retail, we consider the problem of selling one item\n(e.g., an ad slot) to two non-excludable buyers (say, a merchant and a brand).\nThis problem captures, for example, situations where a merchant and a brand\ncooperatively bid in an auction to advertise a product, and both benefit from\nthe ad being shown. A mechanism collects bids from the two and decides whether\nto allocate and which payments the two parties should make. This gives rise to\nintricate incentive compatibility constraints, e.g., on how to split payments\nbetween the two parties. We approach the problem of finding a\nrevenue-maximizing incentive-compatible mechanism from an online learning\nperspective; this poses significant technical challenges. First, the action\nspace (the class of all possible mechanisms) is huge; second, the function that\nmaps mechanisms to revenue is highly irregular, ruling out standard\ndiscretization-based approaches. In the stochastic setting, we design an efficient learning algorithm\nachieving a regret bound of $O(T^{3/4})$. Our approach is based on an adaptive\ndiscretization scheme of the space of mechanisms, as any non-adaptive\ndiscretization fails to achieve sublinear regret. In the adversarial setting,\nwe exploit the non-Lipschitzness of the problem to prove a strong negative\nresult, namely that no learning algorithm can achieve more than half of the\nrevenue of the best fixed mechanism in hindsight. We then consider the\n$\\sigma$-smooth adversary; we construct an efficient learning algorithm that\nachieves a regret bound of $O(T^{2/3})$ and builds on a succinct encoding of\nexponentially many experts. Finally, we prove that no learning algorithm can\nachieve less than $\\Omega(\\sqrt T)$ regret in both the stochastic and the\nsmooth setting, thus narrowing the range where the minimax regret rates for\nthese two problems lie.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"29 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Science and Game Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07819","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Motivated by online retail, we consider the problem of selling one item
(e.g., an ad slot) to two non-excludable buyers (say, a merchant and a brand).
This problem captures, for example, situations where a merchant and a brand
cooperatively bid in an auction to advertise a product, and both benefit from
the ad being shown. A mechanism collects bids from the two and decides whether
to allocate and which payments the two parties should make. This gives rise to
intricate incentive compatibility constraints, e.g., on how to split payments
between the two parties. We approach the problem of finding a
revenue-maximizing incentive-compatible mechanism from an online learning
perspective; this poses significant technical challenges. First, the action
space (the class of all possible mechanisms) is huge; second, the function that
maps mechanisms to revenue is highly irregular, ruling out standard
discretization-based approaches. In the stochastic setting, we design an efficient learning algorithm
achieving a regret bound of $O(T^{3/4})$. Our approach is based on an adaptive
discretization scheme of the space of mechanisms, as any non-adaptive
discretization fails to achieve sublinear regret. In the adversarial setting,
we exploit the non-Lipschitzness of the problem to prove a strong negative
result, namely that no learning algorithm can achieve more than half of the
revenue of the best fixed mechanism in hindsight. We then consider the
$\sigma$-smooth adversary; we construct an efficient learning algorithm that
achieves a regret bound of $O(T^{2/3})$ and builds on a succinct encoding of
exponentially many experts. Finally, we prove that no learning algorithm can
achieve less than $\Omega(\sqrt T)$ regret in both the stochastic and the
smooth setting, thus narrowing the range where the minimax regret rates for
these two problems lie.