安全的在线竞标优化，并在投资回报和预算限制下受到不确定性的限制

论文标题

安全的在线竞标优化，并在投资回报和预算限制下受到不确定性的限制

Safe Online Bid Optimization with Return-On-Investment and Budget Constraints subject to Uncertainty

论文作者

Castiglioni, Matteo, Nuara, Alessandro, Romano, Giulia, Spadaro, Giorgio, Trovò, Francesco, Gatti, Nicola

论文摘要

在在线营销中，广告商的目标通常是实现高量和高盈利能力之间的权衡。这些公司的业务部门通常通过最大化这些量来解决这一权衡，同时保证投资回报率（ROI）的下限。本文研究了组合匪徒算法，以优化广告活动的优化，但预算和投资回报率限制不确定。我们研究优化和学习问题的性质。特别是，当关注不确定性的优化问题时，我们表明，除非p = np，否则它在任何因素内都不适合，并且我们提供了可实现最佳解决方案的伪多项式算法。在考虑不确定性时，我们证明，在学习过程中，没有在线学习算法可以违反（ROI或预算）约束，同时保证sublinear pseudo-regret。因此，我们提供了一种算法，即GCB，保证了统一性的遗憾，以潜在的线性约束违规行为为代价。我们还设计其安全版本，即GCB_ {SAFE}，保证W.H.P.违反违规数量的恒定上限为线性伪regret的成本。更有趣的是，我们提供了一种算法，即gcb_ {safe}（ψ，ϕ），保证sublinear pseudo-regret and Safety W.H.P.分别以收益率和预算约束的满意度接受公差ψ和ϕ。该算法实际上减轻了由于违规的约束而不排除融合到最佳解决方案而导致的风险。最后，我们通过实验性地比较了从现实世界数据产生的设置中的伪regret/Constraint-violagration-rightoff进行比较，这表明在实践中采用安全性约束的重要性以及我们算法的有效性。

In online marketing, the advertisers' goal is usually a tradeoff between achieving high volumes and high profitability. The companies' business units customarily address this tradeoff by maximizing the volumes while guaranteeing a lower bound to the Return On Investment (ROI). This paper investigates combinatorial bandit algorithms for the bid optimization of advertising campaigns subject to uncertain budget and ROI constraints. We study the nature of both the optimization and learning problems. In particular, when focusing on the optimization problem without uncertainty, we show that it is inapproximable within any factor unless P=NP, and we provide a pseudo-polynomial-time algorithm that achieves an optimal solution. When considering uncertainty, we prove that no online learning algorithm can violate the (ROI or budget) constraints during the learning process a sublinear number of times while guaranteeing a sublinear pseudo-regret. Thus, we provide an algorithm, namely GCB, guaranteeing sublinear regret at the cost of a potentially linear number of constraints violations. We also design its safe version, namely GCB_{safe}, guaranteeing w.h.p. a constant upper bound on the number of constraints violations at the cost of a linear pseudo-regret. More interestingly, we provide an algorithm, namely GCB_{safe}(ψ,ϕ), guaranteeing both sublinear pseudo-regret and safety w.h.p. at the cost of accepting tolerances ψand ϕin the satisfaction of the ROI and budget constraints, respectively. This algorithm actually mitigates the risks due to the constraints violations without precluding the convergence to the optimal solution. Finally, we experimentally compare our algorithms in terms of pseudo-regret/constraint-violation tradeoff in settings generated from real-world data, showing the importance of adopting safety constraints in practice and the effectiveness of our algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题