Online DR-Submodular Maximization: Minimizing Regret and Constraint Violation

Authors

  • Prasanna Raut University of Washington
  • Omid Sadeghi University of Washington
  • Maryam Fazel University of Washington

Keywords:

Optimization, Online Learning & Bandits

Abstract

In this paper, we consider online continuous DR-submodular maximization with linear stochastic long-term constraints. Compared to the prior work on online submodular maximization, our setting introduces the extra complication of stochastic linear constraint functions that are i.i.d. generated at each round. In particular, at each time step a DR-submodular utility function and a constraint vector, i.i.d. generated from an unknown distribution, are revealed after committing to an action and we aim to maximize the overall utility while the expected cumulative resource consumption is below a fixed budget. Stochastic long-term constraints arise naturally in applications where there is a limited budget or resource available and resource consumption at each step is governed by stochastically time-varying environments. We propose the Online Lagrangian Frank-Wolfe (OLFW) algorithm to solve this class of online problems. We analyze the performance of the OLFW algorithm and we obtain sub-linear regret bounds as well as sub-linear cumulative constraint violation bounds, both in expectation and with high probability.

Downloads

Published

2021-05-18

How to Cite

Raut, P., Sadeghi, O., & Fazel, M. (2021). Online DR-Submodular Maximization: Minimizing Regret and Constraint Violation. Proceedings of the AAAI Conference on Artificial Intelligence, 35(11), 9395-9402. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17132

Issue

Section

AAAI Technical Track on Machine Learning IV