Near-Optimal MNL Bandits Under Risk Criteria

Guangyu Xi; Chao Tao; Yuan Zhou

doi:10.1609/aaai.v35i12.17245

Near-Optimal MNL Bandits Under Risk Criteria

Authors

Guangyu Xi University of Maryland, College Park
Chao Tao Indiana University Bloomington
Yuan Zhou UIUC

DOI:

https://doi.org/10.1609/aaai.v35i12.17245

Keywords:

Learning Theory, Online Learning & Bandits

Abstract

We study MNL bandits, which is a variant of the traditional multi-armed bandit problem, under risk criteria. Unlike the ordinary expected revenue, risk criteria are more general goals widely used in industries and business. We design algorithms for a broad class of risk criteria, including but not limited to the well-known conditional value-at-risk, Sharpe ratio, and entropy risk, and prove that they suffer a near-optimal regret. As a complement, we also conduct experiments with both synthetic and real data to show the empirical performance of our proposed algorithms.

Downloads

Published

2021-05-18

How to Cite

Xi, G., Tao, C., & Zhou, Y. (2021). Near-Optimal MNL Bandits Under Risk Criteria. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12), 10397–10404. https://doi.org/10.1609/aaai.v35i12.17245

Download Citation

Issue

Vol. 35 No. 12: AAAI-21 Technical Tracks 12

Section

AAAI Technical Track on Machine Learning V

Near-Optimal MNL Bandits Under Risk Criteria

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information