A Primal-Dual Online Algorithm for Online Matching Problem in Dynamic Environments

Yu-Hang Zhou; Peng Hu; Chen Liang; Huan Xu; Guangda Huzhang; Yinfu Feng; Qing Da; Xinshang Wang; An-Xiang Zeng

doi:10.1609/aaai.v35i12.17331

Authors

Yu-Hang Zhou Alibaba Group
Peng Hu Alibaba Group
Chen Liang Alibaba Group
Huan Xu Alibaba Group
Guangda Huzhang Alibaba Group
Yinfu Feng Alibaba Group
Qing Da Alibaba Group
Xinshang Wang Alibaba Group
An-Xiang Zeng Alibaba Group

DOI:

https://doi.org/10.1609/aaai.v35i12.17331

Keywords:

Online Learning & Bandits

Abstract

Recently, the online matching problem has attracted much attention due to its wide application on real-world decision-making scenarios. In stationary environments, by adopting the stochastic user arrival model, existing methods are proposed to learn dual optimal prices and are shown to achieve a fast regret bound. However, the stochastic model is no longer a proper assumption when the environment is changing, leading to an optimistic method that may suffer poor performance. In this paper, we study the online matching problem in dynamic environments in which the dual optimal prices are allowed to vary over time. We bound the dynamic regret of online matching problem by the sum of two quantities, including a regret of online max-min problem and a dynamic regret of online convex optimization (OCO) problem. Then we propose a novel online approach named Primal-Dual Online Algorithm (PDOA) to minimize both quantities. In particular, PDOA adopts the primal-dual framework by optimizing dual prices with the online gradient descent (OGD) algorithm to eliminate the online max-min problem's regret. Moreover, it maintains a set of OGD experts and combines them via an expert-tracking algorithm, which gives a sublinear dynamic regret bound for the OCO problem. We show that PDOA achieves an O(K sqrt{T(1+P_T)}) dynamic regret where K is the number of resources, T is the number of iterations and P_T is the path-length of any potential dual price sequence that reflects the dynamic environment. Finally, experiments on real applications exhibit the superiority of our approach.

A Primal-Dual Online Algorithm for Online Matching Problem in Dynamic Environments

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription