IPO: Interior-Point Policy Optimization under Constraints

Authors

  • Yongshuai Liu University of California, Davis
  • Jiaxin Ding University of California, Davis
  • Xin Liu University of California, Davis

DOI:

https://doi.org/10.1609/aaai.v34i04.5932

Abstract

In this paper, we study reinforcement learning (RL) algorithms to solve real-world decision problems with the objective of maximizing the long-term reward as well as satisfying cumulative constraints. We propose a novel first-order policy optimization method, Interior-point Policy Optimization (IPO), which augments the objective with logarithmic barrier functions, inspired by the interior-point method. Our proposed method is easy to implement with performance guarantees and can handle general types of cumulative multi-constraint settings. We conduct extensive evaluations to compare our approach with state-of-the-art baselines. Our algorithm outperforms the baseline algorithms, in terms of reward maximization and constraint satisfaction.

Downloads

Published

2020-04-03

How to Cite

Liu, Y., Ding, J., & Liu, X. (2020). IPO: Interior-Point Policy Optimization under Constraints. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04), 4940-4947. https://doi.org/10.1609/aaai.v34i04.5932

Issue

Section

AAAI Technical Track: Machine Learning