Reward Certification for Policy Smoothed Reinforcement Learning

Authors

  • Ronghui Mu University of Liverpool
  • Leandro Soriano Marcolino Lancaster University
  • Yanghao Zhang University of Liverpool
  • Tianle Zhang University of Liverpool
  • Xiaowei Huang University of Liverpool
  • Wenjie Ruan University of Liverpool

DOI:

https://doi.org/10.1609/aaai.v38i19.30139

Keywords:

General

Abstract

Reinforcement Learning (RL) has achieved remarkable success in safety-critical areas, but it can be weakened by adversarial attacks. Recent studies have introduced ``smoothed policies" to enhance its robustness. Yet, it is still challenging to establish a provable guarantee to certify the bound of its total reward. Prior methods relied primarily on computing bounds using Lipschitz continuity or calculating the probability of cumulative reward being above specific thresholds. However, these techniques are only suited for continuous perturbations on the RL agent's observations and are restricted to perturbations bounded by the l2-norm. To address these limitations, this paper proposes a general black-box certification method, called ReCePS, which is capable of directly certifying the cumulative reward of the smoothed policy under various lp-norm bounded perturbations. Furthermore, we extend our methodology to certify perturbations on action spaces. Our approach leverages f-divergence to measure the distinction between the original distribution and the perturbed distribution, subsequently determining the certification bound by solving a convex optimisation problem. We provide a comprehensive theoretical analysis and run experiments in multiple environments. Our results show that our method not only improves the tightness of certified lower bound of the mean cumulative reward but also demonstrates better efficiency than state-of-the-art methods.

Published

2024-03-24

How to Cite

Mu, R., Soriano Marcolino, L., Zhang, Y., Zhang, T., Huang, X., & Ruan, W. (2024). Reward Certification for Policy Smoothed Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(19), 21429-21437. https://doi.org/10.1609/aaai.v38i19.30139

Issue

Section

AAAI Technical Track on Safe, Robust and Responsible AI Track