Noisy Derivative-Free Optimization With Value Suppression

Authors

  • Hong Wang Nanjing University
  • Hong Qian Nanjing University
  • Yang Yu Nanjing University

Keywords:

Derivative-free Optimization, Noisy Environment, Direct Policy Search, Value Suppression

Abstract

Derivative-free optimization has shown advantage in solving sophisticated problems such as policy search, when the environment is noise-free. Many real-world environments are noisy, where solution evaluations are inaccurate due to the noise. Noisy evaluation can badly injure derivative-free optimization, as it may make a worse solution looks better. Sampling is a straightforward way to reduce noise, while previous studies have shown that delay the noise handling to the comparison time point (i.e., threshold selection) can be helpful for derivative-free optimization. This work further delays the noise handling, and proposes a simple noise handling mechanism, i.e., value suppression. By value suppression, we do nothing about noise until the best-so-far solution has not been improved for a period, and then suppress the value of the best-so-far solution and continue the optimization. On synthetic problems as well as reinforcement learning tasks, experiments verify that value suppression can be significantly more effective than the previous methods.

Downloads

Published

2018-04-25

How to Cite

Wang, H., Qian, H., & Yu, Y. (2018). Noisy Derivative-Free Optimization With Value Suppression. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/11534

Issue

Section

AAAI Technical Track: Heuristic Search and Optimization