Noisy Derivative-Free Optimization With Value Suppression

Hong Wang; Hong Qian; Yang Yu

doi:10.1609/aaai.v32i1.11534

Authors

Hong Wang Nanjing University
Hong Qian Nanjing University
Yang Yu Nanjing University

DOI:

https://doi.org/10.1609/aaai.v32i1.11534

Keywords:

Derivative-free Optimization, Noisy Environment, Direct Policy Search, Value Suppression

Abstract

Derivative-free optimization has shown advantage in solving sophisticated problems such as policy search, when the environment is noise-free. Many real-world environments are noisy, where solution evaluations are inaccurate due to the noise. Noisy evaluation can badly injure derivative-free optimization, as it may make a worse solution looks better. Sampling is a straightforward way to reduce noise, while previous studies have shown that delay the noise handling to the comparison time point (i.e., threshold selection) can be helpful for derivative-free optimization. This work further delays the noise handling, and proposes a simple noise handling mechanism, i.e., value suppression. By value suppression, we do nothing about noise until the best-so-far solution has not been improved for a period, and then suppress the value of the best-so-far solution and continue the optimization. On synthetic problems as well as reinforcement learning tasks, experiments verify that value suppression can be significantly more effective than the previous methods.

Noisy Derivative-Free Optimization With Value Suppression

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information