MS-PPO: Mean Standard Deviation Proximal Policy Optimization for Reliable Parking Space Search in Structured Environments

Haoming Chen; Hongliang Guo

doi:10.1609/aaai.v40i24.39091

Authors

Haoming Chen Sichuan University
Hongliang Guo Sichuan University

DOI:

https://doi.org/10.1609/aaai.v40i24.39091

Abstract

This paper investigates the reliable parking space search problem in structured environments, with the objective of minimizing the linear combination of mean and standard deviation (mean-std) parking space search time. While canonical parking space search algorithms usually target the minimal expected search time, we argue that risk-averse users would like to trade expectation with its variance, leading to the reliable parking space search problem, which minimizes the mean-std search time. However, the non-additive nature of standard deviation makes the reliable parking space search problem difficult to solve with canonical search algorithms. To address the challenge, we propose a model-free reinforcement learning algorithm, namely MS-PPO, which simultaneously estimates the mean and standard deviation of the current decision-making policy's search time, and performs policy optimization via clipped mean-std advantage function maximization. MS-PPO is compared with several baseline parking space search algorithms as well as canonical reinforcement learning algorithms in a range of representative parking lot networks, and achieves the best overall performance in terms of the mean-std parking space search time. We also validate the effectiveness of MS-PPO in a real parking garage by deploying it to an autonomous vehicle testbed.

MS-PPO: Mean Standard Deviation Proximal Policy Optimization for Reliable Parking Space Search in Structured Environments

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information