MS-PPO: Mean Standard Deviation Proximal Policy Optimization for Reliable Parking Space Search in Structured Environments
DOI:
https://doi.org/10.1609/aaai.v40i24.39091Abstract
This paper investigates the reliable parking space search problem in structured environments, with the objective of minimizing the linear combination of mean and standard deviation (mean-std) parking space search time. While canonical parking space search algorithms usually target the minimal expected search time, we argue that risk-averse users would like to trade expectation with its variance, leading to the reliable parking space search problem, which minimizes the mean-std search time. However, the non-additive nature of standard deviation makes the reliable parking space search problem difficult to solve with canonical search algorithms. To address the challenge, we propose a model-free reinforcement learning algorithm, namely MS-PPO, which simultaneously estimates the mean and standard deviation of the current decision-making policy's search time, and performs policy optimization via clipped mean-std advantage function maximization. MS-PPO is compared with several baseline parking space search algorithms as well as canonical reinforcement learning algorithms in a range of representative parking lot networks, and achieves the best overall performance in terms of the mean-std parking space search time. We also validate the effectiveness of MS-PPO in a real parking garage by deploying it to an autonomous vehicle testbed.Published
2026-03-14
How to Cite
Chen, H., & Guo, H. (2026). MS-PPO: Mean Standard Deviation Proximal Policy Optimization for Reliable Parking Space Search in Structured Environments. Proceedings of the AAAI Conference on Artificial Intelligence, 40(24), 20059–20066. https://doi.org/10.1609/aaai.v40i24.39091
Issue
Section
AAAI Technical Track on Machine Learning I