Maximizing the Probability of Arriving on Time: A Practical Q-Learning Method

Zhiguang Cao; Hongliang Guo; Jie Zhang; Frans Oliehoek; Ulrich Fastenrath

doi:10.1609/aaai.v31i1.11170

Maximizing the Probability of Arriving on Time: A Practical Q-Learning Method

Authors

Zhiguang Cao Nanyang Technological University
Hongliang Guo University of Electronic Science and Technology of China
Jie Zhang Nanyang Technological University
Frans Oliehoek University of Liverpool
Ulrich Fastenrath BMW Group

DOI:

https://doi.org/10.1609/aaai.v31i1.11170

Abstract

The stochastic shortest path problem is of crucial importance for the development of sustainable transportation systems. Existing methods based on the probability tail model seek for the path that maximizes the probability of arriving at the destination before a deadline. However, they suffer from low accuracy and/or high computational cost. We design a novel Q-learning method where the converged Q-values have the practical meaning as the actual probabilities of arriving on time so as to improve accuracy. By further adopting dynamic neural networks to learn the value function, our method can scale well to large road networks with arbitrary deadlines. Experimental results on real road networks demonstrate the significant advantages of our method over other counterparts.

Downloads

Published

2017-02-12

How to Cite

Cao, Z., Guo, H., Zhang, J., Oliehoek, F., & Fastenrath, U. (2017). Maximizing the Probability of Arriving on Time: A Practical Q-Learning Method. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1). https://doi.org/10.1609/aaai.v31i1.11170

Download Citation

Issue

Vol. 31 No. 1 (2017): Thirty-First AAAI Conference on Artificial Intelligence

Section

Special Track on Computational Sustainability

Maximizing the Probability of Arriving on Time: A Practical Q-Learning Method

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription