Return to Article Details Bandit-Based Planning and Learning in Continuous-Action Markov Decision Processes Download Download PDF