TY - JOUR AU - Hilleli, Bar AU - El-Yaniv, Ran PY - 2018/04/25 Y2 - 2024/03/29 TI - Toward Deep Reinforcement Learning Without a Simulator: An Autonomous Steering Example JF - Proceedings of the AAAI Conference on Artificial Intelligence JA - AAAI VL - 32 IS - 1 SE - AAAI Technical Track: Human-AI Collaboration DO - 10.1609/aaai.v32i1.11490 UR - https://ojs.aaai.org/index.php/AAAI/article/view/11490 SP - AB - <p> We propose a scheme for training a computerized agent to perform complex human tasks such as highway steering. The scheme is designed to follow a natural learning process whereby a human instructor teaches a computerized trainee. It enables leveraging the weak supervision abilities of a (human) instructor, who, while unable to perform well herself at the required task, can provide coherent and learnable instantaneous reward signals to the computerized trainee. The learning process consists of three supervised elements followed by reinforcement learning. The supervised learning stages are: (i) supervised imitation learning; (ii) supervised reward induction; and (iii) supervised safety module construction. We implemented this scheme using deep convolutional networks and applied it to successfully create a computerized agent capable of autonomous highway steering over the well-known racing game Assetto Corsa. We demonstrate that the use of all components is essential to effectively carry out reinforcement learning of the steering task using vision alone, without access to a driving simulator internals, and operating in wall-clock time. </p> ER -