(1)
Lipton, Z.; Li, X.; Gao, J.; Li, L.; Ahmed, F.; Deng, L. BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems. AAAI 2018, 32.