[1]

Z. Zhu, K. Lin, B. Dai, and J. Zhou, “Self-Adaptive Imitation Learning: Learning Tasks with Delayed Rewards from Sub-optimal Demonstrations”, AAAI, vol. 36, no. 8, pp. 9269-9277, Jun. 2022.