1.
Cheng J, Lai S, Yao S, Xue B. Offline Multi-Objective Bandits: From Logged Data to Pareto-Optimal Policies. AAAI [Internet]. 2026 Mar. 14 [cited 2026 May 15];40(43):36636-44. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/40987