[1]

J. Cheng, S. Lai, S. Yao, and B. Xue, “Offline Multi-Objective Bandits: From Logged Data to Pareto-Optimal Policies”, AAAI, vol. 40, no. 43, pp. 36636–36644, Mar. 2026.