ERFSL: An Efficient Reward Function Searcher via Large Language Models for Custom-Environment Multi-Objective Reinforcement Learning (Student Abstract)

Guanwen Xie; Jingzehua Xu; Yiyuan Yang; Yimian Ding; Shuai Zhang

doi:10.1609/aaai.v39i28.35316

ERFSL: An Efficient Reward Function Searcher via Large Language Models for Custom-Environment Multi-Objective Reinforcement Learning (Student Abstract)

Authors

Guanwen Xie Massachusetts Institute of Technology
Jingzehua Xu Massachusetts Institute of Technology
Yiyuan Yang University of Oxford
Yimian Ding Massachusetts Institute of Technology
Shuai Zhang New Jersey Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v39i28.35316

Abstract

We propose ERFSL, an efficient reward function searcher using large language models (LLMs) for custom-environment, multi-objective reinforcement learning (RL). ERFSL generates reward components based on explicit user requirements and rectifies them, and iteratively optimizes the weights of these components based on textual context. Applied to an underwater data collection RL task, ERFSL corrects reward codes with only one feedback iteration per requirement, and acquires diverse reward functions within the Pareto set. ERFSL also presents robust capability for deviated weights and small-size LLMs such as GPT-4o mini. The full-text prompts, examples of LLM-generated answers, and source code are available at https://360zmem.github.io/LLMRsearcher/ .

AAAI-25 / IAAI-25 / EAAI-25 Proceedings Cover

Downloads

Published

2025-04-11

How to Cite

Xie, G., Xu, J., Yang, Y., Ding, Y., & Zhang, S. (2025). ERFSL: An Efficient Reward Function Searcher via Large Language Models for Custom-Environment Multi-Objective Reinforcement Learning (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 39(28), 29535–29537. https://doi.org/10.1609/aaai.v39i28.35316

Download Citation

Issue

Vol. 39 No. 28: IAAI-25, EAAI-25, AAAI-25 Student Abstracts, Undergraduate Consortium and Demonstrations

Section

AAAI Student Abstract and Poster Program

ERFSL: An Efficient Reward Function Searcher via Large Language Models for Custom-Environment Multi-Objective Reinforcement Learning (Student Abstract)

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information