Reinforcement Learning Without Explicit Rewards: Theory and Practice

Weitong Zhang

doi:10.1609/aaai.v40i47.41364

Reinforcement Learning Without Explicit Rewards: Theory and Practice

Authors

Weitong Zhang University of North Carolina at Chapel Hill

DOI:

https://doi.org/10.1609/aaai.v40i47.41364

Abstract

In this New Faculty Highlights, I begin with the reward free exploration that learns broad state and skill coverage with intrinsic rewards and remains robust under misspecification during efficient finetuning; guided generation methods that preserve the prior policy and mitigate reward hacking; and AI for science and healthcare, including practical RL for autonomous laboratories and automatic diagnosis. Building on impacts evidenced by publications, adoption, and awards. My future work will pursue imitation learning and contextual multi task RL that connect behavioral cloning with interactive policies without explicit reward design; personalized and multi-tasked offline to online adaptation with in-context demonstrations. In parallel, I am broadening the impact of AI for science and healthcare through existing collaborations. I will close with a talk that surveys these results and outlines an agenda for reinforcement learning without explicit rewards.

AAAI-26 / IAAI-26 / EAAI-26 Proceedings Cover

Downloads

Published

2026-03-14

How to Cite

Zhang, W. (2026). Reinforcement Learning Without Explicit Rewards: Theory and Practice. Proceedings of the AAAI Conference on Artificial Intelligence, 40(47), 39847–39847. https://doi.org/10.1609/aaai.v40i47.41364

Download Citation

Issue

Vol. 40 No. 47: AAAI-26 New Faculty Highlights, Journal Track, IAAI-26 and EAAI-26 Main Track

Section

New Faculty Highlights

Reinforcement Learning Without Explicit Rewards: Theory and Practice

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information