Towards Reinforcement Learning from Neural Feedback: Mapping fNIRS Signals to Agent Performance

Authors

  • Julia Santaniello Tufts University, Medford, MA
  • Matthew Russell Tufts University, Medford, MA
  • Benson Jiang Tufts University, Medford, MA
  • Donatello Sassaroli Tufts University, Medford, MA
  • Robert Jacob Tufts University, Medford, MA
  • Jivko Sinapov Tufts University, Medford, MA

DOI:

https://doi.org/10.1609/aaai.v40i21.38823

Abstract

Reinforcement Learning from Human Feedback (RLHF) is a methodology that aligns agent behavior with human preferences by integrating human feedback into the agent's training process. We introduce a possible framework that employs passive Brain-Computer Interfaces (BCI) to guide agent training from implicit neural signals. We present and release a novel dataset of functional near-infrared spectroscopy (fNIRS) recordings collected from 25 human participants across three domains: a Pick-and-Place Robot, Lunar Lander, and Flappy Bird. We train classifiers to predict levels of agent performance (optimal, sub-optimal, or worst-case) from windows of preprocessed fNIRS feature vectors, achieving an average F1 score of 67% for binary classification and 46% for multi-class models averaged across conditions and domains. We also train regressors to predict the degree of deviation between an agent's chosen action and a set of near-optimal policies, providing a continuous measure of performance. We evaluate cross-subject generalization and demonstrate that fine-tuning pre-trained models with a small sample of subject-specific data increases average F1 scores by 17% and 41% for binary and multi-class models, respectively. Our work demonstrates that mapping implicit fNIRS signals to agent performance is feasible and can be improved, laying the foundation for future brain-driven RLHF systems.

Downloads

Published

2026-03-14

How to Cite

Santaniello, J., Russell, M., Jiang, B., Sassaroli, D., Jacob, R., & Sinapov, J. (2026). Towards Reinforcement Learning from Neural Feedback: Mapping fNIRS Signals to Agent Performance. Proceedings of the AAAI Conference on Artificial Intelligence, 40(21), 17670–17678. https://doi.org/10.1609/aaai.v40i21.38823

Issue

Section

AAAI Technical Track on Humans and AI