CultureRL: Internalizing Cultural Principles in Large Language Models via Norm-Driven Reinforcement Learning
DOI:
https://doi.org/10.1609/aaai.v40i44.41150Abstract
As large language models (LLMs) are increasingly deployed across culturally diverse regions, ensuring that their responses align with users’ cultural norms has become a critical challenge. Existing approaches to cultural alignment primarily rely on prompting or data-augmentation-based supervised finetuning, which teach models to follow norms indirectly through example-based supervision. However, these methods are difficult to scale and often fail to generalize, particularly in low-resource cultural settings. In this work, we propose CultureRL, a culture-norm-driven reinforcement learning framework that directly encodes cultural principles into model behavior. Rather than relying on output imitation, CultureRL provides normative feedback during training, enabling the model to internalize high-level cultural rules. It consists of two key components: (1) Norm Pool Construction (NPC), which clusters data from the World Values Survey into abstract cultural concepts to form a structured and retrievable norm pool; and (2) Norm Cluster-based Reward Mechanism (NCRM), which retrieves the relevant norm for each input and uses an external reward model to assess conformity, guiding model updates through cultural alignment. We evaluate CultureRL in both one-for-one (per-culture) and one-for-all (multi-culture) settings across nine cultures and three benchmarks. Results show that CultureRL consistently outperforms strong baselines, especially in terms of cultural consistency and adaptability.Downloads
Published
2026-03-14
How to Cite
Zhao, W., Li, H., Zhao, Y., Liu, H., Li, B., Liu, T., & Qin, B. (2026). CultureRL: Internalizing Cultural Principles in Large Language Models via Norm-Driven Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(44), 38120-38128. https://doi.org/10.1609/aaai.v40i44.41150
Issue
Section
AAAI Special Track on AI Alignment