Reasoning Shapes Alignment: Investigating Cultural Alignment in Large Reasoning Models with Cultural Norms

Authors

  • Yuhang Wang Beijing Jiaotong University
  • Yanxu Zhu Beijing Jiaotong University
  • Jitao Sang Beijing Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v40i21.38839

Abstract

The advanced reasoning capabilities of Large Reasoning Models enable them to thoroughly understand and apply safety policies through deliberate thought processes, thereby improving the models' safety. Beyond safety, these models must also be able to reflect the diverse range of human values across various cultures. This paper presents the Cultural Norm-based Cultural Alignment (CNCA) framework, which enables models to leverage their powerful reasoning ability to align with cultural norms. Specifically, we propose three methods to automatically mine cultural norms from limited survey data and explore ways to effectively utilize these norms for improving cultural alignment. Two alignment paradigms are examined: an in-context alignment method, where cultural norms are explicitly integrated into the user context, and a fine-tuning-based method, which internalizes norms through enhanced Chain-of-Thought training data. Comprehensive experiments demonstrate the effectiveness of these methods, highlighting that models with stronger reasoning capabilities benefit more from cultural norm mining and utilization. Our findings emphasize the potential for reasoning models to better reflect diverse human values through culturally informed alignment strategies.

Downloads

Published

2026-03-14

How to Cite

Wang, Y., Zhu, Y., & Sang, J. (2026). Reasoning Shapes Alignment: Investigating Cultural Alignment in Large Reasoning Models with Cultural Norms. Proceedings of the AAAI Conference on Artificial Intelligence, 40(21), 17814–17822. https://doi.org/10.1609/aaai.v40i21.38839

Issue

Section

AAAI Technical Track on Humans and AI