Feng, Duanyu, Bowen Qin, Chen Huang, Youcheng Huang, Zheng Zhang, and Wenqiang Lei. 2025. “LEGEND: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets”. Proceedings of the AAAI Conference on Artificial Intelligence 39 (26):27277-85. https://doi.org/10.1609/aaai.v39i26.34937.