1.
Feng D, Qin B, Huang C, Huang Y, Zhang Z, Lei W. LEGEND: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets. AAAI [Internet]. 2025 Apr. 11 [cited 2026 May 31];39(26):27277-85. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/34937