[1]
Cao, Z. et al. 2025. SCANS: Mitigating the Exaggerated Safety for LLMs via Safety-Conscious Activation Steering. Proceedings of the AAAI Conference on Artificial Intelligence. 39, 22 (Apr. 2025), 23523–23531. DOI:https://doi.org/10.1609/aaai.v39i22.34521.