Zhang, P., & Sun, P. (2026). Differentiated Directional Intervention: A Framework for Evading LLM Safety Alignment. Proceedings of the AAAI Conference on Artificial Intelligence, 40(44), 38102–38110. https://doi.org/10.1609/aaai.v40i44.41148