Sanyal, D., Ray, M., & Mandal, M. (2026). AntiDote: Bi-level Adversarial Training for Tamper-Resistant LLMs. Proceedings of the AAAI Conference on Artificial Intelligence, 40(39), 32893–32901. https://doi.org/10.1609/aaai.v40i39.40570