Wang, Z., Zhang, R., Li, H., Fan, W., Jiang, W., Zhao, Q., & Xu, G. (2026). ConfGuard: A Simple and Effective Backdoor Detection for Large Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(42), 35829–35837. https://doi.org/10.1609/aaai.v40i42.40897