Das Antar, A., Huan, X. and Banovic, N. (2025) “"Do Your Guardrails Even Guard?’’ Method for Evaluating Effectiveness of Moderation Guardrails in Aligning LLM Outputs with Expert User Expectations”, Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 8(1), pp. 705–718. doi: 10.1609/aies.v8i1.36583.