BANERJEE, Somnath; LAYEK, Sayan; HAZRA, Rima; MUKHERJEE, Animesh. How (Un)ethical Are Instruction-Centric Responses of LLMs? Unveiling the Vulnerabilities of Safety Guardrails to Harmful Queries. Proceedings of the International AAAI Conference on Web and Social Media, [S. l.], v. 19, n. 1, p. 193–205, 2025. DOI: 10.1609/icwsm.v19i1.35811. Disponível em: https://ojs.aaai.org/index.php/ICWSM/article/view/35811. Acesso em: 29 may. 2026.