Xu, C., He, Z., He, Z., & McAuley, J. (2022). Leashing the Inner Demons: Self-Detoxification for Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 36(10), 11530–11537. https://doi.org/10.1609/aaai.v36i10.21406