[1]

L.-S. Puglisi, F. Valdés, and J. J. Metzger, “Neurons to Words: A Novel Method for Automated Neural Network Interpretability and Alignment”, AAAI, vol. 39, no. 26, pp. 27591-27598, Apr. 2025.