Dordevic, D., Bozic, V., Thommes, J., Coppola, D., & Pal Singh, S. (2024). Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 38(21), 23477–23479. https://doi.org/10.1609/aaai.v38i21.30436