1.
Dordevic D, Bozic V, Thommes J, Coppola D, Pal Singh S. Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers (Student Abstract). AAAI [Internet]. 2024 Mar. 24 [cited 2026 May 29];38(21):23477-9. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/30436