Wang, P. (2022) “Scaled ReLU Matters for Training Vision Transformers”, Proceedings of the AAAI Conference on Artificial Intelligence, 36(3), pp. 2495–2503. doi: 10.1609/aaai.v36i3.20150.