Wang, Pichao, et al. “Scaled ReLU Matters for Training Vision Transformers”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 3, June 2022, pp. 2495-03, doi:10.1609/aaai.v36i3.20150.