[1]

P. Wang, “Scaled ReLU Matters for Training Vision Transformers”, AAAI, vol. 36, no. 3, pp. 2495–2503, Jun. 2022.