Li, Y., Lin, Y., Xiao, T. and Zhu, J. (2021) “An Efficient Transformer Decoder with Compressed Sub-layers”, Proceedings of the AAAI Conference on Artificial Intelligence, 35(15), pp. 13315-13323. doi: 10.1609/aaai.v35i15.17572.