Long, L., Yang, R., Huang, Y., Hui, D., Zhou, A., & Yang, J. (2026). SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(38), 32284–32292. https://doi.org/10.1609/aaai.v40i38.40502