Cheng, Z., Chen, Q., Zhang, J., Fei, H., Feng, X., Che, W., … Qin, L. (2025). CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 39(22), 23678–23686. https://doi.org/10.1609/aaai.v39i22.34538