Fu, T. (2026) “OmniPT: Unleashing the Potential of Large Vision Language Models for Pedestrian Tracking and Understanding”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(5), pp. 4031–4039. doi: 10.1609/aaai.v40i5.37406.