[1]
Zhang, Z., Rossi, R.A., Yu, T., Dernoncourt, F., Zhang, R., Gu, J., Kim, S., Chen, X., Wang, Z. and Lipka, N. 2026. VipAct: Visual-Perception Enhancement via Specialized VLM Agent Collaboration and Tool-use. Proceedings of the AAAI Conference on Artificial Intelligence. 40, 43 (Mar. 2026), 36536-36546. DOI:https://doi.org/10.1609/aaai.v40i43.40976.