Wang, Wenkai, Hongcan Guo, Zheqi Lv, and Shengyu Zhang. 2026. “A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (40):33666-74. https://doi.org/10.1609/aaai.v40i40.40656.