From Representation to Reasoning: Toward General-Purpose Visual Intelligence

Chen Wei

doi:10.1609/aaai.v40i47.41357

From Representation to Reasoning: Toward General-Purpose Visual Intelligence

Authors

Chen Wei Rice University

DOI:

https://doi.org/10.1609/aaai.v40i47.41357

Abstract

This talk surveys my research agenda on advancing general-purpose visual intelligence, moving AI beyond static recognition toward active reasoning and embodied action. A central challenge is enabling AI systems to generalize reliably in low-data and long-tail regimes. I address this by combining multimodal representation learning with agentic reasoning frameworks such as PyVision, which equips vision models to dynamically generate tools for deliberate problem-solving, and ViGaL, which leverages gameplay to instill transferable cognitive skills for reasoning under scarcity. These efforts chart a trajectory from representation and generation to interactive, embodied agents, re-imagining AI as an active collaborator capable of tool use, imagination, and purposeful engagement across both digital and physical environments.

AAAI-26 / IAAI-26 / EAAI-26 Proceedings Cover

Downloads

Published

2026-03-14

How to Cite

Wei, C. (2026). From Representation to Reasoning: Toward General-Purpose Visual Intelligence. Proceedings of the AAAI Conference on Artificial Intelligence, 40(47), 39836–39837. https://doi.org/10.1609/aaai.v40i47.41357

Download Citation

Issue

Vol. 40 No. 47: AAAI-26 New Faculty Highlights, Journal Track, IAAI-26 and EAAI-26 Main Track

Section

New Faculty Highlights

From Representation to Reasoning: Toward General-Purpose Visual Intelligence

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information