Efficient and Effective In-context Demonstration Selection with Coreset

Authors

  • Zihua Wang School of Computer Science and Engineering and the Key Laboratory of New Generation Artificial Intelligence Technology and its Interdisciplinary Applications, Southeast University, Nanjing 210096, China.
  • Jiarui Wang School of Computer Science and Engineering and the Key Laboratory of New Generation Artificial Intelligence Technology and its Interdisciplinary Applications, Southeast University, Nanjing 210096, China.
  • Haiyang Xu Tongyi Lab, Alibaba Group.
  • Ming Yan Tongyi Lab, Alibaba Group.
  • Fei Huang Tongyi Lab, Alibaba Group.
  • Xu Yang School of Computer Science and Engineering and the Key Laboratory of New Generation Artificial Intelligence Technology and its Interdisciplinary Applications, Southeast University, Nanjing 210096, China.
  • Xiu-Shen Wei School of Computer Science and Engineering and the Key Laboratory of New Generation Artificial Intelligence Technology and its Interdisciplinary Applications, Southeast University, Nanjing 210096, China.
  • Siya Mi School of Cyber Science and Engineering, Southeast University, Nanjing 211189, China. Purple Mountain Laboratories, Nanjing 210000, China.
  • Yu Zhang School of Computer Science and Engineering and the Key Laboratory of New Generation Artificial Intelligence Technology and its Interdisciplinary Applications, Southeast University, Nanjing 210096, China.

DOI:

https://doi.org/10.1609/aaai.v40i13.38017

Abstract

In-context learning (ICL) has emerged as a powerful paradigm for Large Visual Language Models (LVLMs), enabling them to leverage a few examples directly from input contexts. However, the effectiveness of this approach is heavily reliant on the selection of demonstrations, a process that is NP-hard. Traditional strategies, including random, similarity-based sampling and infoscore-based sampling, often lead to inefficiencies or suboptimal performance, struggling to balance both efficiency and effectiveness in demonstration selection. In this paper, we propose a novel demonstration selection framework named Coreset-based Dual Retrieval (CoDR). We show that samples within a diverse subset achieve a higher expected mutual information. To implement this, we introduce a cluster-pruning method to construct a diverse coreset that aligns more effectively with the query while maintaining diversity. Additionally, we develop a dual retrieval mechanism that enhances the selection process by achieving global demonstration selection while preserving efficiency. Experimental results demonstrate that our method significantly improves the ICL performance compared to the existing strategies, providing a robust solution for effective and efficient demonstration selection.

Downloads

Published

2026-03-14

How to Cite

Wang, Z., Wang, J., Xu, H., Yan, M., Huang, F., Yang, X., Wei, X.-S., Mi, S., & Zhang, Y. (2026). Efficient and Effective In-context Demonstration Selection with Coreset. Proceedings of the AAAI Conference on Artificial Intelligence, 40(13), 10458-10466. https://doi.org/10.1609/aaai.v40i13.38017

Issue

Section

AAAI Technical Track on Computer Vision X