ShoppingBench: A Real-World Intent-Grounded Shopping Benchmark for LLM-based Agents

Jiangyuan Wang; Kejun Xiao; Qi Sun; Huaipeng Zhao; Tao Luo; Jian Dong Zhang; Xiaoyi Zeng

doi:10.1609/aaai.v40i39.40640

Authors

Jiangyuan Wang Alibaba International Digital Commercial Group
Kejun Xiao Alibaba International Digital Commercial Group
Qi Sun Alibaba International Digital Commercial Group
Huaipeng Zhao Alibaba International Digital Commercial Group
Tao Luo Alibaba International Digital Commercial Group
Jian Dong Zhang Alibaba International Digital Commercial Group
Xiaoyi Zeng Alibaba International Digital Commercial Group

DOI:

https://doi.org/10.1609/aaai.v40i39.40640

Abstract

Existing benchmarks in e-commerce primarily focus on basic user intents, such as finding or purchasing products. However, real-world users often pursue more complex goals, such as applying vouchers, managing budgets, and finding multi-products seller. To bridge this gap, we propose ShoppingBench, a novel end-to-end shopping benchmark designed to encompass increasingly challenging levels of grounded intent. Specifically, we propose a scalable framework to simulate user instructions based on various intents derived from sampled real-world products. To facilitate consistent and reliable evaluations, we provide a large-scale shopping sandbox that serves as an interactive simulated environment, incorporating over 2.5 million real-world products. Experimental results demonstrate that even state-of-the-art language agents (such as GPT-4.1) achieve absolute success rates under 50% on our benchmark tasks, highlighting the significant challenges posed by our ShoppingBench. In addition, we propose a trajectory distillation strategy and leverage supervised fine-tuning, along with reinforcement learning on synthetic trajectories, to distill the capabilities of a large language agent into a smaller one. As a result, our trained agent achieves competitive performance compared to GPT-4.1.

ShoppingBench: A Real-World Intent-Grounded Shopping Benchmark for LLM-based Agents

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information