Optimizing Discount and Reputation Trade-Offs in E-Commerce Systems: Characterization and Online Learning

Authors

  • Hong Xie The Chinese University of Hong Kong
  • Yongkun Li University of Science and Technology of China
  • John C. S. Lui The Chinese University of Hong Kong

DOI:

https://doi.org/10.1609/aaai.v33i01.33017992

Abstract

Feedback-based reputation systems are widely deployed in E-commerce systems. Evidences showed that earning a reputable label (for sellers of such systems) may take a substantial amount of time and this implies a reduction of profit. We propose to enhance sellers’ reputation via price discounts. However, the challenges are: (1) The demands from buyers depend on both the discount and reputation; (2) The demands are unknown to the seller. To address these challenges, we first formulate a profit maximization problem via a semiMarkov decision process (SMDP) to explore the optimal trade-offs in selecting price discounts. We prove the monotonicity of the optimal profit and optimal discount. Based on the monotonicity, we design a QLFP (Q-learning with forward projection) algorithm, which infers the optimal discount from historical transaction data. We conduct experiments on a dataset from to show that our QLFP algorithm improves the profit by as high as 50% over both the classical Q-learning and speedy Q-learning algorithm. Our QLFP algorithm also improves the profit by as high as four times over the case of not providing any price discount.

Downloads

Published

2019-07-17

How to Cite

Xie, H., Li, Y., & Lui, J. C. S. (2019). Optimizing Discount and Reputation Trade-Offs in E-Commerce Systems: Characterization and Online Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 7992-7999. https://doi.org/10.1609/aaai.v33i01.33017992

Issue

Section

AAAI Technical Track: Reasoning under Uncertainty