Reinforcement Mechanism Design: With Applications to Dynamic Pricing in Sponsored Search Auctions
In many social systems in which individuals and organizations interact with each other, there can be no easy laws to govern the rules of the environment, and agents' payoffs are often influenced by other agents' actions. We examine such a social system in the setting of sponsored search auctions and tackle the search engine's dynamic pricing problem by combining the tools from both mechanism design and the AI domain. In this setting, the environment not only changes over time, but also behaves strategically. Over repeated interactions with bidders, the search engine can dynamically change the reserve prices and determine the optimal strategy that maximizes the profit. We first train a buyer behavior model, with a real bidding data set from a major search engine, that predicts bids given information disclosed by the search engine and the bidders' performance data from previous rounds. We then formulate the dynamic pricing problem as an MDP and apply a reinforcement-based algorithm that optimizes reserve prices over time. Experiments demonstrate that our model outperforms static optimization strategies including the ones that are currently in use as well as several other dynamic ones.