Learned Behaviors of Multiple Autonomous Agents in Smart Grid Markets
One proposed approach to managing a large complex Smart Grid is through Broker Agents who buy electrical power from distributed producers, and also sell power to consumers, via a Tariff Market--a new market mechanism where Broker Agents publish concurrent bid and ask prices. A key challenge is the specification of the market strategy that the Broker Agents should use in order to earn profits while maintaining the market's balance of supply and demand. Interestingly, previous work has shown that a Broker Agent can learn its strategy, using Markov Decision Processes (MDPs) and Q-learning, and outperform other Broker Agents that use predetermined or randomized strategies. In this work, we investigate the more representative scenario in which multiple Broker Agents, instead of a single one, are independently learning their strategies. Using a simulation environment based on real data, we find that Broker Agents who employ periodic increases in exploration achieve higher rewards. We also find that varying levels of market dominance in customer allocation models result in remarkably distinct outcomes in market prices and aggregate Broker Agent rewards. The latter set of results can be explained by established economic principles regarding the emergence of monopolies in market-based competition, further validating our approach.