Revisiting Consistent Hashing with Bounded Loads
Keywords:Scalability, Parallel & Distributed Systems
AbstractDynamic load balancing lies at the heart of distributed caching. Here, the goal is to assign objects (load) to servers (computing nodes) in a way that provides load balancing while at the same time dynamically adjusts to the addition or removal of servers. Load balancing is a critical topic in many areas including cloud systems, distributed databases, and distributed and data-parallel machine learning. A popular and widely adopted solution to dynamic load balancing is the two-decade-old Consistent Hashing (CH). Recently, an elegant extension was provided to account for server bounds. In this paper, we identify that existing methodologies for CH and its variants suffer from cascaded overflow, leading to poor load balancing. This cascading effect leads to decreasing performance of the hashing procedure with increasing load. To overcome the cascading effect, we propose a simple solution to CH based on recent advances in fast minwise hashing. We show, both theoretically and empirically, that our proposed solution is significantly superior for load balancing and is optimal in many senses. On the AOL search dataset and Indiana University Clicks dataset with real user activity, our proposed solution reduces cache misses by several magnitudes.
How to Cite
Chen, J., Coleman, B., & Shrivastava, A. (2021). Revisiting Consistent Hashing with Bounded Loads. Proceedings of the AAAI Conference on Artificial Intelligence, 35(5), 3976-3983. https://doi.org/10.1609/aaai.v35i5.16517
AAAI Technical Track on Data Mining and Knowledge Management