Revisiting Consistent Hashing with Bounded Loads

John Chen; Benjamin Coleman; Anshumali Shrivastava

doi:10.1609/aaai.v35i5.16517

Authors

John Chen Rice University
Benjamin Coleman Rice University
Anshumali Shrivastava Rice University

DOI:

https://doi.org/10.1609/aaai.v35i5.16517

Keywords:

Scalability, Parallel & Distributed Systems

Abstract

Dynamic load balancing lies at the heart of distributed caching. Here, the goal is to assign objects (load) to servers (computing nodes) in a way that provides load balancing while at the same time dynamically adjusts to the addition or removal of servers. Load balancing is a critical topic in many areas including cloud systems, distributed databases, and distributed and data-parallel machine learning. A popular and widely adopted solution to dynamic load balancing is the two-decade-old Consistent Hashing (CH). Recently, an elegant extension was provided to account for server bounds. In this paper, we identify that existing methodologies for CH and its variants suffer from cascaded overflow, leading to poor load balancing. This cascading effect leads to decreasing performance of the hashing procedure with increasing load. To overcome the cascading effect, we propose a simple solution to CH based on recent advances in fast minwise hashing. We show, both theoretically and empirically, that our proposed solution is significantly superior for load balancing and is optimal in many senses. On the AOL search dataset and Indiana University Clicks dataset with real user activity, our proposed solution reduces cache misses by several magnitudes.

Revisiting Consistent Hashing with Bounded Loads

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription