Consistent Hashing

I wrote about consistent hashing in 2013 when I worked at Basho, when I had started a series called “Learning About Distributed Databases” and today I’m kicking that back off after a few years (ok, after 5 or so years!) with this post on consistent hashing.

As with Riak, which I wrote about in 2013, Cassandra remains one of the core active distributed database projects alive today that provides an effective and reliable consistent hash ring for the clustered distributed database system. This hash function, is an algorithm that maps data to variable length to data that’s fixed. This consistent hash is a kind of hashing that provides this pattern for mapping keys to particular nodes around the ring in Cassandra. One can think of this as a kind of Dewey Decimal Classification system where the cluster nodes are the various bookshelves in the library.

Ok, so maybe the Dewey Decimal system isn’t the best analogy. Does anybody even learn about that any more? If you don’t know what it is, please read up and support your local library.

Consistent hashing allows data distributed across a cluster to minimize reorganization when nodes are added or removed. These partitions are based on a particular partition key. The partition key shouldn’t be confused with a primary key either, it’s more like a unique identifier controlled by the system that would make up part of a primary key of a primary key that is made up of multiple candidate keys in a composite key.

For an example, let’s take a look at sample data from the DataStax docs on consistent hashing.

For example, if you have the following data:

name age car gender
jim 36 camaro M
carol 37 345s F
johnny 12 supra M
suzy 10 mustang F

The database assigns a hash value to each partition key:

Partition key Murmur3 hash value
jim -2245462676723223822
carol 7723358927203680754
johnny -6723372854036780875
suzy 1168604627387940318

Each node in the cluster is responsible for a range of data based on the hash value.

Hash values in a four node cluster

DataStax Enterprise places the data on each node according to the value of the partition key and the range that the node is responsible for. For example, in a four node cluster, the data in this example is distributed as follows:

Node Start range End range Partition key Hash value
1 -9223372036854775808 -4611686018427387904 johnny -6723372854036780875
2 -4611686018427387903 -1 jim -2245462676723223822
3 0 4611686018427387903 suzy 1168604627387940318
4 4611686018427387904 9223372036854775807 carol 7723358927203680754

So there ya go, that’s consistent hashing and how it works in a distributed database like Apache Cassandra, the derived distributed database DataStax Enterprise, or the mostly defunct RIP Riak. If you’d like to dig in further, I’ve also found Distributed Hash Tables interesting and also a host of other articles that delve into coding up a consistent has table, respective ring, and the whole enchilada. Check out these articles for more information and details:

    • Simple Magic Consistent by Mathias Meyer @roidrage CTO of Travis CI. Mathias’s post is well written and drives home some good points.
    • Consistent Hashing: Algorithmic Tradeoffs by Damien Gryski @dgryski. This post from Damien is pretty intense, and if you want code, he’s got code for ya.
    • How Ably Efficiently Implemented Consistent Hashing by Srushtika Neelakantam. Srushtika does a great job not only of describing what consistent hashing is but also has drawn up diagrams, charts, and more to visualize what is going on. But that isn’t all, she also wrote up some code to show nodes coming and going. A really great post.

For more on distributed database things to know subscribe to the blog, of course the ole’ RSS feed works great too, and follow @CompositeCode on Twitter for blog updates.

Distributed Database Things to Know Series

  1. Consistent Hashing (this post)
  2. Apache Cassandra Datacenter & Racks


Consistent Hashing – Learning About Distributed Databases :: Issue 002

One of the core tools in the belt of the distributed database is consistent hashing. In Riak this is especially true, as it stands at the core of a Riak Cluster. Hashing, using a hash function, is an algorithm that maps data to variable length to data that’s fixed. In other words, odd things like the name of things mapped to integers. Consistent hashing is a special kind of hashing that provides the pattern for mapping keys and all related functionality around a cluster ring in Riak.

Consistent hashing was originally devised by David Karger, a professor of computer science at MIT (Massachusetts Institute of Technology). He’s also known for Karger’s Algorithm, a Monte Carlo method that computes the minimum cut in a connected graph (graph theory related stuff). Along with these developments he’s been part of many other efforts and contributed to computer science in many ways.

Remapping, Mapping and Keeping Distributed (& Available)

One key property of a consistent hash is that it minimizes the number of keys that must be remapped. With a regular hash changes, the entire key hash must be remapped.

Consistent hashing is based around mapping each object to a point of a circle. The system maps each storage bucket to pseudo-randomly distributed points on the edge of this circle.

The system finds where to place the object based on the key on the edge of the circle. It then walks the circle falling into the first bucket it finds. This results in the buckets containing the resources between its point and the next bucket point.

When a bucket disappears for any reason, the pseudo randomly mapped objects will now get re-mapped to different buckets. When a bucket appears, such as becoming available again or being added, a similar process occurs.

The Basho Docs describe in brief that,

Consistent hashing is a technique used to limit the reshuffling of keys when a hash-table data structure is rebalanced (when slots are added or removed). Riak uses consistent hashing to organize its data storage and replication. Specifically, the vnodes in the Riak Ring responsible for storing each object are determined using the consistent hashing technique.

NOTES: This is not a single blog entry topic by any means. This is merely a cursory look at consistent hashing. This entry I aimed to provide a basic description and coverage of the actions around consistent hashing. For more information and to dive even deeper into consistent hashing I’ve included a few links that have extensive information on the topic:

Write the Docs, Proper Portland Brew, Hack n’ Bike and Polyglot Conference 2013

Blog Entry Index:

I just wrapped up a long weekend of staycation. Monday kicked off Write the Docs this week and today, Tuesday, I’m getting back into the saddle.

Write the Docs

The Write the Docs Conference this week, a two day affair, has kicked off an expanding community around document creation. This conference is about what documentation is, how we create documentation as technical writers, writers, coders and others in the field.

Not only is it about those things it is about how people interact and why documentation is needed in projects. This is one of the things I find interesting, as it seems obvious, but is entirely not obvious because of the battle between good documentation, bad documentation or a complete lack of documentation. The later being the worse situation.

The Bloody War of Documentation!

At this conference it has been identified that the ideal documentation scenario is that building it starts before any software is even built. I do and don’t agree with this, because I know we must avoid BDUF (Big Design Up Front). But we must take this idea, of documentation first, in the appropriate context of how we’re speaking about documentation at the conference. Just as tests & behaviors identified up front, before the creation of the actual implementation is vital to solid, reliable, consistent, testable & high quality production software, good documentation is absolutely necessary.

There are some situations, the exceptions, such as with agencies that create software, in which the software is throwaway. I’m not and don’t think much of the conference is about those types of systems. What we’ve been speaking about at the conference is the systems, or ecosystems, in which software is built, maintained and used for many years. We’re talking about the APIs that are built and then used by dozens, hundreds or thousands of people. Think of Facebook, Github and Twitter. All of these have APIs that thousands upon thousands use everyday. They’re successful in large part, extremely so, because of stellar documentation. In the case of Facebook, there’s some love and hate to go around because they’ve gone between good documentation and bad documentation. However whenever it has been reliable, developers move forward with these APIs and have built billion dollar empires that employ hundreds of people and benefit thousands of people beyond that.

As developers that have been speaking at the conference, and developers in the audience, and this developer too all tend to agree, build that README file before you build a single other thing within the project. Keep that README updated, keep it marked up and easy to read, and make sure people know what your intent is as best you can. Simply put, document!

You might also have snarkily asked, does Write the Docs have docs,why yes, it does: <- Give em’ a read, they’re solid docs.

Portland Proper Brew

Today while using my iPhone, catching up on news & events over the time I had my staycation I took a photo. On that photo I used Stitch to put together some arrows. Kind of a Portland Proper Brew (PPB) with documentation. (see what I did there!) It exemplifies a great way to start the day.

Everyday I bike (or ride the train or bus) in to downtown Porltand anywhere from 5-9 kilometers and swing into Barista on 3rd. Barista is one of the finest coffee shops, in Portland & the world. If you don’t believe me, drag your butt up here and check it out. Absolutely stellar baristas, the best coffee (Coava, Ritual, Sightglass, Stumptown & others), and pretty sweet digs to get going in the morning.

I’ll have more information on a new project I’ve kicked off. Right now it’s called Bike n’ Hack, which will be a scavenger style code hacking & bicycle riding urban awesome game. If you’re interested in hearing more about this, the project, the game & how everything will work be sure to contact me via twitter @adron or jump into the bike n’ hack github organization and the team will be adding more information about who, what, where, when and why this project is going to be a blast!

Polyglot Conference & the Zombie Apocalypse

I’ll be teaching a tutorial, “Introduction to Distributed Databases” at Polyglot Conference in Vancouver in May!  So it has begun & I’m here for you! Come and check out how to get a Riak deployment running in your survival bunker’s data center. Zombies or just your pointy hair boss scenarios of apocalypse we’ll discuss how consistent hashing, hinted handoff and gossipping can help your systems survive infestations! Here’s a basic outline of what I’ll cover…

Introducing Riak, a database designed to survive the Zombie Plague. Riak Architecture & 5 Minute History of Riak & Zombies.

Architecture deep dive:

  • Consistent Hashing, managing to track changes when your kill zone is littered with Zombies.
  • Intelligent Replication, managing your data against each of your bunkers.
  • Data Re-distribution, sometimes they overtake a bunker, how your data is re-distributed.
  • Short Erlang Introduction, a language fit for managing post-civil society.
  • Getting Erlang

Installing Riak on…

  • Ubuntu, RHEL & the Linux Variety.
  • OS-X, the only user centered computers to survive the apocolypse.
  • From source, maintained and modernized for humanities survival.
  • Upgrading Riak, when a bunker is retaken from the zomibes, it’s time to update your Riak.
  • Setting up

Devrel – A developer’s machine w/ Riak – how to manage without zombie bunkers.

  • 5 nodes, a basic cluster
  • Operating Riak
  • Starting, stopping, and restarting
  • Scaling up & out
  • Managing uptime & data integrity
  • Accessing & writing data

Polyglot client libraries

  • JavaScript/Node.js & Erlang for the zombie curing mad scientists.
  • C#/.NET & Java for the zombie creating corporations.
  • Others, for those trying to just survive the zombie apocolypse.

If you haven’t registered for the Polyglot Conference yet, get registered ASAP as it could sell out!

Some of the other tutorials that are happening, that I wish I could clone myself for…

That’s it for updates right now, more code & news later. Cheers!