Riak 1.4 – A Few Notes, Notices & Thoughts…

The release notes for Riak 1.4 can be found via github

Two things have worked together that made me want to write up the new Riak 1.4 features. With Riak 1.4 hitting the streets and the work I’ve been doing with CorrugatedIron there are a few features that are going to add icing the cake. If you want to dive more into the release, check out the release notes. If you’re interested in the .NET Client CorrugatedIron, check it out here or check out the code on github. Now on to the client APIs.

riak-attach changed to not nuke a node

So when issuing the attach command like…

[sourcecode language=”bash”]
riak attach
[/sourcecode]

…the command attaches to the named pipe to communicate with the running erlang nodes. Now when you hit Ctrl-C it kills just the pipe versus killing the pipe and riak node that you’re on. This is something that has bit me in the keister more than a few times. Bringing down a node or two while working on viewing what is going on with a node. This leads me to the next enhancement.

riak-admin transfers

If you’re using riak_kv_bitcask_backend, riak_kv_eleveldb_backend or riak_kv_memory_backend the riak-admin transfers command now shows per-transfer progress and displays long node names better. Giving you a better idea of what and where things are going. The way this is reported depends slightly on the specific back end. For bitcask or in memory back end the progress is calculated by the keys already transferred out of the total keys, where as the level DB back end calculates based on bytes transferred. Based on this the level DB calculation can get slightly off over time.

Protocol Buffers & Multiple Interface Binding

Protocol Buffers can now bind to multiple ports and interfaces, so clients such as CorrugatedIron for .NET (http://corrugatediron.org/), Riakjs (http://riakjs.com/) can now bind to the Protocol Buffers outside of the set configuration. For more on Riak configuration around the binding, check out the Basho Docs (http://docs.basho.com/riak/latest/references/Configuration-Files/). This also brings feature parity around interface binding equal to that of the HTTP interfaces. This changes the pb_port and pb_ip to a single pb setting which is now a list of IP and port pairs.

Total radness in paging, of 2i

Secondary indexes now have results available via pagination. Check out this PR for bunches more info.

Client-specified Timeouts

Milliseconds can now be assigned to a timeout value for clients. This can be used for object manipulation around fetch, store and delete, listing buckets or keys. This takes care of some time out issues that may have been occurring during certain types of requests. This will come in handy for asynchronous and pivotal if anyone goes the synchronous route.

Bucket Properties for Protocol Buffers

If you’re needing to reset a bucket to it’s defaults, this is now possible. Besides a reset to defaults all bucket properties are now usable for protocol buffer usage. This can definitely help client usage of protocol buffers in a dramatic way.

List-buckets Streaming – Realtime

Listing keys or buckets via a streaming request will send bucket names to the client as received. This prevents any need to wait for a request from all nodes to respond. This can help with response time and time outs from the client point of view. This gives the ability to use the streaming features with Node.js, C#, Java and other languages and frameworks that support realtime streaming data feeds.

…these are the features that have jumped out at me, so until next release.

Back in the Bosh Bunker

In the last post on the topic of Bosh I put together a simple Cloud Foundry environment using the tools & repos of Stark & Wayne. Even though the bootstrap is a great way to get an environment up and started, it doesn’t explain a lot of things about Bosh. So let’s take a look at what we’re dealing with here.

Bosh – What is it?

Bosh handles deployment and upgrades of Cloud Foundry environments. However, it isn’t particularly limited to just Cloud Foundry. It’s been used to launch Riak Clusters, setup Redis, Cassandra, CouchDB and other services that don’t just fit neatly in the Cloud Foundry services design.

It is a very important tool in regards to keeping a Cloud Foundry environment up to date with the latest bits, security fixes, bugs and related elements. Bosh is broken down into several key components that work together to handle these deployment and maintenance tasks.

To put it another way, Bosh aims to give ops or devops the ability to throw together an entire stack to deploy. Bosh starts with stemcells, packages and jobs as the core concepts of how it works.

Bosh is used, within Cloud Foundry and prospectively for whatever anyone would want to use it for, to launch instances, change out the instances, change networking values, IPs and other configuration information. Overall it kind of rolls a lot of other tooling (chef, puppet) together into one tool. How well it does this is up for debate, but I’m not arguing what it is here, just going to get some definitions here.

The Pieces of BOSH

Stemcells

A stem cell or stemcell is something that is a bit hard to track down a definition for. I’m taking a stab at it with what I know a stem cell is, so if you have any corrections please comment below – I’ll be more than happy to add a correction or three. Overall I understand a stem cell to be a complete framework stack built on some sort of virtual image. It can be thought of as the recipe for building an operating systems that will act as an active member of a Cloud Foundry environment. In some situations, such as with a distributed database like Riak, it becomes not so much a member of the Cloud Foundry environment itself but an active node available to a distributed database cluster. This can then be used as a distributed database that is managed by Bosh and accessible within the Cloud Foundry ecosystem.

Packages

A package is sourec with the appropriate scripts for building it into usable binaries. Think of this as a package in the Node.js NPM, Gems (Ruby/Rails), or Nugets (.NET) worlds. It’s something that Bosh will pull in and compile on demand.

There are a few key parts to a package, referred to as package specs. These are: name, dependencies and files. Of the specs, the name and files are really the only required parts. The dependencies are an optional list of other packages this package would depend on.

Jobs

This is pretty self-descriptive. The jobs within Bosh spool up, start servers and services and other miscellaneous responsibilities as needed.

Relavent Sites, Documentation & Key Content
  • The Cloud Foundry Bosh Repo => This is the actual code repository on Github. If you’re in need of really diving into what it does, there’s always the possibility of reading the code!

  • Cloud Foundry Documentation => This has links to documentation related to Bosh that is pivotal (no pun intended).

  • Bosh Documentation => This is the Bosh documentation. It’s almost a good idea to start on the “Running Cloud Foundry” part of the documentation. This documentation can use your help (it’s super sparse at the moment), so if you get going and using Bosh, please contribute with examples and other material.

  • Stark & Wayne Repositories => I already mentioned them, but they’re likely some of the best material out there.

  • Bosh DB => This is a site & repository that Brian McClain @brianmmcclain put together to keep track of bosh stem cells and other repositories related to launching certain tools, services, servers and other things in Cloud Foundry environments via Bosh.

  • Dr Nic’s intro to Bosh => This page serves as an into and description of what’s going on in Bosh. I read this a while back for my own kick off with the Bosh Tool.

Summary

This is what I’ve found and put together as a good starting point. I still think there’s a bit of confusion around what Bosh is, how it works, how to get started with it and having it clearly defined on the web. Documentation is getting better, but still needs a lot of work (remember, you too can contribute). For systems outside of Cloud Foundry it also is a bit difficult and sometimes sketchy to use Bosh as the primary means of deployment, maintenance and upgrading. But just like the documentation that is also getting better. I’ll have more coming in the near future regarding what Bosh is, how it works, and things you can do with it – until then check out Dr Nic’s material for the most up to date how-to and related documentations and videos. He’s done some great work with the tooling and continues to knock it out of the park.

Keep reading and I’ll have more definitions, outlines of what is what, and the entire inception that Bosh is.

Consistent Hashing – Learning About Distributed Databases :: Issue 002

One of the core tools in the belt of the distributed database is consistent hashing. In Riak this is especially true, as it stands at the core of a Riak Cluster. Hashing, using a hash function, is an algorithm that maps data to variable length to data that’s fixed. In other words, odd things like the name of things mapped to integers. Consistent hashing is a special kind of hashing that provides the pattern for mapping keys and all related functionality around a cluster ring in Riak.

Consistent hashing was originally devised by David Karger, a professor of computer science at MIT (Massachusetts Institute of Technology). He’s also known for Karger’s Algorithm, a Monte Carlo method that computes the minimum cut in a connected graph (graph theory related stuff). Along with these developments he’s been part of many other efforts and contributed to computer science in many ways.

Remapping, Mapping and Keeping Distributed (& Available)

One key property of a consistent hash is that it minimizes the number of keys that must be remapped. With a regular hash changes, the entire key hash must be remapped.

Consistent hashing is based around mapping each object to a point of a circle. The system maps each storage bucket to pseudo-randomly distributed points on the edge of this circle.

The system finds where to place the object based on the key on the edge of the circle. It then walks the circle falling into the first bucket it finds. This results in the buckets containing the resources between its point and the next bucket point.

When a bucket disappears for any reason, the pseudo randomly mapped objects will now get re-mapped to different buckets. When a bucket appears, such as becoming available again or being added, a similar process occurs.

The Basho Docs describe in brief that,

Consistent hashing is a technique used to limit the reshuffling of keys when a hash-table data structure is rebalanced (when slots are added or removed). Riak uses consistent hashing to organize its data storage and replication. Specifically, the vnodes in the Riak Ring responsible for storing each object are determined using the consistent hashing technique.

NOTES: This is not a single blog entry topic by any means. This is merely a cursory look at consistent hashing. This entry I aimed to provide a basic description and coverage of the actions around consistent hashing. For more information and to dive even deeper into consistent hashing I’ve included a few links that have extensive information on the topic:

An Ubuntu Riak #devrel

Setting up Riak to test out, prototype against, develop and use in a general way is extremely easy. Just setup a devrel on your local development machine. This is however limited to certain *nix based operating systems, so Windows as a dev platform is out – but not completely. Get a virtual machine running on Ubuntu, RHEL or some other Linux instance and you’re ready to go. What I’ve put together here is an example of getting a devrel up and running with an Ubuntu Virtual Machine.

Step 1: Get the basic reqs installed.

Step 2: With each of the nodes, now join and build the Riak cluster.

  • Start each node.[sourcecode language=”bash”]
    dev1/bin/riak start
    dev2/bin/riak start
    dev3/bin/riak start
    dev4/bin/riak start[/sourcecode]
  • Check to determine that the riak services are running.[sourcecode language=”bash”]
    ps aux | grep beam[/sourcecode]
  • Add each node to a single node.[sourcecode language=”bash”]
    dev2/bin/riak-admin cluster join dev1@127.0.0.1
    dev3/bin/riak-admin cluster join dev1@127.0.0.1
    dev4/bin/riak-admin cluster join dev1@127.0.0.1[/sourcecode]
  • Set and get the cluster plan.[sourcecode language=”bash”]
    dev2/bin/riak-admin cluster plan[/sourcecode]

    NOTE: This plan can be run from any of the instances.

  • Last, commit the cluster plan.[sourcecode language=”bash”]
    dev2/bin/riak-admin cluster commit[/sourcecode]

    NOTE: The commit can also be run from any of the instances.

Backup Riak – Learning About Distributed Databases :: Issue 001

I’ve got more than a few series in the queue, so why not another one eh! The intent is, I’ll grab a specific topic to break down and add details to related to distributed systems, primarily around Riak. I will however diverge into other distributed databases too, but I’ll primarily be sticking to Riak. Without more introduction, the first topic is…

Backing Up and Recovery of Riak (Nodes)

I’ve been asked approximately 423,983,321.7 zillion times how this is done. So here’s a quick summary and respective links to the best ways to backup Riak, how to recover nodes.

When backing up Riak there are two key things that need copied to the backup storage; the ring and data directories. Each of these things are specific based on the backend used with Riak. In addition to the core backup containing the ring and data, another good thing to backup is the configuration directory. When recovering this comes in useful.

For the locations of data, it depends slightly based on the operating system being used. The two big variances are OS-X and Linux Distros. On OS-X the data path, ring data and configuration are located at the locations listed below:

  • Bitcask data: ./data/bitcask
  • LevelDB data: ./data/leveldb
  • Ring data: ./data/riak/ring
  • Configuration: ./etc

For each specific distro, there are slight variations on where the locations are, for a full list check out the Basho Riak docs on backups. But on Linux distros the paths are as follows:

Debian and Ubuntu

  • Bitcask data: /var/lib/riak/bitcask
  • LevelDB data: /var/lib/riak/leveldb
  • Ring data: /var/lib/riak/ring
  • Configuration: /etc/riak

Fedora and RHEL

  • Bitcask data: /var/lib/riak/bitcask
  • LevelDB data: /var/lib/riak/leveldb
  • Ring data: /var/lib/riak/ring
  • Configuration: /etc/riak

Other Operating System Paths

Freebsd

  • Bitcask data: /var/db/riak/bitcask
  • LevelDB data: /var/db/riak/leveldb
  • Ring data: /var/db/riak/ring
  • Configuration: /usr/local/etc/riak

SmartOS

  • Bitcask data: /var/db/riak/bitcask
  • LevelDB data: /var/db/riak/leveldb
  • Ring data: /var/db/riak/ring
  • Configuration: /opt/local/etc/riak

Solaris

  • Bitcask data: /opt/riak/data/bitcask
  • LevelDB data: /opt/riak/data/leveldb
  • Ring data: /opt/riak/ring
  • Configuration: /opt/riak/etc

When backing things up, it’s important to note that each node could have slightly inconsistent data. The data however is rebuilt by the Riak read-repair system once it is recovered and brought into use.

Backup Jobs

One of the easiest ways to backup Riak is to setup a cron job with your choice of cp, rsync or tar. Then just get those files onto whatever your choice of backup medium. An example tar cron job to backup a Bitcask backend is shown below (snagged from the documentation) just to give you an idea of where to start.

[sourcecode language=”bash”]tar -czf /mnt/riak_backups/riak_data_`date +%Y%m%d_%H%M`.tar.gz /var/lib/riak/bitcask /var/lib/riak/ring /etc/riak
[/sourcecode]

For a leveldb back end the most important thing to note is that the node must be stopped. The basic workflow of backing up a node in this manner is to stop the node, backup the data, ring and configuration and then start the node back up.

Backup Recovery / Restoring

When recovering data on a node that is replacing an existing node that has the same name (fully qualified or IP) then follow the steps below:

  1. Install Riak
  2. Restore the old node’s configuration, data & ring.
  3. Start the node

Once you’ve got the node started back up it’s a good idea to do a ping or status against the node to verify it is in a good state.

If node names have been changed there are additional steps.

  1. Mark the original instance down[sourcecode language=”bash”]riak-admin down [/sourcecode]
  2. Join the restored cluster  [sourcecode language=”bash”]riak-admin join [/sourcecode]
  3. Replace the original with [sourcecode language=”bash”]riak-admin cluster force-replcae  [/sourcecode]
  4. Get the cluster plan built [sourcecode language=”bash”]riak-admin cluster plan[/sourcecode]
  5. Commit the changes [sourcecode language=”bash”]riak-admin cluster commit[/sourcecode]
  6. Change the -name setting in the vm.args configuration file to match the new name.
  7. Change & verify that the IP reflects the instances IP in the app.config for http and protocol buffer interfaces.

Cluster Backups via Riak Enterprise Multi-Data Center (MDC)

In the above sections I wrote about the traditional backup approaches. This is very similar to the way RDBMS are backed up. However, with a distributed system like Riak there is another great alternative if you’re utilizing multiple datacenters and Enterprise Riak. In this version of Riak, which is basically Riak with additional features and capabilities, one of the possible backup scenarios is to use the Multi-Data Center, or MDC, to replicate a duplicate cluster and use it as an active, real-time and always ready backup.

One workflow that is an exceptionally effective way to provide backups is to setup the “backup” cluster beside the current operative cluster. As an example, if your cluster is operational in AWS and it is running in X region and Y zone then you’d want to put the backup cluster in that same region and zone. Once you’ve setup Riak Enterprise and MDC, then just setup a full sync. Once the full sync is done you can then remove the backup cluster and it provides a point in time backup of the data.

[sourcecode language=”bash”]riak-repl start-fullsync[/sourcecode]

It’s easy to schedule full sync operations to low usage periods and it is also possible to pause and resume full sync operations.

[sourcecode language=”bash”]riak-repl resume-fullsync<br />riak-repl pause-fullsync[/sourcecode]

The variations on backing up data with Riak Enterprise and MDC are pretty expansive. Doing a point in time, maintaining a secondary live copy of the data, using the replication as a data dump to another cluster or even just using the MDC replication to dump all of the data to a single instance.

File System Snapshots

One other technique that is extremely efficient, fast and thorough is snapshotting the file system. The backup workflow for snapshots is extremely easy. First stop Riak, then snapshot, then start Riak again. Of all the methods, snapshotting is one of the easiest of the options. Just like setting up a cron job, automating snapshots based on some pre-defined schedule and meshing that with automated start and stop of Riak provides a very thorough backup.

With these options, have fun strategizing your stratagems into strategies for backups.

Diskettes

One of the oldest, tried and true backups is the old diskette. The bestest way to backup with diskettes is to backup each node on three diskettes each. The send one of each diskettes to a geographically dispersed to a bank lock box or other secure facility. Do this for each node, and if need be use as many diskettes for each node as needed. A particularly useful method is to use the sharded zip strategy to stripe a backup across many diskettes. Once each lock box has a copy of the node for each node in the cluster, you’ll have one of the most secure backups in existence. Nothing compares to the diskette backup!

References:

  1. Basho Docs – Backups
  2. Basho Docs – MDC Full Sync