Riak Developer Guidance

The “Client Round Robin Anti-Pattern”

One of the features that is often available in Riak Client software (including the CorrguatedIron .NET Client, the riak-js client and others) is the ability to send requests to the Riak Cluster through a round robin style approach. What this means is each IP, of each node within the Riak Cluster is entered into a config file for the client. The client then goes through that list to send off requests to read, write or delete data in the database.

The client being responsible and knowledgeable about the data tier of the application in an architecture is an immediate red flag! The concept around SoC (Separation of Concerns) dictates that

“SoC is a principle for separating a computer program into distinct sections, such that each section addresses a separate concern.

Having the client provide a network tier layer to round robin communication with the database leaves us in a scenario that should be separated into individual concerns. Below is some basic guidance on eliminating this SoC issue.

  • Client ONLY sends and receives communication: The client, especially in the situation with a distributed system like Riak should only be dealing with sending and receiving information from the cluster or a facade that provides an interface for that cluster.
  • Another layer should deal with the network communication and division of nodes and node communication. Ideally, in the case or Riak, and most distributed systems this should be dealt with at the network device layer (router).
  • The network device (router) layer would ideally be able to have (through software likely) a way to automate the failure, inclusion or exclusion of nodes with the cluster system. If a node goes down, the network device should handle the immediate cessation of communication with that node from all clients, routing the communication accordingly to an active node.
  • The node itself needs to maintain a continual information state available to the network. Ideally the network state would identify any addition or removal of a node and if possible the immediate failure of a node. Of course it isn’t always possible to be informed of a failure, but the first line of defense should start within the cluster itself among the nodes.

The Anti-Pattern

Having the client handle all of these parts of the functional architecture leads to a number of problems, not merely that the guidance of the SoC concept is broken. With the client attempting to track and be aware of the individual nodes in the cluster, it sets the client with a huge responsibility.

Take for instance the riak-js client. If a node goes down the client will need to be aware of which node has gone down. For a few seconds (yes, you have to wait entire seconds at this level) the node will be gone and the client won’t know it is down. The client would just have to reasonably wait. When the communication times out, the client would then have to have the responsibility of marking that particular node as down. At this point the client must track which node it is in some type of data repository local to the client. The client must also set a time or some way to identify when the node comes back up. Several questions start to come up such as;

  • Does the client do an arbitrary test to determine when the node comes back up?
  • When the node comes back up is it considered alive or damaged?
  • How would the client manage the IP (or identifier) of the node that has gone down?
  • How long would the client store that the node is down?

The list of questions can get long pretty quick, thus the bad karma of not following a good practice around separating your concerns appropriately! One has to be careful, a god class might be right around the corner otherwise! That’s it for this quick journey into some distributed database usage guidelines. Until next, happy data sciencing.  😉

Aggregated Web Services Pt I

I’ve been working through an architecture scenario recently.  This is what I have so far.  Multiple external web services, some SOAP and some REST, and some data sources in a SQL Server Database, Azure Table Storage, and flat files of some sort.  All of these sources need to be accessed by a web site for read-only display.  In the diagram below I’ve drawn out the primary three points of reference.

  1. The services that are external; Contract, Table Store, Document, Search, and Help Desk Services.
  2. The Website Web Services Facade, which would be an aggregated layer that then provides the various services via an internally controlled services layer.
  3. On top of that will be the web site, accessing the services from the aggregated layer with jQuery.
base three tiers
Basic Three Tiers

After creating this to get some basic idea of how these things should fit together, I moved on to elaborate on the web services aggregation layer.  What I’ve sketched in this diagram is the correlation to architectural elements and the physical environments they would prospectively be deployed to.  Again, broken out by the three tiers as shown above.

  1. Website and the respective jQuery, AJAX, and Market/CSS for display.
  2. Web Services, which include the actual architecture breakout;  Facade Interface, Facade Aggregation Component, Cache & Non-cached DTOs (Data Transfer Objects), Cache Database/Storage, Caching Process, Lower Layer Aggregation Component, and the Poller Process for polling the external services.
  3. The cache is intended to use SQL Server, thus the red call out to the physical SQL Server cluster.
  4. The last tier, which isn’t being developed, but just providing data is the External Services, primarily shown to provide a full picture of all the layers.
aggregate web services
Aggregate Web Services

I primarily drew up these diagrams for discussion of the architecture, poke holes in it, or otherwise. Which speaking of, if any readers have input, question, or are curious please type up a comment and I’ll answer it ASAP.

As the effort continues there are some other great how-to write ups I will be putting together.  Everything from unit testing, mocking (with moq), how to setup test services, test services, and other elements of the project.  I’ll have all this coming, so keep reading & let me know what you think of the design so far, subscribe via e-mail (look to the metadata section below), or grab the RSS for the blog (see below also).

kick it on DotNetKicks.com

Shout it