Creating Distributed Database Application Starter Kits

I’ve boarded a bus, and as always, when I board a bus I almost always code. Unless of course there are people I’m hanging out with then I chit chat, but right now this is the 212 and I don’t know anybody on this chariot anyway. So into the code I go.

I’ve been re-reviewing the Docker and related collateral we offer at DataStax. In that review it seems like it would be worth having some starter kit applications along with these “default” Docker options. This post I’ve created to provide the first language & tech stack of several starter kits I’m going to create.

Starter Kit – The Todo List Template

This first set of starter kits will be based upon a todo list application. It’s really simple, minimal in features, and offers a complete top to bottom implementation of a service, and an application on top of that service all built on Apache Cassandra. In some places, and I’ll clearly mark these places, I might add a few DataStax Enterprise features around search, analytics, or graph.

The Todo List

Features: The following detail the features, from the users perspective, that this application will provide. Each implementation will provide all of these features.

  • A user wants to create a user account to create todo lists with.
  • A user wants to be able to store a username, full name, email, and some simple notes with their account.
  • A user wants to be able to create a todo list that is identified by a user defined name. (i.e. “Grocery List”, “Guitar List”, or “Stuff to do List”)
  • A user want to be able to logout and return, then retrieve a list from a list of their lists.
  • A user wants to be able to delete a todo list.
  • A user wants to be able to update a todo list name.
  • A user wants to be able to add items to a todo list.
  • A user wants to be able to update items in the todo list.
  • A user wants to be able to delete items in a todo list.

Architecture: The following is the architecture of the todo list starter kit application.

  • Database: Apache Cassandra.
  • Service: A small service to manage the data tier of the application.
  • User Interface: A web interface using React/Vuejs ??

As you can see, some of the items are incomplete, but I’ll decide on them soon. My next review is to check out what I really want to use for the user interface, and also to get a user account system figured out. I don’t really want to create the entire user interface, but instead would like to use something like Auth0 or Okta.

May I Ask?

There are numerous things I’d love help with. Are there any user stories you think are missing? Should I add something? What would make these helpful to you? Leave a comment, or tweet at me @Adron. I’d be happy to get some feedback and other’s thoughts on the matter so that I can ensure that these are simple, to the point, usable, and helpful to people. Cheers!

Let’s Really Discuss Lock In

For to long lock-in has been referred to with an almost entirely negative connotation even though it can be inferred in positive and negative situations. The fact is that there’s a much more nuanced and balanced range to benefits and disadvantages of lock-in. Often this may even be referred to as this or that dependency, but either way a dependency often is just another form of lock in. Weighing those and finding the right balance for your projects can actually lead to lock-in being a positive game changer or something that simply provides one a basis in which to work and operate. Sometimes lock-in actually will provide a way to remove lock-in by providing more choices to other things, that in turn may provide another variance of lock-in.

Concrete Lock-in Examples

The JavaScript Lock-In

IT Security icons. Simplus seriesTake the language we choose to build an application in. JavaScript is a great example. It has become the singular language of the web, at least on the client side. This was long ago, a form of lock-in that browser makers (and standards bodies) chose that dictated how and in which direction the web – at least web pages – would progress.

JavaScript has now become a prominent language on the server side now too thanks to Node.js. It has even moved in as a first class language in serverless technology like AWS’s Lambda. JavaScript is a perfect example of a language, initially being a source of specific lock-in, but required for the client, that eventually expanded to allow programming in a number of other environments – reducing JavaScript’s lock in – but displacing lock in through abstractions to other spaces such as the server side and and serverless functions.

The .NET Windows SQL Server Lock In

IT Security icons. Simplus seriesJavaScript is merely one example, and a relatively positive one that expands one’s options in more ways than limits one’s efforts. But let’s say the decision is made to build a high speed trading platform and choose SQL Server, .NET C#, and Windows Server. Immediately this is a technology combination that has notoriously illuminated in the past * how lock-in can be extremely dangerous.

This application, say it was built out with this set of technology platforms and used stored procedures in SQL Server, locking the application into the specific database, used proprietary Windows specific libraries in .NET with the C# code, and on Windows used IIS specific advances to make the application faster. When it was first built it seemed plenty fast and scaled just right according to the demand at the time.

Fast forward to today. The application now has a sharded database when it hit a mere 8 Terabytes, loaded on two super pumped up – at least for today – servers that have many cores, many CPUs, GPUs, and all that jazz. They came in around $240k each! The application is tightly coupled to a middle tier, that is then sort of tightly coupled to those famous stored procedures, and the application of course has a turbo capability per those IIS Servers.

But today it’s slow. Looking at benchmarks and query times the database is having a hard time dealing with things as is, and the application has outages on a routine basis for a whole variation of reasons. Sometimes tracing and debugging solves the problems quickly, other times the servers just oversubscribe resources and sit thrashing.

Where does this application go? How does one resolve the database loading issues? They’ve already sunk a half million on servers, they’re pegged out already, horizontally scaling isn’t an option, they’re tightly coupled to Window Servers running IIS removing the possibility of effectively scaling out the application servers via container technologies, and other issues. Without recourse, this is the type of lock in that will kill the company if something is changed in a massive way very soon.

To add, this is the description of an actual company that is now defunct. I phrased it as existing today only to make the point. The hard reality is the company went under, almost entirely because of the costs of maintaining and unsustainable architecture that caused an exorbitant lock in to very specific tools – largely because the company drank the cool aid to use the tools as suggested. They developed the product into a corner. That mistake was so expensive that it decimated the finances of the company. Not a good scenario, not a happy outcome, and something to be avoided in every way! This is truly the epitomy of negative lock in.

Of course there’s this distinctive lock in we have to steer clear from, but there’s the lock in associated with languages and other technology capabilities that will help your company move forward faster, easier, and with increasing capabilities. Those are the choices, the ties to technology and capabilities that decision makers can really leverage with fewer negative consequences.

The “Lock In” That Enables

IT Security icons. Simplus seriesOne common statement is, “the right tool for the job”. This is of course for the ideal world where ideal decisions can be made all the time. This doesn’t exist and we have to strive for balance between decisions that will wreck the ship or decisions that will give us clear waters ahead.

For databases we need to choose the right databases for where we want to go versus where we are today. Not to gold plate the solution, but to have intent and a clear focus on what we want our future technology to hold for us. If we intend to expand our data and want to maintain the ability to effectively query – let’s take the massive SQL Server for example – what could we have done to prevent it from becoming a debilitating decision?

A solution that could have effectively come into play would have been not to shard the relational database, but instead to either export or split the data in a more horizontal way and put it into a distributed database store. Start building the application so that this system could be used instead of being limited by the relational database. As the queries are built out and the tight coupling to SQL Server removed, the new distributed database could easily add nodes to compensate for the ever growing size of the data stored. The options are numerous, that all are a form of lock-in, but not the kind that eventually killed this company that had limited and detrimentally locked itself into use of a relational database.

At the application tier, another solution could have been made to remove the ties to IIS and start figuring out a way to containerize the application. One way years ago would have been to move away from .NET, but let’s say that wasn’t really an option for other reasons. The idea to mimic containerization could have been done through shifting to a self-contained web server on Windows that would allow the .NET application to run under a singular service and then have those services spin off the application as needed. This would decouple from IIS, and enable spreading the load more quickly across a set number of machines and eventually when .NET Core was released offer the ability to actually containerize and shift entirely off of Windows Server to a more cost efficient solution under Linux.

These are just some ideas. The solutions of course would vary and obviously provide different results. Above all there are pathways away from negative lock in and a direction toward positive lock in that enables. Realize there’s the balance, and find those that leverage lock in positively.

Nuanced Pedantic Notes:

  • Note I didn’t say all examples, but just that this combo has left more than a few companies out on a limb over the years. There are of course other technologies that have put companies (people actually) in awkward situations too. I’m just using this combo here as an example. For instance, probably some of the most notorious lock in comes from the legal ramifications of using Oracle products and being tied into their sales agreements. On the opposite end of the spectrum, Stack Overflow is a great example of how choosing .NET and scaling with it, SQL Server, and related technologies can work just fine.

Riak Developer Guidance

The “Client Round Robin Anti-Pattern”

One of the features that is often available in Riak Client software (including the CorrguatedIron .NET Client, the riak-js client and others) is the ability to send requests to the Riak Cluster through a round robin style approach. What this means is each IP, of each node within the Riak Cluster is entered into a config file for the client. The client then goes through that list to send off requests to read, write or delete data in the database.

The client being responsible and knowledgeable about the data tier of the application in an architecture is an immediate red flag! The concept around SoC (Separation of Concerns) dictates that

“SoC is a principle for separating a computer program into distinct sections, such that each section addresses a separate concern.

Having the client provide a network tier layer to round robin communication with the database leaves us in a scenario that should be separated into individual concerns. Below is some basic guidance on eliminating this SoC issue.

  • Client ONLY sends and receives communication: The client, especially in the situation with a distributed system like Riak should only be dealing with sending and receiving information from the cluster or a facade that provides an interface for that cluster.
  • Another layer should deal with the network communication and division of nodes and node communication. Ideally, in the case or Riak, and most distributed systems this should be dealt with at the network device layer (router).
  • The network device (router) layer would ideally be able to have (through software likely) a way to automate the failure, inclusion or exclusion of nodes with the cluster system. If a node goes down, the network device should handle the immediate cessation of communication with that node from all clients, routing the communication accordingly to an active node.
  • The node itself needs to maintain a continual information state available to the network. Ideally the network state would identify any addition or removal of a node and if possible the immediate failure of a node. Of course it isn’t always possible to be informed of a failure, but the first line of defense should start within the cluster itself among the nodes.

The Anti-Pattern

Having the client handle all of these parts of the functional architecture leads to a number of problems, not merely that the guidance of the SoC concept is broken. With the client attempting to track and be aware of the individual nodes in the cluster, it sets the client with a huge responsibility.

Take for instance the riak-js client. If a node goes down the client will need to be aware of which node has gone down. For a few seconds (yes, you have to wait entire seconds at this level) the node will be gone and the client won’t know it is down. The client would just have to reasonably wait. When the communication times out, the client would then have to have the responsibility of marking that particular node as down. At this point the client must track which node it is in some type of data repository local to the client. The client must also set a time or some way to identify when the node comes back up. Several questions start to come up such as;

  • Does the client do an arbitrary test to determine when the node comes back up?
  • When the node comes back up is it considered alive or damaged?
  • How would the client manage the IP (or identifier) of the node that has gone down?
  • How long would the client store that the node is down?

The list of questions can get long pretty quick, thus the bad karma of not following a good practice around separating your concerns appropriately! One has to be careful, a god class might be right around the corner otherwise! That’s it for this quick journey into some distributed database usage guidelines. Until next, happy data sciencing.  😉

Back in the Bosh Bunker

In the last post on the topic of Bosh I put together a simple Cloud Foundry environment using the tools & repos of Stark & Wayne. Even though the bootstrap is a great way to get an environment up and started, it doesn’t explain a lot of things about Bosh. So let’s take a look at what we’re dealing with here.

Bosh – What is it?

Bosh handles deployment and upgrades of Cloud Foundry environments. However, it isn’t particularly limited to just Cloud Foundry. It’s been used to launch Riak Clusters, setup Redis, Cassandra, CouchDB and other services that don’t just fit neatly in the Cloud Foundry services design.

It is a very important tool in regards to keeping a Cloud Foundry environment up to date with the latest bits, security fixes, bugs and related elements. Bosh is broken down into several key components that work together to handle these deployment and maintenance tasks.

To put it another way, Bosh aims to give ops or devops the ability to throw together an entire stack to deploy. Bosh starts with stemcells, packages and jobs as the core concepts of how it works.

Bosh is used, within Cloud Foundry and prospectively for whatever anyone would want to use it for, to launch instances, change out the instances, change networking values, IPs and other configuration information. Overall it kind of rolls a lot of other tooling (chef, puppet) together into one tool. How well it does this is up for debate, but I’m not arguing what it is here, just going to get some definitions here.

The Pieces of BOSH

Stemcells

A stem cell or stemcell is something that is a bit hard to track down a definition for. I’m taking a stab at it with what I know a stem cell is, so if you have any corrections please comment below – I’ll be more than happy to add a correction or three. Overall I understand a stem cell to be a complete framework stack built on some sort of virtual image. It can be thought of as the recipe for building an operating systems that will act as an active member of a Cloud Foundry environment. In some situations, such as with a distributed database like Riak, it becomes not so much a member of the Cloud Foundry environment itself but an active node available to a distributed database cluster. This can then be used as a distributed database that is managed by Bosh and accessible within the Cloud Foundry ecosystem.

Packages

A package is sourec with the appropriate scripts for building it into usable binaries. Think of this as a package in the Node.js NPM, Gems (Ruby/Rails), or Nugets (.NET) worlds. It’s something that Bosh will pull in and compile on demand.

There are a few key parts to a package, referred to as package specs. These are: name, dependencies and files. Of the specs, the name and files are really the only required parts. The dependencies are an optional list of other packages this package would depend on.

Jobs

This is pretty self-descriptive. The jobs within Bosh spool up, start servers and services and other miscellaneous responsibilities as needed.

Relavent Sites, Documentation & Key Content
  • The Cloud Foundry Bosh Repo => This is the actual code repository on Github. If you’re in need of really diving into what it does, there’s always the possibility of reading the code!

  • Cloud Foundry Documentation => This has links to documentation related to Bosh that is pivotal (no pun intended).

  • Bosh Documentation => This is the Bosh documentation. It’s almost a good idea to start on the “Running Cloud Foundry” part of the documentation. This documentation can use your help (it’s super sparse at the moment), so if you get going and using Bosh, please contribute with examples and other material.

  • Stark & Wayne Repositories => I already mentioned them, but they’re likely some of the best material out there.

  • Bosh DB => This is a site & repository that Brian McClain @brianmmcclain put together to keep track of bosh stem cells and other repositories related to launching certain tools, services, servers and other things in Cloud Foundry environments via Bosh.

  • Dr Nic’s intro to Bosh => This page serves as an into and description of what’s going on in Bosh. I read this a while back for my own kick off with the Bosh Tool.

Summary

This is what I’ve found and put together as a good starting point. I still think there’s a bit of confusion around what Bosh is, how it works, how to get started with it and having it clearly defined on the web. Documentation is getting better, but still needs a lot of work (remember, you too can contribute). For systems outside of Cloud Foundry it also is a bit difficult and sometimes sketchy to use Bosh as the primary means of deployment, maintenance and upgrading. But just like the documentation that is also getting better. I’ll have more coming in the near future regarding what Bosh is, how it works, and things you can do with it – until then check out Dr Nic’s material for the most up to date how-to and related documentations and videos. He’s done some great work with the tooling and continues to knock it out of the park.

Keep reading and I’ll have more definitions, outlines of what is what, and the entire inception that Bosh is.

Free Windows Azure Images…

I created these, prospectively for use in presentations or whatever may call for something of this sort.  Since others may be talking about Windows Azure and the innards of the system also, I thought I’d contribute these for free use on the web.  I don’t even really care if you attribute me (even though it would be cool if you did).  I’ll be adding a few more over the next couple of weeks as I prepare for some pending presentations.  (click image to see full size)

Windows Azure
Windows Azure
Spline Architecture
Spline Architecture