Architectural PaaS Cracks or Crack PaaS

Over the last couple years there have been two prominent open source PaaS Solutions come onto the market. Cloud Foundry & OpenShift. There’s been a lot of talk about these plays and the talk has slowly but steadily turned into traction. Large enterprises are picking these up and giving their developers and operations staff a real chance to make changes. Sometimes disruptive in a very good way.

However, with all the grandeur I’m going to hit on the negatives. These are the missing parts, the serious pain points beyond just some little deployment nuisance. Then a last note on why, even amidst the pain points, you still need to make real movement with PaaS tooling and technologies.

Negative: The Data Story is Lacking

Both Cloud Foundry and OpenShift have a way to plug into databases easily.

Cloud Foundry provides ways to build a Cloud Foundry Service that becomes the bound and hooked in SQL Server, MySQL, Postgresql, Redis or whatever data storage service you need. For more details on building a service, check out the echo example on the vcap sample github project.

OpenShift has what are called Cartridges which provide the ability to add databases and other services into the system. For more information about the cartridges check out Red Hat’s OpenShift Documentation and also the forums.

Cloud Foundry and OpenShift however have distinctive weak spots when it comes to services that go beyond a mere single instance database. In the case of a true distributed database such as Cassandra, HBase or Riak, it is inordinately difficult to integrate a system that any PaaS inter-operates with well. In some cases it’s irrelevant to even try.

The key problem being that both of the PaaS systems assume the mantle of master while subjugating the distributed database a lower tier of coordination. The way to resolve this at the moment is to do an autonomous installation of Riak, Cassandra, Neo4j or other database that may be distributed, stored hot swappable, or otherwise spread across multiple machine or instance points. Then create a bound connection between it and the PaaS Application that is hosted. This is the big negative in PaaS systems and tooling right now, the data story just doesn’t expand well to the latest in data and database technologies. I’ll elaborate more about this below.

Negative: Deployment is Sometimes Easy, Maintenance is Sometimes Hard

Cloud Foundry is extremely rough to deploy, unless you use Bosh to deploy to either VMware Virtualized instances or AWS. Now, you could if resources were available get Bosh to deploy your Cloud Foundry environment anywhere you wanted. However, that’s not easy to do. Bosh is still a bit of a black box. I myself along with others in the community are working to document Bosh, but it is slow going.

OpenShift is dramatically easier to deploy, but is missing a few key pieces once deployed that draw some additional operational overhead. One of those is that OpenShift requires more networking management to handle routing between various parts of the PaaS Ecosystem.

Overall, this boils down to what you need between the two PaaS tool chains. If you want Cloud Foundry’s automatic routing and management between nodes. This is a viable route, but if your team wants to manage the networking tier more autonomous from the PaaS environment then maybe OpenShift is the way to go. In the end, it’s negative bumpy territory to determine which you may or may not want based on that.

Negative: Full Spectrum Polyglot, Missing Some

Cloud Foundry has a wider selection of languages and frameworks with community involvement around those with groups like Iron Foundry. OpenShift I’m sure will be getting to parity in the coming months. I have no doubt between both of these PaaS Ecosystems that they’ll expand to new languages and frameworks over time. Being polyglot after all is a no brainer these days!

Why PaaS Is, IMHO, Still Vitally Important

First toss out the idea that huge, web scale, Facebooks and Googles need to be built. Think about what the majority of developers out there in the world work on. Tons and tons and tons of legacy or greenfield enterprise applications. Sometimes the developer is lucky enough to work on a full vertical mix of things for a small business, but generally, the standard developer in the world is working on an enterprise app.

PaaS tooling takes the vast majority of that enterprise app maintenance from an operational side and tosses it out. Instead of managing a bunch of servers with a bunch of different apps the operations team manages an ecosystem that has a bunch of apps. This, for the enterprises that have enough foresight and have managed their IT assets well enough to be able to implement and use PaaS tooling, is HUGE!

For companies working to stay relevant in the enterprise, for companies looking to make inroads into the enterprise and especially for enterprises that are looking to maintain, grow or struggling to keep ahead of the curve – PaaS tooling is something that is a must have.

Just ask a dev, do they want to spend a few hours configuring and testing a server?  Do they want to deploy their application and focus on building more value into that application?

…being I’ve spent a few years being the developer, I’ll hedge on the side of adding value.

What’s Next?

So what’s next? Two major things in my opinion.

1. Fill the data gap. Most of the PaaS tooling needs to bridge the gap with the data story. I’m working my part with testing, development and efforts to get real options built into these environments, but this often leads back to the data story of PaaS being weak. What’s the solution here? I’m in talks, ongoing, planning sessions ongoing, and we’ll eventually get a solid solution around the data side.

2. Fix deployments & deployment management. Bosh isn’t straight forward or obvious in what it does, Cloud Foundry is easily the hardest thing to deploy with many dependencies. OpenShift is easier to deploy and neither of them actually have a solid management story over time. Bosh does some impressive updates of Cloud Foundry, and OpenShift has some upgrade methods, but still over time and during day to day operations there hasn’t been any clear cut wins with viewing, monitoring and managing nodes and data within these environments.

OSCON : Day 1, Windows Just Doesn’t Do Cloud Foundry… but, there’s a fix for that…

 The day before yesterday was day one, for me, of OSCON. I’d been out of town on business meet on Monday, so skipped out on the intro day. However the second day, my first, was a good time. There was already a good dose of “oh dear, I can’t attend ALL of the sessions I want to – BLASTED CONCURRENCY ISSUES!” problem. I was pondering the Intro to Erlang, then the backbone.js session, but in the end settled on Dr Nic’s @drnic session on how to deploy Cloud Foundry with BOSH.

Windows Just Doesn’t Do It

The first issue we ran into was actually the issue of prerequisites. About 30% of the audience was running Windows. To clarify the Windows question, there is no PaaS Solution that meets the following requirements:

– All Services Running on Windows
– Open Source Software
– Free or Cost

For those of you running Windows, the closest thing you can get – and I might add it’s a damn good solution – is Iron Foundry. But you’ll have to accept that there will still be some Linux involved for the Cloud Foundry parts that don’t run on Windows.

OSCON Ongoing

After the session I footed it over to the booths were a food & beer crawl of sorts was occurring, which I think might have been the first booth crawl, of two booth crawls. This was a good time, as the booth crawls usually are. It’s also fun seeing and learning about all the companies that are participating. Since everybody involved is ideally open sourced 100%, and most are at least a large percent open sourced, I always like hearing about the business models that are being used around the various products and services.

With that, this is day 1 coverage, I’ll leave you with a few photos of my first day:

The Chalk Art Wall o' Companies & Messages (Click for full size)
The Chalk Art Wall o’ Companies & Messages (Click for full size)
ESRI hanging out below the Samsung Sign... or is that perception?  (Click for full size)
ESRI hanging out below the Samsung Sign… or is that perception? (Click for full size)
Riot Games just before the deluge! (Click for full size)
Riot Games just before the deluge! (Click for full size)

…and with that, I’ll have a follow up post on the following days following this post. Cheers!

Back in the Bosh Bunker

In the last post on the topic of Bosh I put together a simple Cloud Foundry environment using the tools & repos of Stark & Wayne. Even though the bootstrap is a great way to get an environment up and started, it doesn’t explain a lot of things about Bosh. So let’s take a look at what we’re dealing with here.

Bosh – What is it?

Bosh handles deployment and upgrades of Cloud Foundry environments. However, it isn’t particularly limited to just Cloud Foundry. It’s been used to launch Riak Clusters, setup Redis, Cassandra, CouchDB and other services that don’t just fit neatly in the Cloud Foundry services design.

It is a very important tool in regards to keeping a Cloud Foundry environment up to date with the latest bits, security fixes, bugs and related elements. Bosh is broken down into several key components that work together to handle these deployment and maintenance tasks.

To put it another way, Bosh aims to give ops or devops the ability to throw together an entire stack to deploy. Bosh starts with stemcells, packages and jobs as the core concepts of how it works.

Bosh is used, within Cloud Foundry and prospectively for whatever anyone would want to use it for, to launch instances, change out the instances, change networking values, IPs and other configuration information. Overall it kind of rolls a lot of other tooling (chef, puppet) together into one tool. How well it does this is up for debate, but I’m not arguing what it is here, just going to get some definitions here.

The Pieces of BOSH

Stemcells

A stem cell or stemcell is something that is a bit hard to track down a definition for. I’m taking a stab at it with what I know a stem cell is, so if you have any corrections please comment below – I’ll be more than happy to add a correction or three. Overall I understand a stem cell to be a complete framework stack built on some sort of virtual image. It can be thought of as the recipe for building an operating systems that will act as an active member of a Cloud Foundry environment. In some situations, such as with a distributed database like Riak, it becomes not so much a member of the Cloud Foundry environment itself but an active node available to a distributed database cluster. This can then be used as a distributed database that is managed by Bosh and accessible within the Cloud Foundry ecosystem.

Packages

A package is sourec with the appropriate scripts for building it into usable binaries. Think of this as a package in the Node.js NPM, Gems (Ruby/Rails), or Nugets (.NET) worlds. It’s something that Bosh will pull in and compile on demand.

There are a few key parts to a package, referred to as package specs. These are: name, dependencies and files. Of the specs, the name and files are really the only required parts. The dependencies are an optional list of other packages this package would depend on.

Jobs

This is pretty self-descriptive. The jobs within Bosh spool up, start servers and services and other miscellaneous responsibilities as needed.

Relavent Sites, Documentation & Key Content
  • The Cloud Foundry Bosh Repo => This is the actual code repository on Github. If you’re in need of really diving into what it does, there’s always the possibility of reading the code!

  • Cloud Foundry Documentation => This has links to documentation related to Bosh that is pivotal (no pun intended).

  • Bosh Documentation => This is the Bosh documentation. It’s almost a good idea to start on the “Running Cloud Foundry” part of the documentation. This documentation can use your help (it’s super sparse at the moment), so if you get going and using Bosh, please contribute with examples and other material.

  • Stark & Wayne Repositories => I already mentioned them, but they’re likely some of the best material out there.

  • Bosh DB => This is a site & repository that Brian McClain @brianmmcclain put together to keep track of bosh stem cells and other repositories related to launching certain tools, services, servers and other things in Cloud Foundry environments via Bosh.

  • Dr Nic’s intro to Bosh => This page serves as an into and description of what’s going on in Bosh. I read this a while back for my own kick off with the Bosh Tool.

Summary

This is what I’ve found and put together as a good starting point. I still think there’s a bit of confusion around what Bosh is, how it works, how to get started with it and having it clearly defined on the web. Documentation is getting better, but still needs a lot of work (remember, you too can contribute). For systems outside of Cloud Foundry it also is a bit difficult and sometimes sketchy to use Bosh as the primary means of deployment, maintenance and upgrading. But just like the documentation that is also getting better. I’ll have more coming in the near future regarding what Bosh is, how it works, and things you can do with it – until then check out Dr Nic’s material for the most up to date how-to and related documentations and videos. He’s done some great work with the tooling and continues to knock it out of the park.

Keep reading and I’ll have more definitions, outlines of what is what, and the entire inception that Bosh is.

Using Bosh to Bootstrap Cloud Foundry via Stark & Wayne Consulting

I finally sat down and really started to take a stab at Cloud Foundry Bosh. Here’s the quick lowdown on installing the necessary bits and getting an initial environment built. Big thanks out to Dr Nic @drnic, Luke Bakken & Brain McClain @brianmmcclain for initial pointers to where the good content is. With their guidance and help I’ve put together this how-to. Enjoy…  boshing.

Prerequisites

Step: Get an instance/machine up and running.

To make sure I had a totally clean starting point I started out with an AWS EC2 Instance to work from. I chose a micro instance loaded with Ubuntu. You can use your local workstation if you want to or whatever, it really doesn’t matter. The one catch, of course is you’ll have to have a supported *nix based operating system.

Step: Get things updated for Ubuntu.

[sourcecode language=”bash”]
sudo apt-get update
[/sourcecode]

Step: Get cURL to make life easy.

[sourcecode language=”bash”]
sudo apt-get install curl
[/sourcecode]

Step: Get Ruby, in a proper way.

[sourcecode language=”bash”]
\curl -L https://get.rvm.io | bash -s stable
source ~/.rvm/scripts/rvm
rvm autolibs enable
rvm requirements
[/sourcecode]

Enabling autolibs sets up so that rvm will install all the requirements with the ‘rvm requirements’ command. It used to just show you what you needed, then you’d have to go through and install them. This requirements phase includes some specifics, such as git, gcc, sqlite, and other tools needed to build, execute and work with Ruby via rvm. Really helpful things overall, which will come in handy later when using this instance for whatever purposes.

Finish up the Ruby install and set it as our default ruby to use.

[sourcecode language=”bash”]
rvm install 1.9.3
rvm use 1.9.3 –default
rvm rubygems current
[/sourcecode]

Step: Get bosh-bootstrap.

bosh-bootstrap is the easiest way to get started with a sample bosh deployment. For more information check out Dr Nic’s Stark and Wayne repo on Github. (also check out the Cloud Foundry Bosh repo.)

[sourcecode language=”bash”]
gem install bosh-bootstrap
gem update –system
[/sourcecode]

Git was installed a little earlier in the process, so now set the default user name and email so that when we use bosh it will know what to use for cloning repositories it uses.

[sourcecode language=”bash”]
git config –global user.name "Adron Hall"
git config –global user.email plzdont@spamme.bro
[/sourcecode]

Step: Launch a bosh deploy with the bootstrap.

[sourcecode language=”bash”]
bosh-bootstrap deploy
[/sourcecode]

You’ll receive a prompt, and here’s what to hit to get a good first deploy.

Stage 1: I select AWS, simply as I’ve no OpenStack environment. One day maybe I can try out the other option. Until then I went with the tried and true AWS. Here you’ll need to enter your access & secret key from the AWS security settings for your AWS account.

For the region, I selected #7, which is west 2. That translates to the data center in Oregon. Why did I select Oregon? Because I live in Portland and that data center is about 50 miles away. Otherwise it doesn’t matter which region you select, any region can spool up almost any type of bosh environment.

Stage 2: In this stage, select default by hitting enter. This will choose the default bosh settings. The default uses a medium instance to spool up a good default Cloud Foundry environment. It also sets up a security group specifically for Cloud Foundry.

Stage 3: At this point you’ll be prompted to select what to do, choose to create an inception virtual machine. After a while, sometimes a few minutes, sometimes an hour or two – depending on internal and external connections – you should receive the “Stage 6: Setup bosh” results.

Stage 6: Setup bosh

setup bosh user
uploading /tmp/remote_script_setup_bosh_user to Inception VM
Initially targeting micro-bosh…
Target set to `microbosh-aws-us-west-2′
Creating initial user adron…
Logged in as `admin’
User `adron’ has been created
Login as adron…
Logged in as `adron’
Successfully setup bosh user
cleanup permissions
uploading /tmp/remote_script_cleanup_permissions to Inception VM
Successfully cleanup permissions
Locally targeting and login to new BOSH…
bosh -u adron -p cheesewhiz target 54.214.0.15
Target set to `microbosh-aws-us-west-2′
bosh login adron cheesewhiz
Logged in as `adron’
Confirming: You are now targeting and logged in to your BOSH

ubuntu@ip-yz-xyz-xx-yy:~$

If you look in your AWS Console you should also see a box with a key pair named “inception” and one that is under the “microbosh-aws-us-west-2” name. The inception instance is a m1.small while the microbosh instance is an m1.medium.

That should get you going with bosh. In my next entry around bosh I’ll dive into some of Dr Nic & Brian McClain’s work before diving into what exactly Bosh actually is. As one may expect, from Stark & Wayne we can expect some pretty cool stuff, so keep an eye over there on Stark & Wayne.

Deploycon, PaaS & the pending data tier gravity fallout…

For a quick recap of last years Deploycon & related talks, check out my “Day #3 => DeployCon && Enterprise && Data Gravity” entry from last year.

PaaS Systems aren’t always effectively distributed. Heroku has fallen over every time east-1 has gone down at AWS. Not that I’m saying they’ve done bad, just pointing that out. With Cloud Foundry, there’s several key SPOFs (Single Points of Failure), and with all PaaS Systems the data tier is often the neglected pairing of the system. I’ve been wanting to write about this for a few months now and Deploycon has lit a fire for me to do just that.

Deploycon – “Platform Services and Developer Expectations” **

I’m on a panel at Deploycon titled “Platform Services and Developer Expectations” and this leads right back around to that. This SPOF issue is concerning to me as PaaS Providers talk up the offerings more and more with little light actually shone on this issue. In some ways each is moving away form their respective SPOFs, but overall they’re all pretty prevalent throughout. For security, each has a non-distributed database, which technically needs backed up still – no clear replication or other mechanisms setup to ensure data integrity in a failure situation. Of course, the huge saving grace with a PaaS, is that if the overall system goes down or a SPOF blows up, all the existing deployed applications will generally continue to run. Unless of course the routing and networking are also SPOF. This is the largest glaring concern with PaaS Systems that I see today.

One of the other things about PaaS that has always led to a ton of questions is “what about my PostGresql/mysql/Riak/mongodb/database thing and how do I do X, Y, Z with it to ensure scalability in my PaaS.” In almost every case it ends with a simple and unfortunate answer, “…when it comes to data, a PaaS doesn’t really do a damn thing for ya…” This is obviously not very helpful. The entire reason to put a PaaS into place is to simplify life, the sad fact that it barely does a thing for the data tier isn’t very helpful.

Now, hold on a second before you start screaming at me about “but a PaaS does X, Y and Z and isn’t even supposed to touch that aspect of things…” let me elaborate a bit more. The panel at Deploycon states “…Developer Expectations” and when things are getting simplified in the way a PaaS does, developers assume that if it does all this fancy magic for an application it ought to simplify the data side of things too! Right? Well no, and it isn’t going to for the foreseeable future. But no matter what, it doesn’t change the fact that developers often have that expectation.

Now, I could write at length about all the reasons that PaaS doesn’t really do anything for the data tier. I could wax poetic about how a distributed database (re: Riak, Cassandra, etc) just doesn’t lend itself to a cookie cutter approach to deployment under a PaaS or an RDBMS has umpteen different configurations for stability, scaling, hot swappable services, and other such complexities around the data tier. But instead I’m going to skip all, maybe cover some of those things another day, and jump right into some of the things that are actually moving forward to fill this gap.

BOSH, Cloud Foundry, OpenShift & fixing the data tier…

The most obvious reason there isn’t a simple turn key solution to the data side of things with a PaaS ecosystem is that data is complex and extremely diverse. There’s distributed key/value stores (Riak, Cassandra), there’s sort of kind of distributed databases (Mongo), graph databases (Neo4j), the age old RDBMS (DB2, SQL Server, Oracle’s Stuff, etc) and the million solutions around that, there’s key/value in memory styled databases that are insanely fast, like Redis. Expanding just slightly you have software that works around these systems such as Hadoop & Riak CS & the list goes on. All of it focused on the data tier and maintaining one, two or some form of the three points around CAP Theorem (http://en.wikipedia.org/wiki/CAP_theorem), atomicity and other key capbilities.

All of the PaaS Systems, including public and private often have some sort of plug-in style architectures for data. Whether it is Apprenda which is closed to community and closed source or an ongoing open to community PaaS like OpenShift or Cloud Foundry, things still fall almost entirely to the developers or database team to build an architecture around the data. When looking at solutions to simplify data in PaaS Systems the closed source solutions we have no idea what they’re up to in this regard. The one’s that are open source or in large part public and involved in the community PaaSes, like EngineYard, Heroku, Cloudbees and others we can really see the directions and efforts around creating real PaaS style solutions to the data tier problem.

BOSH, Vagrant, etc…  One of the best solutions I’ve seen so far is the ability of Bosh, which was created by the Cloud Foundry team while at VMware, to spool up an environment that includes such things as a Riak Cluster (or other cluster). Currently Brian McClain & Dr Nic have worked to put together such Bosh + Vagrant scripts & get things rolling. I myself will be spending some considerable time on just that. But beyond that this is a good start in enabling data tier back ends.

How to close the gap, between absurdly simple application deployment and still arduous and difficult data tier deployment? For the next several years I think we’ll have cumbersome deployment practices around the data tier. There won’t be anything as elegantly simple as Cloud Foundry’s single line deployment or AppFog’s one click deployment of a web application. The best we can do at this time, is to streamline around pieces and architectures, and at least get them into a kind of simple 3 step deployment.

Please drop a comment or two on how you think we might simplify the data side of the PaaS toolchain. Also drop a few tweets in the twitterverse too, I’m sure that’ll be exploding as usual. I’m @adron, ping me.

Cheers, happy data architecting.

** the Deployconpanel will be at 4:30pm in Santa Clara on April 2nd. Come check it out.