So now that I’ve provided the links, here’s a quick intro to each of the application sections, what this application is for, where the workflow for contributions will be and what the next steps are. Trust me, I roll easy, I’ll be working as hard as I can to make pull requests easy peasy, keep the issues down to workable contributions and the whole “this is a good OSS project”.
The juncture application should be split into several key components, or application divisions of functionality. I’ve broken each out with a basic description. If you just want to watch a video where I outline each division, play the video below for a quick 5 minute intro to the application and the idea behind it all.
A quick run through of the first sample UI.
Call the Doctor! (Administration & Maintenance)
This part of the application would provide an interface for all the general administration and maintenance needs around individual nodes and around the overall cluster of nodes. The ability to add, remove and generally administer everything that is available via the riak-admin command line interface.
Time Travel That Data (Performance Benchmarking)
This section of the application will provide the ability to benchmark the timing of data in and out of a cluster. In addition it should show standard benchmarking similar to that which is offered with the basho_bench project.
Love of the Data (Reporting)
This division of the application would be focused on reporting. I’m not sure what exactly that would entail, but something with charts, graphs and pulling together trending points of some sort. If you have ideas and want to work on this part of the application, weigh in!
Golfing With Your Data (Query, Put, Deletes, Etc. Handling the CRUD)
The application will have an interface to provide access to add and remove data, as well as viewing the data that is available within a cluster. The primary means for implementing this part of the application will be with the CorrugatedIron Project. It’s a library available via Nuget that @peschkaj and @TheColonial have put together.
News! News! News! (News… RSS Feed Reader)
The idea is that this will provide a quick and easy way to get familiar with Windows 8 dev and the project overall. I’m aiming to eat the Basho blog feed and provide it as key highlights for the application with future abilities around mining other RSS feeds or such and having those fed into a ?? Riak Cluster? Again, everything is open to change, addition or removal! So jump into the project and let me know your thoughts.
Whew, it’s been a total blast working at Basho. I’ve accomplished a ton of things. Riak is a solid distributed database system and I’m glad to have worked with the team on advocating its use, teaching distributed systems ideas and concepts and generally spreading the knowledge. I’ve seen some truly great things that people are hacking together, setting up for projects and redesigning old systems to utilize newer, better, faster and more capable distributed systems concepts and ideas. Some of the things I’m happy to have contributed to in my time at Basho.
I helped negotiate and get an effort started that came to fruition with Tier 3 releasing a Riak CS backed object store for their customers. A very cool feature added to their already formidable enterprise cloud offerings. Read more about that here “Tier 3 Object Storage” and here “Tier 3 Launches Global Object Storage“. Their implementation is really nice, including many geographic regions of accessibility, S3 API compatibility and high end storage capabilities that offer a bigger punch of performance than your average object storage in the cloud!
I launched both the Seattle Riak (up to 52 members now!) and Portland Riak (up to 74 members now!!) groups started, which you should join, they’re a good time, good conversation and great information.
I partnered with Troy Howard @thoward37 to run the second year of Node PDX. Basho was excellent enough to contribute not just a few bucks but also sent Chris Meiklejohn @cmeik out to speak at the conference.
I got to work directly with a number of people at Windows Azure, AWS and EngineYard in deploying Riak, testing out how the respective images (azure VM Depot & AWS AMI) and deployments (Riak EDS) would work. In the end, this has been a great opportunity to learn more about the latest and greatest of each of these services. I’ve been impressed as they’ve each been doing a seriously kick ass job lately!
…and there has been a whole lot more. Suffice it to say, Basho has provided me with some sweet opportunities to work on some extremely interesting data projects from a very data sciency point of view (yeah I know sciency aint a word). There may be more Riak work and Riak meetups and Riak hacks and Riak who knows what coming from me, but the meetups & such are now at the hands of the core Riak crew and…
Where Am I Headed?
Right now, I’m moving 20 blocks away from where I currently live, setting up a couch to hack on and grabbing a beer. I’ve got a few personal projects I’ve been wanting to work on. Then I’m taking a few weeks to do some side projects that have been on the burner. Keep an eye out, I’ll be kicking off one, maybe two of these open source projects in the next few days. As @tsantero twitted…
i wish i had the time to work on even 5% of the ideas in my notebook :
…I’m going to attack my own notebook of ideas. Maybe I’ll even work on that Riak CS Video object store that Tom and I spoke about 10 months ago? Either way, whatever the projects are, I’ll have them posted right here. Until then…
I wrote about my first day of OSCON “OSCON : Day 1, Windows Just Doesn’t Do Cloud Foundry… but, there’s a fix for that…“. The rest of the week was most excellent. I caught up with friends and past coworkers. I heard about people working on some amazing new projects. Some things I will try to write up in the coming days, as I’m sure some of it will be making the tech news (if not the regular people news too).
Had some great conversations about the direction of enterprise and paas uptake. It’s great to hear that there is some movement in that space finally. As one would expect however, there is still a lot of distance for the enterprise to catch up on, but they’ll get there – or fall apart in the meantime.
There were also tons of conversation about the Indiegogo Ubuntu Edge mobile device. This device is a great looking and sounds like a solid idea. The questions arise in the fact that they’re working to make this a purely crowd funded project. This wouldn’t be a concern if they were trying to just get a few million in capital, but they’re aiming for $32 million! Overall though, with 128 GB, Dual LTE Antennas for Europe and the US, a top tier screen in quality and design, a metal body and also multiple other features that put this phone ahead of anything out there. I hope it’s successful, but I must admit my own hesitance. What’s your take on the device?
Over the course of the conference I talked to and worked with a number of other individuals playing around with Cloud Foundry and also OpenShift. The primary aspect that we worked on was strategies around deployment of these PaaS Technologies.
We also worked with Iron Foundry to extend Cloud Foundry to support .NET. If you love .NET or hate .NET, wherever in that spectrum, it has an absolutely huge user base still. Primarily because .NET spent the last decade and a few years going head to head against Java in the Enterprise, and we all know the enterprise is slow to shift anything. So for now and the foreseeable future .NET is an extremely large part of the development world. Having it work in your PaaS is fundamental to gaining significant enterprise share. Cloud Foundry is the only open source, internally usable PaaS on the market today. There are closed source options available, but that obviously doens’t come up at OSCON.
While at OSCON, I also got to discuss architecture and deployment of Riak with a number of people. The usage of Riak continues to grow and the environments, use cases and tooling that people are using Riak with and for is always an interesting space for me. I also got to discuss deployment of Cassandra and even some Neo4j, Redis and Riak side by side deployments. People have used an interesting mix of NoSQL solutions out there to pull their respective data together for their needs.
Among all these deployments, conversations regularly returned to a known topic of mine. Cloud computing and who is capable of what, where and when. AWS is still an easy leader in cloud computing, not just in customers but in technology. This also brought up the concerns and apathy that some have around OpenStack (hat tip to Ben Kepes for the write up) working more homogeneously with AWS. Whatever the case might be, the path for OpenStack needs to be clarified regularly. I imagine the next movement is going to be away from being too concerned with infrastructure and increased concern with portability of applications and development of applications.
Another growing topic of discussion was around building applications for, on and with Windows Azure. Microsoft has actually become dramatically more involved in open source in an honest and more integrity based way. I’m honestly amazed at how far they’ve come from the declaration years ago that “open source is a cancer” and the all too famous, “linux is communism“. Whatever that was supposed to mean, they didn’t seem to get it back then. Now however, they regularly contribute to open source projects on codeplex but also github and other places. Microsoft has even contributed to the Linux kernal a few months ago.
That leads me to the next topic that came up a number of times…
There’s been a lot of discussion about architecture around PaaS, containers (more on that in a moment), distributed systems in general and distributed databases. As I wrote about recently, “Architectural PaaS Cracks or Crack PaaS” the world of distributed systems and distributed databases has more than a few issues when working together in a PaaS environment. This brought up the discussion about what solutions exist today, solutions I look forward to writing and building in the coming months.
The most immediate solution to scalable data sources is still to run your operational data sources such as Neo4j, Redis, Riak or other database autonomously but residing close to your PaaS System. The current public PaaS Providers do exactly this and in some cases extend that to offer the databases and data sources as services through add-ons. These are currently great solutions, but require time, effort and custom development work when setting up internally.
This leads me to the last topic…
The Story of a Container – Docker
Well, not just Docker, but containers in general and Docker specifically. First some context about what a container is.
Container – In this particular context I’m writing about a container, or more specifically a runtime-container, that isolates resources for applications or services. Containers are common in PaaS technologies to help isolate the specific services or applications when they’re on a single physical machine or instance. For each of the respective PaaS systems that came up at OSCON we have dotCloud from the same team that created Docker, Cloud Foundry has Warden and OpenShift has gears and Red Hat Enterprise Linux OS specific containers.
I’ve studied Warden a little in the past while I was working with AppFog and Tier 3 around Cloud Foundry. Warden is a great piece of technology. However the star at OSCON was clearly Docker. I jumped into a number of conversations around Docker. This conversation would then take the direction to containers becoming the key to PaaS tooling and systems growth and increasing capabilities. That leads me back to my previous blog entry “Architectural PaaS Cracks or Crack PaaS” and one of the key solutions to the data tier issue.
Containers, A Solution for Scaling the Data Tier
One of the issues that comes up when trying to scale any distributed database in a PaaS Environment is how to provide multi-tenancy without spooling up new instances for each and every single installation of a node within that distributed database. Here’s an example diagram of the requirements behind a scalable distributed database.
In a default configuration you’d want each node to be running on a physical machine or dedicated virtual instance. This is for performance reasons as well as reasons for load balancing, security, data integrity and a host of others. This is the natural beginning state of a highly available distributed database or distributed system.
Trying to deploy something like this into a PaaS environment is tricky. Take into account that there is no such thing in application or service speak as an instance, and especially not anything such as a physical server. The real division between process and resources are containers. These containers are what actually needs to run the distributed system node. This becomes possible, if a distributed system node can be deployed to and executed from within a container.
After reviewing Docker, the capabilities around it and the requirements of a distributed database, it looks like an ideal marriage of the two technologies. Already Docker has Redis and other database technologies running on it. The Container technology around Docker looks like an ideal fit to extend distributed systems to run autonomously of a single physical machine or single instance per node. This would enable nodes to be deployed as resources are available to provide a more seamless and PaaS style deployment for systems like Cassandra, Riak and related distributed systems. Could this be the next evolution of affordable distributed systems, containers to the rescue?
I’ll be reporting back on my progress, this could be cool!
Two things have worked together that made me want to write up the new Riak 1.4 features. With Riak 1.4 hitting the streets and the work I’ve been doing with CorrugatedIron there are a few features that are going to add icing the cake. If you want to dive more into the release, check out the release notes. If you’re interested in the .NET Client CorrugatedIron, check it out here or check out the code on github. Now on to the client APIs.
…the command attaches to the named pipe to communicate with the running erlang nodes. Now when you hit Ctrl-C it kills just the pipe versus killing the pipe and riak node that you’re on. This is something that has bit me in the keister more than a few times. Bringing down a node or two while working on viewing what is going on with a node. This leads me to the next enhancement.
If you’re using riak_kv_bitcask_backend, riak_kv_eleveldb_backend or riak_kv_memory_backend the riak-admin transfers command now shows per-transfer progress and displays long node names better. Giving you a better idea of what and where things are going. The way this is reported depends slightly on the specific back end. For bitcask or in memory back end the progress is calculated by the keys already transferred out of the total keys, where as the level DB back end calculates based on bytes transferred. Based on this the level DB calculation can get slightly off over time.
Protocol Buffers & Multiple Interface Binding
Protocol Buffers can now bind to multiple ports and interfaces, so clients such as CorrugatedIron for .NET (http://corrugatediron.org/), Riakjs (http://riakjs.com/) can now bind to the Protocol Buffers outside of the set configuration. For more on Riak configuration around the binding, check out the Basho Docs (http://docs.basho.com/riak/latest/references/Configuration-Files/). This also brings feature parity around interface binding equal to that of the HTTP interfaces. This changes the pb_port and pb_ip to a single pb setting which is now a list of IP and port pairs.
Milliseconds can now be assigned to a timeout value for clients. This can be used for object manipulation around fetch, store and delete, listing buckets or keys. This takes care of some time out issues that may have been occurring during certain types of requests. This will come in handy for asynchronous and pivotal if anyone goes the synchronous route.
Bucket Properties for Protocol Buffers
If you’re needing to reset a bucket to it’s defaults, this is now possible. Besides a reset to defaults all bucket properties are now usable for protocol buffer usage. This can definitely help client usage of protocol buffers in a dramatic way.
List-buckets Streaming – Realtime
Listing keys or buckets via a streaming request will send bucket names to the client as received. This prevents any need to wait for a request from all nodes to respond. This can help with response time and time outs from the client point of view. This gives the ability to use the streaming features with Node.js, C#, Java and other languages and frameworks that support realtime streaming data feeds.
…these are the features that have jumped out at me, so until next release.
The Rails 2013 Conference kicked off for me, with a short bike ride through town to the conference center. The Portland conference center is one of the most connected conference centers I’ve seen; light rail, streetcar, bus, bicycle boulevards, trails & of course pedestrian access is all available. I personally have no idea if you can drive to it, but I hear there is parking & such for drivers.
Rails Conf however clearly places itself in the category of a conference of people that give a shit! This is evident in so many things among the community, from the inclusive nature creating one of the most diverse groups of developers to the fact they handed out 7 day transit passes upon picking up your Rails Conf Pass!
The keynote was by DHH (obviously right?). He laid out where the Rails stack is, some roadmap topics & drew out how much the community had grown. Overall, Rails is now in the state of maintain and grow the ideal. Considering its inclusive nature I hope to see it continue to grow and to increase options out there for people getting into software development.
I also met a number of people while at the conference. One person I ran into again was Travis, who lives out yonder in Jacksonville, Florida and works with Hashrocket. Travis & I, besides the pure metal, have Jacksonville as common stomping ground. Last year I’d met him while the Hash Rocket Crew were in town. We discussed Portland, where to go and how to get there, plus what Hashrocket has been up to in regards to use around Mongo, other databases and how Ruby on Rails was treating them. The conclusion, all good on the dev front!
One of these days though, the Hashrocket crew is just gonna have to move to Portland. Sorry Jacksonville, we’ll visit one day. 😉
For the later half of the conferene I actually dove out and headed down for some client discussions in the country of Southern California. Nathan Aschbacher headed up Basho attendance at the conference from this point on. Which reminds me, I’ve gotta get a sitrep with Nathan…
RICON East (May 13th & 14th)
Ok, so I didn’t actually attend RICON East (sad face), I had far too many things to handle over here in Portlandia – but I watched over 1/3rd of the talks via the 1080p live stream. The basic idea of the RICON Conferences, is a conference series focused on distributed systems. Riak is of course a distributed database, falling into that category, but RICON is by no means merely about Riak at all. At RICON the talks range from competing products to acedemic heavy hitting talks about how, where and why distributed systems are the future of computing. They may touch on things you may be familiar with such as;
PaaS (Platform as a Service)
Existing databases and how they may fit into the fabric of distributed systems (such as Postgresql)
How to scale distributed across AWS Cloud Services, Azure or other cloud providers
As the videos are posted online I’ll be providing some blog entries around the talks. It will however be extremely difficult to choose the first to review, just as RICON back in October of 2012, every single talk was far above the modicum of the median!
Two immediate two talks that stand out was Christopher Meiklejohn’s @cmeik talk, doing a bit o’ proofs and all, in realtime off the cuff and all. It was merely a 5 minute lightnight talk, but holy shit this guy can roll through and hand off intelligence via a talk so fast in blew my mind!
The other talk was Kyle’s, AKA @aphry, who went through network partitions with databases. Basically destroying any comfort you might have with your database being effective at getting reads in a partition event. Kyle knows his stuff, that is without doubt.
There are many others, so subscribe keep reading and I’ll be posting them in the coming weeks.
Node PDX 2013 (May 16th & 17th)
Holy moley we did it, again! Thanks to EVERYBODY out there in the community for helping us pull together another kick ass Node PDX event! That’s two years in a row now! My fellow cohort of Troy Howard @thoward37 and Luc Perkins @lucperkins had hustled like some crazed worker bees to get everything together and ready – as always a lot always comes together the last minute and we don’t get a wink of sleep until its all done and everybody has had a good time!
Polyglot Conference was held in Vancouver again this year, with clear intent to expand to Portland and Seattle in the coming year or two. I’m super stoked about this and will definitely be looking to help out – if you’re interested in helping let me know and I’ll get you in contact with the entire crew that’s been handling things so far!
The biggest problem with this conference, is that it’s technically only one day. I hope that we can extend it to two days for next year – and hopefully even have the Seattle and Portland branches go with an extended two day itenerary.
This year the break out sessions that that I attended included “Dev Tools”, “How to Be a Better Programmer”, “Go (Language) Noises”, other great sessions and I threw down a session of my own on “Distributed Systems”. Overall, great time and great sessions! I had a blast and am looking forward to next year.
By the way, I’m not sure if I’ve mentioned this at the beginning of this blog entry, but this is only THE BEGINNING OF SUMMER IN CASCADIA! I’ll have more coverage of these events and others coming up, the roadmap includes OS Bridge (where I’m also speaking) and Portland’s notorious OSCON.
Until the next conference, keep hacking on that next bad ass piece of software, cheers!