Lots of Events & Topical Tech Discussions

This week we just had Ryan Zhang present at the Seattle Scalability Meetup. I did a little short presentation just showing some tools that I’ve been using as of late; DataGrip, and related schema migrations and Docker containers as I work through the schema migrations. It was a solid meetup and excellent conversation after meetup, big thanks to everybody who came out to the meetup and joined us for a round of drinks, amazing cheese curds and hummus at Collin’s afterwards! It was a great meetup and looking forward to getting together again on May 28th with Guinevere (@guincodes) presenting “The PR That Wouldn’t Merge“!

In other upcoming events that I’ll be at either presenting or attending. The events I’m attending let’s get talk, I’m always interested in meeting new people and learning about you’re working on, what you’re learning, and where and what efforts are of interest to you. For the events I’m presenting at the same applies, plus I’ll be standing among all the persons and presenting whatever tidbit of knowledge I’ve come to present. Hopefully it’ll be useful and informative for you and we can continue the conversation after the presentation and we all gain more insight, ideas, and ways to move forward more productively with our respective efforts. Here’s a list of the next big meetups and conferences I’m either speaking at or attending, and hope to see and meet many of you dear \m/ readers there!

Attending ML4ALL @ Bossanova Ballroom on April 28th thru 30th in Portland, Oregon.
Presenting “Schema Migrations & Distributed Database Management Software Projects” in Portland on May 14th.
Attending Accelerate on May 21st-23rd at the Gaylord National Resort & Convetion Center in Maryland.
Presenting “Go Trigonous Hacks for Database Reliability Engineering” in San Jose on either June 11th, 12th, or 13th.
Attending Velocity Conference and presenting at Software Architecture Conference @ San Jose Convention Center on June 10th-13th in San Jose, California.

DataStax Developer Days

Over the last week I had the privilege and adventure of coming out to Chicago and Dallas to teach about operations and security capabilities of DataStax Enterprise. More about that later in this post, first I’ll elaborate on and answer the following:

What is DataStax Developer Day? Why would you want to attend?
Where are the current DataStax Developer Day events that have been held, and were future events are going to be held?
Possibilities for future events near a city you live in.

What is DataStax Developer Day?

The way we’ve organized this developer day event at DataStax, is focused around the DataStax Enterprise built on Apache Cassandra product, however I have to add the very important note that this isn’t merely just a product pitch type of thing, you can and will learn about distributed databases and systems in a general sense too. We talk about a number of the core principles behind distributed systems such as the pivotally important consistent hash ring, datacenter and racks, gossip, replication, snitches, and more. We feel it’s important that there’s enough theory that comes along with the configuration and features covered to understand who, what, where, why, and how behind the configuration and features too.

The starting point of the day’s course material is based on the idea that one has not worked with or played with a Apache Cassandra or DataStax Enterprise. However we have a number of courses throughout the day that delve into more specific details and advanced topics. There are three specific tracks:

Cassandra Track – this track consists of three workshops: Core Cassandra, Cassandra Data Modeling, and Cassandra Application Development. [more details]
DSE Track – this track consists of three workshops: DataStax Enterprise Search, DataStax Enterprise Analytics, and DataStax Enterprise Graph. [more details]
Bonus Content – This track has two workshops: DataStax Enterprise Overview and DataStax Enterprise Operations and Security. [more details]

Why would you want to attend?

One huge rad awesome reason is that the developer day events are FREE. But really, nothing is ever free right? You’d want to take a day away from the office to join us, so there’s that.
You also might want to even stay a little later after the event as we always have a solidly enjoyable happy hour so we can all extend conversations into the evening and talk shop. After all, working with distributed databases, managing data, and all that jazz is honestly pretty enjoyable when you’ve got awesome systems like this to work with, so an extended conversation into the evening is more than worth it!
You’ll get a firm basis of knowledge and skillset around the use, management, and more than a few ideas about how Apache Cassandra and DataStax Enterprise can extend your system’s systemic capabilities.
You’ll get a chance to go beyond merely the distributed database system of Apache Cassandra itself and delve into graph, what it is and how it works, analytics, and search too. All workshops take a look at the architecture, uses, and what these capabilities will provide your systems.
You’ll also have one on one time with DataStax engineers, and other technical members of the team to ask questions, talk about architecture and solutions that you may be working on, or generally discuss any number of graph, analytics, search, or distributed systems related questions.

Where are the current DataStax Developer Day events that have been held, and were future events are going to be held? So far we’ve held events in New York City, Washington DC, Chicago, and Dallas. We’ve got two more events scheduled with one in London, England and one in Paris, France.

Future events? With a number of events completed and a few on the calendar, we’re interested in hearing about future possible locations for events. Where are you located and where might an event of this sort be useful for the community? I can think of a number of cities, but organizing them into order to know where to get something scheduled next is difficult, which is why the team is looking for input. So ping me via @Adron, email, or just send me a quick message from here.

PDX Cloud – A Question Posed.

I attended the PDX Cloud meeting to present, but more to ask a question. Here’s how I posed that question (slide deck at the bottom of this blog entry). I frame the scenario of the distributed development world of cloud computing, dive into the vertical world of enterprise dev and then throw down the big question…

This is a situational report on the current state, of the somewhat bi-polar condition that exists in software development right now. This is reflective of my train of thought around a number of aspects of the industry and what questions have come up time and time again while working with fellow coders and technologists.

The first segment of the industry that we often here about. it’s the hip and cool thing to do, as well as the obvious path into the future right now. It’s not particularly the idea that this segment, of building things as distributed systems is new, it’s just that it has become more important and more capable now than it ever has in the past.

A lot of this has to do with the advent of key technologies around virtualization, cloud computing and large scale object storage and network capabilities. We can spool up enough compute to rival a super computer, sitting alone at home, to storing more data than we can imagine with zero theoretical limit to that storage. All of this networked together behind load balancers, switches and programmable devices that a mere half dozen years ago would have taken more resources than any reasonably sized small business could even afford. All of these capabilities are literally at our fingertips now.

I’ve spooled up a 1000 EC2 instances for a demo before. That was 2 years ago even! Now I as well as many host applications and databases entirely in memory. SSDs as a cloud back end option at AWS and other locations provide another avenue that brings these devices into a world where they can be utilized immediately. Blink an eye, you’ll have the resources.

The storage realm, with costs falling through the floor with Glacier to operationally effective options like S3, EBS, Table Store, Object storage and others make our junk trunks limitless. The option to throw away any data at all seems less and less appealing.

Many developers, but definitely not all, have seized opportunities to alter the way they work and what they’re able to accomplish by using these new capabilities. From the now common asynchronous approach to development, shifting languages and stack to the invention of new paradigms around development and operations into a devop practice, leadership has stepped up to this changing game.

Vertical systems have in the past twenty years held the main position in the enterprise as the go to architectures. Client server or three tier or whatever one may call it. With a synchronous mindset the vertical implementation of systems produced several benefits.

We gained the ability through diligent documentation and widget style architecture to build CRUD (Create Read Update Delete) and LOB (Line of Business) applications at a rapid rate. With a simplified approach like this businesses spent a lot of time focusing on their business, not particularly on efficient utilization of resources, processing or reliability. But who could blame them, with Moore’s Law it seemed the only real ways to scale vertical systems were by writing faster code or buying a bigger computer, for a while that seemed to work fine.

Most of the, what I’ll call “vertical revolution” happened with the GSD mindset. GSD mean Get Shit Done. Again, another idea that sort of worked pretty well as long as Moore’s Law was in effect. But things have started to change, with Moore’s Law faltering.

Management practices also became a complete TLA soup during this time. The last 20 years continued the standard “let’s cookie cut people into widget producers”. It never works as well as it could or should, but the industry – and really all humanity keeps trying – to do this anyway. This is fine, we’ve got to try. The vertical stack however brought this to the extreme forefront as the industry tried to shoe horn all sorts of development into singular types of management practices.

Overall though as long as things stayed simple, we stuck to our KISS principles as software craftspeoples the architecture stays straight forward enough and the stack stays easy. However there are voluminous limitations. There are massive management and project issues with all of this.

Many parts of the industry are screaming for the future. As we have it, some agree on certain aspects of what the future should be and others agree on other aspects of the future.

We have some bright spots amid the confusion that is making the distributed world much easier, and the technology continues to do this.

Some want convergence. Which may work well in some ways, but in others it is converging into a clustered mess. As with the roadways of the 50s and the effervescent ideas of 50s planners, we’re finding the idea of the superhighways aren’t working either. The same is starting to appear for some types of device convergence. So where does this really leave us? Where are our weak spots as an industry? It seems like right now we’re stuck in that traffic jam getting to the next step.

Things are looking a little like this freakingnews.com MAV. Multifunction and not functional at all.

So to gain clarity on direction I pose the question…

How do we change the later world to work as well as the new world of distributed systems?

…and a few follow ups.

What do developers in the industry need to make true distributed computing advances while drawing on the known elements of the vertical computing realm?
What do we need as developers and leaders to more reliably advance the industry without setbacks?
What do we need as leaders to move the industry forward to the next steps, stages and developments in converging technology?
Are these even valid questions? What would you propose to ask?

It’s Happening Again, Seattle Code Camp!

I’ve got two presentations happening this year at Seattle Code Camp! Are you signed up? If not, hit this and get signed up ASAP: https://seattlecodecamp2013.eventbrite.com/

My two presentations are:

Distributed Databases – An Introduction to Riak

Presenter:Adron Hall

I’ll dive in with a quick definition and context of what distributed databases are. From there we’ll quickly move into what Riak is, how its architecture lends it toward being one of the premier distributed database solutions on the market today. We’ll take a walk through vector clocks to consistent hashs, clusters and rings managing the world of the distributed systems. Then we’ll dive into a use case with a put and pull of data from a walkthrough implementation of Riak.

…and…

Developer Workflow: From Angular.js, Riak, Testing and Vagrant Dev Environments

Presenter:Adron Hall

Each developer has to come up with a workflow that works well for them. Sometimes a lot of the workflow is dictated but there is still a lot that’s left up to the individual. With many modern tools you have a selection of everything from text editor, to IDE to actual operating system distribution. In this presentation I’m going to walk through some of the tooling to help keep all of these things under control during the course of programming efforts. …and yes, this will go beyond just the IDE (or text editor, etc)

…and others to check out!

Much Ado About Hadoop

Presenter:Jeremiah Peschka

By now you’ve heard the words “Big Data” and “Hadoop”, but you’re not sure what they mean, much less how to get started. You’re struggling with storing a lot of data, rapidly processing a huge volume of data, or maybe you’re just curious. There are a bewildering array of options and use cases within the Hadoop ecosystem. Every day I help customers understand their data problems, understand where Hadoop fits into their environment, and determine how they can use Hadoop to solve their problem. This session provides an introduction to what Hadoop is, when it’s appropriate to use Hadoop, and guidance on how to get started.

Unit Testing Web Development

Presenter:Mark Michaelis

When it comes to testing, Web Development is fraught with challenges whether it be from variations in browser behavior, the lack of compilation on JavaScript, or the traditional coupling between the UI and the code. In this session we walk through the complexities surrounding the testing of web projects and cover how to overcome these. This includes leveraging everything from source code analysis and JavaScript unit testing to UI and performance testing. Don’t miss this session to learn a multitude ways to significantly improve the quality of your web development.

Riak in a .NET World

Presenter:Jeremiah Peschka

Developers have a lot of choices when it comes to storing data. In this session, we’ll introduce .NET developers to Riak, a distributed key-value database. Through a combination of concepts and practical examples, attendees will learn when Riak might be appropriate, how to get started with Riak using CorrugatedIron (a full-featured .NET client for Riak), and how to solve data modeling problems they’re likely to encounter. This talk is for developers who are interested in backing their applications with a fault-tolerant, distributed database.

Introduction to Ember.js

Presenter:Jon Cortez

Ember.js is an open-source client-side JavaScript web application framework based on the Model-View-Controller (MVC) software architectural pattern. It is designed to help developers build scalable Single Page Applications (SPAs) by incorporating common idioms and best practices into a framework that provides a rich object model, declarative two-way data binding, computed properties, automatically-updating templates, and a router for managing application state. In this session, you will learn the key concepts of Ember.js and how to use it to create a simple Single Page Application.

Think Like a Dev: Cognitive Pitfalls in Software Development

Presenter:Michael Ibarra

Our own minds are often working against us. What makes estimating so hard? Is there real value in planning poker? How effective are weekly retrospectives, really? Let’s explore how our minds may be working against us in ways we might not realize. We’ll examine the sources of some common cognitive biases, how they apply to our work efforts, and discuss some “strategery” for overcoming them.

Building a Server Appliance in Node.js

Presenter:Eugenio Pace

Auth0 is a server/service to drastically simplify authentication, identity federation & SSO scenarios; for web & mobile apps. It’s our first big project on node. One of the reasons we decided to build it entirely on node, is the ability to package it and deploy it anywhere: as a service in the public cloud, as a virtual appliance on private cloud, or as an appliance on-premises. In this session we’ll show how we built it. How we use JS for extensibility and easy customization. What worked well, what didn’t. Tools we used, etc.

Hope to see you there. Cheers!

Write the Docs, Proper Portland Brew, Hack n’ Bike and Polyglot Conference 2013

Blog Entry Index:

Write the Docs
Portland Proper Brew
How to Survive the Zombie Apocalypse with Riak @ Polyglot Conference 2013

I just wrapped up a long weekend of staycation. Monday kicked off Write the Docs this week and today, Tuesday, I’m getting back into the saddle.

Write the Docs

The Write the Docs Conference this week, a two day affair, has kicked off an expanding community around document creation. This conference is about what documentation is, how we create documentation as technical writers, writers, coders and others in the field.

Not only is it about those things it is about how people interact and why documentation is needed in projects. This is one of the things I find interesting, as it seems obvious, but is entirely not obvious because of the battle between good documentation, bad documentation or a complete lack of documentation. The later being the worse situation.

The Bloody War of Documentation!

At this conference it has been identified that the ideal documentation scenario is that building it starts before any software is even built. I do and don’t agree with this, because I know we must avoid BDUF (Big Design Up Front). But we must take this idea, of documentation first, in the appropriate context of how we’re speaking about documentation at the conference. Just as tests & behaviors identified up front, before the creation of the actual implementation is vital to solid, reliable, consistent, testable & high quality production software, good documentation is absolutely necessary.

There are some situations, the exceptions, such as with agencies that create software, in which the software is throwaway. I’m not and don’t think much of the conference is about those types of systems. What we’ve been speaking about at the conference is the systems, or ecosystems, in which software is built, maintained and used for many years. We’re talking about the APIs that are built and then used by dozens, hundreds or thousands of people. Think of Facebook, Github and Twitter. All of these have APIs that thousands upon thousands use everyday. They’re successful in large part, extremely so, because of stellar documentation. In the case of Facebook, there’s some love and hate to go around because they’ve gone between good documentation and bad documentation. However whenever it has been reliable, developers move forward with these APIs and have built billion dollar empires that employ hundreds of people and benefit thousands of people beyond that.

As developers that have been speaking at the conference, and developers in the audience, and this developer too all tend to agree, build that README file before you build a single other thing within the project. Keep that README updated, keep it marked up and easy to read, and make sure people know what your intent is as best you can. Simply put, document!

You might also have snarkily asked, does Write the Docs have docs,why yes, it does:

http://docs.writethedocs.org/ <- Give em’ a read, they’re solid docs.

Portland Proper Brew

Today while using my iPhone, catching up on news & events over the time I had my staycation I took a photo. On that photo I used Stitch to put together some arrows. Kind of a Portland Proper Brew (PPB) with documentation. (see what I did there!) It exemplifies a great way to start the day.

Everyday I bike (or ride the train or bus) in to downtown Porltand anywhere from 5-9 kilometers and swing into Barista on 3rd. Barista is one of the finest coffee shops, in Portland & the world. If you don’t believe me, drag your butt up here and check it out. Absolutely stellar baristas, the best coffee (Coava, Ritual, Sightglass, Stumptown & others), and pretty sweet digs to get going in the morning.

I’ll have more information on a new project I’ve kicked off. Right now it’s called Bike n’ Hack, which will be a scavenger style code hacking & bicycle riding urban awesome game. If you’re interested in hearing more about this, the project, the game & how everything will work be sure to contact me via twitter @adron or jump into the bike n’ hack github organization and the team will be adding more information about who, what, where, when and why this project is going to be a blast!

Polyglot Conference & the Zombie Apocalypse

I’ll be teaching a tutorial, “Introduction to Distributed Databases” at Polyglot Conference in Vancouver in May! So it has begun & I’m here for you! Come and check out how to get a Riak deployment running in your survival bunker’s data center. Zombies or just your pointy hair boss scenarios of apocalypse we’ll discuss how consistent hashing, hinted handoff and gossipping can help your systems survive infestations! Here’s a basic outline of what I’ll cover…

Introducing Riak, a database designed to survive the Zombie Plague. Riak Architecture & 5 Minute History of Riak & Zombies.

Architecture deep dive:

Consistent Hashing, managing to track changes when your kill zone is littered with Zombies.
Intelligent Replication, managing your data against each of your bunkers.
Data Re-distribution, sometimes they overtake a bunker, how your data is re-distributed.
Short Erlang Introduction, a language fit for managing post-civil society.
Getting Erlang

Installing Riak on…

Ubuntu, RHEL & the Linux Variety.
OS-X, the only user centered computers to survive the apocolypse.
From source, maintained and modernized for humanities survival.
Upgrading Riak, when a bunker is retaken from the zomibes, it’s time to update your Riak.
Setting up

Devrel – A developer’s machine w/ Riak – how to manage without zombie bunkers.

5 nodes, a basic cluster
Operating Riak
Starting, stopping, and restarting
Scaling up & out
Managing uptime & data integrity
Accessing & writing data

Polyglot client libraries

JavaScript/Node.js & Erlang for the zombie curing mad scientists.
C#/.NET & Java for the zombie creating corporations.
Others, for those trying to just survive the zombie apocolypse.

If you haven’t registered for the Polyglot Conference yet, get registered ASAP as it could sell out!

Some of the other tutorials that are happening, that I wish I could clone myself for…

Angular js and HTML 6! w/ Chris Nicola @lucisferre & Saem @saemg
Intro to Erlang w/ Yurii Rashkovskii @yrashk

That’s it for updates right now, more code & news later. Cheers!