Category Archives: Architecture

Survey of Go Libraries for Database Work

Over the past few months I’ve picked up a number of libraries in the Go ecosystem to help me get work done around database engineering. These libraries are ones that I have used to do a range of work primarily around Apache Cassandra, DataStax Enterprise, PostgreSQL, and to a lesser degree MS SQL Server, MySQL, and others. The following is a survey of libraries that I’ve found to be pretty solid for getting the job done.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (5)

I’ve broken the follow tooling libraries out into the following categories:

  • Observability, Monitoring, & Insight – I created this section, and added libraries to it based specifically on the specific and peculiarly pedantic nature of observability in light of monitoring that work to provide insight into one’s applications they’re responsible for. For additional information about observability check out the Wikipedia article on the topic observability, it’s a great starting point. For monitoring however it gets more specific with a breakdown of monitoring types: application performance monitoring, network monitoring, system monitoring, and business transaction monitoring. The libraries in this section apply to some or all of the criteria in this definitions.
  • Data Schema Migration – Managing one’s data schema for a database, even really, truly, honestly if you have a schema-less system you still need to manage the underlying schema at some level.
  • Flow, Pipelines, Extraction, Transformation, and Loading – This section is mutative in the sense that it includes a lot of various types of libraries that have a very wide range of work to do and they offers a plethora of ways to do this work. Creating pipelines, to flow sequences, to extraction and transformation, to standard bulk loading. These libraries provide ways to get the data where you need it when you need it there in effective and reliable ways.
  • Database Backup Libraries – There are a zillion different things to maintaining effective and useful database backups; onsite storage, offsite storage, rotation periods, transmission & security control, scheduling, full or differential, and other topics of concern. One of the most important and often overlooked aspect of database backups is actually restoring the database from backup! These libraries can be used to get those backups, automate, and implement restoration of data in a more seamless way.
  • Database Drivers – At the core of any programmable automation of databases, one needs to have some way to connect to and work with the databases they’re automating, that’s where database drivers come into play. For Go, there’s a ton of support on every relatively known database in existence. MS SQL, Apache Cassandra, PostgreSQL, and dozens more!

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering

Veneur – Largely used by and originating from Stripe. This library works as a distributed, fault tolerant pipeline for data emitted from run time on systems and services throughout your environment. It has server implementations of the DogStatsD protocol or SSF (Sensor Sensibility Format) for aggregating metrics and sending these metrics for storage or via sinks to various other systems. The system can also works up histograms, sets, and counters as global aggregator.

TLDR;

Veneur is a convenient sink for various observability primitives with lots of outputs!

Honeycomb.io – Honeycomb I did some work for back in February of 2018 and gotta say I loved the team. Charity @mipsytipsy, Christine @cyen, Ben @maplebed and crew are tops! Friendly, wildly smart, and humble thrown in for good measure. With that said, I’m also a fan of the product. It’s a solid high cardinality, query and event intake system for observability. There are libraries for Go as well as others, and it’s pretty easy to use the library to setup ingest for appropriately instrumented applications.

TLDR;

Honeycomb.io is a Saas tool with available libraries for Go to provide observability insight and data collection for your applications!

OpenCensus – This framework and toolsetprovides ways to get telemetry out of your services. Currently  there are libraries for a number of languages that allow you to capture, manipulate, and export metrics and distributed traces to your data store of choice. The key idea is that OpenCensus works via tracing through the course of events in an application and that data is logged for awareness, insight, and thus observability of your systems.

TLDR;

OpenCensus is a library that provides ways to gather telemetry for your services and store it in your choice of a location.

RxGo – This library is a reactive extensions built for Go. This one is as much a programming concept as it is a way to enhance and specifically focus on observability, so let’s take a look at the intro example they’ve got on the actual repo README.md itself.

ReactiveX, or Rx for short, is an API for programming with observable streams. This is a ReactiveX API for the Go language.

ReactiveX is a new, alternative way of asynchronous programming to callbacks, promises and deferred. It is about processing streams of events or items, with events being any occurrences or changes within the system.

In Go, it is simpler to think of a observable stream as a channel which can Subscribe to a set of handler or callback functions.

The pattern is that you Subscribe to an Observable using an Observer:

subscription := observable.Subscribe(observer)

An Observer is a type consists of three EventHandler fields, the NextHandlerErrHandler, and DoneHandler, respectively. These handlers can be evoked with OnNextOnError, and OnDone methods, respectively.

The Observer itself is also an EventHandler. This means all types mentioned can be subscribed to an Observable.

nextHandler := func(item interface{}) interface{} {
    if num, ok := item.(int); ok {
        nums = append(nums, num)
    }
}

// Only next item will be handled.
sub := observable.Subscribe(handlers.NextFunc(nextHandler))

TLDR;

RxGo are the reactive extensions that make it easier to go full scale and spectrum observability, with significantly greater insight into your applications over time and the events they execute.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (1)

Go-Migrate – This library is written in Go and handles data schema migrations for a significant number of databases; PostgreSQL, MySQL, SQLite, RedShift, Neo4j, CockroadDB, and that’s just a few.

Example:

migrate -source file://path/to/migrations -database postgres://localhost:5432/database up 2

TLDR;

Go-Migrate is an open source library that can be used via CLI or in code to manage all your schema migration needs.

Gocqlx Migrate – This library primarily provides extensions to the Go CQL driver library, and one of those extensions specifically is a data-schema migration functionality.

Example:

package main

import (
    "context"

    "github.com/scylladb/gocqlx/migrate"
)

const dir = "./cql" 

func main() {
    session := CreateSession()
    defer session.Close()

    ctx := context.Background()
    if err := migrate.Migrate(ctx, session, dir); err != nil {
        panic(err)
    }
}

TLDR;

Gocqlx Migrate is a feature of the Gocqlx extensions library that can be used for schema migrations from within code.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (2)

Pachyderm – (Open Source Repo) A pachyderm is

a very large mammal with thick skin, especially an elephant, rhinoceros, or hippopotamus.

So it is kind of a fitting name for this library. The library, the project itself, has found funding and bills itself as “Scalable, Reproducible Data Science“. I’ve used it minimally myself, but find it continually popping up on my “use this tool because you’ll need a ton of the features” list.

TLDR;

Pachyderm is an open source library, and paired capital funded company, that does indeed provide scalable, reproducible data science in addition to being a great library for your ETL and related data management needs.

Reflow – This library provides incremental data processing in the cloud. Providing this ability gives scientists and engineers the ability to put tools together, packaged in Docker images, using programming constructs. The library then evaluates the programs transparently parallelizing the work and memoizing results – i.e. using go routines and caching data appropriately to speed up tasks. The library was created at GRAIL to manage our NGS (next generation sequencing) bioinformatics workloads on AWS, but has also been used for many other applications, including model training and ad-hoc data analyses. Severl of Reflow’s key features include:

  • functional, lazy, type-safe Domain Specific Language (DSL) for writing workflow programs.
  • the runtime for the DSL evaluates incrementally, coordinating cluster execution, and memoization.
  • a cluster scheduler to dynamically provision and tear down resources in the cloud (currently AWS is supported).
  • with containers the same processing workloads can also be executed locally.

TLDR;

Reflow provides a way for data scientists, and by proxy database administrators, data programmers, programmers, and anybody that needs to work through ETL or related work to write programs against that data in the cloud or locally.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (3)

Restic (Github) – Restic is a backup CLI and Go library that will backup to a number of sources, a few including; local directory, sftp, http REST, S3, Google Cloud Storage, Azure Blob Storage, and others.

Restic follows several objectives:

  • The tool aims to be easy, with minimal singular steps to execute a backup.
  • The tool aims to be fast, using appropriate mechanisms to ensure speedy backups.
  • The tool aims to provide verifiable backups that can easily be restored.
  • The tool aims to incorporate cryptographic guarantees of confidentiality to make sure the backups are secure.
  • The tool aims to be efficient with additional snapshots only taking the storage of the actual increment and de-duplicated to save space in the storage back end.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (4)

For each of these there’s a particular single driver that I use for each. Except in the case of Apache Cassandra and DataStax Enterprise I have also picked up gocqlx to add to my gocql usage.

PostgreSQL – Features:

  • SSL
  • Handles bad connections for database/sql
  • Scan time.Time correctly (i.e. timestamp[tz], time[tz], date)
  • Scan binary blobs correctly (i.e. bytea)
  • Package for hstore support
  • COPY FROM support
  • pq.ParseURL for converting urls to connection strings for sql.Open.
  • Many libpq compatible environment variables
  • Unix socket support
  • Notifications: LISTEN/NOTIFY
  • pgpass support

Gocql & Gocqlx

Gocql Features:

  • Modern Cassandra client using the native transport
  • Automatic type conversions between Cassandra and Go
    • Support for all common types including sets, lists and maps
    • Custom types can implement a Marshaler and Unmarshaler interface
    • Strict type conversions without any loss of precision
    • Built-In support for UUIDs (version 1 and 4)
  • Support for logged, unlogged and counter batches
  • Cluster management
    • Automatic reconnect on connection failures with exponential falloff
    • Round robin distribution of queries to different hosts
    • Round robin distribution of queries to different connections on a host
    • Each connection can execute up to n concurrent queries (whereby n is the limit set by the protocol version the client chooses to use)
    • Optional automatic discovery of nodes
    • Policy based connection pool with token aware and round-robin policy implementations
  • Support for password authentication
  • Iteration over paged results with configurable page size
  • Support for TLS/SSL
  • Optional frame compression (using snappy)
  • Automatic query preparation
  • Support for query tracing
  • Support for Cassandra 2.1+ binary protocol version 3
    • Support for up to 32768 streams
    • Support for tuple types
    • Support for client side timestamps by default
    • Support for UDTs via a custom marshaller or struct tags
  • Support for Cassandra 3.0+ binary protocol version 4
  • An API to access the schema metadata of a given keyspace

Gocqlx Features:

  • Binding query parameters form struct or map
  • Scanning results directly into struct or slice
  • CQL query builder (package qb)
  • Super simple CRUD operations based on table model (package table)
  • Database migrations (package migrate)

Go-MSSQLDB – Features:

  • Can be used with SQL Server 2005 or newer
  • Can be used with Microsoft Azure SQL Database
  • Can be used on all go supported platforms (e.g. Linux, Mac OS X and Windows)
  • Supports new date/time types: date, time, datetime2, datetimeoffset
  • Supports string parameters longer than 8000 characters
  • Supports encryption using SSL/TLS
  • Supports SQL Server and Windows Authentication
  • Supports Single-Sign-On on Windows
  • Supports connections to AlwaysOn Availability Group listeners, including re-direction to read-only replicas.
  • Supports query notifications

So this is just a few of the libraries I use, have worked with, and suggest checking out if you’re delving into database work and especially building systems around databases for reliability and related efforts.

If you’ve got other libraries that you’ve used, or really like, definitely leave a comment and let me know and I’ll update the post to include new libraries for Go. Subscribe to the blog too as I’ve got more posts in the cooker for database work, Go libraries and usage with databases, and a lot more. Happy thrashing code!

Creating Distributed Database Application Starter Kits

I’ve boarded a bus, and as always, when I board a bus I almost always code. Unless of course there are people I’m hanging out with then I chit chat, but right now this is the 212 and I don’t know anybody on this chariot anyway. So into the code I go.

I’ve been re-reviewing the Docker and related collateral we offer at DataStax. In that review it seems like it would be worth having some starter kit applications along with these “default” Docker options. This post I’ve created to provide the first language & tech stack of several starter kits I’m going to create.

Starter Kit – The Todo List Template

This first set of starter kits will be based upon a todo list application. It’s really simple, minimal in features, and offers a complete top to bottom implementation of a service, and an application on top of that service all built on Apache Cassandra. In some places, and I’ll clearly mark these places, I might add a few DataStax Enterprise features around search, analytics, or graph.

The Todo List

Features: The following detail the features, from the users perspective, that this application will provide. Each implementation will provide all of these features.

  • A user wants to create a user account to create todo lists with.
  • A user wants to be able to store a username, full name, email, and some simple notes with their account.
  • A user wants to be able to create a todo list that is identified by a user defined name. (i.e. “Grocery List”, “Guitar List”, or “Stuff to do List”)
  • A user want to be able to logout and return, then retrieve a list from a list of their lists.
  • A user wants to be able to delete a todo list.
  • A user wants to be able to update a todo list name.
  • A user wants to be able to add items to a todo list.
  • A user wants to be able to update items in the todo list.
  • A user wants to be able to delete items in a todo list.

Architecture: The following is the architecture of the todo list starter kit application.

  • Database: Apache Cassandra.
  • Service: A small service to manage the data tier of the application.
  • User Interface: A web interface using React/Vuejs ??

As you can see, some of the items are incomplete, but I’ll decide on them soon. My next review is to check out what I really want to use for the user interface, and also to get a user account system figured out. I don’t really want to create the entire user interface, but instead would like to use something like Auth0 or Okta.

May I Ask?

There are numerous things I’d love help with. Are there any user stories you think are missing? Should I add something? What would make these helpful to you? Leave a comment, or tweet at me @Adron. I’d be happy to get some feedback and other’s thoughts on the matter so that I can ensure that these are simple, to the point, usable, and helpful to people. Cheers!

Let’s Really Discuss Lock In

For to long lock-in has been referred to with an almost entirely negative connotation even though it can be inferred in positive and negative situations. The fact is that there’s a much more nuanced and balanced range to benefits and disadvantages of lock-in. Often this may even be referred to as this or that dependency, but either way a dependency often is just another form of lock in. Weighing those and finding the right balance for your projects can actually lead to lock-in being a positive game changer or something that simply provides one a basis in which to work and operate. Sometimes lock-in actually will provide a way to remove lock-in by providing more choices to other things, that in turn may provide another variance of lock-in.

Concrete Lock-in Examples

The JavaScript Lock-In

IT Security icons. Simplus seriesTake the language we choose to build an application in. JavaScript is a great example. It has become the singular language of the web, at least on the client side. This was long ago, a form of lock-in that browser makers (and standards bodies) chose that dictated how and in which direction the web – at least web pages – would progress.

JavaScript has now become a prominent language on the server side now too thanks to Node.js. It has even moved in as a first class language in serverless technology like AWS’s Lambda. JavaScript is a perfect example of a language, initially being a source of specific lock-in, but required for the client, that eventually expanded to allow programming in a number of other environments – reducing JavaScript’s lock in – but displacing lock in through abstractions to other spaces such as the server side and and serverless functions.

The .NET Windows SQL Server Lock In

IT Security icons. Simplus seriesJavaScript is merely one example, and a relatively positive one that expands one’s options in more ways than limits one’s efforts. But let’s say the decision is made to build a high speed trading platform and choose SQL Server, .NET C#, and Windows Server. Immediately this is a technology combination that has notoriously illuminated in the past * how lock-in can be extremely dangerous.

This application, say it was built out with this set of technology platforms and used stored procedures in SQL Server, locking the application into the specific database, used proprietary Windows specific libraries in .NET with the C# code, and on Windows used IIS specific advances to make the application faster. When it was first built it seemed plenty fast and scaled just right according to the demand at the time.

Fast forward to today. The application now has a sharded database when it hit a mere 8 Terabytes, loaded on two super pumped up – at least for today – servers that have many cores, many CPUs, GPUs, and all that jazz. They came in around $240k each! The application is tightly coupled to a middle tier, that is then sort of tightly coupled to those famous stored procedures, and the application of course has a turbo capability per those IIS Servers.

But today it’s slow. Looking at benchmarks and query times the database is having a hard time dealing with things as is, and the application has outages on a routine basis for a whole variation of reasons. Sometimes tracing and debugging solves the problems quickly, other times the servers just oversubscribe resources and sit thrashing.

Where does this application go? How does one resolve the database loading issues? They’ve already sunk a half million on servers, they’re pegged out already, horizontally scaling isn’t an option, they’re tightly coupled to Window Servers running IIS removing the possibility of effectively scaling out the application servers via container technologies, and other issues. Without recourse, this is the type of lock in that will kill the company if something is changed in a massive way very soon.

To add, this is the description of an actual company that is now defunct. I phrased it as existing today only to make the point. The hard reality is the company went under, almost entirely because of the costs of maintaining and unsustainable architecture that caused an exorbitant lock in to very specific tools – largely because the company drank the cool aid to use the tools as suggested. They developed the product into a corner. That mistake was so expensive that it decimated the finances of the company. Not a good scenario, not a happy outcome, and something to be avoided in every way! This is truly the epitomy of negative lock in.

Of course there’s this distinctive lock in we have to steer clear from, but there’s the lock in associated with languages and other technology capabilities that will help your company move forward faster, easier, and with increasing capabilities. Those are the choices, the ties to technology and capabilities that decision makers can really leverage with fewer negative consequences.

The “Lock In” That Enables

IT Security icons. Simplus seriesOne common statement is, “the right tool for the job”. This is of course for the ideal world where ideal decisions can be made all the time. This doesn’t exist and we have to strive for balance between decisions that will wreck the ship or decisions that will give us clear waters ahead.

For databases we need to choose the right databases for where we want to go versus where we are today. Not to gold plate the solution, but to have intent and a clear focus on what we want our future technology to hold for us. If we intend to expand our data and want to maintain the ability to effectively query – let’s take the massive SQL Server for example – what could we have done to prevent it from becoming a debilitating decision?

A solution that could have effectively come into play would have been not to shard the relational database, but instead to either export or split the data in a more horizontal way and put it into a distributed database store. Start building the application so that this system could be used instead of being limited by the relational database. As the queries are built out and the tight coupling to SQL Server removed, the new distributed database could easily add nodes to compensate for the ever growing size of the data stored. The options are numerous, that all are a form of lock-in, but not the kind that eventually killed this company that had limited and detrimentally locked itself into use of a relational database.

At the application tier, another solution could have been made to remove the ties to IIS and start figuring out a way to containerize the application. One way years ago would have been to move away from .NET, but let’s say that wasn’t really an option for other reasons. The idea to mimic containerization could have been done through shifting to a self-contained web server on Windows that would allow the .NET application to run under a singular service and then have those services spin off the application as needed. This would decouple from IIS, and enable spreading the load more quickly across a set number of machines and eventually when .NET Core was released offer the ability to actually containerize and shift entirely off of Windows Server to a more cost efficient solution under Linux.

These are just some ideas. The solutions of course would vary and obviously provide different results. Above all there are pathways away from negative lock in and a direction toward positive lock in that enables. Realize there’s the balance, and find those that leverage lock in positively.

Nuanced Pedantic Notes:

  • Note I didn’t say all examples, but just that this combo has left more than a few companies out on a limb over the years. There are of course other technologies that have put companies (people actually) in awkward situations too. I’m just using this combo here as an example. For instance, probably some of the most notorious lock in comes from the legal ramifications of using Oracle products and being tied into their sales agreements. On the opposite end of the spectrum, Stack Overflow is a great example of how choosing .NET and scaling with it, SQL Server, and related technologies can work just fine.

The Question of Docker, The Future of OS Virtualization

In this article I’m going to take a look at Docker and OS Virtualization autonomously of each other. There’s a reason, which will unfold as I dig through some data and provide this look into what is and isn’t happening in the virtualization space.

It’s important to also note what methods were used to attain the information provided in this article. I have obtained information through speaking with Docker employees and key executives including Ben Golub and founder Solomon Hykes over the years since the founding of Docker (and it’s previous incarnation dotCloud, before the pivot and name change to Docker).

Beyond communicating directly with the Docker team and gaining insight from them I have also done a number of interviews over the course of 4 days. These interviews have followed a fairly standard set of questions and conversation about the Docker technology, including but not limited to the following questions.

  • What is your current use of Docker visualization technologies?
  • What is your future intended use of Docker technologies?
  • What is the general current configuration and setup of your development team(s) and tooling that they use (i.e. stack: .NET, java, python, node.js, etc)
  • Do you find it helps you to move forward faster than without?

The History of OS-Level Virtualization

First, let’s take a look at where virtualization has been, then I’ll dive into where it is now, and then I’ll take a look at where it appears to be going in the future and derive some information from the interviews and discussions that I’ve had with various teams over the last 4 days.

The Short of It

OS-level virtualization is a virtualization application that allows the installation of software in a complete file system, just like a hypervisor based virtualization server, but dramatically faster installation and prospectively speed overall by using the host OS for OS-level virtualization. This cuts down on excess redundancies
within the core system and the respective virtual clients on the host.

Virtualization in concept has been around since the 1960s, with IBM being heavily involved at the Cambridge Scientific Center. Over time developments continued, but the real breakthrough in pushing virtualization into the market was VMware in 1999 with their virtual platform. This, hypervisor level virtualization great into a huge industry with the help of VMware.

However OS-level virtualization, which is what Docker is based on, didn’t take off immediately when introduced. There were many product options that came out over time around OS-level virtualization, but nothing made a huge splash in the industry similar to what Docker has. Fast forward to today and Docker was released in 2013 to an ever increasing developer demand and usage.

Timeline of Virtualization Development

Docker really brought OS-level virtualization to the developer community at the right time in regards to demands around web development and new ways to implement effective continuous delivery of applications. Docker has been one of the most extensively used OS-level virtualization tools to implement immutable infrastructure, continuous build, integration, and deployment environments, and to use as a general virtual environment to spool up resources as needed for development.

Where we Are With Virtualization

Currently Docker holds a pretty dominant position in the OS-level virtualization market space. Let’s take a quick review of their community statistics and involvement from just a few days ago.

The Stats: Docker on Github -> https://github.com/docker/docker

Watchers: 2017
Starred: 22941
Forks: 5617

16,472 Commits
3 Branches
102 Releases
983 Contributors

Just from that data we can ascertain that the Docker Community is active. We can also take a deep look into the forks and determine pull requests, acceptance of and related data to find out that the overall codebase is healthy with involvement. This is good to know since at one point there were questions if Docker had the capability to manage the open source legions pushing the product forward while maintaining the integrity, reputation, and quality of the product.

Now let’s take a look at what that position is based on considering the interviews I’ve had in the last 4 days. Out of the 17 people I spoke with all knew what Docker is. That’s a great position to be in compared to just a few years ago.

Out of the 17 people I spoke with, 15 of the individuals are working on teams that have, are implementing or are in some state between having and implementing Docker into their respective environments.

Of the 17, only 13 said they were concerned in some significant way about Docker Security. All of these individuals were working on teams attempting to figure out a way to use Docker in a production way, instead of only in development or related uses.

The list of uses that the 17 want to use or are using Docker for vary as much as the individual work that each is currently working on. There are however some core similarities in what they’re working on where Docker comes into play.

The most common similarity among Docker uses is simply as a platform to build out development testing environments or test servers. This is routinely a database server or simple distributed database like Cassandra or Riak, that can be built immutably, then destroyed and recreated whenever it is needed again for test and development. Some of the build outs are done with Docker specifically to work up a mock distributed database environment for testing. Mind you, I’m probably hearing about and seeing this because of my past work with Basho and other distributed systems programmers, companies, and efforts around this type of technology. It’s still interesting and very telling none the less.

The second most common usage is for Docker to be used somewhere in the continuous delivery chain. The push to move the continuous integration and delivery process to a more immutable, repeatable, and reliable process has been a perfect marriage between Docker and these needs. The ability to spin up entire environments in a matter of seconds and destroy them on whim, creating them again a matter of moments later, as made continuous delivery more powerful and more possible than it has ever been.

Some of the less common, yet still key uses of Docker, that came up during the interviews included; in memory cache servers, network virtualization, and distributed systems.

Virtualization’s Future

Pathing

With the history covered, the core uses of Docker discussed, let’s put those on the table with the acquisitions. The acquisitions by Docker have provided some insight into the future direction of the company. The acquisitions so far include: Kitematic, SocketPlane, Koality, and Orchard.

From a high level strategic play, the path Docker is pushing forward into is a future of continued virtualization around, as the hipsters might say “all the things”. With their purchase of Kitematic and SocketPlane. Both of these will help Docker expand past only OS virtualization and push more toward systemic virtualization of network environments with programmatic capabilities and more. These are capabilities that are needed to move past the legacy IT environments of yesteryear which will open up more enterprise possibilities too.

To further their core use that exists today, Docker has purchased Koality. Koality provides parallelizable continuous integration, deployment, and related services. This enables Docker to provide more built out services around this very important.

The other acquisition was Orchard (orchardup.com). This is a startup that provides a Docker host in the cloud, instantly. This is a similar purchase as the Koality one. It bulks up capabilities that Docker had some level of already. It also pushes them forward with two branches of capabilities: SaaS based on the web and prospectively offering something behind the firewall, which the Koality acquisition might have some part to play also.

Threat Vectors

Even though the pathways toward the future seem clear for Docker in many ways, in other ways they see dramatically less clear. For one, there are a number of competitive options that are in play now, gaining momentum and on the horizon. One big threat is Google’s lack of interest in Docker has led them to build competing tooling. If they push hard into the OS level virtualization space they could become a substantial threat.

The other threat vector, is the simple unknown of what could become a threat. Something like Mesos might explode in popularity and determine it doesn’t want to use Docker, and focus on another virtualization path. In the same sense, Mesos could commoditize Docker to a point that the value add at that level of virtualization doesn’t retain a business market value that would sustain Docker.

The invisible threat around this area right now is fairly large. There’s no greater way to determine this then to just get into a conversation with some developers about Docker. In one sense they love what it allows them to do, but the laundry list of things they’d like would allow for a disruptor to come in and steal the Docker thunder pretty easily. To put it simply, there isn’t a magical allegiance to Docker, developers will pick what helps them move the ball forward the fastest and easiest.

Another prospective threat is a massive purchase by a legacy software company like Oracle, Microsoft, or someone else. This could effectively destabilize the OSS aspects of the product and slow down development and progress, yet it could increase corporate adoption many times over what it is now. So this possibility is something that shouldn’t be ruled out.

Summary

Docker has two major threats: the direct competitor and their prospectively being leapfrogged by another level of virtualization. The other prospective threat to part of the company is acquisition of Docker itself, while it could mean a huge increase in enterprise penetration. In the future path the company and technology is moving forward in, there will be continued growth in usage and capabilities. The growth will maintain in the leading technology startups and companies of this kind, while the mid-size and larger corporate environments will continue to adopt and deploy at a slower pace.

A Question for You

I’ve put together what I’ve noticed, and I’d love to see things that you dear reader might notice about the Docker momentum machine. Do you see networking as a strength, other levels of virtualization, deployment of machines, integration or delivery, or some other part of this space as the way forward into the future. Let me know what your thoughts are on Twitter or whatever medium you feel like reaching out on. Of course, I’d also love to know if you think I’m wrong about anything I’ve written here.

Truly Excellent People and Coding Inspiration…

.NET Fringe took place this last week. It’s been a rather long time since my last actual conference that I actually got to really attend, meet people, and talk to people about all the different projects, aspirations, goals, and ideas about what’s next for the future. This conference was perfect to jump into, first and foremost, I knew it was an effort in being inclusive of the existing community and newcomers. We’d reached out to many brave souls to come and attend this conference about pushing technology into the future.

I met some truly excellent people. Smart, focused, intent, and a whole lot of great conversations followed meeting these people. Here’s a few people you’ll want to keep an eye on based on the technology they’re working on. I got to sit down and talk to every one of these coders and they’re in top form, smart, inventive, witty and full of great humor to boot!

Maria Naggaga @Twitter

I met Maria and one of the first things I saw was her crafty and most excellent art sketches around lifestyles, heroes, and more. I love art like this, and was really impressed with what Maria had done with her’s.

Maria giving us the info.

Maria giving us the info.

I was able to hang out with Maria a bit more and had some good conversation time talking about evangelism, tech fun and nonsense all around. I also was able to attend her talk on “Legacy… What?” which was excellent. The question she posed in the description states a common question posed, “When students think about .Net they think: legacy , enterprise , retired, and what is that?” which I too find to be a valid thought. Is .NET purely legacy these days? For many getting into the field it generally isn’ the landscape of greenfield applications and is far more commonly associated with legacy applications. Hearing her vantage point on this as an evangelist was eye opening. I gained more ideas, thoughts, and was pushed to really get that question answered for students in a different way…  which I’ll add to sometime in the future in another blog entry.

Kathleen Dollard @Twitter && @Github

I spoke to Kathleen while we took a break across the street from the conference at Grendal’s Coffee Shop. We talked a lot about education and what is effective training, diving heavily into what works around video, samples, and related things. You see, we’re both authors at Pluralsight too and spend a lot of time thinking about these things. It was great to be able to sit down and really discuss these topics face to face.

We also dived into a discussion about city livability and how Portland’s transit system works, what is and isn’t working in the city and what it’s like to live here. I was, of course, more than happy to provide as much information as I could.

We also discussed her interest in taking legacy shops (i.e. pre-C# even, maybe Delphi or whatever might exist) and helping them modernize their shop. I found this interesting, as it could be a lot of fun figuring out large gaps in technology like that and helping a company to step forward into the future.

Kathleen gave two presentations at the conference – excellent presentations. One was the “Your Code, Your Brain” presentation, talking about exactly the topic of legacy shops moving forward without disruption.

If you’re interested in Kathleen’s courses, give a look here.

Amy Palamountain @Twitter && @Github

Amy had a wicked great slides and samples that were probably the most flawless I’ve seen in a while. Matter of fact, a short while after the conference Amy put together a blog entry about those great slides and samples “Super Smooth Technical Demoes“.

An intent and listening audience.

An intent and listening audience.

An intent and listening audience.Amy’s talked at the conference was titled “Space, Time, and State“. It almost sounds like we could just turn that into an acronym. The talk was great, touched on the aspects of reactiveness and the battle of state that we developers fight every day while building solutions.

We also got to talk a little after the presentation, the horror of times zones, and a slew of good conversation.

Tomasz Janczuk @Twitter && @Github

AAAAAaggghhhhhh! I missed half of Tomasz’s talk! It always happens at every conference right! You get to talking to people, excited about this topic or that topic and BOOM, you’ve missed half of a talk that you fully intended to attend. But hey, the good part is I still got to see half the talk!

If you’re not familiar with Tomasz’ work and you do anything with Node.js you should pay close attention. Tomasz has been largely responsible for the great work behind Edge.js and influencing the effort to get Node.js running (and running damn well might I add) on Windows. For more on Edge.js check out Act I and Act II and the Github repository.

The Big Hit for Me, Distributed Systems

First some context. About 4 years ago I left the .NET Community almost entirely. Even though I was still doing a little work with C# I primarily switched stacks to other things to push forward with Riak, distributed systems usage, devops deployment of client apps, and a whole host of other things. At the time I basically had gotten real burned out on where the .NET Community had ended up worldwide, while some pushed onward with the technologies I loved to work with, I was tired of waiting and dived into some esoteric stuff and learned strange programming techniques in JavaScript, Ruby, Erlang and dived deeper into distributed technologies for use in application construction.

However some in the community didn’t stop moving the ball forward, and at this conference I got a great view into some of that progress! I’m stoked to see this technology and where it is now, because there is a LOT of potential for a number of things. Here’s the two talks and two more great people I got to see speak. One I knew already (great to see you again and hang out Aaron!) and one I had the privilege & honor to meet (it was most excellent hanging out and seeing your presentation Lena).

Aaron Stonnard @Twitter && @Github

Aaron I’d met back when Troy & I put together the first Node PDX. Aaron had swung into Portland to present on “Building Node.js Applications on Windows Azure“. At .NET Fringe however Aaron was diving into a topic that was super exciting to me. The first line of the description from the topic really says it all “Distributed computing in .NET isn’t something you often hear about, but it’s becoming an increasingly important area for growing .NET businesses around the globe. And frankly it’s an area where .NET has lagged behind other runtimes and platforms for years – but this is changing!“. Yup, that’s my exact pain point, it’s awesome to know Aaron & Petabridge are kicking ass in this space now.

Aaron’s presentation was solid, as to be expected. We also had some good conversations after and before the presentation about the state of distributed compute and systems within the Microsoft and Windows ecosystem. To check out more about Akka .NET that Aaron & Andrew Skotzko …  follow @AkkaDotNet, @aaronontheweb, @petabridge, and @askotzko.

Akka .NET

Alena Dzenisenka @Twitter && @Github

...

…Lena traveled all the way from Kiev in the Ukraine to provide the .NET Fringe crowd with some serious F# distributed and parallel compute knowledge in “Embracing the Cloud“!  (Slides here)

Here’s a short dive into F# here if you’re unfamiliar, which you can install on OS-X, Windows or whatever. So don’t use the “well, I don’t use windows” excuse to not give it a try! Here’s info about MBrace that  Lena also used in her demo. Also dive into brisk from elastacloud…

In addition to the excellent talk that Lena gave I also got to hang out with her, Phil Haack, Ryan Riley, and others over food at Biwa on the last day of the conference. After speaking with Lena about the Ukraine, computing, coding and other topics around hacking and the OSS Community she really inspired me to take a dive into these tools for some of the work that I’m working on now and what I’ll be doing in the near future.

All The Things

Now of course, there were a ton of other people I got to meet, people I got to catch up with I haven’t seen in ages and others I didn’t get to write about. It was a really great conference with great content. I’m looking forward to round 2 and spending more time with everybody in the future!

The whole bunch of us at the end of the conference!

The whole bunch of us at the end of the conference!

Cheers everybody!   \m/

An Aside of Blog Entries on .NET Fringe

Here are some additional blog entries that others wrote about the event. In addition to these blog entries I’ll be updating this entry with any additional entries that I see pop up – so if you post one let me know, and I’ll also update these talks above that I’ve discussed with videos when they’re posted live.

__2 “Starting a Basic Loopback API & Continuous Integration”

In this article Keartida is going to dive into setting up a basic Loopback API project and get a build of that project running on a continuous integration service. In this example she’s going to get the project setup with Codeship.

Prerequisites:

  • Be sure, whichever system you are using, to have a C++ compiler installed. For Windows that usually means installing Visual Studio or something, on OS-X install XCode and the Developer Tools. On Ubuntu the GCC compiler and other options exist. For instructions on OS-X and Linux check out installing compiler tools.
  • Ubuntu
  • OS-X
  • For windows, I’d highly suggest setting up a VM of Ubuntu to do any work with Loopback, Node.js, or follow along with this material. It’s possible on Windows, but there are a number of things that are lacking. If you still want to make a go of using Windows, here are some initial setup steps here.

Nice to Haves:

  • git-flow – works on any bash, handles the branching and merging. Very nice scripts to have.
  • bashit – Adding more information to the bash prompt (works on OS-X, not Ubuntu or Windows Bash)

Continue reading

__1 “Getting Started, Kanban & First Steps for a Sharing App”

This is the first (of course the precursor to this entry was the zero day team introduction article) of an ongoing series I’m going to put together. I’m going to write this series from the context of a team building a product. I’ll have code samples and more as I work along through the material.

The first step included Oi Elffaw having a discussion with the team to setup the first week’s working effort. Oi decided to call it a sprint and the rest of the team decided that was cool too. This was week one after all and there wasn’t going to be much else besides testing, research, and setup that took place.

Prerequisites

Before starting everything I went ahead and created a project repository on github for Oi to use waffle.io with. Waffle.io is an online service that works with github issues to provide a kanban style inferface to the issues. This provides an easier view, especially for leads and management, to get insight into where things are and what’s on the plate for the team for the week. I included the default node.js .gitignore file and an Apache 2.0 license when I created the repository. Github then seeds the project with a .gitignore, README.md and the license files.

After setting up the repository in github I pinged Oi and he set to work after the team’s initial meet to discuss what week one would include. Continue reading