Starting a New Project – Let’s Choose a Tech Stack!

It’s time to start a new project. Because one can never have enough side projects! /s

This particular project I’ll be writing about in this post is derived from the multi-tenant music collector’s database I’ve already started working on. I’ve finally gotten back to it, during a slight break in collecting and music listening, to write up some of my thinking about this particular project.

Stated Objectives For This Application

  1. Personal Reasons: I always like to have side projects that I could make use of myself. Since I’ve recently started collecting music again, and in that am a new collector of vinyl albums, I wanted a better way to organize all that music and the extensive history, members, song, lyrics, and related information about the music and artists.
  2. For Everybody: Beyond the desire to have a well built application to provide the capabilities I’ve described above, I also want to provide this capability to others. In light of that capability, I’ll be designing this application as a multi-tenant application so that you too dear reader, once I get it built can use the application for your own music collection.
  3. Choose The Tech Stack: I’ll need to write this application in something, obviously, so this post is going to cover my reasoning for the tech stack I’m going to use. The application will be built in three core pieces: the database, the services and middle tier layer, and the user interface. I’ll detail each and cover the reasoning for the stack I’ll choose for each section.
Continue reading “Starting a New Project – Let’s Choose a Tech Stack!”

A Story of an Ivory Tower Architect Clusterfuck

Astronaut Architect or Ivory Tower Architect – Both of these are generally terms of dirision toward software architects who have become disconnected from anything related to actual implementation, coding, or related knowledge and work that has to do with the work of building software applications and services.

Astronaut Architecturalism or Ivory Tower Architecturalism – Topics that an architect might bring up which could be a sign they have reached the Ivory Tower or Astronaut level of architecture work with software. This ism is something to be very concerned about, as sometimes these individuals can completely derail and wreck a project with lofty – yet extremely disconnected ideas – about how or in what way the software should be built for a project.

Chief Emotive Product Technology Architect

Circa 2004 – In the city of Portland, Oregon.

The team, a reasonably sized startup team, of ~12 people sat around the table. It was a oval shaped table, specifically in this conference room so that all members could have involvement while seated at the table, in a conference room called “King Arthur’s Round Table”. Oh, the “LOLz”. The 12 individuals had been working on a project that was a B-round funded venture that had to do with musical things. These 12 individuals had so far, self-organized and been working at a reasonable clip, to get this software built and realize the ideas behind the venture.

The 12 had been gathered here by their manager so that a new member of the team could be introduced. Nobody knew who this person was or what the role of this person would be, except of course the hiring manager. Because I suppose, that was how things were going to be done in this year of 2004 with this startup with this manager. Yes, all sorts of indicators were already being fired off for the experienced of the 12. As these indicators fired off in their mind, concerns were growing, but the experienced had kept their mouths shut so far and offered the manage the benefit of the doubt.

The manager, who I’ll call Thomasen stood before everyone, in conflict with the ideals of Arthur’s Round Table. Thomasen mentioned the ideals of Arthur’s Round Table, which seemed ironic since he stood there before – removed from equal voice by the mere act of standing and spoke to everyone. But also it seemed fitting, for he then announced that he had hired a new member for the team. This unilateral decision of Thomasen’s seemed as if done in spite of the ideals of the Round Table, in spite of his just stating the ideals of the Round Table. What kind of fresh hell was being birthed at this moment? Well, that’s were things get lit, if not outright explosive!

Ericson, one of the experienced developer’s asked, “Ok, you just made a unilateral decision to hire someone, but what even is this person’s role? What are they gonna do here?” Thomasen replied, “I’ve hired Jacobson Tirelorn to join us as the Chief Emotive Product Technology Architect!” To which, the metal head of the team whipped to attention in his seat and inquired, a bit forcefully as a metal head might be imagined to do, “What the Holy Hell fuck is a Chief Emotive Product Technology Architect?”

Thomasen replied, “Kirk, jeez man you don’t have to be so aggresive and vulgar, could you tone it back about 10 notches?” Kirk, not one to take bullshit at even volume level 1, let alone at 11, responded kindly with a simple statement, “Nope.” Thomasen breathed heavily, the sigh clearly evident to all in the room. The newer folks of the 12 watched this with apprehension, while those that were familiar with Kirk’s lack of accepting bullshit, smirked as he laid out the guantlet for Thomasen. Thomasen responded finally after this long sigh, “Ok Kirk.”

Thomasen continued, “I’ve hired Jacobson Tirelorn in this role to help use build our product and really get connected on an emotional level to our users. Kirk, again sniped in, “We don’t have users yet we’re a startup in stealth mode.” Thomasen growing frustrated, “I know but we will.” Kirk, “Sure. But what exactly is the emotional level of our users?” Grunting almost, “I’m getting to that point Kirk.”

Thomasen elaborated, “Jacobson, the architect will also help us significantly figure out what technology we’ll use for the product and how all the parts should fit appropriately to make a highly scalable solution for our users!”

Lasinia raised her hand, “Can I ask a question?” to which Thomasen responded calmly, “Sure Lasinia, what’s your question?” Lasinia started off, “Per what you said, I decided to raise my hand since I’m confused about the Round Table ideals, if we all have an equal seat why are you standing and why do I have to raise my hand to speak?” Another of the 12 responded, “you don’t have to raise your hand…” “yeah but I felt like it because we’re not really following the ideals very well if Thomasen is standing up, Thomasen, could you sit down and continue telling us about this architect you hired?”

Thomasen pulled up a chair and sat down. Muttering somewhat under his breath, but still clear to the 12, “God dammit everybody.” He sat and continued finally.

Eventually he ended. The era of theh Chief Emotive Product Technology Architect was upon the team, regardless of what that meant!

The Chief Emotive Product Technology Architect Era Begins

Two weeks after the first Round Table, Jaconson arrived and called a meeting for Wednesday. It being Monday everybody easily accepted the meeting for 8am Wednesday. On Tuesday morning Jacobson sent out a document detailing all sorts of great and lofty goals for his architect role. He was, after all exuberently ready to get things started! Maximum emotional user support and all!

Around mid-day Tuesday, after the document with the details had arrived that morning Jacobson sent out another calendar invite for 9am Wednesday. Again, everybody went along with this and now had an 8am and 9am meeting scheduled for Wednesday! However, the foreshadowing got amped up another level when Jacobson is his zest and zeal sent out another meeting for 1pm that Wednesday. This meeting invite, however, was sent out at 4:51pm on Tuesday, which would have placed the meetings at 8am, 9am, and 1pm. The scheduling, and titles, went something like this.

8am – Software Release Schedule
9am – Emotional User Stories
1pm – Design Studies

This was going to be one helluva kick off of the Chief Emotive Product Technology Architect Era!

The day rolled around. Having sent out that 1pm invite at 4:51pm multiple people had not received the invite in time to know about the meeting before the 8am meeting kicked off. Thus, this is when the horror of the era truly began.

8am everybody stood at the door to the conference room, again King Arthur’s Round Table conference room. The time ticked past 8am. This seemed normal, it was after all Portland and rarely are people precisely on time on the west coast. 8:03am ticked by. Still, none of the 12 were concerned yet as Thomasen arrived to open the door. Again, this whole irony here, being the door to the Round Table conference room was locked, as if someone was going to sneak in and steal the massive multi-hundred pound table?

Thomasen and the 12 all sat down at the table. Being 2004, Thomasen was the only one with a laptop, another important piece of context here. In 2004 most people still programmed at desktop machines, thus, barely any of the 12 knew of the 1pm meeting, but were still ready to get things started and give this thing a shot!

8:31am arrived and Jacobson walks in as if no tardy one moment. He then pulls out a chair and sits down at the table and announces, “Hello everybody, it’s great to be here I’m Jacobson Tirelorn and I’m going to help you get this software solution, your processes, and your emotional well being figured out!” Kirk, yes Kirk again, spearheads the next immediate assertion, “Confirming, you’re going to help with the software solution architecture, the processes for the project, and our emotional well being? Why would I want you to help me with my emotional well being? Is there an assumption my emotional well being is fucked up?”

Now I need to paint some context here. Jacobson had entered the room casually, as if not late, but also wearing the garb of a 1970s era hippy. Multicolored attire and jeans, which wasn’t to crazy for a startup in Portland, but was still slightly dated and odd for this era. To add to this off-date situation, Jacobson looked like he was approximately 18 or so years old.

Thomasen chimed in, “Kirk, we’ll talk about your emotional well being later, let’s go ahead and let Jacobson get into the architecture and planning first.” Kirk “alright.” shrugs.

Jacobson starts in on some wording, “Alright, what we’re building is going to need to scale, massively, at a moments notice and currently we’ve only got the capacity for about 20% of this capability. So we need to really amp up our systems and our architecture to handle 150% of our expected capacity! With that in mind I’ve drawn this architecture diagram to help us get there.”

He then commences to put this on display, after messing with a projetor for approximately 5 minutes. What he then shows looks like a soup bowl was spilled out onto paper, which bright colors and odd shapes representing the Sun Servers, which were drawn not with the icons of the era that represented a Sun Server but with actual suns showing bright yellow. The network connections and backplane were shown with fuzzy ropes and other tangled bushes. It almost looked more like a wildlife sketch about the magic of the birds and bees than anything to do with software.

Jacobson then made a statement that immediately shaped the project, “What I’ve done here is use a little artistic license to draw up the server and network diagram that will get us to the needed scale!”

All of the 12, in their minds, and with their coursing eyes looking amongst each other thought in horror and disbelief. The all imagined that an oil train had just derailed and fell from a cliff into the ocean. As Jacobson continued the idea spread and so did the oil spill idea, as if it were lit on fire to burn uncontrollably.

Jacobson continued. He talked for the raminder of the meeting and then stated, “everybody take a bio break now and grab a snack for the next meeting, we’ll all meet back here at 9:05am to get started on the emotional user studies.”

At 9:05 everyone except Jacobson and Kirk entered the room and sat down. Nobody spoke, but just sat and waited patiently, some snacking on food and others just pondering what was going on. At 9:11 Jacobson walks back in and shows another chart that resembled Maslav’s Heirarchy. He then asks, “Where is Kirk?”

Kirk walks back in at 9:12, just a mere minute later and Jacobson says, “Good you’re here, a little late but we can get started now.” Kirk states, “you walked back in here a minute ago, it’s 9:12 now. You were late, I’m just following your lead.” Jacobson seems to not even acknowledge that Kirk has spoken and being a spiel about emotional well being.

Just a few minutes into this Sarah, one of the 12, new to the team asks, “What are the key aspects of everybody’s well being should we primarily be focused on?” “We should be focused 100% on the zen of people’s well being.”

Kirk stands up. Walks toward the door and leaves. Thomasen responds, “go ahead and continue Jacobson I’ll see what the problem is.” Jacobson continues.

Meanwhile Thomasen and Kirk have a chat, Kirk states simply that this guy is bullshit and you can keep me and I’m just going to work on the project or you can toss him, but I’m not going to work with this guy while he spouts this nonsense. Kirk having been key so far the lead of the development teams and practically a founder, leaves Thomasen to agree to the terms. Kirk informs him he’ll give Jacobson a chance but isn’t going to listen to the emotional zen nonsense, so he’ll be in other meetings, and Thomasen seems relieved, and life continues for the project.

Thomasen rejoins the meeting, and the meeting runs on and on and on and on. At 12:07 Jacobson says, “everybody take lunch now and we’ll get back to things later! thanks all!” To which everybody, releived, heads off to lunch. But do note, not a single person of the 12, except for Kirk, has been back to their desks to work or read emails on their desktops. Not a soul, except Kirk, knows now that there is a 1pm meeting. Not even Thomasen.

1pm rolls around and Kirk walks into the conference room ready for design details. Jacobson enters and immediately states, “well I guess since you’re the lead nobody else really needs to join us.” But Kirk knowing the anti-pattern in that states, “Well, others should join us as everybody needs to know the design of the system that is going to be working on the design of the system.” “Well, you Kirk can tell them later right?” Kirk smiles wryly and laughing says, “Alright, this is turning into a train wreck already so yeah, let’s go with that idea.”

Jacobson informs Kirk of his ideas.

Kirk finishs the 1pm meeting with the conclusion that Jacobson is an Ivory Tower Architect of no use to the team that needs to implement the product.

Three months pass. A status meeting is called by the CEO of the company. It starts off plainly.

“So how are things going team? I’d like to get a preview look at the product before our coming release in two weeks.”

Kirk looks up and asks, “In two weeks?”

Jacobson responds, “Yeah, we’re releasing in two weeks.”

Kirk and Sarah both look toward each other and simultaneously ask, “We’re releasing what exactly?”

Jacobson starts to respond and says, “It’ll be version one of the prod…”

Thomasen cuts in, “Wait a second, just to clarify we’re releasing a beta of the version 1 of the product.”

Sarah, “So like, an alpha product?”

Jacobson, “You could say that, but it’ll be more like a beta, using the design patterns we’ve implemented and architecture I’ve designed.”

Larry, one of the 12, softspoke and rarely speaks unless specifically asked a question, “What architecture and design patterns?”

Kirk says, “The ones Jacobson dreamed up, but don’t worry, it’s those that I’ve shown you but we don’t really call them design pattersn. We’ve just been calling it our architecture.”

Larry, almost assured “Oh, that, ok. But, per Sarah’s question, what are we releasing exactly?”

Jacobson says, “The v1 product.”

Thomasen corrects, “The beta v1 product.”

The CEO asks, “What about the alpha product?”

Thomasen “There is no alpha product.”

Jacobson emotes “Well, we could just rename this the alpha v1 product.”

CEO stands, causing consternation among everybody, “What do you mean rename, we don’t have an alpha product? How can we not have an alpha product and we’re about to release a beta product? Why do you keep skiping the beta part and saying v1 and Thomasen keeps interjecting beta? Who the fuck is in charge of this?”

“I am, and Jacobson is building the architecture design.”

“No he’s not” interupts Kirk, “I’m building the design based on Jacobson’s architecture.”

As you can imagine, none of this is getting resolved quickly, so let’s fast forward to the results of this CEO scheduled meeting.

Post Trash Fire Meeting SITREP

The SITREP, or situational report, went something like this. The meeting went on for another twenty plus minutes of the CEO, Jacobson, and Thomasen being confused about alpha, beta, and v1 variations and Kirk eventually sat back and just let it happen unabated. Sarah got up about 15 minutes into the meeting and nobody noticed she left. In addition to leaving the meeting though she went to her desk, got her personal things and left never to return again.

The others among the 12 listened and eventually faded out and started ignoring the clusterfuck that unfolded before them that day. Two others besides Sarah left within the next three days. A week after that meeting, Kirk decided he was done too and resigned. Jacobson, being the root of many of these problems was fired by Thomasen, which then the CEO in a fit of rage, merely 2 weeks after this meeting as the startup fell apart, fired Thomasen.

4 Weeks after that meeting the CEO was then fired, even though a founder, and the board having all the power to do this started the process of rehiring everybody to fill the roles of those that had left. First they managed to get Kirk to come in for a conversation about what exactly went wrong.

Board inquired “So Kirk, thanks for coming to speak with us. Could you tell us about what exactly happened? We’re not really sure ourselves.”

Kirk happily, ya know happily for a metal head, smiled and simple said, “Not really, I know, but it’s not worth going into.” and then turned just a bit and opened up a laptop he’d bought. “However, here’s some design and ideas about what should be built to acheive what you want and how much it’ll cost to have me come in and get it done.” The board started looking at the architecture and then asked, “Could you elaborate on what all this architecture shows us?”

Kirk went on to detail, in depth the technical challenges and what the design would require to meet or exceed the needs of the prospective userbase. Then, the board flipped the page after they had felt they understood enough. On this next page of details Kirk had layed out a SOW, or Statement of Work, to detail what he’d cost and what he would do to make this happen. The board was agast at Kirk’s hefty 2004 hourly price tag of $100 per hour. They stated they would have to think about it.

A week later they decided, after talking to Jacobson, to hire Jacobson at $80 bucks an hour to lead the effort around the architecture that Kirk had shown them. At least, to the best of their memory.

The project kicked back off again, now many weeks behind. At this point you might know what happened next, Jacobson cratered. He left a blast radius of unhired roles, unfinished design that didn’t make sense, and a massively unfinished project. This took about 4 months before the board wised up to their poor hiring decision, and having hired this astronaut architect stuck in his lofty ivory tower, they then opted to fold and reposition the company, eating the entirity of the losses.

TLDR Summary

2004 was a whopper of a year, considering the economy hadn’t even improved much after the 2000 apacolyptic tech crash! It was a time when many startups had survived with just enough of their hubris and naivety intact that they were trying some pretty crazy things trying to get products and services delivered. In spite of that, many continued failing.

At this point I want to add an important caveat, that the names are changed but this is the recollection of events from multiple people involved in this particular startup. This wildly happened more than a few times during this period when many startups went under. However, many others started to rise from the ashes during this time too.

Moral of the story, beware the astronaut or ivory tower architects!

Survey of Go Libraries for Database Work

Over the past few months I’ve picked up a number of libraries in the Go ecosystem to help me get work done around database engineering. These libraries are ones that I have used to do a range of work primarily around Apache Cassandra, DataStax Enterprise, PostgreSQL, and to a lesser degree MS SQL Server, MySQL, and others. The following is a survey of libraries that I’ve found to be pretty solid for getting the job done.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (5)

I’ve broken the follow tooling libraries out into the following categories:

  • Observability, Monitoring, & Insight – I created this section, and added libraries to it based specifically on the specific and peculiarly pedantic nature of observability in light of monitoring that work to provide insight into one’s applications they’re responsible for. For additional information about observability check out the Wikipedia article on the topic observability, it’s a great starting point. For monitoring however it gets more specific with a breakdown of monitoring types: application performance monitoring, network monitoring, system monitoring, and business transaction monitoring. The libraries in this section apply to some or all of the criteria in this definitions.
  • Data Schema Migration – Managing one’s data schema for a database, even really, truly, honestly if you have a schema-less system you still need to manage the underlying schema at some level.
  • Flow, Pipelines, Extraction, Transformation, and Loading – This section is mutative in the sense that it includes a lot of various types of libraries that have a very wide range of work to do and they offers a plethora of ways to do this work. Creating pipelines, to flow sequences, to extraction and transformation, to standard bulk loading. These libraries provide ways to get the data where you need it when you need it there in effective and reliable ways.
  • Database Backup Libraries – There are a zillion different things to maintaining effective and useful database backups; onsite storage, offsite storage, rotation periods, transmission & security control, scheduling, full or differential, and other topics of concern. One of the most important and often overlooked aspect of database backups is actually restoring the database from backup! These libraries can be used to get those backups, automate, and implement restoration of data in a more seamless way.
  • Database Drivers – At the core of any programmable automation of databases, one needs to have some way to connect to and work with the databases they’re automating, that’s where database drivers come into play. For Go, there’s a ton of support on every relatively known database in existence. MS SQL, Apache Cassandra, PostgreSQL, and dozens more!

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering

Veneur – Largely used by and originating from Stripe. This library works as a distributed, fault tolerant pipeline for data emitted from run time on systems and services throughout your environment. It has server implementations of the DogStatsD protocol or SSF (Sensor Sensibility Format) for aggregating metrics and sending these metrics for storage or via sinks to various other systems. The system can also works up histograms, sets, and counters as global aggregator.

TLDR;

Veneur is a convenient sink for various observability primitives with lots of outputs!

Honeycomb.io – Honeycomb I did some work for back in February of 2018 and gotta say I loved the team. Charity @mipsytipsy, Christine @cyen, Ben @maplebed and crew are tops! Friendly, wildly smart, and humble thrown in for good measure. With that said, I’m also a fan of the product. It’s a solid high cardinality, query and event intake system for observability. There are libraries for Go as well as others, and it’s pretty easy to use the library to setup ingest for appropriately instrumented applications.

TLDR;

Honeycomb.io is a Saas tool with available libraries for Go to provide observability insight and data collection for your applications!

OpenCensus – This framework and toolsetprovides ways to get telemetry out of your services. Currently  there are libraries for a number of languages that allow you to capture, manipulate, and export metrics and distributed traces to your data store of choice. The key idea is that OpenCensus works via tracing through the course of events in an application and that data is logged for awareness, insight, and thus observability of your systems.

TLDR;

OpenCensus is a library that provides ways to gather telemetry for your services and store it in your choice of a location.

RxGo – This library is a reactive extensions built for Go. This one is as much a programming concept as it is a way to enhance and specifically focus on observability, so let’s take a look at the intro example they’ve got on the actual repo README.md itself.

ReactiveX, or Rx for short, is an API for programming with observable streams. This is a ReactiveX API for the Go language.

ReactiveX is a new, alternative way of asynchronous programming to callbacks, promises and deferred. It is about processing streams of events or items, with events being any occurrences or changes within the system.

In Go, it is simpler to think of a observable stream as a channel which can Subscribe to a set of handler or callback functions.

The pattern is that you Subscribe to an Observable using an Observer:

subscription := observable.Subscribe(observer)

An Observer is a type consists of three EventHandler fields, the NextHandlerErrHandler, and DoneHandler, respectively. These handlers can be evoked with OnNextOnError, and OnDone methods, respectively.

The Observer itself is also an EventHandler. This means all types mentioned can be subscribed to an Observable.

nextHandler := func(item interface{}) interface{} {
    if num, ok := item.(int); ok {
        nums = append(nums, num)
    }
}

// Only next item will be handled.
sub := observable.Subscribe(handlers.NextFunc(nextHandler))

TLDR;

RxGo are the reactive extensions that make it easier to go full scale and spectrum observability, with significantly greater insight into your applications over time and the events they execute.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (1)

Go-Migrate – This library is written in Go and handles data schema migrations for a significant number of databases; PostgreSQL, MySQL, SQLite, RedShift, Neo4j, CockroadDB, and that’s just a few.

Example:

migrate -source file://path/to/migrations -database postgres://localhost:5432/database up 2

TLDR;

Go-Migrate is an open source library that can be used via CLI or in code to manage all your schema migration needs.

Gocqlx Migrate – This library primarily provides extensions to the Go CQL driver library, and one of those extensions specifically is a data-schema migration functionality.

Example:

package main

import (
    "context"

    "github.com/scylladb/gocqlx/migrate"
)

const dir = "./cql" 

func main() {
    session := CreateSession()
    defer session.Close()

    ctx := context.Background()
    if err := migrate.Migrate(ctx, session, dir); err != nil {
        panic(err)
    }
}

TLDR;

Gocqlx Migrate is a feature of the Gocqlx extensions library that can be used for schema migrations from within code.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (2)

Pachyderm – (Open Source Repo) A pachyderm is

a very large mammal with thick skin, especially an elephant, rhinoceros, or hippopotamus.

So it is kind of a fitting name for this library. The library, the project itself, has found funding and bills itself as “Scalable, Reproducible Data Science“. I’ve used it minimally myself, but find it continually popping up on my “use this tool because you’ll need a ton of the features” list.

TLDR;

Pachyderm is an open source library, and paired capital funded company, that does indeed provide scalable, reproducible data science in addition to being a great library for your ETL and related data management needs.

Reflow – This library provides incremental data processing in the cloud. Providing this ability gives scientists and engineers the ability to put tools together, packaged in Docker images, using programming constructs. The library then evaluates the programs transparently parallelizing the work and memoizing results – i.e. using go routines and caching data appropriately to speed up tasks. The library was created at GRAIL to manage our NGS (next generation sequencing) bioinformatics workloads on AWS, but has also been used for many other applications, including model training and ad-hoc data analyses. Severl of Reflow’s key features include:

  • functional, lazy, type-safe Domain Specific Language (DSL) for writing workflow programs.
  • the runtime for the DSL evaluates incrementally, coordinating cluster execution, and memoization.
  • a cluster scheduler to dynamically provision and tear down resources in the cloud (currently AWS is supported).
  • with containers the same processing workloads can also be executed locally.

TLDR;

Reflow provides a way for data scientists, and by proxy database administrators, data programmers, programmers, and anybody that needs to work through ETL or related work to write programs against that data in the cloud or locally.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (3)

Restic (Github) – Restic is a backup CLI and Go library that will backup to a number of sources, a few including; local directory, sftp, http REST, S3, Google Cloud Storage, Azure Blob Storage, and others.

Restic follows several objectives:

  • The tool aims to be easy, with minimal singular steps to execute a backup.
  • The tool aims to be fast, using appropriate mechanisms to ensure speedy backups.
  • The tool aims to provide verifiable backups that can easily be restored.
  • The tool aims to incorporate cryptographic guarantees of confidentiality to make sure the backups are secure.
  • The tool aims to be efficient with additional snapshots only taking the storage of the actual increment and de-duplicated to save space in the storage back end.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (4)

For each of these there’s a particular single driver that I use for each. Except in the case of Apache Cassandra and DataStax Enterprise I have also picked up gocqlx to add to my gocql usage.

PostgreSQL – Features:

  • SSL
  • Handles bad connections for database/sql
  • Scan time.Time correctly (i.e. timestamp[tz], time[tz], date)
  • Scan binary blobs correctly (i.e. bytea)
  • Package for hstore support
  • COPY FROM support
  • pq.ParseURL for converting urls to connection strings for sql.Open.
  • Many libpq compatible environment variables
  • Unix socket support
  • Notifications: LISTEN/NOTIFY
  • pgpass support

Gocql & Gocqlx

Gocql Features:

  • Modern Cassandra client using the native transport
  • Automatic type conversions between Cassandra and Go
    • Support for all common types including sets, lists and maps
    • Custom types can implement a Marshaler and Unmarshaler interface
    • Strict type conversions without any loss of precision
    • Built-In support for UUIDs (version 1 and 4)
  • Support for logged, unlogged and counter batches
  • Cluster management
    • Automatic reconnect on connection failures with exponential falloff
    • Round robin distribution of queries to different hosts
    • Round robin distribution of queries to different connections on a host
    • Each connection can execute up to n concurrent queries (whereby n is the limit set by the protocol version the client chooses to use)
    • Optional automatic discovery of nodes
    • Policy based connection pool with token aware and round-robin policy implementations
  • Support for password authentication
  • Iteration over paged results with configurable page size
  • Support for TLS/SSL
  • Optional frame compression (using snappy)
  • Automatic query preparation
  • Support for query tracing
  • Support for Cassandra 2.1+ binary protocol version 3
    • Support for up to 32768 streams
    • Support for tuple types
    • Support for client side timestamps by default
    • Support for UDTs via a custom marshaller or struct tags
  • Support for Cassandra 3.0+ binary protocol version 4
  • An API to access the schema metadata of a given keyspace

Gocqlx Features:

  • Binding query parameters form struct or map
  • Scanning results directly into struct or slice
  • CQL query builder (package qb)
  • Super simple CRUD operations based on table model (package table)
  • Database migrations (package migrate)

Go-MSSQLDB – Features:

  • Can be used with SQL Server 2005 or newer
  • Can be used with Microsoft Azure SQL Database
  • Can be used on all go supported platforms (e.g. Linux, Mac OS X and Windows)
  • Supports new date/time types: date, time, datetime2, datetimeoffset
  • Supports string parameters longer than 8000 characters
  • Supports encryption using SSL/TLS
  • Supports SQL Server and Windows Authentication
  • Supports Single-Sign-On on Windows
  • Supports connections to AlwaysOn Availability Group listeners, including re-direction to read-only replicas.
  • Supports query notifications

So this is just a few of the libraries I use, have worked with, and suggest checking out if you’re delving into database work and especially building systems around databases for reliability and related efforts.

If you’ve got other libraries that you’ve used, or really like, definitely leave a comment and let me know and I’ll update the post to include new libraries for Go. Subscribe to the blog too as I’ve got more posts in the cooker for database work, Go libraries and usage with databases, and a lot more. Happy thrashing code!

Creating Distributed Database Application Starter Kits

I’ve boarded a bus, and as always, when I board a bus I almost always code. Unless of course there are people I’m hanging out with then I chit chat, but right now this is the 212 and I don’t know anybody on this chariot anyway. So into the code I go.

I’ve been re-reviewing the Docker and related collateral we offer at DataStax. In that review it seems like it would be worth having some starter kit applications along with these “default” Docker options. This post I’ve created to provide the first language & tech stack of several starter kits I’m going to create.

Starter Kit – The Todo List Template

This first set of starter kits will be based upon a todo list application. It’s really simple, minimal in features, and offers a complete top to bottom implementation of a service, and an application on top of that service all built on Apache Cassandra. In some places, and I’ll clearly mark these places, I might add a few DataStax Enterprise features around search, analytics, or graph.

The Todo List

Features: The following detail the features, from the users perspective, that this application will provide. Each implementation will provide all of these features.

  • A user wants to create a user account to create todo lists with.
  • A user wants to be able to store a username, full name, email, and some simple notes with their account.
  • A user wants to be able to create a todo list that is identified by a user defined name. (i.e. “Grocery List”, “Guitar List”, or “Stuff to do List”)
  • A user want to be able to logout and return, then retrieve a list from a list of their lists.
  • A user wants to be able to delete a todo list.
  • A user wants to be able to update a todo list name.
  • A user wants to be able to add items to a todo list.
  • A user wants to be able to update items in the todo list.
  • A user wants to be able to delete items in a todo list.

Architecture: The following is the architecture of the todo list starter kit application.

  • Database: Apache Cassandra.
  • Service: A small service to manage the data tier of the application.
  • User Interface: A web interface using React/Vuejs ??

As you can see, some of the items are incomplete, but I’ll decide on them soon. My next review is to check out what I really want to use for the user interface, and also to get a user account system figured out. I don’t really want to create the entire user interface, but instead would like to use something like Auth0 or Okta.

May I Ask?

There are numerous things I’d love help with. Are there any user stories you think are missing? Should I add something? What would make these helpful to you? Leave a comment, or tweet at me @Adron. I’d be happy to get some feedback and other’s thoughts on the matter so that I can ensure that these are simple, to the point, usable, and helpful to people. Cheers!

Let’s Really Discuss Lock In

For to long lock-in has been referred to with an almost entirely negative connotation even though it can be inferred in positive and negative situations. The fact is that there’s a much more nuanced and balanced range to benefits and disadvantages of lock-in. Often this may even be referred to as this or that dependency, but either way a dependency often is just another form of lock in. Weighing those and finding the right balance for your projects can actually lead to lock-in being a positive game changer or something that simply provides one a basis in which to work and operate. Sometimes lock-in actually will provide a way to remove lock-in by providing more choices to other things, that in turn may provide another variance of lock-in.

Concrete Lock-in Examples

The JavaScript Lock-In

IT Security icons. Simplus seriesTake the language we choose to build an application in. JavaScript is a great example. It has become the singular language of the web, at least on the client side. This was long ago, a form of lock-in that browser makers (and standards bodies) chose that dictated how and in which direction the web – at least web pages – would progress.

JavaScript has now become a prominent language on the server side now too thanks to Node.js. It has even moved in as a first class language in serverless technology like AWS’s Lambda. JavaScript is a perfect example of a language, initially being a source of specific lock-in, but required for the client, that eventually expanded to allow programming in a number of other environments – reducing JavaScript’s lock in – but displacing lock in through abstractions to other spaces such as the server side and and serverless functions.

The .NET Windows SQL Server Lock In

IT Security icons. Simplus seriesJavaScript is merely one example, and a relatively positive one that expands one’s options in more ways than limits one’s efforts. But let’s say the decision is made to build a high speed trading platform and choose SQL Server, .NET C#, and Windows Server. Immediately this is a technology combination that has notoriously illuminated in the past * how lock-in can be extremely dangerous.

This application, say it was built out with this set of technology platforms and used stored procedures in SQL Server, locking the application into the specific database, used proprietary Windows specific libraries in .NET with the C# code, and on Windows used IIS specific advances to make the application faster. When it was first built it seemed plenty fast and scaled just right according to the demand at the time.

Fast forward to today. The application now has a sharded database when it hit a mere 8 Terabytes, loaded on two super pumped up – at least for today – servers that have many cores, many CPUs, GPUs, and all that jazz. They came in around $240k each! The application is tightly coupled to a middle tier, that is then sort of tightly coupled to those famous stored procedures, and the application of course has a turbo capability per those IIS Servers.

But today it’s slow. Looking at benchmarks and query times the database is having a hard time dealing with things as is, and the application has outages on a routine basis for a whole variation of reasons. Sometimes tracing and debugging solves the problems quickly, other times the servers just oversubscribe resources and sit thrashing.

Where does this application go? How does one resolve the database loading issues? They’ve already sunk a half million on servers, they’re pegged out already, horizontally scaling isn’t an option, they’re tightly coupled to Window Servers running IIS removing the possibility of effectively scaling out the application servers via container technologies, and other issues. Without recourse, this is the type of lock in that will kill the company if something is changed in a massive way very soon.

To add, this is the description of an actual company that is now defunct. I phrased it as existing today only to make the point. The hard reality is the company went under, almost entirely because of the costs of maintaining and unsustainable architecture that caused an exorbitant lock in to very specific tools – largely because the company drank the cool aid to use the tools as suggested. They developed the product into a corner. That mistake was so expensive that it decimated the finances of the company. Not a good scenario, not a happy outcome, and something to be avoided in every way! This is truly the epitomy of negative lock in.

Of course there’s this distinctive lock in we have to steer clear from, but there’s the lock in associated with languages and other technology capabilities that will help your company move forward faster, easier, and with increasing capabilities. Those are the choices, the ties to technology and capabilities that decision makers can really leverage with fewer negative consequences.

The “Lock In” That Enables

IT Security icons. Simplus seriesOne common statement is, “the right tool for the job”. This is of course for the ideal world where ideal decisions can be made all the time. This doesn’t exist and we have to strive for balance between decisions that will wreck the ship or decisions that will give us clear waters ahead.

For databases we need to choose the right databases for where we want to go versus where we are today. Not to gold plate the solution, but to have intent and a clear focus on what we want our future technology to hold for us. If we intend to expand our data and want to maintain the ability to effectively query – let’s take the massive SQL Server for example – what could we have done to prevent it from becoming a debilitating decision?

A solution that could have effectively come into play would have been not to shard the relational database, but instead to either export or split the data in a more horizontal way and put it into a distributed database store. Start building the application so that this system could be used instead of being limited by the relational database. As the queries are built out and the tight coupling to SQL Server removed, the new distributed database could easily add nodes to compensate for the ever growing size of the data stored. The options are numerous, that all are a form of lock-in, but not the kind that eventually killed this company that had limited and detrimentally locked itself into use of a relational database.

At the application tier, another solution could have been made to remove the ties to IIS and start figuring out a way to containerize the application. One way years ago would have been to move away from .NET, but let’s say that wasn’t really an option for other reasons. The idea to mimic containerization could have been done through shifting to a self-contained web server on Windows that would allow the .NET application to run under a singular service and then have those services spin off the application as needed. This would decouple from IIS, and enable spreading the load more quickly across a set number of machines and eventually when .NET Core was released offer the ability to actually containerize and shift entirely off of Windows Server to a more cost efficient solution under Linux.

These are just some ideas. The solutions of course would vary and obviously provide different results. Above all there are pathways away from negative lock in and a direction toward positive lock in that enables. Realize there’s the balance, and find those that leverage lock in positively.

Nuanced Pedantic Notes:

  • Note I didn’t say all examples, but just that this combo has left more than a few companies out on a limb over the years. There are of course other technologies that have put companies (people actually) in awkward situations too. I’m just using this combo here as an example. For instance, probably some of the most notorious lock in comes from the legal ramifications of using Oracle products and being tied into their sales agreements. On the opposite end of the spectrum, Stack Overflow is a great example of how choosing .NET and scaling with it, SQL Server, and related technologies can work just fine.