Survey of Go Libraries for Database Work

Over the past few months I’ve picked up a number of libraries in the Go ecosystem to help me get work done around database engineering. These libraries are ones that I have used to do a range of work primarily around Apache Cassandra, DataStax Enterprise, PostgreSQL, and to a lesser degree MS SQL Server, MySQL, and others. The following is a survey of libraries that I’ve found to be pretty solid for getting the job done.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (5)

I’ve broken the follow tooling libraries out into the following categories:

  • Observability, Monitoring, & Insight – I created this section, and added libraries to it based specifically on the specific and peculiarly pedantic nature of observability in light of monitoring that work to provide insight into one’s applications they’re responsible for. For additional information about observability check out the Wikipedia article on the topic observability, it’s a great starting point. For monitoring however it gets more specific with a breakdown of monitoring types: application performance monitoring, network monitoring, system monitoring, and business transaction monitoring. The libraries in this section apply to some or all of the criteria in this definitions.
  • Data Schema Migration – Managing one’s data schema for a database, even really, truly, honestly if you have a schema-less system you still need to manage the underlying schema at some level.
  • Flow, Pipelines, Extraction, Transformation, and Loading – This section is mutative in the sense that it includes a lot of various types of libraries that have a very wide range of work to do and they offers a plethora of ways to do this work. Creating pipelines, to flow sequences, to extraction and transformation, to standard bulk loading. These libraries provide ways to get the data where you need it when you need it there in effective and reliable ways.
  • Database Backup Libraries – There are a zillion different things to maintaining effective and useful database backups; onsite storage, offsite storage, rotation periods, transmission & security control, scheduling, full or differential, and other topics of concern. One of the most important and often overlooked aspect of database backups is actually restoring the database from backup! These libraries can be used to get those backups, automate, and implement restoration of data in a more seamless way.
  • Database Drivers – At the core of any programmable automation of databases, one needs to have some way to connect to and work with the databases they’re automating, that’s where database drivers come into play. For Go, there’s a ton of support on every relatively known database in existence. MS SQL, Apache Cassandra, PostgreSQL, and dozens more!

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering

Veneur – Largely used by and originating from Stripe. This library works as a distributed, fault tolerant pipeline for data emitted from run time on systems and services throughout your environment. It has server implementations of the DogStatsD protocol or SSF (Sensor Sensibility Format) for aggregating metrics and sending these metrics for storage or via sinks to various other systems. The system can also works up histograms, sets, and counters as global aggregator.

TLDR;

Veneur is a convenient sink for various observability primitives with lots of outputs!

Honeycomb.io – Honeycomb I did some work for back in February of 2018 and gotta say I loved the team. Charity @mipsytipsy, Christine @cyen, Ben @maplebed and crew are tops! Friendly, wildly smart, and humble thrown in for good measure. With that said, I’m also a fan of the product. It’s a solid high cardinality, query and event intake system for observability. There are libraries for Go as well as others, and it’s pretty easy to use the library to setup ingest for appropriately instrumented applications.

TLDR;

Honeycomb.io is a Saas tool with available libraries for Go to provide observability insight and data collection for your applications!

OpenCensus – This framework and toolsetprovides ways to get telemetry out of your services. Currently  there are libraries for a number of languages that allow you to capture, manipulate, and export metrics and distributed traces to your data store of choice. The key idea is that OpenCensus works via tracing through the course of events in an application and that data is logged for awareness, insight, and thus observability of your systems.

TLDR;

OpenCensus is a library that provides ways to gather telemetry for your services and store it in your choice of a location.

RxGo – This library is a reactive extensions built for Go. This one is as much a programming concept as it is a way to enhance and specifically focus on observability, so let’s take a look at the intro example they’ve got on the actual repo README.md itself.

ReactiveX, or Rx for short, is an API for programming with observable streams. This is a ReactiveX API for the Go language.

ReactiveX is a new, alternative way of asynchronous programming to callbacks, promises and deferred. It is about processing streams of events or items, with events being any occurrences or changes within the system.

In Go, it is simpler to think of a observable stream as a channel which can Subscribe to a set of handler or callback functions.

The pattern is that you Subscribe to an Observable using an Observer:

subscription := observable.Subscribe(observer)

An Observer is a type consists of three EventHandler fields, the NextHandlerErrHandler, and DoneHandler, respectively. These handlers can be evoked with OnNextOnError, and OnDone methods, respectively.

The Observer itself is also an EventHandler. This means all types mentioned can be subscribed to an Observable.

nextHandler := func(item interface{}) interface{} {
    if num, ok := item.(int); ok {
        nums = append(nums, num)
    }
}

// Only next item will be handled.
sub := observable.Subscribe(handlers.NextFunc(nextHandler))

TLDR;

RxGo are the reactive extensions that make it easier to go full scale and spectrum observability, with significantly greater insight into your applications over time and the events they execute.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (1)

Go-Migrate – This library is written in Go and handles data schema migrations for a significant number of databases; PostgreSQL, MySQL, SQLite, RedShift, Neo4j, CockroadDB, and that’s just a few.

Example:

migrate -source file://path/to/migrations -database postgres://localhost:5432/database up 2

TLDR;

Go-Migrate is an open source library that can be used via CLI or in code to manage all your schema migration needs.

Gocqlx Migrate – This library primarily provides extensions to the Go CQL driver library, and one of those extensions specifically is a data-schema migration functionality.

Example:

package main

import (
    "context"

    "github.com/scylladb/gocqlx/migrate"
)

const dir = "./cql" 

func main() {
    session := CreateSession()
    defer session.Close()

    ctx := context.Background()
    if err := migrate.Migrate(ctx, session, dir); err != nil {
        panic(err)
    }
}

TLDR;

Gocqlx Migrate is a feature of the Gocqlx extensions library that can be used for schema migrations from within code.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (2)

Pachyderm – (Open Source Repo) A pachyderm is

a very large mammal with thick skin, especially an elephant, rhinoceros, or hippopotamus.

So it is kind of a fitting name for this library. The library, the project itself, has found funding and bills itself as “Scalable, Reproducible Data Science“. I’ve used it minimally myself, but find it continually popping up on my “use this tool because you’ll need a ton of the features” list.

TLDR;

Pachyderm is an open source library, and paired capital funded company, that does indeed provide scalable, reproducible data science in addition to being a great library for your ETL and related data management needs.

Reflow – This library provides incremental data processing in the cloud. Providing this ability gives scientists and engineers the ability to put tools together, packaged in Docker images, using programming constructs. The library then evaluates the programs transparently parallelizing the work and memoizing results – i.e. using go routines and caching data appropriately to speed up tasks. The library was created at GRAIL to manage our NGS (next generation sequencing) bioinformatics workloads on AWS, but has also been used for many other applications, including model training and ad-hoc data analyses. Severl of Reflow’s key features include:

  • functional, lazy, type-safe Domain Specific Language (DSL) for writing workflow programs.
  • the runtime for the DSL evaluates incrementally, coordinating cluster execution, and memoization.
  • a cluster scheduler to dynamically provision and tear down resources in the cloud (currently AWS is supported).
  • with containers the same processing workloads can also be executed locally.

TLDR;

Reflow provides a way for data scientists, and by proxy database administrators, data programmers, programmers, and anybody that needs to work through ETL or related work to write programs against that data in the cloud or locally.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (3)

Restic (Github) – Restic is a backup CLI and Go library that will backup to a number of sources, a few including; local directory, sftp, http REST, S3, Google Cloud Storage, Azure Blob Storage, and others.

Restic follows several objectives:

  • The tool aims to be easy, with minimal singular steps to execute a backup.
  • The tool aims to be fast, using appropriate mechanisms to ensure speedy backups.
  • The tool aims to provide verifiable backups that can easily be restored.
  • The tool aims to incorporate cryptographic guarantees of confidentiality to make sure the backups are secure.
  • The tool aims to be efficient with additional snapshots only taking the storage of the actual increment and de-duplicated to save space in the storage back end.

DevOps Days Vancouver - Architecture Guidance - Venomous Database Reliability Engineering (4)

For each of these there’s a particular single driver that I use for each. Except in the case of Apache Cassandra and DataStax Enterprise I have also picked up gocqlx to add to my gocql usage.

PostgreSQL – Features:

  • SSL
  • Handles bad connections for database/sql
  • Scan time.Time correctly (i.e. timestamp[tz], time[tz], date)
  • Scan binary blobs correctly (i.e. bytea)
  • Package for hstore support
  • COPY FROM support
  • pq.ParseURL for converting urls to connection strings for sql.Open.
  • Many libpq compatible environment variables
  • Unix socket support
  • Notifications: LISTEN/NOTIFY
  • pgpass support

Gocql & Gocqlx

Gocql Features:

  • Modern Cassandra client using the native transport
  • Automatic type conversions between Cassandra and Go
    • Support for all common types including sets, lists and maps
    • Custom types can implement a Marshaler and Unmarshaler interface
    • Strict type conversions without any loss of precision
    • Built-In support for UUIDs (version 1 and 4)
  • Support for logged, unlogged and counter batches
  • Cluster management
    • Automatic reconnect on connection failures with exponential falloff
    • Round robin distribution of queries to different hosts
    • Round robin distribution of queries to different connections on a host
    • Each connection can execute up to n concurrent queries (whereby n is the limit set by the protocol version the client chooses to use)
    • Optional automatic discovery of nodes
    • Policy based connection pool with token aware and round-robin policy implementations
  • Support for password authentication
  • Iteration over paged results with configurable page size
  • Support for TLS/SSL
  • Optional frame compression (using snappy)
  • Automatic query preparation
  • Support for query tracing
  • Support for Cassandra 2.1+ binary protocol version 3
    • Support for up to 32768 streams
    • Support for tuple types
    • Support for client side timestamps by default
    • Support for UDTs via a custom marshaller or struct tags
  • Support for Cassandra 3.0+ binary protocol version 4
  • An API to access the schema metadata of a given keyspace

Gocqlx Features:

  • Binding query parameters form struct or map
  • Scanning results directly into struct or slice
  • CQL query builder (package qb)
  • Super simple CRUD operations based on table model (package table)
  • Database migrations (package migrate)

Go-MSSQLDB – Features:

  • Can be used with SQL Server 2005 or newer
  • Can be used with Microsoft Azure SQL Database
  • Can be used on all go supported platforms (e.g. Linux, Mac OS X and Windows)
  • Supports new date/time types: date, time, datetime2, datetimeoffset
  • Supports string parameters longer than 8000 characters
  • Supports encryption using SSL/TLS
  • Supports SQL Server and Windows Authentication
  • Supports Single-Sign-On on Windows
  • Supports connections to AlwaysOn Availability Group listeners, including re-direction to read-only replicas.
  • Supports query notifications

So this is just a few of the libraries I use, have worked with, and suggest checking out if you’re delving into database work and especially building systems around databases for reliability and related efforts.

If you’ve got other libraries that you’ve used, or really like, definitely leave a comment and let me know and I’ll update the post to include new libraries for Go. Subscribe to the blog too as I’ve got more posts in the cooker for database work, Go libraries and usage with databases, and a lot more. Happy thrashing code!

Distributed Database Things to Know: Gossip

Some of the names used can seem to conflate the actual purpose of a feature’s functionality in distributed databases. However gossip is pretty spot on. Within a group of people gossiping the purpose is to find out each other’s business. What’s going on with Frank, who’s he seeing, and Sally started a business, say what! In the end, all gossippers get into the business and understand what Frank, Sally, and the whole crew are up to. This is a good analogy for what gossip does in a distributed database, or distributed systems in general.

The way gossip works in node, is on a peer-to-peer basis. It’s a communication protocol with the purpose of minding the other nodes business so the singular node gossiping can go about its business. The process runs every second and exchanges state messages between the nodes, which then can update their respective state and keep all nodes informed.

Preventing over-communication and mixed messages, the list is derived from seed nodes for all nodes in the cluster. When a node boots up it initiates its gossip from this seed node, which we usually have a few of, and then continues with that gossip list. Note, that seed nodes aren’t a single point of failure, as other nodes in the cluster will take their place if need be, they’re just kind of designated as the lead to initiate a gossip list from.

It is important in Apache Cassandra to also designate a single seed node per replication group (i.e. datacenter) for the seed list. This is recommended for fault tolerance, else gossip has to communicate across higher latency to hit each datacenter, which can eat at response time and performance of the gossip. Think of sending a snail mail USPS letter to a friend to get gossip news! That would take months just to find out what’s going on, kind of the same version of that for computer nodes going across datacenters to talk to the seed node.

Distributed Database Things to Know: Snitches

Snitches. What a great name for a feature right? I’d bring up the Harry Potter thing, but I’m gonna let that one fly. (get it, it flies!)

A snitch determines where nodes go among the racks and datacenters. This is the Cassandra specific racks and datacenters however, so check out my previous post on datacenters and racks for more detail on the specifics about what they are in relation to Cassandra and DataStax Enterprise (DSE). Snitches tell the database about the network topology of the system. Requests can then be routed efficiently and enables Cassandra and DSE to distribute replicas by grouping the machines accordingly. Of the nodes, all within a cluster must use this same snitch in the logic of distribution among the system.

Snitch Options

The following are the feature options we have to determine how the snitches determine node placement.

  • DseSimpleSnitch – This is the default snitch and is intended only for development deployments. It doesn’t recognize datacenter or rack information, and simply needs a a keyspace defined to use SimpleStrategy and set a replication factor. It’s use makes it a bit easier to setup a cluster for development.
  • GossipingPropertyFileSnitch – This snitch is usable for production. Rack and datacenter information for the local node is defined in the cassandra-rackdc.properties file, which then propagates this to other nodes via gossip.
  • Ec2Snitch – This is a great snitch for simple cluster deployments that reside in a single region. For this snitch, the region name is used as the datacenter name and availability zones are setup as racks. That gives us a setup that matches datacenter and racks to region and zones, making it pretty easy to remember which is where then. Since this maps this way, as the way Ec2 works, this snitch isn’t usable among multi-region clusters.
  • Ec2MultiRegionSnitch – This snitch can be used for multi-region deployments. To use this snitch settings need to be made in both the cassandra.yaml file and cassandra-rackdc.properties file. The way this snitch works is by using the public IP designated in the broadcast_address to allow this multi-region connection.
  • GoogleCloudSnitch – This snitch, as is somewhat obvious by the name, is for DSE deployments on Google Cloud Platform (GCP). This snitch uses datacenters and racks similarly mapped as the Ec2Snitch with datacenters mapped to regions and racks mapped to zones.
  • CloudstackSnitch – This snitch is for Apache Cloudstack. Zone naming is free-form in Cloudstack so this snitch uses <country> <location> <az> notation.
  • PropertyFileSnitch – The way this snitch works is by proximity, determined by rack and datacenter. It uses network details configured in cassandra-topology.properties file, with the datacenter names defined using standard convention. These need to correlated to the name of the actual datacenters in the keyspace definition. Then nodes in the cluster are described in the cassandra-topology.properties file and must be exactly the same on every node in the cluster.
  • RackInferringSnitch – This snitch is kind of funny, because it’s a usable snitch, but it’s also an example snitch. It determines the proximity of nodes by datacenter and rack too. However it assumes these to correspond to the second and third octet of the node’s IP address. It is best used as an example for writing custom snitch classes, unless of course this matches your actual deployment conventions.

That’s the basics on snitches. I recently wrote about another important distributed database architectural concept called consistent hashing, it’s an important concept to understand about distributed databases like Cassandra and DataStax Enterprise.

References:

Cassandra Datacenter & Racks

The last post in this series is Distributed Database Things to Know: Consistent Hashing.

Let’s talk about the analogy of Apache Cassandra Datacenter & Racks to actual datacenter and racks. I kind of enjoy the use of the terms datacenter and racks to describe architectural elements of Cassandra. However, as time moves on the relationship between these terms and why they’re called datacenter and racks can be obfuscated.

Take for instance, a datacenter could just be a cloud provider, an actual physical datacenter location, a zone in Azure, or region in some other provider. What an actual Datacenter in Cassandra parlance actually is can vary, but the origins of why it’s called a Datacenter remains the same. The elements of racks also can vary, but also remain the same.

Origins: Racks & Datacenters?

Let’s cover the actual things in this industry we call datacenter and racks first, unrelated to Apache Cassandra terms.

Racks: The easiest way to describe a physical rack is to show pictures of datacenter racks via the ole’ Google images.

racks.png

A rack is something that is located in a data-center, or even just someone’s garage in some odd scenarios. Ya know, if somebody wants serious hardware to work with. The rack then has a number of servers, often various kinds, within that rack itself. As you can see from the images above there’s a wide range of these racks.

Datacenter: Again the easiest way to describe a datacenter is to just look at a bunch of pictures of datacenter, albeit you see lots of racks again. But really, that’s what a datacenter is, is a building that has lots and lots of racks.

data-center.png

However in Apache Cassandra (and respectively DataStax Enterprise products) a datacenter and rack do not directly correlate to a physical rack or datacenter. The idea is more of an abstraction than hard mapping to the physical realm. In turn it is better to think of datacenter and racks as a way to structure and organize your DataStax Enterprise or Apache Cassandra architecture. From a tree perspective of organizing your cluster, think of things in this hierarchy.

  • Cluster
    • Datacenter(s)
      • Rack(s)
        • Server(s)
          • Node (vnode)

Apache Cassandra Datacenter

An Apache Cassandra Datacenter is a group of nodes, related and configured within a cluster for replication purposes. Setting up a specific set of related nodes into a datacenter helps to reduce latency, prevent transactions from impact by other workloads, and related effects. The replication factor can also be setup to write to multiple datacenter, providing additional flexibility in architectural design and organization. One specific element of datacenter to note is that they must contain only one node type:

Depending on the replication factor, data can be written to multiple datacenters. Datacenters must never span physical locations.Each datacenter usually contains only one node type. The node types are:

  • Transactional: Previously referred to as a Cassandra node.
  • DSE Graph: A graph database for managing, analyzing, and searching highly-connected data.
  • DSE Analytics: Integration with Apache Spark.
  • DSE Search: Integration with Apache Solr. Previously referred to as a Solr node.
  • DSE SearchAnalytics: DSE Search queries within DSE Analytics jobs.

Apache Cassandra Racks

An Apache Cassandra Rack is a grouped set of servers. The architecture of Cassandra uses racks so that no replica is stored redundantly inside a singular rack, ensuring that replicas are spread around through different racks in case one rack goes down. Within a datacenter there could be multiple racks with multiple servers, as the hierarchy shown above would dictate.

To determine where data goes within a rack or sets of racks Apache Cassandra uses what is referred to as a snitch. A snitch determines which racks and datacenter a particular node belongs to, and by respect of that, determines where the replicas of data will end up. This replication strategy which is informed by the snitch can take the form of numerous kinds of snitches, some examples include;

  • SimpleSnitch – this snitch treats order as proximity. This is primarily only used when in a single-datacenter deployment.
  • Dynamic Snitching – the dynamic snitch monitors read latencies to avoid reading from hosts that have slowed down.
  • RackInferringSnitch – Proximity is determined by rack and datacenter, assumed corresponding to 3rd and 2nd octet of each node’s IP address. This particular snitch is often used as an example for writing a custom snitch class since it isn’t particularly useful unless it happens to match one’s deployment conventions.

In the future I’ll outline a few more snitches, how some of them work with more specific detail, and I’ll get into a whole selection of other topics. Be sure to subscribe to the blog, the ole’ RSS feed works great too, and follow @CompositeCode for blog updates. For discourse and hot takes follow me @Adron.

Distributed Database Things to Know Series

  1. Consistent Hashing
  2. Apache Cassandra Datacenter & Racks (this post)

 

DSE6 + .NET v?

Project Repo: Interoperability Black Box

First steps. Let’s get .NET installed and setup. I’m running Ubuntu 18.04 for this setup and start of project. To install .NET on Ubuntu one needs to go through a multi-command process of keys and some other stuff, fortunately Microsoft’s teams have made this almost easy by providing the commands for the various Linux distributions here. The commands I ran are as follows to get all this initial setup done.

[sourcecode language=”bash”]
wget -qO- https://packages.microsoft.com/keys/microsoft.asc | gpg –dearmor > microsoft.asc.gpg
sudo mv microsoft.asc.gpg /etc/apt/trusted.gpg.d/
wget -q https://packages.microsoft.com/config/ubuntu/18.04/prod.list
sudo mv prod.list /etc/apt/sources.list.d/microsoft-prod.list
sudo chown root:root /etc/apt/trusted.gpg.d/microsoft.asc.gpg
sudo chown root:root /etc/apt/sources.list.d/microsoft-prod.list
[/sourcecode]

After all this I could then install the .NET SDK. It’s been so long since I actually installed .NET on anything that I wasn’t sure if I just needed the runtime, the SDK, or what I’d actually need. I just assumed it would be safe to install the SDK and then install the runtime too.

[sourcecode language=”bash”]
sudo apt-get install apt-transport-https
sudo apt-get update
sudo apt-get install dotnet-sdk-2.1
[/sourcecode]

Then the runtime.

[sourcecode language=”bash”]
sudo apt-get install aspnetcore-runtime-2.1
[/sourcecode]

logoAlright. Now with this installed, I wanted to also see if Jetbrains Rider would detect – or at least what would I have to do – to have the IDE detect that .NET is now installed. So I opened up the IDE to see what the results would be. Over the left hand side of the new solution dialog, if anything isn’t installed Rider usually will display a message that X whatever needs installed. But it looked like everything is showing up as installed, “yay for things working (at this point)!

rider-01

Next up is to get a solution started with the pertinent projects for what I want to build.

dse2

Kazam_screenshot_00001

For the next stage I created three projects.

  1. InteroperationalBlackBox – A basic class library that will be used by a console application or whatever other application or service that may need access to the specific business logic or what not.
  2. InteroperationalBlackBox.Tests – An xunit testing project for testing anything that might need some good ole’ testing.
  3. InteroperationalBlackBox.Cli – A console application (CLI) that I’ll use to interact with the class library and add capabilities going forward.

Alright, now that all the basic projects are setup in the solution, I’ll go out and see about the .NET DataStax Enterprise driver. Inside Jetbrains Rider I can right click on a particular project that I want to add or manage dependencies for. I did that and then put “dse” in the search box. The dialog pops up from the bottom of the IDE and you can add it by clicking on the bottom right plus sign in the description box to the right. Once you click the plus sign, once installed, it becomes a little red x.

dse-adding-package

Alright. Now it’s almost time to get some code working. We need ourselves a database first however. I’m going to setup a cluster in Google Cloud Platform (GCP), but feel free to use whatever cluster you’ve got. These instructions will basically be reusable across wherever you’ve got your cluster setup. I wrote up a walk through and instructions for the GCP Marketplace a few weeks ago. I used the same offering to get this example cluster up and running to use. So, now back to getting the first snippets of code working.

Let’s write a test first.

[sourcecode language=”csharp”]
[Fact]
public void ConfirmDatabase_Connects_False()
{
var box = new BlackBox();
Assert.Equal(false, box.ConfirmConnection());
}
[/sourcecode]

In this test, I named the class called BlackBox and am planning to have a parameterless constructor. But as things go tests are very fluid, or ought to be, and I may change it in the next iteration. I’m thinking, at least to get started, that I’ll have a method to test and confirm a connection for the CLI. I’ve named it ConfirmConnection for that purpose. Initially I’m going to test for false, but that’s primarily just to get started. Now, time to implement.

[sourcecode language=”csharp”]
namespace InteroperabilityBlackBox
using System;
using Dse;
using Dse.Auth;

namespace InteroperabilityBlackBox
{
public class BlackBox
{
public BlackBox()
{}

public bool ConfirmConnection()
{
return false;
}
}
}
[/sourcecode]

That gives a passing test and I move forward. For more of the run through of moving from this first step to the finished code session check out this

By the end of the coding session I had a few tests.

[sourcecode language=”csharp”]
using Xunit;

namespace InteroperabilityBlackBox.Tests
{
public class MakingSureItWorksIntegrationTests
{
[Fact]
public void ConfirmDatabase_Connects_False()
{
var box = new BlackBox();
Assert.Equal(false, box.ConfirmConnection());
}

[Fact]
public void ConfirmDatabase_PassedValuesConnects_True()
{
var box = new BlackBox(“cassandra”, “”, “”);
Assert.Equal(false, box.ConfirmConnection());
}

[Fact]
public void ConfirmDatabase_PassedValuesConnects_False()
{
var box = new BlackBox(“cassandra”, “notThePassword”, “”);
Assert.Equal(false, box.ConfirmConnection());
}
}
}
[/sourcecode]

The respective code for connecting to the database cluster, per the walk through I wrote about here, at session end looked like this.

[sourcecode language=”csharp”]
using System;
using Dse;
using Dse.Auth;

namespace InteroperabilityBlackBox
{
public class BlackBox : IBoxConnection
{
public BlackBox(string username, string password, string contactPoint)
{
UserName = username;
Password = password;
ContactPoint = contactPoint;
}

public BlackBox()
{
UserName = “ConfigValueFromSecretsVault”;
Password = “ConfigValueFromSecretsVault”;
ContactPoint = “ConfigValue”;
}

public string ContactPoint { get; set; }
public string UserName { get; set; }
public string Password { get; set; }

public bool ConfirmConnection()
{
IDseCluster cluster = DseCluster.Builder()
.AddContactPoint(ContactPoint)
.WithAuthProvider(new DsePlainTextAuthProvider(UserName, Password))
.Build();

try
{
cluster.Connect();
return true;
}
catch (Exception e)
{
Console.WriteLine(e);
return false;
}

}
}
}
[/sourcecode]

With my interface providing the contract to meet.

[sourcecode language=”csharp”]
namespace InteroperabilityBlackBox
{
public interface IBoxConnection
{
string ContactPoint { get; set; }
string UserName { get; set; }
string Password { get; set; }
bool ConfirmConnection();
}
}
[/sourcecode]

Conclusions & Next Steps

After I wrapped up the session two things stood out that needed fixed for the next session. I’ll be sure to add these as objectives for the next coding session at 3pm PST on Thursday.

  1. The tests really needed to more resiliently confirm the integrations that I was working to prove out. My plan at this point is to add some Docker images that would provide the development integration tests a point to work against. This would alleviate the need for something outside of the actual project in the repository to exist. Removing that fragility.
  2. The application, in its “Black Box”, should do something. For the next session we’ll write up some feature requests we’d want, or maybe someone has some suggestions of functionality they’d like to see implemented in a CLI using .NET Core working against a DataStax Enterprise Cassandra Database Cluster? Feel free to leave a comment or three about a feature, I’ll work on adding it during the next session.