Category Archives: Infrastructure

New Live Coding Streams and Episodes!

I’ve been working away in Valhalla on the next episodes of Thrashing Code TV and subsequent content for upcoming Thrashing Code Sessions on Twitch (follow) and Youtube (subscribe). The following I’ve broken into the main streams and shows that I’ll be putting together over the next days, weeks, and months and links to sessions and shows already recorded. If you’ve got any ideas, questions, thoughts, just send them my way.

Colligere (Next Session)

Coding has been going a little slow, in light of other priorities and all, but it’ll still be one of the featured projects I’ll be working on. Past episodes are available here, however join in on Friday and I’ll catch everybody up, so you can skip past episodes if you aren’t after specific details and just want to join in on future work and sessions.

In this next session, this Friday the 9th at 3:33pm PST, I’m going to be working on reading in JSON, determining what type of structure the JSON should be unmarshalled into, and how best to make that determination through logic and flow.

Since Go needs something to unmarshall JSON into, a specific structure, I’ll be working on determining a good way to pre-read information in the schema configuration files (detailed in the issue listed below) so that a logic flow can be implemented that will then begin the standard Go JSON unmarshalling of the object. So this will likely end up including some hackery around reading in JSON without the assistance of the Go JSON library. Join in and check out what solution(s) I come up with.

The specific issue I’ll be working on is located on Github here. These sessions I’m going to continue working on, but will be a little vague and will start working on the Colligere CLI primarily on Saturday’s at 10am. So you can put that on your schedule and join me then for hacks. If you’d like to contribute, as always, reach out via here, @Adron, or via the Github Colligere Repository and let’s discuss what you’d like to add.

Getting Started with Go

This set of sessions, which I’ve detailed in “Getting Started with Go“, I’ll be starting on January 12th at 4pm PST. You can get the full outline and further details of what I’ll be covering on my “Getting Started with Gopage and of course the first of these sessions I’ve posted details on the Twitch event page here.

  • Packages & the Go Tool – import paths, package declarations, blank imports, naming, and more.
  • Structure – names, declarations, variables, assignments, scope, etc., etc.
  • Basic Types – integers, floats, complex numbers, booleans, strings, and constants.

Infrastructure as Code with Terraform and Apache Cassandra

I’ll be continuing the Terraform, bash, and related configuration and coding of using infrastructure as code practices to build out, maintain, and operate Apache Cassandra distributed database clusters. At some point I’ll likely add Kubernetes, some additional on the metal cluster systems and start looking at Kubernetes Operators and how one can manage distributed systems on Kubernetes using this on the metal environment. But for now, these sessions will continue real soon as we’ve got some systems to build!

Existing episodes of this series you can check out here.

Getting Started with Multi-model Databases

This set of sessions I’ve detailed in “Getting Started with a Multi-model Database“, and this one I’ll be starting in the new years also. Here’s the short run down of the next several streams. So stay tuned, subscribe or follow my Twitch and Youtube and of course subscribe to the Composite Code blog (should be to the left, or if on mobile click the little vertical ellipses button)

  1. An introduction to a range of databases: Apache Cassandra, Postgresql and SQL Server, Neo4j, and … in memory database. Kind of like 7 Databases in 7 Weeks but a bunch of databases in just a short session!
  2. An Introduction – Apache Cassandra and what it is, how to get a minimal cluster started, options for deploying something quickly to try it out.
  3. Adding to Apache Cassandra with DataStax Enterprise, gaining analytics, graph, and search. In this session I’ll dive into what each of these capabilities within DataStax Enterprise give us and how the architecture all fits together.
  4. Deployment of Apache Cassandra and getting a cluster built. Options around ways to effectively deploy and maintain Apache Cassandra in production.
  5. Moving to DataStax Enterprise (DSE) from Apache Cassandra. Getting a DSE Cluster up and running with OpsCenter, Lifecycle Manager (LCM), and getting some queries tried out with Studio.

Terraform “Invalid JWT Signature.”

I ran into this issue recently. The “Invalid JWT Signature.” error while running some Terraform. It appeared to occur whenever I was setting up a bucket in Google Cloud Platform to use for a back-end to store Terraform’s state. In console, here’s the exact error.

terraform-jwt-invalid.png

My first quick searches uncovered some Github issues that looked curiously familiar. Invalid JWT Token when using Service Account JSON #3100 which was closed without any particular resolution. Upon further searching it didn’t help to much but I’d be curious as to what the resolution was. The second is Creating GCP project in terraform #13109 which sounded much more on point compared to my issue. This appeared closer to my issue but it looked like I should probably just start from scratch, since this did work on one machine already but just didn’t work on this machine I shifted to. (Grumble grumble, what’d I miss).

The Solution(s)?

In the end this is a message, if you work on multiple machines with multiple cloud accounts you might get the keys mixed up. In this particular case I reset my NIC (i.e. you can just reboot too, especially if on Windows it’s just easier to do that). Then everything just started working. In some cases however, the JSON with the gcloud/gcp keys needs to be regenerated as the old key was rolled or otherwise invalidated.

 

Terraform Changes, Provisioners, Connections, and Static Nodes

NOTE: Video toward the bottom of the post with the timeline so you can navigate into certain points of the video where I speak to different topics.

Cluster Nodes Currently

First I start out this session with a simple issue I ran into the previous night. The fact that a variable in the name of the instance resource being created causes the instance to be destroyed and created on every single execution of terraform. That won’t do so I needed to just declare each node, at least the first three, maybe five (cuz’ it’s best to go with 2n+1 instances in most distributed systems). So I removed the module I created in the last session and created two nodes to work with, for creation and setup of the provisioners that I’d use to setup Cassandra.

In addition to making each of the initial nodes statically declared, I also, at least for now added public IP’s for troubleshooting purposes. Once those are setup I’ll likely remove those and ensure that the cluster just communicates on the private IP’s, or at least disallow traffic from hitting the individual nodes of the cluster even via the public IP’s via the firewall settings.

The nodes, public and private IP addresses, now look like this.

resource "google_compute_address" "node_zero_address" {
  name         = "node-0-address"
  subnetwork   = "${module.network_development.subnetwork_west}"
  address_type = "INTERNAL"
  address      = "10.1.0.2"
}

resource "google_compute_address" "node_zero_public_address" {
  name = "node-0-public"
}

resource "google_compute_instance" "node_zero" {
  name         = "node-0"
  machine_type = "n1-standard-1"
  zone         = "us-west1-a"

  boot_disk {
    initialize_params {
      image = "ubuntu-minimal-1804-bionic-v20180814"
    }
  }

  network_interface {
    subnetwork = "${module.network_development.subnetwork_west}"
    address    = "${google_compute_address.node_zero_address.address}"

    access_config {
      nat_ip = "${google_compute_address.node_zero_public_address.address}"
    }
  }

  service_account {
    scopes = [
      "userinfo-email",
      "compute-ro",
      "storage-ro",
    ]
  }
}

Provisioners

When using Terraform provisioners there are a few things to take into account. First, which is kind of obvious but none the less I’m going to mention it, is to know which user, ssh keys, and related authentication details you’re using to actually login to the instances you create. It can get a little confusing sometimes, at least it’s gotten me more than once, when gcloud initiates making ssh keys for me while I’ve got my own keys already created. Then when I setup a provisioner I routinely forget that I don’t want to use the id_rsa but instead the google created keys. I talk more about this in the video, with more ideas about insuring also that the user on the machine is the right user, it’s setup for centos or ubuntu or whatever distro your using, and all of these specifics.

Once that’s all figured out, sorted, and not confused, it’s just a few steps to get a provisioner setup and working. The first thing is we need to setup whichever provisioner it is we want to use. There are several options; file, local-exec, remote-exec, and some others for chef, salt-masterless, null_resource, and habitat. Here I’m just going to cover file, local-exec, and remote-exec. First let’s take a look at the file provisioner.

resource "google_compute_instance" "someserver" {
  # ... some other fields for the particular instance being created.

  # Copies the myapp.conf file to /etc/myapp.conf
  provisioner "file" {
    source      = "directory/relative/to/repo/path/thefile.json"
    destination = "/the/path/that/will/be/created/andthefile.json"
  }
}

This example of a provisioner shows, especially with my fancy naming, several key things;

The provisioner goes into an instance resource, as a field within the instance resource itself.
The provisioner, which I’ve selected the “file” type of provisioner here, then has several key fields within the field, for the file provisioner those two required fields are source and destination.
The source takes the file to be copied to the instance, which is locally in this repository, or I suppose could be outside of it, based on a general path and the file name. You could just put the filename if it’s just there in the root of where this is being executed.
The destination is where within the instance users folder is on the remote server. So if I set a folder/filename.json it’ll create a folder called “folder” and put the file in it, possibly renamed if it isn’t the same name, here as filename.json.

But of course, one can’t just simply copy a file to a remote instance without a user and the respective authentication, which is where the connection resource comes into play within the provisioner resource. That makes the provisioner look like this.

resource "google_compute_instance" "someserver" {
  # ... some other fields for the particular instance being created.

  # Copies the myapp.conf file to /etc/myapp.conf
  provisioner "file" {
    source      = "directory/relative/to/repo/path/thefile.json"
    destination = "/the/path/that/will/be/created/andthefile.json"
    connection {
      type        = "ssh"
      user        = "useraccount"
      private_key = "${file("~/.ssh/id_rsa")}"
      agent       = false
      timeout     = "30s"
    }
  }
}

Here the connection resource adds several new fields that are used to make the connection. This is the same for the file or remote-exec provisioner. As mentioned earlier, and through troubleshooting in the video, in this case I’ve just put in the syntax id_rsa since that’s the common ssh private key. However with GCP I needed to be using the google_compute_engine ssh private key.

The other fields are pretty self-explanatory, but let’s discuss real quick. The type, is simply that it is an ssh connection that will be used. One can also use plain text and some other options, or in the case of windows one uses the winrm connection type.

The user field is simply the name of the user which will be used to authenticate to the remote instance upon creation. The private key is the private ssh key of the user that needs to connect to that particular instance and do whatever the provisioner is going to do.

The agent, if set to false, doesn’t use ssh-agent, but if set to true does. On windows the only supported ssh authentication is pageant.

Then the timeout field is the time, in seconds as shown, that the provisioner will wait for execution to respond before an error is thrown in the terraform execution.

After all of those things were setup I created a simple script to install Cassandra which I called install-c.sh.

#!/usr/bin/env bash

# Installing Cassandra 3.11.3

echo "deb http://www.apache.org/dist/cassandra/debian 311x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
curl https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
sudo apt-get update
sudo apt-key adv --keyserver pool.sks-keyservers.net --recv-key A278B781FE4B2BDA
sudo apt-get install -Y cassandra

Now with the script done I can use the file provisioner to move it over to the server. Which I did in the video, and then started putting together the remote-exec provisioner. However I messed around with the local-exec provisioner too but then realized I’d gotten them turned around in my head. The local-exec actually initiates a script executing here on the local machine that terraform execution starts from, which I needed terraform to execute the script that is out on the remote instance that was just created. That provisioner looks like this.

provisioner "remote-exec" {
  inline = [
    "chmod +x install-c.sh",
    "install-c.sh",
  ]

  connection {
    type        = "ssh"
    user        = "adron"
    private_key = "${file("~/.ssh/google_compute_engine")}"
    agent       = false
    timeout     = "30s"
  }
}

In this provisioner the inline field takes an array of strings which will be executed on the remote server. I could have just entered the installation steps for installing cassandra here, but considering I’ll actually want to take additional steps and actions, that would make the privisioner exceedingly messy. So what I do here is setup the connection and then for the inline field I modify the file so that it can be executed and then execute the script on the server. It then goes through the steps of adding the apt-get repo, taking respective verification steps, running and update, and then installing cassandra.

After that I wrap up for this session. Static instances created, provisioners dedicated, and connections made and verified for the provisioners with cassandra being installed. So for the next session I’m all set to start connecting the nodes together into a cluster and taking whatever next steps I need.

  • 0:57 Thrashing Code Intro
  • 3:43 Episode Intro, stepping into Terraform for the session.
  • 4:25 Starting to take notes during session!
  • 6:12 Describing the situation where variables in key values of resources causes the resources to recreate every `terraform apply` so I broke out each individual node.
  • 9:36 I start working with the provisioner to get a script file on machine and also execute that script file. It’s tricky, because one has to figure out what the user for the particular image is. Also some oddball mess with my mic starts. Troubleshooting this, but if you know what’s up with OBS streaming software and why this is happening, lemme know I’m all ears! 🙂
  • 31:25 Upon getting the file provisioner to finally work, setting extra fields for time out and such, I moved on to take next steps with IntelliJ cuz the IDE is kind of awesome. In addition, the plugins for HCL (HashiCorp Configuration Language) and related elements including features built into the actual IDE provide some extra autocomplete and such for Terraform work. At this time I go through actually setting up the IDE (installed it earlier while using Visual Studio Code).
  • 34:15 Setup of IntelliJ plugins specific to Terraform and the work at hand.
  • 35:50 I show some of the additional fields that the autocomplete in Terraform actually show, which helps a lot during troubleshooting or just building out various resources.
  • 37:05 Checking out the Apache Cassandra (http://cassandra.apache.org/) download and install commands and adding those to the install script for the Terraform provisioner file copy to copy over to the instance.
  • 43:18 I add some configurations for terraform builds/execution via IntelliJ. There are terraform specific build options and also bash file execution options which I cover in video.
  • 45:32 I take a look at IntelliJ settings to determine where to designate the Terraform executable file. This is important for getting the other terraform build configuration options to work.
  • 48:00 Start of debugging and final steps between now and…
  • 1:07:16 …finish of successful Cassandra install on nodes using provisioners.
  • 1:07:18 Finishing up by committing the latest changes to repository.
  • 1:09:04 Hacker outtro!

Resources:

 

Setting Up Nodes, Firewall, & Instances in Google Cloud Platform

Here’s the run down of what I covered in the latest Thrashing Code Session (go subscribe here to the channel for future sessions or on Twitch). The core focus of this session was getting some further progress on my Terraform Project around getting a basic Cassandra and DataStax Enterprise Apache Cassandra Cluster up and running in Google Cloud Platform.

The code and configuration from the work is available on Github at terraform-fields and a summary of code changes and other actions taken during the session are further along in this blog entry.

Streaming Session Video

In this session I worked toward completing a few key tasks for the Terraform project around creating a Cassandra Cluster. Here’s a run down of the time points where I tackle specific topics.

  • 3:03 – Welcome & objectives list: Working toward DataStax Enterprise Apache Cassandra Cluster and standard Apache Cassandra Cluster.
  • 3:40 – Review of what infrastructure exists from where we left off in the previous episode.
  • 5:00 – Found music to play that is copyright safe! \m/
  • 5:50 – Recap on where the project lives on Github in the terraformed-fields repo.
  • 8:52 – Adding a google_compute_address for use with the instances. Leads into determining static public and private google_compute_address resources. The idea being to know the IP for our cluster to make joining them together easier.
  • 11:44 – Working to get the access_config and related properties set on the instance to assign the google_compute_address resources that I’ve created. I run into a few issues but work through those.
  • 22:28 – Bastion server is set with the IP.
  • 37:05 – I setup some files, following a kind of “bad process” as I note. Which I’ll refactor and clean up in a subsequent episode. But the bad process also limits the amount of resources I have in one file, so it’s a little easier to follow along.
  • 54:27 – Starting to look at provisioners to execute script files and commands before or after the instance creation. Super helpful, with the aim to use this feature to download and install the DataStax Enterprise Apache Cassandra or standard Apache Cassandra software.
  • 1:16:18 – Ah, a need for firewall rule for ssh & port 22. I work through adding those and then end up with an issue that we’ll be resolving next episode!

Session Content

Starting Point: I started this episode from where I left off last session.

Work Done: In this session I added a number of resources to the project and worked through a number of troubleshooting scenarios, as one does.

Added firewall resources to open up port 22 and icmp (ping, etc).

resource "google_compute_firewall" "bastion-ssh" {
  name    = "gimme-bastion-ssh"
  network = "${google_compute_network.dev-network.name}"

  allow {
    protocol = "tcp"
    ports    = ["22"]
  }
}

resource "google_compute_firewall" "bastion-icmp" {
  name    = "gimme-bastion-icmp"
  network = "${google_compute_network.dev-network.name}"

  allow {
    protocol = "icmp"
  }
}

I also broke out the files so that each instances has its own IP addresses with it in the file specific to that instance. Later I’ll add context for why I gave the project file bloat, by refactoring to use modules.

terraform-files.png

Added each node resource as follows. I just increased each specific node count by one for each subsequent node, such as making this node1_internal IP google_compute_address increment to node2_internal. Everything also statically defined, adding to my file and configuration bloat.

resource "google_compute_address" "node1_internal" {
  name         = "node-1-internal"
  subnetwork   = "${google_compute_subnetwork.dev-sub-west1.self_link}"
  address_type = "INTERNAL"
  address      = "10.1.0.5"
}

resource "google_compute_instance" "node_frank" {
  name         = "frank"
  machine_type = "n1-standard-1"
  zone         = "us-west1-a"

  boot_disk {
    initialize_params {
      image = "ubuntu-minimal-1804-bionic-v20180814"
    }
  }

  network_interface {
    subnetwork = "${google_compute_subnetwork.dev-sub-west1.self_link}"
    address    = "${google_compute_address.node1_internal.address}"
  }

  service_account {
    scopes = [
      "userinfo-email",
      "compute-ro",
      "storage-ro",
    ]
  }
}

I also setup the bastion server so it looks like this. Specifically designating a public IP so that I can connect via SSH.

resource "google_compute_address" "bastion_a" {
  name = "bastion-a"
}

resource "google_compute_instance" "a" {
  name         = "a"
  machine_type = "n1-standard-1"
  zone         = "us-west1-a"

  provisioner "file" {
  source      = "install-c.sh"
  destination = "install-c.sh"

  connection {
    type     = "ssh"
    user     = "root"
    password = "${var.root_password}"
    }
  }

  boot_disk {
    initialize_params {
      image = "ubuntu-minimal-1804-bionic-v20180814"
    }
  }

  network_interface {
    subnetwork = "${google_compute_subnetwork.dev-sub-west1.self_link}"
    access_config {
      nat_ip = "${google_compute_address.bastion_a.address}"
    }
  }

  service_account {
    scopes = [
      "userinfo-email",
      "compute-ro",
      "storage-ro",
    ]
  }
}

Plans for next session include getting the nodes setup so that the bastion server can work with and deploy or execute commands against them without the nodes being exposed publicly to the internet. We’ll talk more about that then. For now, happy thrashing code!

Thrashing Code Twitch Schedule September 19th-October 2nd

I’ve got everything queued back up with some extra Thrashing Code Sessions and will have some on the rails travel streams. Here’s what the schedule looks like so far.

Today at 3pm PST (UPDATED: Sep 19th 2018)

thrashing-code-terraformUPDATED: Video available https://youtu.be/NmlzGKUnln4

I’m going to get back into the roll of things this session after the travels last week. In this session I’m aiming to do several things:

  1. Complete next steps toward getting a DataStax Enterprise Apache Cassandra cluster up and running via Terraform in Google Cloud Platform. My estimate is I’ll get to the point that I’ll have three instances that launch and will automate the installation of Cassandra on the three instances. Later I’ll aim to expand this, but for now I’m just going to deploy 3 nodes and then take it from there. Another future option is to bake the installation into a Packer deployed image and use it for the Terraform execution. Tune in to find out the steps and what I decide to go with.
  2. I’m going to pull up the InteroperabilityBlackBox and start to flesh out some objects for our model. The idea, is based around something I stumbled into last week during travels, the thread on that is here.

Friday (Today) @ 3pm PST

thrashing-code-gopherThis Friday I’m aiming to cover some Go basics before moving further into the Colligere CLI  app. Here are the highlights of the plan.

  1.  I’m going to cover some of the topics around program structure including: type declarations, tuple assignment, variable lifetime, pointers, and other variables.
  2.  I’m going to cover some basics on packages, initialization of packages, imports, and scope. This is an important aspect of ongoing development with Colligere since we’ll be pulling in a number of packages for generation of the data.
  3. Setting up configuration and schema for the Colligere application using Viper and related tooling.

Tuesday, October 2nd @ 3pm PST

thrashing-code-terraformThis session I’m aiming to get some more Terraform work done around the spin up and shutdown of the cluster. I’ll dig into some more specific points depending on where I progress to in sessions previous to this one. But it’s on the schedule, so I’ll update this one in the coming days.

 

Collecting Terraform Resources

I just finished and got a LinkedIn Learning (AKA Lynda.com) course published last month on Learning Terraform (LinkedIn Learning Course & Lynda.com Course). Immediately after posting that I spoke with my editor at LinkedIn Learning and agreed on the next two courses I’ll record: Terraform Essentials and Go for Site Reliability Engineers. Consider me stoked to be putting this material together and recording more video courses, this is a solid win, as the internet dogge would say, “much excite, very wow”!

The following are some recent materials I’ve dug up in regards to Terraform, Go, and Site Reliability work. Some of which will very likely find it’s way into influencing my courses. Good material here if you’re looking for some solid, and arguably more advanced approaches, to your Terraform work.

Advanced Terraform Materials

The HashiCorp Documentation Material

Writing customer providers

Running Terraform in Automation

Gaining a Systemic View of Immutable Infrastructure Tooling

I put together a few starter notes on things you should delve into and understand before working with infrastructure related tooling, like Ansible, Terraform, or similar tools. If you think I’ve missed any do ping me @Adron and let me know your thoughts on other additions. The first starter items I’d list as the following. Continue reading