Setup Postgres, and GraphQL API with Hasura on Azure

Key Technologies: HasuraPostgresTerraformDocker, and Azure.

I created a data model to store railroad systems, services, scheduled, time points, and related information, detailing the schema “Beyond CRUD n’ Cruft Data-Modeling” with a few tweaks. The original I’d created for Apache Cassandra, and have since switched to Postgres giving the option of primary and foreign keys, relations, and the related connections for the model.

In this post I’ll use that schema to build out an infrastructure as code solution with Terraform, utilizing Postgres and Hasura (OSS).

Prerequisites

Docker Compose Development Environment

For the Docker Compose file I just placed them in the root of the repository. Add a docker-compose.yaml file and then added services. The first service I setup was the Postgres/PostgreSQL database. This is using the standard Postgres image on Docker Hub. I opted for version 12, I do want it to always restart if it gets shutdown or crashes, and then the last of the obvious settings is the port which maps from 5432 to 5432.

For the volume, since I might want to backup or tinker with the volume, I put the db_data location set to my own Codez directory. All my databases I tend to setup like this in case I need to debug things locally.

The POSTGRES_PASSWORD is an environment variable, thus the syntax ${PPASSWORD}. This way no passwords go into the repo. Then I can load the environment variable via a standard export POSTGRES_PASSWORD="theSecretPasswordHere!" line in my system startup script or via other means.

services:
  postgres:
    image: postgres:12
    restart: always
    volumes:
      - db_data:/Users/adron/Codez/databases
    environment:
      POSTGRES_PASSWORD: ${PPASSWORD}
    ports:
      - 5432:5432

For the db_data volume, toward the bottom I add the key value setting to reference it.

volumes:
  db_data:

Next I added the GraphQL solution with Hasura. The image for the v1.1.0 probably needs to be updated (I believe we’re on version 1.3.x now) so I’ll do that soon, but got the example working with v1.1.0. Next I’ve got the ports mapped to open 8080 to 8080. Next, this service will depend on the postgres service already detailed. Restart, also set on always just as the postgres service. Finally two evnironment variables for the container:

  • HASURA_GRAPHQL_DATABASE_URL – this variable is the base postgres URL connection string.
  • HASURA_GRAPHQL_ENABLE_CONSOLE – this is the variable that will set the console user interface to initiate. We’ll definitely want to have this for the development environment. However in production I’d likely want this turned off.
  graphql-engine:
    image: hasura/graphql-engine:v1.1.0
    ports:
      - "8080:8080"
    depends_on:
      - "postgres"
    restart: always
    environment:
      HASURA_GRAPHQL_DATABASE_URL: postgres://postgres:logistics@postgres:5432/postgres
      HASURA_GRAPHQL_ENABLE_CONSOLE: "true"

At this point the commands to start this are relatively minimal, but in spite of that I like to create a start and stop shell script. My start script and stop script simply look like this:

Starting the services.

docker-compose up -d

For the first execution of the services you may want to skip the -d and instead watch the startup just to become familiar with the events and connections as they start.

Stopping the services.

docker-compose down

🚀 That’s it for the basic development environment, we’re launched and ready for development. With the services started, navigate to https://localhost:8080/console to start working with the user interface, which I’ll have a more details on the “Beyond CRUD n’ Cruft Data-Modeling” swap to Hasura and Postgres in an upcoming blog post.

For full syntax of the docker-compose.yaml check out this gist: https://gist.github.com/Adron/0b2ea637b5e00681f4d62404805c3a00

Terraform Production Environment

For the production deployment of this stack I want to deploy to Azure, use Terraform for infrastructure as code, and the Azure database service for Postgres while running Hasura for my API GraphQL tier.

For the Terraform files I created a folder and added a main.tf file. I always create a folder to work in, generally, to keep the state files and initial prototyping of the infrastructre in a singular place. Eventually I’ll setup a location to store the state and fully automate the process through a continues integration (CI) and continuous delivery (CD) process. For now though, just a singular folder to keep it all in.

For this I know I’ll need a few variables and add those to the file. These are variables that I’ll use to provide values to multiple resources in the Terraform templating.

variable "database" {
  type = string
}

variable "server" {
  type = string
}

variable "username" {
  type = string
}

variable "password" {
  type = string
}

One other variable I’ll want so that it is a little easier to verify what my Hasura connection information is, will look like this.

output "hasura_url" {
  value = "postgres://${var.username}%40${azurerm_postgresql_server.logisticsserver.name}:${var.password}@${azurerm_postgresql_server.logisticsserver.fqdn}:5432/${var.database}"
}

Let’s take this one apart a bit. There is a lot of concatenated and interpolated variables being wedged together here. This is basically the Postgres connection string that Hasura will need to make a connection. It includes the username and password, and all of the pertinent parsed and string escaped values. Note specifically the %40 between the ${var.username} and ${azurerm_postgresql_server.logisticsserver.name} variables while elsewhere certain characters are not escaped, such as the @ sign. When constructing this connection string, it is very important to be prescient of all these specific values being connected together. But, I did the work for you so it’s a pretty easy copy and paste now!

Next I’ll need the Azure provider information.

provider "azurerm" {
  version = "=2.20.0"
  features {}
}

Note that there is a features array that is just empty, it is now required for the provider to designate this even if the array is empty.

Next up is the resource group that everything will be deployed to.

resource "azurerm_resource_group" "adronsrg" {
  name     = "adrons-rg"
  location = "westus2"
}

Now the Postgres Server itself. Note the location and resource_group_name simply map back to the resource group. Another thing I found a little confusing, as I wasn’t sure if it was a Terraform name or resource name tag or the server name itself, is the “name” key value pair in this resource. It is however the server name, which I’ve assigned var.server. The next value assigned “B_Gen5_2” is the Azure designator, which is a bit cryptic. More on that in a future post.

After that information the storage is set to, I believe if I RTFM’ed correctly to 5 gigs of storage. For what I’m doing this will be fine. The backup is setup for 7 days of retention. This means I’ll be able to fall back to a backup from any of the last seven days, but after 7 days the backups are rolled and the last day is deleted to make space for the newest backup. The geo_redundant_backup_enabled setting is set to false, because with Postgres’ excellent reliability and my desire to not pay for that extra reliability insurance, I don’t need geographic redundancy. Last I set auto_grow_enabled to true, albeit I do need to determine the exact flow of logic this takes for this particular implementation and deployment of Postgres.

The last chunk of details for this resource are simply the username and password, which are derived from variables, which are derived from environment variables to keep the actual username and passwords out of the repository. The last two bits set the ssl to enabled and the version of Postgres to v9.5.

resource "azurerm_postgresql_server" "logisticsserver" {
  name = var.server
  location = azurerm_resource_group.adronsrg.location
  resource_group_name = azurerm_resource_group.adronsrg.name
  sku_name = "B_Gen5_2"

  storage_mb                   = 5120
  backup_retention_days        = 7
  geo_redundant_backup_enabled = false
  auto_grow_enabled            = true

  administrator_login          = var.username
  administrator_login_password = var.password
  version                      = "9.5"
  ssl_enforcement_enabled      = true
}

Since the database server is all setup, now I can confidently add an actual database to that database. Here the resource_group_name pulls from the resource group resource and the server_name pulls from the server resource. The name, being the database name itself, I derive from a variable too. Then the character set is UTF8 and collation is set to US English, which are generally standard settings on Postgres being installed for use within the US.

resource "azurerm_postgresql_database" "logisticsdb" {
  name                = var.database
  resource_group_name = azurerm_resource_group.adronsrg.name
  server_name         = azurerm_postgresql_server.logisticsserver.name
  charset             = "UTF8"
  collation           = "English_United States.1252"
}

The next thing I discovered, after some trial and error and a good bit of searching, is the Postgres specific firewall rule. It appears this is related to the Postgres service in Azure specifically, as for a number of trials and many errors I attempted to use the standard available firewalls and firewall rules that are available in virtual networks. My understanding now is that the Postgres Servers exist outside of that paradigm and by relation to that have their own firewall rules.

This firewall rule basically attaches the firewall to the resource group, then the server itself, and allows internal access between the Postgres Server and the Hasura instance.

resource "azurerm_postgresql_firewall_rule" "pgfirewallrule" {
  name                = "allow-azure-internal"
  resource_group_name = azurerm_resource_group.adronsrg.name
  server_name         = azurerm_postgresql_server.logisticsserver.name
  start_ip_address    = "0.0.0.0"
  end_ip_address      = "0.0.0.0"
}

The last and final step is setting up the Hasura instance to work with the Postgres Server and the designated database now available.

To setup the Hasura instance I decided to go with the container service that Azure has. It provides a relatively inexpensive, easier to setup, and more concise way to setup the server than setting up an entire VM or full Kubernetes environment just to run a singular instance.

The first section sets up a public IP address, which of course I’ll need to change as the application is developed and I’ll need to provide an actual secured front end. But for now, to prove out the deployment, I’ve left it public, setup the DNS label, and set the OS type.

The next section in this resource I then outline the container details. The name of the container can be pretty much whatever you want it to be, it’s your designator. The image however is specifically hasura/graphql-engine. I’ve set the CPU and memory pretty low, at 0.5 and 1.5 respectively as I don’t suspect I’ll need a ton of horsepower just to test things out.

Next I set the port available to port 80. Then the environment variables HASURA_GRAPHQL_SERVER_PORT and HASURA_GRAPHQL_ENABLE_CONSOLE to that port to display the console there. Then finally that wild concatenated interpolated connection string that I have setup as an output variable – again specifically for testing – HASURA_GRAPHQL_DATABASE_URL.

resource "azurerm_container_group" "adronshasure" {
  name                = "adrons-hasura-logistics-data-layer"
  location            = azurerm_resource_group.adronsrg.location
  resource_group_name = azurerm_resource_group.adronsrg.name
  ip_address_type     = "public"
  dns_name_label      = "logisticsdatalayer"
  os_type             = "Linux"


  container {
    name   = "hasura-data-layer"
    image  = "hasura/graphql-engine"
    cpu    = "0.5"
    memory = "1.5"

    ports {
      port     = 80
      protocol = "TCP"
    }

    environment_variables = {
      HASURA_GRAPHQL_SERVER_PORT = 80
      HASURA_GRAPHQL_ENABLE_CONSOLE = true
    }
    secure_environment_variables = {
      HASURA_GRAPHQL_DATABASE_URL = "postgres://${var.username}%40${azurerm_postgresql_server.logisticsserver.name}:${var.password}@${azurerm_postgresql_server.logisticsserver.fqdn}:5432/${var.database}"
    }
  }

  tags = {
    environment = "datalayer"
  }
}

With all that setup it’s time to test. But first, just for clarity here’s the entire Terraform file contents.

provider "azurerm" {
  version = "=2.20.0"
  features {}
}

resource "azurerm_resource_group" "adronsrg" {
  name     = "adrons-rg"
  location = "westus2"
}

resource "azurerm_postgresql_server" "logisticsserver" {
  name = var.server
  location = azurerm_resource_group.adronsrg.location
  resource_group_name = azurerm_resource_group.adronsrg.name
  sku_name = "B_Gen5_2"

  storage_mb                   = 5120
  backup_retention_days        = 7
  geo_redundant_backup_enabled = false
  auto_grow_enabled            = true

  administrator_login          = var.username
  administrator_login_password = var.password
  version                      = "9.5"
  ssl_enforcement_enabled      = true
}

resource "azurerm_postgresql_database" "logisticsdb" {
  name                = var.database
  resource_group_name = azurerm_resource_group.adronsrg.name
  server_name         = azurerm_postgresql_server.logisticsserver.name
  charset             = "UTF8"
  collation           = "English_United States.1252"
}

resource "azurerm_postgresql_firewall_rule" "pgfirewallrule" {
  name                = "allow-azure-internal"
  resource_group_name = azurerm_resource_group.adronsrg.name
  server_name         = azurerm_postgresql_server.logisticsserver.name
  start_ip_address    = "0.0.0.0"
  end_ip_address      = "0.0.0.0"
}

resource "azurerm_container_group" "adronshasure" {
  name                = "adrons-hasura-logistics-data-layer"
  location            = azurerm_resource_group.adronsrg.location
  resource_group_name = azurerm_resource_group.adronsrg.name
  ip_address_type     = "public"
  dns_name_label      = "logisticsdatalayer"
  os_type             = "Linux"


  container {
    name   = "hasura-data-layer"
    image  = "hasura/graphql-engine"
    cpu    = "0.5"
    memory = "1.5"

    ports {
      port     = 80
      protocol = "TCP"
    }

    environment_variables = {
      HASURA_GRAPHQL_SERVER_PORT = 80
      HASURA_GRAPHQL_ENABLE_CONSOLE = true
    }
    secure_environment_variables = {
      HASURA_GRAPHQL_DATABASE_URL = "postgres://${var.username}%40${azurerm_postgresql_server.logisticsserver.name}:${var.password}@${azurerm_postgresql_server.logisticsserver.fqdn}:5432/${var.database}"
    }
  }

  tags = {
    environment = "datalayer"
  }
}

variable "database" {
  type = string
}

variable "server" {
  type = string
}

variable "username" {
  type = string
}

variable "password" {
  type = string
}

output "hasura_url" {
  value = "postgres://${var.username}%40${azurerm_postgresql_server.logisticsserver.name}:${var.password}@${azurerm_postgresql_server.logisticsserver.fqdn}:5432/${var.database}"
}

To run this, similarly to how I setup the dev environment, I’ve setup a startup and shutdown script. The startup script named prod-start.sh has the following commands. Note the $PUSERNAME and $PPASSWORD are derived from environment variables, where as the other two values are just inline.

cd terraform

terraform apply -auto-approve \
    -var 'server=logisticscoresystemsdb' \
    -var 'username='$PUSERNAME'' \
    -var 'password='$PPASSWORD'' \
    -var 'database=logistics'

For the full Terraform file check out this gist: https://gist.github.com/Adron/6d7cb4be3a22429d0ff8c8bd360f3ce2

Executing that script gives me results that, if everything goes right, looks similarly to this.

./prod-start.sh 
azurerm_resource_group.adronsrg: Creating...
azurerm_resource_group.adronsrg: Creation complete after 1s [id=/subscriptions/77ad15ff-226a-4aa9-bef3-648597374f9c/resourceGroups/adrons-rg]
azurerm_postgresql_server.logisticsserver: Creating...
azurerm_postgresql_server.logisticsserver: Still creating... [10s elapsed]
azurerm_postgresql_server.logisticsserver: Still creating... [20s elapsed]


...and it continues.

Do note that this process will take a different amount of time and is completely normal for it to take ~3 or more minutes. Once the server is done in the build process a lot of the other activities start to take place very quickly. Once it’s all done, toward the end of the output I get my hasura_url output variable so that I can confirm that it is indeed put together correctly! Now that this is preformed I can take next steps and remove that output variable, start to tighten security, and other steps. Which I’ll detail in a future blog post once more of the application is built.

... other output here ...


azurerm_container_group.adronshasure: Still creating... [40s elapsed]
azurerm_postgresql_database.logisticsdb: Still creating... [40s elapsed]
azurerm_postgresql_database.logisticsdb: Still creating... [50s elapsed]
azurerm_container_group.adronshasure: Still creating... [50s elapsed]
azurerm_postgresql_database.logisticsdb: Creation complete after 51s [id=/subscriptions/77ad15ff-226a-4aa9-bef3-648597374f9c/resourceGroups/adrons-rg/providers/Microsoft.DBforPostgreSQL/servers/logisticscoresystemsdb/databases/logistics]
azurerm_container_group.adronshasure: Still creating... [1m0s elapsed]
azurerm_container_group.adronshasure: Creation complete after 1m4s [id=/subscriptions/77ad15ff-226a-4aa9-bef3-648597374f9c/resourceGroups/adrons-rg/providers/Microsoft.ContainerInstance/containerGroups/adrons-hasura-logistics-data-layer]

Apply complete! Resources: 5 added, 0 changed, 0 destroyed.

Outputs:

hasura_url = postgres://postgres%40logisticscoresystemsdb:theSecretPassword!@logisticscoresystemsdb.postgres.database.azure.com:5432/logistics

Now if I navigate over to logisticsdatalayer.westus2.azurecontainer.io I can view the Hasura console! But where in the world is this fully qualified domain name (FQDN)? Well, the quickest way to find it is to navigate to the Azure portal and take a look at the details page of the container itself. In the upper right the FQDN is available as well as the IP that has been assigned to the container!

Navigating to that FQDN URI will bring up the Hasura console!

Next Steps

From here I’ll take up next steps in a subsequent post. I’ll get the container secured, map the user interface or CLI or whatever the application is that I build lined up to the API end points, and more!

References

Sign Up for Thrashing Code

For JavaScript, Go, Python, Terraform, and more infrastructure, web dev, and coding in general I stream regularly on Twitch at https://twitch.tv/adronhall, post the VOD’s to YouTube along with entirely new tech and metal content at https://youtube.com/c/ThrashingCode.

Beyond CRUD n’ Cruft Data-Modeling

I dig through a lot of internet results and blog entries that show CRUD data modeling all the time. A lot of these blog entries and documentation are pretty solid. Unfortunately, rarely do we end up with data that is accurately or precisely modeled the way it ought to be or the way we would ideally use it. In this post I’m going to take some sample elements of data and model it out for various uses. Then reconstitute that data into different structures for various uses within microservices, loading, reading, both in normalized form and denormalized form.

The Domain: Railroad Systems & Services

The domain I chose for this particular example is the entire global spectrum of rail services. Imagine if you would a system that can track all the trains in the world, or even just the trains in a particular area of the world, like the United States. In the United States the trains can be broken down into logical structures of data for various things like freight trains and passenger trains. Trains operated under a particular operator like Amtrak, Union Pacific, or Norfolk Southern, and their respective consists that the train is made up of. Let’s get into some particular word definitions to fully detail this domain. Continue reading “Beyond CRUD n’ Cruft Data-Modeling”

TRIP REPORT: 5 Items for Your Presentation Checklist and My Trip Log for DevOps Days YVR

A Few Trip Details

logoRecently I spoke (video below, plus others) at DevOps Days YVR. YVR is the airport code for Vancouver BC, thus the use of YVR in the DevOps Day name and Twitter handle (@DevOpsDaysYvr). I always love to travel to Vancouver BC for a whole multitude of reasons. The city is beautiful, clean, and has everything from shopping to foodie options all over the place. It’s a truly modern, and by US standards, futuristic city with a number of very effective transport options beyond the myopic use of cars that America is overly dependent on. If you like biking, this is the preeminent city to bike in of all cities in North America. Nothing even comes close. Not Portland, definitely not San Francisco, and don’t even get me started on the trash fire that is Los Angeles and it’s heroin like addiction to sitting in car traffic. If hanging out in the city is a bit much, one can always get out to mountains or just take a stroll into one of the many parks available to get away from it. Overall, Vancouver is an amazing place and any excuse I have for crossing the border and getting a great dose of Canadian Camaraderie and jolliness I’ll take in a heartbeat!

Of course, all the wonderfulness is great, but getting into the nitty gritty of the tech scene is also fun. Vancouver has a great tech scene, albeit small by Seattle’s tech scene, but that’s a disingenuous comparison anyway. It’d be like comparing Seattle aeronautics scene with any other city except where maybe Airbus is located, it’d just be nuts. But compared to most other cities, Vancouver has a pretty solid standing among the coding community.

Photo Mar 29, 11 29 19 PM
Pacific Central Station in Vancouver

I took the train up, as I always do, because of a number of reasons. I can roll my bike on and then upon arrival the staff just hands it back to me and off I go. Considering my origination point and departure point are both bike friendly for my needs, this is my default. In the end, it actually ends up faster than driving and needing to pack or rack the bike, I couldn’t even fly and enjoy these amenities, and the intercity Bolt Buses are just yucky anyway. It’s like flying except you’re stuck on the ground in an often times smaller seat than an airplane. I’m not sure why I’d ever want to torture myself like that. The other huge benefit is it’s extremely easy to get a lot done on the train and one also can’t beat the views en route!

A Small Rant

Once I arrived I checked into my hotel. I realized this trip I must have checked into the singularly clueless hotel in the whole city that has strange myopic, draconian, and stupifying bicycle options. The Hyatt Regency, which I’m clearly not going to be staying at again, wanted to “valet” my bike with a tag, mark it and put it into some basement garage “room”. I wouldn’t easily have access to it and would have to wait for them to retrieve it if I needed it for any reason. Anyway, this was an extremely odd scenario, and would be a lot more work on their part, that just struck me as blatantly behind the times – especially for Vancouver. People and businesses in this city should, and do, know better than to have such nonsense in place. I guess the Hyatt Regency places itself above and removed from the people in some way. Oh well, lesson learned, I’ll stay at one of the other zillion super awesome hotels (or with friends instead).

TLDR; Avoid the Hyatt Regency for future stays, they have strange policies around independence of movement and storage of personal items.

DevOps Days YVR

Alright, on to the conference. The conference was great, my only issue was my own fault, in that I had managed to not be able to attend the first day. I should have gotten there earlier and also planned to stay a little longer. Next time I’m going to make a more official scheduled brunch, dinner, drinks, and maybe a meetup or two. On day two of the event, I was up to speak first thing in the day, a 9:15 am slot! This was the first time I’d ever had a speaking slot this early in the day. At this point, my preparations complete, my checklist checked twice, and I was ready to present. In doing so, I decided a list was in order which I’ve put together below.

The Presentation Checklist

  1. Laptop(s) – These days I tend to bring two laptops when I’m presenting. One is my main workstation running Linux and the other is an older Macbook Pro that I have. The reasoning is simple, depending on the projector and connection options, the Macbook Pro is easily – with its HDMI connection – the most standard setup for presenting. It works more often than any other machine I’ve ever had and is far more consistent in getting resolutions correct for presentations and for projectors. It is in essence the ultimate backup. However I use the Linux machine if I can, it’s more than capable, but some projectors aren’t up to it.
  2. Connectors – I bring the regular assortment of connections to ensure I get feed out from the XPS 15 running Linux or the MacOS to HDMI and VGA. This basically covers every modern projector and everything I’ve ever seen built for the last decade or so. That equates to 2 dongles, one of the Mac (Thunderbolt to VGA) and one for the XPS 15 (USB-C/USB to VGA).
  3. Slide Deck – I aim to have several formats of the presentation deck available outside of some online format like Google Slides. Such as PDF that I can flip through or a local presentation app that I can use. This way regardless of the connection I’ll be able to have the slide deck ready to go.
  4. Presentation Page – This is a page that I setup for slides, video, and whatever other collateral is put together from my efforts and also from the conference organizers’ perspective. For the particular DevOps Days YVR talk I setup a page “Go for Venomous Database Reliability“.

5. Be Present – Be sure to be rested up the day of presenting and for a day of interactions. But don’t just come in like a military insertion assault and then leave. That sucks for attendees, stay for the day. Talk to people. Learn about what they’re working on. Chat about solutions both directions. Be part of the community.

With the checklist done, here’s my talk from the event. “Architecture Guidance for Venomous Database Reliability Engineering” a kind of library checklist for development and database reliability in Go.

After the conference I spent the day catching up with some friends. Included in that was the chance to hang out with Alexandra with Advanced Tech Podcast. We got some food near the office and plotted out a podcast too. Which you can give a listen to at “Adron Hall – Coder, Engineer, Architect“. We tackled a very wide range of topics, tech related, and even toward the end we got into discussions around livability, urban planning, city council meetings, and the whole life of an advocate in the urban realm in America.

It was a great weekend of talking tech, enjoying the beauty and good grub and company in Vancouver BC. Over the next week or three I intend to post videos from the conference with some succinct write ups on the various talks – available via the DevOpsDays Vancouver 2019 Playlist. For now though, time for a little disconnect and the train ride home, enjoy the scenery, cheers! \m/

Conflicts of Building a Real World Example Application Starter Kit

As I dug through a number of JavaScript user interface frameworks lately, reading a number of posts, and building a more informed opinion. All this to decide which one I should use for a sample application for some starter kits. One post I read hit home that it does need to be a bit more complex than a todo app.

However, I’m still starting with a todo app anyway, but it’s going to turn into something else that is much more than a mere todo app. In this post I’m going to write up some of those larger plans and what complexities lie in wait – dragons are indeed there – for this more extensive real world app.

Modernizing Real World US Passenger Rail Ticket Sales!

Ok, I picked this topic since it is one of the things I find frustrating in the United States. The passenger rail systems, pretty much all of them, are barely better than many 3rd world countries, let alone the developed nations. One of those elements that the United States falls far behind on is an effective, efficient, accurate, and useful ticketing and seat assignment system. Let’s talk about this particular problem for a moment and you’ll start to visualize the problems that exist with the current system.

The Problem(s): Train Seating Options

Siemens Charger engine waiting with Talgo train.
Siemens Charger engine waiting with Talgo train.

Getting people on and off of a transport system like a train, airplane, ferry, or other mode of transport isn’t a simple process. However, many times it doesn’t have to be as complex, wrought with error, confusion, or disarray as there often is in the United States. Let’s step back and focus on one particular set of trains, the four particular trains that leave form King Street Station in Seattle, Washington on an almost daily basis.

  1. Sound Transit Sounder – [Stations] [Fares] [Wikipedia] This is a commuter route that has two lines:
    1. North Line – Seattle to Everett.
    2. South Line – Seattle to Tacoma, then onward to Lakewood.
  2. Amtrak Cascades – [Wikipedia] Seattle is one of the major stops on the Cascades route, which starts in Eugene down in Oregon and traverses all the way into Canada to Vancouver.
  3. Amtrak Empire Build – [Wikipedia] This is one of the two Superliner cross country overnight trains that leaves Seattle, connects with a sister train in Spokane everyday from Portland, and then combines and travels all the way to Chicago!
  4. Amtrak Coast Starlight – [Wikipedia] This is one of the other Superliner cross country overnight trains. It departs from Seattle, travels south with a number of stops and eventually ends in Los Angeles.

These four trains use specific train equipment with a particular accommodations for ticket sales.

One of the Amtrak Superliner Coach Car's seating layout.
One of the Amtrak Superliner Coach Car’s seating layout. (Images found here)

The Sounder provides tickets via the Sound Transit System in the area, which is a relatively cheap, non-reserved seat, heavily used train. Often there’s standing room only. It’s one of those things, that if one could purchase a ticket and know if they’re getting a seat, or if the train is full or not, that would encourage or discourage use accordingly. Currently, you buy a ticket and just get on. Rarely are they even checked, there is no gated entry, it’s basically a free for all.

The Amtrak Cascades are a reserved seat system. You purchase a ticket with the contract agreement that you will be provided a seat – either business class or regular – upon boarding. Emphasis on upon boarding as this can cause great confusion when entering the station and attempting to determine how to pick up these seat assignments even though you’ve already purchased a ticket. It adds time to boarding, requires the train sits waiting longer, and passengers have to arrive much earlier than the train departure. Albeit, just for context this earlier arrival (~20-30 minutes before) is nothing compared to the horrors of airports (2 hour suggested arrival before departure), it’s still unnecessary if modern systems were used to provide a streamlined and more efficient boarding process.

Amtrak Empire Builder
Amtrak Empire Builder

The Amtrak Empire Builder and Coast Starlight are currently an interesting mix. Both trains have sleeping accommodations that give a reserved room number before boarding. A very efficient process indeed, something to aim for. Since one knows the car number and room number, one could theoretically just board without even being guided. The rest of the seats however, some 200-300 or more of them depending on the train, are reserved seats albeit one doesn’t receive the seat assignment until they arrive at the station. Again, causing unnecessary chaos.

The Problem(s): Technology Deeper Dive

Problem: Passenger Navigation to Seat Reservation

Amtrak Cascades Bistro
Amtrak Cascades Bistro

Every single one of the trains listed above: Amtrak Empire Builder, Amtrak Coast Starlight, Amtrak Cascades, and Sound Transit Sounder all have some similar characteristics that would make it cheap and relatively easy to implement a ticketing and seat reservation system. In all of the train equipment, whether Sounder Bombardier, Superliner, or Talgo Amtrak Cascades there are seat numbers and car numbers. This provides us a core basis in which to work, to make all of this processing much easier.

At each station where these trains stop, each car of each train stops at a particular point – or could be made to stop at a particular point – at each station. The Sounder trains for example all have floor mats at the station that read “Welcome Aboard”! This is another element we could use to navigate a particular seat reservation. Automating the process of not just assigning a seat, but providing the information on each ticket for where and exactly when each passenger should arrive at a particular point at the station.

Since the cars and stations all have known characteristics about where to be, where the train will arrive and depart from, and what car number and door position is at this can all be automated per train. This is a repeatable process. Something that easily meets the exact definition of why we build computer systems and automate things with computer systems!

Problem: Equipment Changes, Modifiable Trains

Sometimes I’ve had conversations with what might change within the system. Almost all changes with a rail system are very known. From a disaster all the way to a simple everyday equipment change. For example, the train arriving may have an extra coach car or sleeper car on the Coast Starlight for some reason. Since we can build a system to model around the specific vehicles, and the vehicles numbers on a train can easily be set these changes can extrapolate out to tickets so they can be accurately assigned by a computer the day of. Changing equipment may take multiple minutes in the rail yard, but in the computer it’s a few keystrokes and it’s done. All tickets re-assigned, everything rebalanced, it’s almost as magical as a distributed database.

Problem: Common Concurrency, Purchasing, and Related Issues

There are also a number of issues a proper ticketing and reservation system would have to cover, such as managing for multiple people attempting to buy the same seat at relatively the same time. A locking and concurrency mechanism will be needed, something that’s been solved before, so appropriate planning around this will solve the issue.

There are of course timing issues too, once a ticket is locked, eventing within the system should unlock it appropriately. These event based timers will be an interesting challenge too. Solved already, but fun that they’ll need solved again specifically for this system!

Problem: Or Feature “See a Mountain”?

Aerial view of mount Rainier
Aerial view of mount Rainier

Some other things I’ve pondered include, the selling of some seats as choice preferences. For example, for the Empire Builder, Coast Starlight, and Cascades trains each have specific views that are easier or harder to see depending on the side of the train the accommodations are for. An example, if you’re facing west on the Coast Starlight you get all of the ocean views in southern California. If you’re on the east side, you get views of all the mountains like Rainier (see above picture!) and even Shasta if there is a full moon. Depending on these views and related characteristics, I’d happily pay a few bucks more to ensure I get a specific assignment or get to pick a specific assignment, so why not offer the ability to choose the seat for a specific fare?

IMG_4090-XL
The Puget Sound, traveling north out of Seattle on the Amtrak Cascades or Sound Transit Sounder north line.

Summary  & Next Steps

Summary – This is post one of many about the very distributed nature of purchasing tickets for one of the trains into and out of the city. As comparison with my todo app, this will definitely provide a very real world application option indeed! As soon as I wrap up the initial todo app samples, just to get started and provide details on how to get started I’m going to move on to building a real, real world applications sample, so real that it could be implemented by Sound Transit, Brightline, Virgin Rail, SNCF’s TGV, Germany’s ICE, or even good ole’ Amtrak here in the United States.

Next Steps – Next up I’m going to finish up the todo applications, with the notion that they provide some starting points for people but also for this more complex real world application. I’ll also add some more details and thoughts, and would love to converse, discuss, contributions, or co-hack on this project. Maybe you’ll join me, onward, and may you enjoy this flanged wheel ride and code slinging adventure!