Top 3 Refactors for My Hasura GraphQL API Terraform Deploy on Azure

I posted on the 9th of September, the “Setup Postgres, and GraphQL API with Hasura on Azure”. In that post I had a few refactorings that I wanted to make. The following are the top 3 refactorings that make the project in that repo easier to use!

1 Changed the Port Used to a Variable

In the docker-compose and the Terraform automation the port used was using the default for the particular types of deployments. This led to a production and a developer port that is different. It’s much easier, and more logical for the port to be the same on both dev and production, for at least while we have the console available on the production server (i.e. it should be disabled, more on that in a subsequent post). Here are the details of that change.

In the docker-compose file under the graphql-engine the ports, I insured were set to the specific port mapping I’d want. For this, the local dev version, I wanted to stick to port 8080. I thus, left this as 8080:8080.

version: '3.6'
services:
postgres:
image: library/postgres:12
restart: always
environment:
POSTGRES_PASSWORD: ${PPASSWORD}
ports:
- 5432:5432
graphql-engine:
image: hasura/graphql-engine:v1.3.3
ports:
- "8080:8080"
depends_on:
- "postgres"
restart: always
environment:
HASURA_GRAPHQL_DATABASE_URL: postgres://postgres:${PPASSWORD}@postgres:5432/logistics
HASURA_GRAPHQL_ENABLE_CONSOLE: "true"
volumes:
db_data:

The production version, or whichever version this may be in your build, I added a Terraform variable called apiport. This variable I set to be passed in via the script files I use to execute the Terraform.

The script file change looks like this now for launching the environment.

cd terraform
terraform apply -auto-approve \
-var 'server=logisticscoresystemsdb' \
-var 'username='$PUSERNAME'' \
-var 'password='$PPASSWORD'' \
-var 'database=logistics' \
-var 'apiport=8080'

The destroy script now looks like this.

cd terraform
terraform destroy \
-var 'server="logisticscoresystemsdb"' \
-var 'username='$PUSERNAME'' \
-var 'password='$PPASSWORD'' \
-var 'database="logistics"' \
-var 'apiport=8080'

There are then three additional sections in the Terraform file, the first is here, the next I’ll talk about in refactor 2 below. The changes in the resource as shown below, in the container ports section and the environment_variables section, simply as var.apiport.

resource "azurerm_container_group" "adronshasure" {
name = "adrons-hasura-logistics-data-layer"
location = azurerm_resource_group.adronsrg.location
resource_group_name = azurerm_resource_group.adronsrg.name
ip_address_type = "public"
dns_name_label = "logisticsdatalayer"
os_type = "Linux"
  container {
name = "hasura-data-layer"
image = "hasura/graphql-engine:v1.3.2"
cpu = "0.5"
memory = "1.5"
    ports {
port = var.apiport
protocol = "TCP"
}
    environment_variables = {
HASURA_GRAPHQL_SERVER_PORT = var.apiport
HASURA_GRAPHQL_ENABLE_CONSOLE = true
}
secure_environment_variables = {
HASURA_GRAPHQL_DATABASE_URL = "postgres://${var.username}%40${azurerm_postgresql_server.logisticsserver.name}:${var.password}@${azurerm_postgresql_server.logisticsserver.fqdn}:5432/${var.database}"
}
}
  tags = {
environment = "datalayer"
}
}

With that I now have the port standardized across dev and prod to be 8080. Of course, it could be another port, that’s just the one I decided to go with.

2 Get the Fully Qualified Domain Name (FQDN) via a Terraform Output Variable

One thing I kept needing to do after Terraform got production up and going everytime is navigating over to Azure and finding the FQDN to open the console up at (or API calls, etc). To make this easier, since I’m obviously running the script, I added an output variable that concatenates the interpolated FQDN from the results of execution. The output variable looks like this.

output "hasura_uri_path" {
value = "${azurerm_container_group.adronshasure.fqdn}:${var.apiport}"
}

Again, you’ll notice I have the var.apiport concatenated there at the end of the value. With that, it returns at the end of execution the exact FQDN that I need to navigate to for the Hasura Console!

3 Have Terraform Create the Local “Dev” Database on the Postgres Server

I started working with what I had from the previous post “Setup Postgres, and GraphQL API with Hasura on Azure”, and realized I had made a mistake. I wasn’t using a database on the database server that actually had the same name. Dev was using the default database and prod was using a newly created named database! Egads, this could cause problems down the road, so I added some Terraform just for creating a new Postgres database for the local deployment. Everything basically stays the same, just a new part to the local script was added to execute this Terraform along with the docker-compose command.

First, the Terraform for creating a default logistics database.

terraform {
required_providers {
postgresql = {
source = "cyrilgdn/postgresql"
}
}
required_version = ">= 0.13"
}
provider "postgresql" {
host = "localhost"
port = 5432
username = var.username
password = var.password
sslmode = "disable"
connect_timeout = 15
}
resource "postgresql_database" "db" {
name = var.database
owner = "postgres"
lc_collate = "C"
connection_limit = -1
allow_connections = true
}
variable "database" {
type = string
}
variable "server" {
type = string
}
variable "username" {
type = string
}
variable "password" {
type = string
}

Now the script as I setup to call it.

docker-compose up -d
terraform init
sleep 1
terraform apply -auto-approve \
-var 'server=logisticscoresystemsdb' \
-var 'username='$PUSERNAME'' \
-var 'password='$PPASSWORD'' \
-var 'database=logistics'

There are more refactoring that I made, but these were the top 3 I did right away! Now my infrastructure as code is easier to use, the scripts are a little bit more seamless, and everything is wrapping into a good development workflow a bit better.

For JavaScript, Go, Python, Terraform, and more infrastructure, web dev, and coding in general I stream regularly on Twitch at https://twitch.tv/thrashingcode, post the VOD’s to YouTube along with entirely new tech and metal content at https://youtube.com/ThrashingCode.

Pragmatic Database Schema Naming Conventions, Practices, and Patterns

In this post I’ve put together some of the naming conventions, rules, and ideas that I tend to follow when creating database schemas to work with. This also applies to schema-less databases, distributed systems databases, graph, time series, or whatever else I am working with. It’s always good to have some good conventions to work with, and the descriptions and ideas in this post are a solid starting point.

What We’re Working With, Some of The Database Rules

Let’s start with a SQL Server rule. Table names must be less than 128 characters. To force SQL Server to use non-standard table names one can use brackets. Then in scripts these names have to be single quoted.

There are many other rules about naming things in SQL Server. But let’s talk about some other database specific rules for other databases.

Postgres names fold to lowercase versus uppercase, which is different then many other databases. Throw in some double quotes however and you can use names like MyTable, MYTABLE, and mytable. These would all be the same without double qutoes, add the double quotes around those names and “MYTABLE” becomes different than “MyTable” and different than “mytable”.

SQL identifiers in Postgres and key words must begin with letters (a-z), which include diacritical marks and non-Latin letters. After the first letter and identifier can have letters, underscores, digits, or dollar signs. If an identifier is double quoted, you can also yse keywords, albeit I would very strongly reccommend against this practice.

As these examples provide, there are a number of ways that the rules are just different enough from one database to another that it is often very helpful to use a naming convention that would work across databases. I’m often working with a variety of databases so this post will cover naming convention ideas and respective patterns and practices around them that would work with every conceivable database I know to exist!

Table, Column, Tuple, or Related Naming Conventions

  • Table, column, and related object names should contain only letters, numbers as characters in the body of the name and not as the preface characters, underscores and absolutely no spaces or special characters. In summary, use a case scheme like Camel or Pascal Case but do not use Snake or Kebab Case.
  • Use meaningful names from the business or organizational domain being modeled. Such as “BankingUsers”, “Transactions”, “railroads”, or “railroad_Systems”.
  • Use singular word names if at all possible, only moving to compound word naming if absolutely necessary. Ideally names would be single words like “User”, “Transactions”, “railroad”, or “system” and exclude compound names like “railroadSystem” until it is needed to prevent confusion or naming collisions.
  • Columns that are primary or foreign keys should be prefaced with PK_ and FK_ respectively, and in my moderately humble opinion, stick to just PK or FK using Camel or Pascal Case. For other metadata, indexes, and related names use a respective preface or postfix conventions.
  • It’s also a good idea to choose plural or singular for table names. However, be sure to choose one or the other so that frameworks, Object Relationship/Relational Mappers/Mapping (ORM), and other tools can effectively name things when used. For example, when table names are singular, many ORM frameworks when generating code would take a singular table name like Customer and make it Customers and have objects of Customer.

Schema/Domain Naming Conventions

In many databases there are additional organizational and related structures that help us to setup tables, functions, stored procedures, compiled SQL/queries, and other objects in groupings. Naming these objects accordingly is easiest by following the same convention as the table naming convention.

For example, if the table naming convention is following Camel Case then continue that;

Table Names for Returns in an E-commerce Domain:

  • Purchase
  • Return
  • Shipped

Table Names for Pricing in an E-commerce Domain:

  • Price
  • Cost

Table Names for Core Tables, for multiple schemas, within the E-commerce Domain:

  • Item

This could be split out to three schemas;

  • Core
  • Prices
  • Returns

The names would then look like this:

Core.Item
Prices.Purchase
Prices.Return
Prices.Shipped
Returns.Purchase
Returns.Return
Returns.ShippedItem

An Example

Here’s an example I put together with some naming conventions I’ve found useful, reduces confusion, and manages to tell a reasonable amount of information about the domain space and schema of the database without conflicts.

The Database Schema

Throughout this schema I’ve used Camel Casing, with most single word column and single word table names. Keeping it simple, such as Id and Stamp are some of the recurring columsn that are useful for retrieval, relationships, and determining origins of data over time. In production settings there are default columns that are often needed and one can rest assured, a time stamp is most likely one of those that is needed everywhere for audits!

For my foriegn key columns, one can determine the relationship by the name of the column itself. For example, in the Connection table the are two foreign keys, one to the Action table and one to the Source table, to the respective Id columns in each of those tables. The one exception is NoteJot, which I named because Note tends to conflict in certain systems. In that table I’ve added a relationship, for recursive data, back to itself with the use of the NoteId foreign key back to the table’s primary key Id.

The diagram above, without any further details can be used to create the schema, with SQL code that would look like the following.

CREATE TABLE "Source" (
  "Id" uuid PRIMARY KEY,
  "Stamp" timestamp,
  "Name" text,
  "Uri" text,
  "Details" text
);

CREATE TABLE "SourceNotes" (
  "SourceId" uuid,
  "NotesId" uuid,
  "Details" text,
  "Stamp" timestamp
);

CREATE TABLE "NoteJot" (
  "Id" uuid PRIMARY KEY,
  "Stamp" timestamp,
  "NoteId" uuid,
  "Details" text
);

CREATE TABLE "Action" (
  "Id" uuid PRIMARY KEY,
  "Stamp" timestamp,
  "Action" json
);

CREATE TABLE "Connection" (
  "Id" uuid PRIMARY KEY,
  "Stamp" timestamp,
  "ActionId" uuid,
  "SourceId" uuid
);

CREATE TABLE "Formatter" (
  "Id" uuid PRIMARY KEY,
  "Stamp" timestamp,
  "ConnectionId" uuid,
  "FormatterMap" json
);

CREATE TABLE "Schema" (
  "Id" uuid PRIMARY KEY,
  "Stamp" timestamp,
  "ConnectionId" uuid,
  "SchemaMap" json
);

ALTER TABLE "SourceNotes" ADD FOREIGN KEY ("SourceId") REFERENCES "Source" ("Id");

ALTER TABLE "SourceNotes" ADD FOREIGN KEY ("NotesId") REFERENCES "NoteJot" ("Id");

ALTER TABLE "NoteJot" ADD FOREIGN KEY ("NoteId") REFERENCES "NoteJot" ("Id");

ALTER TABLE "Connection" ADD FOREIGN KEY ("ActionId") REFERENCES "Action" ("Id");

ALTER TABLE "Connection" ADD FOREIGN KEY ("SourceId") REFERENCES "Source" ("Id");

ALTER TABLE "Formatter" ADD FOREIGN KEY ("ConnectionId") REFERENCES "Connection" ("Id");

ALTER TABLE "Schema" ADD FOREIGN KEY ("ConnectionId") REFERENCES "Connection" ("Id");

We have the tables and keys, all following the Camel Case standard. Much of this however could be – if you preferred – to Pascal Case however switching them to Snake or Kebab Case would cause a number of issues depending on the database.

Do NOT Use These

Unless you want to spend tons of time with errors, debugging, and related issues skip these practices.

  • Generally throughout databases it is best to skip Snake or Kebab Case. Various situations they’re fine, but overall they’re likely to run into conflicts, naming limitations, or other concerns. It’s best to just skip them and remove the concern.
  • When you’re using data that has variance in how it is represented, do not use multitudes of formats. For dates, location, geographic, or related data it is best to stick to a particular format that is repeated throughout the database. Using mixed representations, for example with dates a MMDDYYYY format and then a DD-MM-YYYY format, methods, functions, or other elements that consume or process this data will need to account for this. Creating more time consuming and error prone code.
  • Once you pick a naming scheme for any particular database object type, stick to the naming scheme. For example, if you go with Camel Casing for your tables, use Camel Casing for all of the tables and don’t switch to Pascal for some of them. Specifically, however with this guidance, is if you switch object types, for example you name the tables Pascal Cased but switch to Camel Case for indexes, that’s perfect. Then if ever reviewing a list of objects irrespective of kinds of objects, one can differentiate merely by the conventions used.

Summary & Caveat

Working up a set of patterns, practices, rules, and generally conventions to work with on the database side of things is immensely useful. It helps the Database Administrators, Data Scientists, Software Developers, others that need to utilize the database, and to communicate with each other in reference to the database and data.


That’s it for the database schema topic for now. However, if you’re interested in joining me for more database and data oriented things, language stack setup, software development, patterns, practices, and more in addition to writing some JavaScript, Go, Python, Terraform, and infrastructure, web dev, and all sorts of coding I stream regularly on Twitch at https://twitch.tv/thrashingcode, post the VOD’s to YouTube along with entirely new tech and metal content at https://youtube.com/ThrashingCode. Feel free to check out a coding session, ask questions, interject, or just come and enjoy the tunes!

Hasura CLI Installation & Notes

This post just elaborates on the existing documentation here with a few extra notes and details about installation. This can be helpful when determining how to install, deploy, or use the Hasura CLI for development, continuous integration, or continuous deployment purposes.

Installation

There are three primary operating system binary installations: Windows, MacOS, and Linux.

Linux

In the shell run the following curl command to download and install the binary.

curl -L https://github.com/hasura/graphql-engine/raw/stable/cli/get.sh | bash

This command installs the command to /usr/local/bin, but if you’d like to install it to another location you can add the follow variable INSTALL_PATH and set it to the path that you want.

curl -L https://github.com/hasura/graphql-engine/raw/stable/cli/get.sh | INSTALL_PATH=$HOME/bin bash

What this script does, in short, follows these steps. The latest version is determined, then downloaded with curl. Once that is done, it then assigns permissions to the binary for execution. It also does some checks to determine OS version, type, distro, if the CLI exists or not, and other validations. To download the binary, and source if you’d want for a particular version of the binary, check out the manual steps listed later in this post.

MacOS

For MacOS the same steps are followed as Linux, as the installation script steps through the same procedures.

curl -L https://github.com/hasura/graphql-engine/raw/stable/cli/get.sh | bash

As with the previous Linux installation, the INSTALL_PATH variable can also be set if you’d want the installation of the binary to another path.

Windows

Windows is a different installation, as one might imagine. The executable (cli-hasura-windows-amd64.exe) is available on the releases page. This leaves it up to you to determine how exactly you want this executable to be called. It’s ideal, in my opinion, to download this and then put it into a directory where you’ll map the path. You’ll also want to rename the executable itself from cli-hasura-windows-amd64.exe to hasura.exe if for any reason because it’s easier to type and then it’ll match the general examples provided in the docs.

To setup a path on Windows to point to the directory where you have the executable, you’ll want to open up ht environment variables dialog. That would be following Start > System > System Settings > Environment Variables. Scroll down until the PATH is viewable. Click the edit button to edit that path. Set it, and be sure to set it up like c:\pathWhatevsAlreadyHere;c:\newPath\directory\where\hasura\executable\is\. Save that, launch a new console and that new console should have the executable available now.

Manual Download & Installation

You can also navigate directly to the releases page and get the CLI at https://github.com/hasura/graphql-engine/releases. All of the binaries are compiled and ready for download along with source code zip file of the particular builds for those binaries.

Installation via npm

The CLI is avialable via an npm package also. It is independently maintained (the package that wraps the executable) by members in the community. If you want to provide a set Hasura CLI version to a project, using npm is a great way to do so.

For example, if you want to install the Hasura CLI, version in your project as a development dependency, use the following command to get version 1.3.0 for example.

npm install --save-dev hasura-cli@1.3.0

For version 1.3.1 it would be npm install --save-dev hasura-cli@1.3.1 for example.

The dev dependencies in the package.json file of your project would then look like the following.

"devDependencies": {
  "hasura-cli": "^1.3.0"
}

Another way you can install the CLI with npm is to just install it globally, with the same format but swap the --save-dev with --global. The following would install the latest command. You can add the @version to the end and get a specific version installed globally too, just like as shown with the dev install previously.

npm install --global hasura-cli

Notes

Using the npm option is great, if you’re installing, using, writing, or otherwise working with JavaScript, have Node.js installed on the dev and other machines, and need to have the CLI available on those particular machine instances. If not, I’d suggest installing via one of the binary options, especially if you’re creating something like a slimmed down Alpine Linux container to automate some Hasura CLI executions during a build process or something. There are a lot of variance to how you’d want to install and use the CLI, beyond just installing it to run the commands manually.

If you’re curious about any particular installation scenarios, ask me @Adron and I’ll answer there and I’ll elaborate here on this post!

Happy Hasura CLI Hacking! 🤘

References

My Top 2 Reading List for Go & Data

Right now I’ve got a couple books in queue.

“Black Hat Go” Go Programming for Hackers and Pentesters by Tom Steele, Chris Patten, and Dan Kottmann.

Alt Text

I picked this book up after a good look at the index. There is a basic intro to Go at the beginning of the book to cover language fundamentals, but immediately after that dives into the meat of the topic. Immediately getting into understanding the TCP handshake, TCP itself, writing a scanner, and a proxy. There is then some basics about HTTP Servers, routers, and middleware in chapter 3 and 4, but returns immediately into topics of interest in chapter 5 around exploiting DNS, and writing DNS clients. I perused the index a bit further to note it covers SMB and NTLM, a chapter on “Abusing Databases and Filesystems”, then on to raw packet processing, writing and porting exploit code, and a host of other topics. This is going to be an interesting book to dig into and write about in the coming weeks. I’m pretty excited about it and hope to write a thorough review upon completion.

“Designing Data-Intensive Applications” by Martin Kleppmann

Alt Text

This book is familiar territory, as I’ve spent much of my career working on similar topics that this book covers. What I suspect is that I’ll enjoy reading the material presented in an orderly and concise way versus the chaos and disruptive way I’ve acquired the knowledge on these topics.

From the index, the book starts off with foundations of data systems and the ideas around building reliable, scalable, and maintainable applications. This provides a good basis in which to dive into the other topics. From there it looks like we’ll get a run into the birth of NoSQL, object-relational mismatches and the related insanity that this has bred in the industry! Then a solid dive into graph-like, traditional, and multi-model data modeling. With the beginning quarter of the book then covering everything from hash indexes, SSTables (another familiar topic), LSM-Trees, B-Trees, and related indexing structures before wrapping up this first 25% of the book with stars and snowflake topics for analytics and column-oriented storage, compression, sort orders in column storage, and related material on aggregation in data cubes and materialized views.

That’s just the first 25%! From there Martin covers a wide range of topics, that if you’re in the industry and plan to deal with large scale data-intensive applications these are topics you need to be intimately familiar with!

Reading

Over the next few months while I read through these books I hope to provide summaries and related notes on the material. Who knows, maybe you’ll want to dive into the material yourself! Until then happy thrashing code and may you have high retention and comprehension in reading!

Schedule Updates: Databases, Coding, Meetups, & More

Here is my updated schedule for the two weeks starting the 23rd for my Twitch streams, the basic topics, meetups (a little beyond two weeks), and related events coming up.

Series: Thrashing Code General Calamity

This is going to be a mix of tech this week. Probably some database hacking, code hacking, samples, and setup of even more examples for use throughout your coding week!

Series: Bunches of Databases in Bunches of Weeks

Focusing on DataStax Enterprise setup and configuration over the next few weeks.

Series: Meetups

This is a little further out than the next few weeks, but in August we’re having another meetup!

Series: Building the Geo App Trux (w/ Vue.js, Go, and DataStax Enterprise (Apache Cassandra). Navigate through to the DataStax Developers Channel to check out the event times.

Series: DataStax’s Apache Cassandra & C# Hour (C# Schema Migration builder, C# driver, and additional content). Navigate through to the DataStax Developers Channel to check out the event times.