Cassie Schema Migrator >> CaSMa

A few weeks back I started working on a schema migration tool for Apache Cassandra and DataStax Enterprise. Just for context, here are the short definitions of what each of the elements of CaSMa are.

  • cstar-iconApache Cassandra
    • Definition: Apache Cassandra is a free and open-source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure.
    • History: Avinash Lakshman, one of the authors of Amazon’s Dynamo, and Prashant Malik initially developed Cassandra at Facebook to power the Facebook inbox search feature. Facebook released Cassandra as an open-source project on Google code in July 2008. In March 2009 it became an Apache Incubator project. On February 17, 2010 it graduated to a top-level project. Facebook developers named their database after the Trojan mythological prophet Cassandra, with classical allusions to a curse on an oracle.
  • dse-logoDataStax Enterprise
    • Definition: DataStax Enterprise, or routinely just referred to as DSE, is an extended version of Apache Cassandra with multi-model capabilities around graph, search, analytics, and other features like security capabilities and a core data engine 2x speed improvement.
    • History: DataStax was formed in 2009 by Jonathan Ellis and Matt Pfeil and originally named Riptano. In 2011 Riptano changes names to DataStax. For more history check out the Wikipedia page or company page for a timeline of events.
  • command-toolsSchema Migration
    • Definition:In software engineering, schema migration (also database migration, database change management) refers to the management of incremental, reversible changes to relational database schemas. A schema migration is performed on a database whenever it is necessary to update or revert that database’s schema to some newer or older version. Migrations are generally performed programmatically by using a schema migration tool. When invoked with a specified desired schema version, the tool automates the successive application or reversal of an appropriate sequence of schema changes until it is brought to the desired state.
    • Addition reference and related materials:

iconmonstr-twitch-5Over the next dozen weeks or so as I work on this application via the DataStax Devs Twitch stream (next coding session events list) I’ll also be posting some blog posts in parallel about schema migration and my intent to expand on the notion of schema migration specifically for multi-model databases and larger scale NoSQL systems; namely Apache Cassandra and DataStax Enterprise. Here’s a shortlist for the next three episodes;

The other important pieces include the current code base on Github, the continuous integration build, and the tasks and issues.

Alright, now that all the collateral and context is listed, let’s get into at a high level what this is all about.

CaSMa’s Mission

Schema migration is a powerful tool to get a project on track and consistently deployed and development working against the core database(s). However, it’s largely entrenched in the relational database realm. This means it’s almost entirely focused on a schema with the notions of primary and foreign keys, the complexities around many to many relationships, indexes, and other errata that needs to be built consistently for a relational database. Many of those things need to be built for a distributed columnar store, key value, graph, time series, or a million other possibilities too. However, in our current data schema world, that tooling isn’t always readily available.

The mission of CaSMa is to first resolve this gap around schema migration, first and foremost for Apache Cassandra and prospectively in turn for DataStax Enterprise and then onward for other database systems. Then the mission will continue around multi-model systems that should, can, and ought to take advantage of schema migration for graph, and related schema modeling. At some point the mission will expand to include other schema, data, and state management focused around software development and data needs within that state

As progress continues I’ll publish additional posts here on the different data model concepts and nature behind various multi-model database options. These modeling options will put us in a position to work consistently, context based, and seamlessly with ongoing development efforts. In addition to all this, there will be the weekly Twitch sessions where I’ll get into coding and reviewing what coding I’ve done off camera too. Check those out on the DataStax Devs Channel.

If you’d like to get into the project and help out just ping me via Twitter @Adron or message me here.

Top 10 West Coast Confs for 2019

I’ve been putting together a list of conferences that I want to aim to attend this coming year. I made it, then thought, “somebody else could use this list probably” so here it is. If you think of any other specific conferences I ought to add and attend please leave a comment. Enjoy!

March 7-10 is SCALE Southern California Linux Expo in Pasadena, California

March 25-28 is O’Reilly Strata in San Francisco, California

April 26-28 is LinuxFest Northwest in Bellingham, Washington

June 3-5 is Monitorama in Portland, Oregon

June 10-13 is O’Reilly Velocity in San Jose, California

June 10-13 is O’Reilly Software Architecture Conference SACON in San Jose, California

July 15-18 is O’Reilly OSCON in Portland, Oregon

August 21-23 is the Open Source Summit in San Diego, California

September 9-12 is the O’Reilly Artificial Intelligence in San Jose, California

November 18-21 KubeCon 2019 in San Diego, California

Without Dates – Conferences that are really great that don’t currently have a date just yet.

Polyglot Conf in Vancouver BC

Seattle Code Camp

Microsoft Build

GDG DevFest

What others should I add that are awesome Seattle or immediate surrounding area conferences?

 

Behind the Scenes @ DevRel Week in Santa Clara & San Francisco

The following are some lagniappe, a little extra, about the behind the scenes adventures I’m off on when I travel or am in between coding. Ya know, coding being life and all.  😉

The first episode in this series I posted a while back on my gear I use to record the Twitch sessions and pretty much everything else. These are the story of the first half of the trip to Santa Clara and San Francisco. The rest, are still in post-production, and will be out real soon. Along with videos on a host of other adventures that will offer you good information on where the good food is, the best coding places, best meetups, and all that stuff. So subscribe on my Youtube Channel and on Twitch – the shows are coming to Youtube, and now and again I’ll pre-watch one with my Twitch audience. Cheers!

Getting in Some Code Stylings, Looking Good for the Code Dance

In every language there are opinions about how to format code. With JavaScript, the community abounds with opinions about how the code should look, how variables are declared, whether there should be semi-colons to end each statement, spaces before or after parenthesis, and more than I care to list in a simply worded paragraph like this. Recently the team at Deconstructed sat down to determine what our ongoing code style format would be and how we can enforce it.

The first thing we did was figure out what we could use for enforcement of the coding style. Milan (@milanloveless) quickly discovered node-jscs per suggestion from Adam Ulvi (@s5fs). He implemented that code as follows.

Continue reading “Getting in Some Code Stylings, Looking Good for the Code Dance”

Distributed Coding Prefunc: Some Basic Erlang & Erlang Shell Basics

Erlang LogoThe first few parts of this series include:

This continues the series, with the intent of getting to “Distributed Coding” with this all ramping up around Erlang and distributed programing in general. In this entry we’ll jump into some of the basics of Erlang and the shell that we can use to test and write some code.

Erlang Basics : The Shell & Some Integers

Before getting to deep into things, here’s some basic shell usage commands and integers to get a feel for the Erlang shell. To start the shell just type erl, which will present the following in the terminal.

[sourcecode language=”erlang”]
$ erl
Erlang R15B01 (erts-5.9.1) [source] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.9.1 (abort with ^G)
1>
[/sourcecode]

To exit the shell just type “q().” and hit enter like this.

[sourcecode language=”erlang”]
$ erl
Erlang R15B01 (erts-5.9.1) [source] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.9.1 (abort with ^G)
1> q().
ok
2> $
[/sourcecode]

It’ll execute a second line that will dump you back out to the terminal. One thing to note, is that when you execute “q().” it drops to a second line numbered two above. If we start the shell back up again and type in some actual code we’ll see that the numbers increment. The idea is similar to each line of code in a code file.

Start up the shell again and type the following.

[sourcecode language=”erlang”]
$ erl
Erlang R15B01 (erts-5.9.1) [source] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.9.1 (abort with ^G)
1> q().
ok
2> Adrons-MacBook-Air-2:erlangMagic adronhall$ clear
Adrons-MacBook-Air-2:erlangMagic adronhall$ erl
Erlang R15B01 (erts-5.9.1) [source] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.9.1 (abort with ^G)
1> -69.
-69
2> 2#1212
2> 2#1212.
* 1: syntax error before: 212
2> 2#1010.
10
3> $A.
65
4> $B.
66
5> $a.
97
6> 2*2+100.
104
7>
[/sourcecode]

What I’ve done here is enter some values, primarily all integers. These integers, on the first line is a negative integer. By entering a “.” at the end, the lines executes and returns the value. In this case, the value is “-69”.

The second line, I entered “2#1212” without entering a period to finish the line of code. Thus, nothing executes.

The third line I entered the same thing as the second, “2#1212” which is not valid, giving us a syntax error. On the next line I tweak my value just a bit, entering a valid hexadecimal number “2#1010” with the appropriate period at the end. The line executes and it displays the decimal integer “10”.

So far at this point, this shows us several things:

  • Every line must end with a period to execute.
  • We’ve seen a negative integer, a positive hexadecimal integer and an invalid integer and the error that throws “* 1: syntax error before: 212”.
  • We’ve seen that when the line in the shell successfully executes we get the next incremental line, such as “-69.” was on the first line, the first successful second line was the “2#1010.” value and as we move through each line above the code line increases.
  • A valid integer includes -69, but also includes a hexadecimal representation of an integer such as “#2#1010”.

Moving beyond these values the “$A”, “$B” and “$a” are all ASCII representations and return their numeral values.

On line six I actually perform a basic calculator type multiplication and addition of integer values, which returns the integer value 104.

Now that we have the shell start and stop figured out, that gives us a platform in which to start stepping through some of the basics of Erlang.

Erlang Floats

Above we’ve seen how regular integers are represented such as 2, 100 or -69 along with hexadecimal and ASCII numeric representations. For floats, we’re looking at similar uses. Spool up the shell again and try out these examples.

[sourcecode language=”erlang”]1> 21.21.
21.21
2> -1234.1234.
-1234.1234
3> 3042.1.
3042.1
4> 650.21
4> 650.21.
* 2: syntax error before: 650.21
4> 650.21
4> 650.21.
* 2: syntax error before: 650.21
4> 650.21.
650.21
[/sourcecode]

Lines 1-3 show some standard floats. On the first part of line four I left off the period to show that it won’t accept the line, even with a period in it, because of the numbers after that period. In that second line number four I’ve added the period, but it throws a syntax error.

Then note I did the exact same thing again, to show that entering a float and then adding the period, will still leave the float broken because of the way the Erlang compiler is waiting to execute the code. It’s almost like the compiler sees “650.21650.21” which of course doesn’t mean anything. However on the fifth line of the number four lines, I finally get a good execution and it displays the float I’ve entered “650.21”. Be sure when you’re fixing values like this, the compiler could be stuck in mid execution, because without the period at the very end, being the last character on the line, the compiler is still waiting for the period to actually start the execution.

Mathematical Operators, You Know Those Things That Help Math Explain the Universe!

Erlang has the standard operators that you’re used to, especially if you’ve done anything with a language based around C syntax such as JavaScript, Java, C#, C++, Objective-C… well, pretty much every regularly used static typed language and more. In functional languages these operators are pretty common too. If any don’t look familiar, be sure to get intimate with them. You’ll find them all over Erlang code.

The + and – characters act as addition and subtraction, but also as unary operators. In order of precedence, the unary operations of the + and – have the highest precedence and the addition and subtraction usage of the + and – have the lowest precedence.

The * and / signify multiplication and division. They’re precedence is just below that of the unary operators of + and -.

Two that are a slight bit unique, is the usage of “div” and “rem” for division and remainder, for integer numbers. This begs to bring up the point that whenever an integer is involved in multiplication, division, addition, subtraction or other operations, it is coerced into a float when being operated on against another float. Such as 2 + 3.47 would mean that 2 is turned into a float before it is added to 3.47.

Here’s some examples of operations occurring between numbers via the shell.

[sourcecode language=”erlang”]
$ erl
Erlang R15B01 (erts-5.9.1) [source] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.9.1 (abort with ^G)
1> +15.
15
2> +2.
2
3> -11.
-11
4> 11 div 5.
2
5> 12 div 5
5> .
2
6> 15 div 5.
3
7> 14 div 5.
2
8> (12 + 345) div 12.
29
9> (3+2)/5.
1.0
10> 2*3*4.12.
24.72
11> 1+1+1+1+13+-2.
15
12> 2/3 + 3/4 – (1/2 + 2/3) – 1.
-0.75
13> 14 rem 5.
4
14>
[/sourcecode]

These examples show a number of effects of how the operators work. Notable is how everything looks very much like simple math being done. This is one of the more readable aspects of Erlang. Note also that the rounding generally goes down.