Cassandra / DataStax Enterprise 6 Clusters: Marketplace Options

As I have stepped full speed into work and research at DataStax there were a few things I needed as soon as I could possibly get them put together. Before even diving into development, use case examples, or reference application development I needed to have some clusters built up. The Docker image is great for some simple local development, but beyond that I wanted to have some live 3+ node clusters to work with. The specific deployed and configured use cases I had included:

  1. I wanted to have a DataStax Enterprise 6 Cassandra Cluster up and running ASAP. A cluster that would be long lived that I could developer sample applications against, use for testing purposes, and generally develop against from a Cassandra and DSE purpose.
  2. I wanted to have an easy to use cluster setup for Cassandra – just the OSS deployment – possibly coded and configured for deployment with Terraform and related scripts necessary to get a 3 node cluster up and running in Google Cloud Platform, Azure, or AWS.
  3. I wanted a DataStax Enterprise 6 enabled deployment, that would showcase some of the excellent tooling DataStax has built around the database itself.

I immediately set out to build solutions for these three requirements.

The first cluster system I decided to aim for was figuring out a way to get some reasonably priced hardware to actually build a physical cluster. Something that would make it absurdly easy to just have something to work with anytime I want without incurring additional expenses. Kind of the ultimate local development environment. With that I began scouring the interwebs and checking out where or how I could get some boxes to build this cluster with. I also reached out to a few people to see if I could be gifted some boxes from Dell or another manufacturer.

I lucked out and found some cheap boxes someone was willing to send over my way for almost nothing. But in the meantime since shipping will take a week or two. I began scouring the easy to get started options on AWS, Google Cloud Platform, and Azure.

First Cluster Launch on GCP

Ok, so even though I aim to have the hardware systems launched for a local dev cluster, I was on this search to find the most immediate way to get up and running without waiting. I wanted a cluster now, with a baseline setup that I could work with.

I opened up my Google Cloud Platform account to check out what options there are. The first option I found that is wildly easy to get started with involved navigating to the Cloud Launcher, typing in Cassandra, choosing the DataStax Enterprise option and then clicking on Cloud Launcher. Note that if you do go this route it can get into the $300 or higher range very quickly depending on the selections chosen.

gcp-building-datastax-enterprise-cluster

After clicking on create, the process begins building the cluster.

gcp-building-datastax-enterprise-cluster-deployment.gif

Then generally after a few minutes you have the cluster up and running. If you’ve left the check boxes for the regions available, it’s a multi-region cluster, and if you’ve selected multiple nodes per cluster that gives you a multi-cluster, multi-region deployment. Pretty sweet initial setup for just a few minutes and a few clicks.

Second Cluster Launch on Azure

I wanted to check out what the marketplace offered in Azure too, so I opened up the interface and did a quick search for the marketplace. It’s up yonder in the top of the interface on the … not the console, but here.

azure-marketplace

Next we have to fill out a bunch of info per our business. In this case I just used the ole’ Thrashing Code for this particular deployment. This is the first create of the process.

dse1

Then roll into that after it does the cross-auth of identity. Click create (this is the second create), and then some additional information around the C* auth password, admin username and password, and then choose the subscription model, name of cluster, and region.

dse2

After all that the cluster will be up and running. Now, I could get into adding another cloud option with AWS but I’ll leave that for another day. I’ve already got two cloud providers up and running so I’ll just stick to working with those for now.

So that is one of the quickest way to get a full fledged cluster running in Azure or GCP. In the next post I’ll show you the ways to get up and running locally with Docker and some the options, and the limitations of running a distributed systems locally. Beyond that, I’ll also be diving in deep into detailed installation and configuration of a Cassandra and DSE Cluster in the coming days. Including building out one with actual hardware – yup, on the metal – so subscribe for all the details. Cheers!

How Long do You Code Per Coding Session?

I was working on getting the latest DataStax Enterprise 6 up and running via the Docker Image offerings today and I stumbled across a site called hashnode.com. On that site was a harmless little question but something I realized I ponder a lot, and even find myself in conversation about on a regular basis. The question (link) is posed,

“How many minutes/hours do you really sit to write code at a particular moment?

I’m not saying the total summation of hours you code a day. When you really sit down to write a code for a particular task at a moment, how many minutes/hours (at worst case) do you normally sit down before you get tired? I know some take break, some say it depends on the task or the individual, I would love to hear them all, and what you do to keep your brain refreshed before getting back to coding. Thanks…”

This question, in normal coder fashion, has one simple answer the belies the actual complexity of the individual complex answers, “it depends“. So that’s the first answer, but here are some of the other answers for me. As with many of these types of questions and answers, many individual characteristics come into play for each of us, so this is indeed anecdotal scenarios for myself and very specifically YMMV for yourself!

Answer 1 – Ideal Long Coding Session

When I really need to get something done, I try to break a problem apart where the largest segment of time I’ll need to sit and work on a piece of code is about 4 hours. Not that I’ll only be there working on the code for 4 hours, it could be longer, or much shorter, but that’s about the longest segment of time I’ve found that works most effectively. If I go much longer than 4 hours several negatives hit me:

  • My mind starts to wander significantly and I start to lose track of what I’m writing, thus – not efficient use of the time.
  • The lethargy of sitting, or standing, or sitting and then standing and then sitting and then standing, as I do with the sit stand desk that I have, gets to a point that I just need to go for a walk. Better yet, it’s a good time to go for a bike ride and just crank hard. That gets my blood flowing and then I can dive right back into another coding session in another ~20-45 minutes.
  • I realize at about 4 hours that I’ve mismanaged the chunk of work. If I’m still working on it this long after the matter, I start to worry as much about the mistake, or the lack of understanding the problem enough to break it apart, that I really start to lose focus on and must shift back to the larger problem to see how it really should be broken apart.

Answer 425 – Ideal City Bus Coding Sessions

Sometimes I’m in the situation I’m about to jump aboard a bus for the commute, or to travel somewhere within the city. I absolutely abhor wasting time “driving” or even “ubering” to the next location when I can instead board a bus in 5-10 minutes and actually code in between my starting location and my destination! In this situation I routinely whip out the laptop and sling some code. I try to keep a lot of little busy tasks that can be completed in 5-10 minute chunks specifically for this scenario. Something that may require Internet access that would be acceptable at questionable tethering speeds and connectivity.

Answer 6 – At The Office

Offices with lots of mixed groups of functional teams and such usually just abhorrently suck for actually getting good, effective, high quality, and SOLID code written. The interruptions are many, the distractions are everywhere, it’s just super rough. But alas, we coders often exist in these damnable office environments where some boss person thinks it’s more important for a warm body to be in a chair pretending to code than to actually productively get things done and be smart. In those situations I like to find ways to break away to conferences rooms, toss on the headphones, or maybe even find a good partner to pair program with. These scenarios I routinely have been able to luck out and get a solid 30-120 minutes coding sessions in. These scenarios aren’t ideal, but they’re absolutely necessary if one actually wants to get something done!

Answer 2 – Ideal Long Session on a Hard Problem

In answer 1, I described the scenario if I have some idea of the problem and can break it apart. But what happens when I need to code a little and then review the problem and then test and review and then code and review and so on? Well those coding sessions, are technically dozens and gazillions of shorter coding sessions, but, overall this type of overall session could be many hours. It’s ideal to get into the proverbial “zone” and then just hack at it for hours while everything is in one’s mind. This can be lots of fun, but progress on the problem has to be kept in mind too. In these scenarios the individual coding sessions are problem 5-75 minutes long at the most, but the overall session might be 4, 5, 8, or even longer hours of just sitting and coding. One needs to make sure to have appropriate drinks, food, and other survival gear ready for this type of coding session as it can be hard on the body!

Answer 42 – Answer to It All

You’re done, you’ve found the answer, it’s 42. So moving along.

Answer 9 – Sometimes Fear is the Appropriate Response

9

Best to just go watch the movie.

Answer 5 – Those Paired Programming Sessions

Ideally another programming technique I like to use is pair programming. Pair programming with someone else forces one to break down the work pretty effectively and then divide it into chunks that often can be built in 15-30 minute segments. Sometimes even less. When it’s a known problem realm, the problem has been described enough to break apart, and I have a solid cohort to pair with, this is easily one of my favorite ways to build software – or as I might say sometimes “sling some code“.

Those are some of my answers, and I’m already thinking about a part II. But throw some of your own thoughts and suggestions at me on the Twitters @Adron. I’ll bundle up suggestions and add them in round two of this series. Cheers!

 

 

DevXcon San Francisco

I just finished attending DevXcon in San Francisco at the beginning of the week before last. It’s the way the DataStax crew welcomed me into the family. It was a solidly awesome time, a great way to get started, and I’ve rated it “would do again!” Tamao (@mewzherder), Matthew (@matthewrevell), and fellow organizers did a great job putting things together!

The DevXcon is kidn of a sibling or parallel of sorts to the DevRelCon presented by Github. These events are organized by Hoopy, a consultancy of Matthew’s that specializes in helping companies around developer relations and marketing. Both of these conferences focus around this, the developer relations of software companies and how to improve that relationship companies have with their prospective developers.

UX Practices for Developers

This DevXcon a big focus of the event was on and around user experience (UX) practices. I found it interesting this was brought up in this context. User experience being a key practice of any good application, CLI or API end point, web interface, or physically the interaction in using devices, or even a weed wacker for that matter is really unquestionable. If a company designs poor interfaces, their products will suffer at market or in the case of an ongoing user experience fail, injure the person using the device.

Other Topics Around Relations

The conference had a range of topics covered, including; working with email newsletters that developers actually want to receive, DevRel’s role in product feedback, and others. I’m going to skip that and point you instead to check out Adam Duvander’s write up “DevXcon SF 2018: where UX, DX, and product came together with dev rel“.

DevRel, the practice of effectively coordinating, creating, and building content, coding, and communicating the benefits of products and services of a company, and open source project, or other organization has grown into something a bit more than merely marketing. Even though much of the work of DevRel could fall into the marketing category if by no means fits into that realm, but more closely – when effective – to engineering, support, and the technical side of the spectrum.

The biggest issues I see are the age old problem of maintaining integrity and reputation in the industry once moving into DevRel from engineering. It’s a difficult step when one has engineering or related work experience their reputation is built on and then delves into DevRel. The myth goes that we aren’t developing anymore, that we lose our coding chops. But seriously, good DevRel build products, services, and expands on what they’re advocating and showing to their respective community just as engineering is building that product or service. Good, emphasis on good, DevRel teams advocate, code, and build still, which is fundamental in my opinion to building a solid community base around any project, product, or service offering.

Advances for DevRel, Advocacy Not Evangelism

It does appear, as seen with many groups, at least we’re frequently starting to drop the misnomer title of Evangelist and pushing toward titles like Advocate, such as Microsoft has done with their Cloud Developer Advocates (or CDAs as they call them in short, because Microsoft gotta TLA like they’ve always TLA’ed). Their focus has always been on their ecosystem, etc, but they’ve refocused around a wide spectrum of tooling, many times tooling that isn’t even from Microsoft. This refocus around an advocacy approach versus and evangelism approach is a pretty big deal, especially for the end user. It’s a positive reflection that DevRel as a working group in companies is moving in a positive direction to benefit the community that uses the products and services of an organization.

Anyway, that’s just a quick summary, more on many of these topics in the future. For now, happy coding, conferencing, and cheers!

 

Let’s Really Discuss Lock In

For to long lock-in has been referred to with an almost entirely negative connotation even though it can be inferred in positive and negative situations. The fact is that there’s a much more nuanced and balanced range to benefits and disadvantages of lock-in. Often this may even be referred to as this or that dependency, but either way a dependency often is just another form of lock in. Weighing those and finding the right balance for your projects can actually lead to lock-in being a positive game changer or something that simply provides one a basis in which to work and operate. Sometimes lock-in actually will provide a way to remove lock-in by providing more choices to other things, that in turn may provide another variance of lock-in.

Concrete Lock-in Examples

The JavaScript Lock-In

IT Security icons. Simplus seriesTake the language we choose to build an application in. JavaScript is a great example. It has become the singular language of the web, at least on the client side. This was long ago, a form of lock-in that browser makers (and standards bodies) chose that dictated how and in which direction the web – at least web pages – would progress.

JavaScript has now become a prominent language on the server side now too thanks to Node.js. It has even moved in as a first class language in serverless technology like AWS’s Lambda. JavaScript is a perfect example of a language, initially being a source of specific lock-in, but required for the client, that eventually expanded to allow programming in a number of other environments – reducing JavaScript’s lock in – but displacing lock in through abstractions to other spaces such as the server side and and serverless functions.

The .NET Windows SQL Server Lock In

IT Security icons. Simplus seriesJavaScript is merely one example, and a relatively positive one that expands one’s options in more ways than limits one’s efforts. But let’s say the decision is made to build a high speed trading platform and choose SQL Server, .NET C#, and Windows Server. Immediately this is a technology combination that has notoriously illuminated in the past * how lock-in can be extremely dangerous.

This application, say it was built out with this set of technology platforms and used stored procedures in SQL Server, locking the application into the specific database, used proprietary Windows specific libraries in .NET with the C# code, and on Windows used IIS specific advances to make the application faster. When it was first built it seemed plenty fast and scaled just right according to the demand at the time.

Fast forward to today. The application now has a sharded database when it hit a mere 8 Terabytes, loaded on two super pumped up – at least for today – servers that have many cores, many CPUs, GPUs, and all that jazz. They came in around $240k each! The application is tightly coupled to a middle tier, that is then sort of tightly coupled to those famous stored procedures, and the application of course has a turbo capability per those IIS Servers.

But today it’s slow. Looking at benchmarks and query times the database is having a hard time dealing with things as is, and the application has outages on a routine basis for a whole variation of reasons. Sometimes tracing and debugging solves the problems quickly, other times the servers just oversubscribe resources and sit thrashing.

Where does this application go? How does one resolve the database loading issues? They’ve already sunk a half million on servers, they’re pegged out already, horizontally scaling isn’t an option, they’re tightly coupled to Window Servers running IIS removing the possibility of effectively scaling out the application servers via container technologies, and other issues. Without recourse, this is the type of lock in that will kill the company if something is changed in a massive way very soon.

To add, this is the description of an actual company that is now defunct. I phrased it as existing today only to make the point. The hard reality is the company went under, almost entirely because of the costs of maintaining and unsustainable architecture that caused an exorbitant lock in to very specific tools – largely because the company drank the cool aid to use the tools as suggested. They developed the product into a corner. That mistake was so expensive that it decimated the finances of the company. Not a good scenario, not a happy outcome, and something to be avoided in every way! This is truly the epitomy of negative lock in.

Of course there’s this distinctive lock in we have to steer clear from, but there’s the lock in associated with languages and other technology capabilities that will help your company move forward faster, easier, and with increasing capabilities. Those are the choices, the ties to technology and capabilities that decision makers can really leverage with fewer negative consequences.

The “Lock In” That Enables

IT Security icons. Simplus seriesOne common statement is, “the right tool for the job”. This is of course for the ideal world where ideal decisions can be made all the time. This doesn’t exist and we have to strive for balance between decisions that will wreck the ship or decisions that will give us clear waters ahead.

For databases we need to choose the right databases for where we want to go versus where we are today. Not to gold plate the solution, but to have intent and a clear focus on what we want our future technology to hold for us. If we intend to expand our data and want to maintain the ability to effectively query – let’s take the massive SQL Server for example – what could we have done to prevent it from becoming a debilitating decision?

A solution that could have effectively come into play would have been not to shard the relational database, but instead to either export or split the data in a more horizontal way and put it into a distributed database store. Start building the application so that this system could be used instead of being limited by the relational database. As the queries are built out and the tight coupling to SQL Server removed, the new distributed database could easily add nodes to compensate for the ever growing size of the data stored. The options are numerous, that all are a form of lock-in, but not the kind that eventually killed this company that had limited and detrimentally locked itself into use of a relational database.

At the application tier, another solution could have been made to remove the ties to IIS and start figuring out a way to containerize the application. One way years ago would have been to move away from .NET, but let’s say that wasn’t really an option for other reasons. The idea to mimic containerization could have been done through shifting to a self-contained web server on Windows that would allow the .NET application to run under a singular service and then have those services spin off the application as needed. This would decouple from IIS, and enable spreading the load more quickly across a set number of machines and eventually when .NET Core was released offer the ability to actually containerize and shift entirely off of Windows Server to a more cost efficient solution under Linux.

These are just some ideas. The solutions of course would vary and obviously provide different results. Above all there are pathways away from negative lock in and a direction toward positive lock in that enables. Realize there’s the balance, and find those that leverage lock in positively.

Nuanced Pedantic Notes:

  • Note I didn’t say all examples, but just that this combo has left more than a few companies out on a limb over the years. There are of course other technologies that have put companies (people actually) in awkward situations too. I’m just using this combo here as an example. For instance, probably some of the most notorious lock in comes from the legal ramifications of using Oracle products and being tied into their sales agreements. On the opposite end of the spectrum, Stack Overflow is a great example of how choosing .NET and scaling with it, SQL Server, and related technologies can work just fine.

A New Adventure of Multi-model Distribute Graph Time Series […etc…] Database(s) Explorations Begins!

I arrived at the airport, sending a few tweets of this or that nature with all of this Github and Microsoft News. I have a great view out the window from the Alaska Lounge just before heading to the D gates. For you aeronautics fans like myself, here’s a picture of that view and a few of those Alaska Planes with one of the newly acquired Virgin America Planes!

IMG_5264

All this news with Github and Microsoft was easily eclipsing WWDC18 and in the meanwhile little ole’ me is on my way to a new adventure in my career. So priorities what they are, the news being excited, I’m more excited today to announce today I’m joining a most excellent team at DataStax! to bring forth investigation, research, knowledge, ideas, and whatever else I can as a Developer Evangelist with the crew here at DataStax! I’m unbelievably stoked as I’ve been searching for a company that would check all of my “will this work” check boxes for some months now! DataStax won out among the other prospective candidate companies and I’m starting today!

datastax_logo_blue

To kick off this adventure, I’m heading to San Francisco to join in the fun attending DevxCon. I’ll be there a little later today, hopefully in time for the kick off (ya know, pending flights and BART are all timely and such)! Then a full day of the conf, then later will join the team for a visit to DataStax HQ and maybe a few surprises. I’m super excited and ready to bring awesome content your way, while inventing, building, and experimenting my way through some awesome technologies!

ML4ALL LiveStream, Talks & More

If you’re attending, or if you’re at the office or at home, you can check out the talks as they go live on the ML4ALL Youtube Channel! Right now during the conference we also have the live feed on the channel, so if you’re feeling a little FOMO this might help a little. Enjoy!

Here are a few gems that are live already!

Manuel Muro “Barriers To Accelerating The Training Of Artificial Neural Networks”

-> Introduction of Manuel

Jon Oropeza “ML Spends A Year In Burgundy”

-> Introduction of Jon

Igor Dziuba “Teach Machine To Teach: Personal Tutor For Language Learners”

-> Introduction to Igor

IDE Launcher via Amtrak Cascades to Portland for ML4ALL

Got fidgety on the train, and just wanted to write code, on the way down to Portland for ML4ALL so I wrote up some decision tree code on determining what IDE’s I want opened up. Ya know, if you do something more than twice it needs automated, so I’ve started the process of automating all startup and shutdown tasks for a day’s coding. Simplistic geeky train geek code fun code is fun geeky train code. Cheers!

package main

import (
	"time"
	"fmt"
)

var sessionMinimal, sessionMedium, sessionLong, sessionZone time.Duration
var language string

func main() {
	sessionMinimal = 15
	sessionMedium = 45
	sessionLong = 90
	sessionZone = 180

	language = "golang"

	openIde("golang", 200)
}

func openIde(languageStack string, expectedCodingTime time.Duration) {
	var ide string

	switch  {
	case expectedCodingTime  sessionZone:
		ide = stackSpecific(languageStack, false, true, true)
		fmt.Printf("Launching: %s", ide)

	}
}

func stackSpecific(language string, fastLaunch bool, featureRich bool, introspective bool) string {
	if fastLaunch == true && featureRich == true && introspective == true {
		return "\n\nCome on, you know better. You get at best two out of three.\n\n"
	}

	if fastLaunch == true && featureRich == true {
		return "Visual Studio Code"
	}

	if featureRich == true && introspective == true {
		switch language {
		case "SQL":
			return "DataGrip"
		case "C":
			return "CLion"
		case "Python":
			return "PyCharm"
		case "golang":
			return "Goland"
		case "java":
			return "IntelliJ"
		case "scala":
			return "IntelliJ"
		case "kotlin":
			return "IntelliJ"
		case "dotnet":
			return "Rider"
		case "csharp":
			return "Rider"
		case "fsharp":
			return "Rider"
		case "vbnet":
			return "Rider"
		case "javascript":
			return "Webstorm"
		case "hcl":
			return "IntelliJ"
		case "ruby":
			return "RubyMine"
		case "swift":
			return "AppCode"
		case "obj-c":
			return "AppCode"
		default:
			return "IntelliJ"
		}
	}

	if featureRich == true {
		switch language {
		case "swift":
			return "AppCode"
		case "obj-c":
			return "AppCode"
		default:
			return "Visual Studio Code"
		}
	}

	if introspective == true {
		switch language {
		case "swift":
			return "AppCode"
		case "obj-c":
			return "AppCode"
		default:
			return "Visual Studio Code"
		}
	}

	if fastLaunch == true  {
		return "Sublime"
	}

	return "No IDE for you."
}