‘bash’ A.K.A. The Solution for Everything – Bourne Shell as per v7 Unix to Today’s

In 1979 Unix v7 started being distributed with the original Bourne Shell. Simply, it’s a program that sits at /bin/sh and runs in the terminal. You may ask, “what’s the difference between a shell and the terminal?” Let’s cover that right now, because it’s something that routinely isn’t common knowledge, but it really ought to be as it sets the basis for understanding a lot of what is going on in Unix based systems (that includes almost every practical system on a PC, Server, in the cloud, on your phones, and more. Probably easiest to explain it simply as everything that isn’t the Microsoft Windows OS)

A Shell and the Terminal

Terminal – A terminal is the text input and output environment on the system.

Shell – This is the command line interpreter that is run at the terminal.

Another point of context, is that a terminal, shell, and the word console are all used in various ways and sometimes interchangeably. However, these words do not mean the same thing at all. They are distinct individual parts of the system. For example, console, which is used in a strangely disingenuous way all over Microsoft phrasing, is the physical terminal of the system, which is where the system terminal, i.e. the thing I’ve described above, actually runs in so that we can type and interface with it as humans.

Albeit, as English does, these definitions aren’t always taken as the exact, appropriate, and pedantically correct definitions today. For example, many at Microsoft argue that the console is just the terminal, that the terminal is the console. Sure, ok, that’s fine I can still follow along in the conversation, and this adds context, for when someone steps out of line and uses the more historically specific definition in context of a conversation.

Alright, that’s all groovy, so now we can get back to just talking about the shell, all the power it gives the Unix/Linux/POSIX System user, and touch on the terminal or console as we need to with full context of what these things actually are!

Gnu-bash-logo.svgIntroducing Bash!

Alright, with that little bit of context around Bourne Shell, let’s talk about what we’ve actually got today running as our shell in our terminal on our console on the computers we work with! The Bourne Shell, years later had a replacement written for it by Brian Fox. He released it in 1989 and over the years it became a kind of defacto replacement of Bourne Shell. The term ‘BASH’ stands for Bourne Again SHell.

440px-Bash_screenshotThe Bash command syntax is a superset of the Bourne Shell syntax. It provides a wide range of commands that includes ideas drawn from the Korn shell (ksh) and the C shell (csh) such as command line editing, command history, the directory stack, the $RANDOM and $PPID variables, and POSIX command substitution syntax. If many of those things make you think, “WTF are these variables and such?” have no worries, I’ll get to em’ soon enough in this series!

But with that, this is the beginning of many short entries on tips, tricks, tutorials, syntax, history, and context of bash so until next time, cheers!

References & Collected Materials

PIP Install Trash Fire – AWS CLI Issues & Fixes for Installation

I sat down again, for about the eight billionth time to install AWS’s CLI and get to work against some infrastructure. For at least the gazillionth time I get a stupid error, that at this point I really feel the installation process ought to ignore, mitigate, or otherwise handle. But anyway, here’s one very common issue that keeps popping up over and over if you’ve got six installed already that isn’t the version that awscli wants (or doesn’t want).

pip-install.png

Found existing installation: six 1.4.1
Cannot uninstall ‘six’. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

The quick solution to this is to just tell pip to ignore the existing six installation.

pip install awscli --ignore-installed six --user<span id="mce_SELREST_start" style="overflow:hidden;line-height:0;"></span>

I did a quick search too, just to see what others had found and it looked like this wasn’t an entirely uncommon issue. On Stackoverflow there’s this same issue coming up with other installations.

I’d love to see this, and about a dozen other odd issues that always come up specifically with the awscli. The simple fix, is for AWS just to go ahead and dig into how GCP builds gcloud because that CLI is easily the slickest option of the big three. But I digress, back to work on getting some infrastructure resources out there.

What do you want?

I’m in the process (with the team I work with) of trying to figure out what would be most useful to you, the community and its members in which we all work. Whether you’re a coder, working toward being a coder, programmer, engineer, or whatever it is that you  aim for we want to know what would help you out? I myself produce a ton of material that I personally find entertaining and fun to produce myself, and hope it’s useful for people. So – if you would, take a moment and answer these few questions. Thanks and cheers!

Strata, Ninjas, Distributed Data Day, and Graph Day Trip Recap

This last week was a helluva set of trips, conferences to attend, topics to discuss, and projects to move forward on. This post I’ll attempt to run through the gamut of events and the graph of things that are transversing from the conference nodes onward! (See what I did there, yeah, transpiling that graph verbiage onto events and related efforts!)

Monday Flight(s)

Monday involved some flying around the country for me via United. It was supposed to be a singular flight, but hey, why not some adventures around the country for shits and giggles right! Two TIL’s (Things I Learned) that I might have known already, but repetition reinforces one’s memory.

  1. If you think you’ve bought a nonstop ticket be sure to verify that there isn’t a stopover half way through the trip. If there’s any delays or related changes your plane might be taken away, you’ll get shuffled off to who know’s what other flight, and then you end up spending the whole day flying around instead of the 6 hour flight you’re supposed to have.
  2. Twitter sentiment tends to be right, it’s good policy to avoid United, they schedule their planes and the logistical positions and crews in ways that generally become problematic quickly when there’s a mere minor delay or two.

Tuesday Strata Day Zero (Train & Workshop Day)

Tuesday rolled in and Strata kicked off with a host of activities. I rolled in to scope out our booth but overall, Tuesday was a low yield activity day. Eventually met up with the team and we rolled out for an impromptu team dinner, drinks, and further discussions. We headed off to Ninja, which if you haven’t been there it’s a worthy adventure for those brave enough. I had enough fun that I felt I should relay this info and provide a link or three so you too could go check it out.

Wednesday Strata Day One

Day two of Strata kicked off and my day involved mostly discussions with speakers, meetings, a few analyst discussions, and going around to booths to check out which technology I needed to add to my “check it out soon” list. Here are a few of the things I noted and are now on the list.

I also worked with the video team and cut some video introductions for Strata and upcoming DataStax Developer Days Announcements. DataStax Developer Days are free events coming to a range of cities. Check them out here and sign up for whichever you’re up for attending. I’m looking forward to teaching those sessions and learning from attendees about their use cases and domains in which they’re working.

The cities you’ll find us coming to soon:

I wish I could come and teach in every city but I narrowed it down to Chicago and Dallas, so if you’re in those cities, I look forward to meeting you there! Otherwise you’ll get to meet other excellent members of the team!

This evening we went to Death Ave. The food was great, drinks solid, and the name was simply straight up metal. Albeit it be a rather upper crust dining experience and no brutal metal was in sight to be seen or heard. However, I’d definitely recommend the joint, especially for groups as they have a whole room you can get if you’ve got enough people and that improves the experience over standard dining.

Thursday Strata Day Two

I scheduled my flights oddly for this day. Which in turn left me without any time to spend at Strata. But that’s the issues one runs into when things are booked back to back on opposite coasts of the country! Thus, this day involved me returning to Newark via Penn Station and flying back out to San Francisco. As some of you may know, I’m a bit of a train geek, so I took a New Jersey NEC (Northeast Corridor) train headed for Trenton out of Penn back to the airport.

The train, whether you’re taking the Acela, Metroliner, NJ Transit, or whatever is rolling along to Newark that day is the way to go in my opinion. I’ve taken the bus, which is slightly cheaper, but meh it’s an icky east coast intercity bus. The difference in price in a buck or three or something, nothing significant, and of course you can jump in an Uber, Taxi, or other transport also. Even when they can make it faster I tend to prefer the train. It’s just more comfortable, I don’t have to deal with a driver, and they’re more reliable. The turnpikes and roadways into NYC from Newark aren’t always 100%, and during rush hour don’t even expect to get to the city in a timely manner. But to each their own, but for those that might not know, beware the taxi price range of $55 base plus tolls which often will put your trip into Manhattan into the $99 or above price range. If you’re going to any other boroughs you better go ahead and take a loan out of the bank.

The trip from Newark to San Francisco was aboard United on a Boeing 757. I kid you not, regardless of airline, if you get to fly on a 757 versus a 737 or Airbus 319 or 320, it’s preferable. Especially for flights in the 2+ hour range. There is just a bit more space, the engines make less noise, the overall plane flies smoother, and the list of comforts is just a smidgen better all around. The 757 is the way to go for cross continent flights!

In San Francisco I took the standard BART route straight into the city and over to the airbnb I was staying at in Protrero Hill. Right by Farley’s on Texas Street if you know the area. I often pick the area because it’s cheap (relatively), super chill, good food nearby, not really noisy, and super close to where the Distributed Data Summit and Graph Day Conferences Venue is located.

The rest of Thursday included some pizza and a short bout of hacking some Go. Then a moderately early turn in around midnight to get rested for the next day.

Friday Distributed Data Summit

I took the short stroll down Texas Street. While walking I watched a few Caltrain Commuter Trains roll by heading into downtown San Francisco. Eventually I got to 16th and cross the rail line and found the walkway through campus to the conference venue. Walked toward the building entrance and there was my fellow DataStaxian Amanda. We chatted a bit and then I headed over to check out the schedule and our DataStax Booth.

We had a plethora of our rather interesting and fun new DataStax tshirts. I’ll be picking some up week after next during our DevRel week get together. I’ll be hauling these back up to Seattle and could prospectively get some sent out to others in the US if you’re interested. Here’s a few pictures of the tshirts.

After that joined the audience for Nate McCall’s Keynote. It was good, he put together a good parallel of life and finding and starting to work with and on Cassandra. Good kick off, and after I delved into a few other talks. Overall, all were solid, and some will even have videos posted on the DataStax Academy Youtube Account. Follow me @Adron or the @DataStaxAcademy account to get the tweets when they’re live, or alternatively just subscribe to the YouTube Channel (honestly, that’s probably the easiest way)!

After the conference wrapped up we rolled through some pretty standard awesome hanging out DevRel DataStax style. It involved the following ordered events:

  1. Happy hour at Hawthorne in San Francisco with drink tickets, some tasty light snacks, and most excellent conversation about anything and everything on the horizon for Cassandra and also a fair bit of chatter about what we’re lining up for upcoming DataStax releases!
  2. BEER over yonder at the world famous Mikeller Bar. This place is always pretty kick ass. Rock n’ Roll, seriously stout beer, more good convo and plotting to take over the universe, and an all around good time.
  3. Chinese Food in CHINA TOWN! So good! Some chow mein, curry, and a host of things. I’m a big fan of always taking a walk into Chinatown in San Francicsco and getting some eats. It’s worth it!

Alright, after that, unlike everybody else that then walked a mere two blocks to their hotel or had taken a Lyft back, I took a solid walk all the way down to the Embarcadero. Walked along for a bit until I decided I’d walked enough and boarded a T-third line train out to Dogpatch. Then walked that last 6 or so blocks up the hill to Texas Street. Twas an excellent night and a great time with everybody!

Saturday Graph Day

Do you do graph stuff? Lately I’ve started looking into Graph Database tech again since I’ll be working on and putting together some reference material and code around the DataStax Graph Database that has been built onto the Cassandra distro. I’m still, honestly kind of a newb at a lot of this but getting it figured out quickly. I do after all have a ton of things I’d like to put into and be able to query against from a graph database perspective. Lot’s of graph problems of course don’t directly correlate to a graph database being a solution, but it’s indeed part of the solution!

Overall, it was an easy day, the video team got a few more talks and I attended several myself. Again, same thing as previously mentioned subscribe to the channel on Youtube or follow me on Twitter @Adron or the crew @DataStaxAcademy to get notified when the videos are released.

Summary

It has been a whirlwind week! Exhausting but worth it. New connections made, my own network of contacts and graph of understanding on many topics has expanded. I even got a short little time in New York among all the activity to do some studying, something I always love to break away and do. I do say though, I’m looking forward to getting back to the coding, Twitch streams, and the day to day in Seattle again. Got some solid material coming together and looking forward to blogging that too, and it only gets put together when I’m on the ground at home in Seattle.

Cheers, happy thrashing code!

Let’s Talk, Even If You’re Not Qualified!

Please help me out if you would and spread the word on this post via the Twitters, LinkedIn, or whatever method you might have to pass on the word. Thanks!

Alright, I don’t usually do this. But I’m going to delve into the topic of a job post we’re working to fill here where I work at DataStax. Specifically it’s the DataStax Developer Advocate Role, which is important to me for a multitude of reasons, but specifically because we’d get to work together. Under most circumstances, I’d probably just let the company look and look and look for somebody, but I’m invested in this team as I really enjoy the work we do and the camaraderie that we have. With that, the position.

We’re looking for someone with a multitude of skills, but more specifically an acumen and interest in learning, exploring data and systems, and helping developers build solid systems with DataStax Enterprise. But that’s just the surface. I wrote about what I, specifically get to do as a developer advocate earlier in a post I titled “Evangelism, Advocacy, and Activism in the Technology Industry“.

The idea that you might be disqualified from the position if you don’t have a specific part of the criteria list, just get that out of your mind. If you see parts that you have, interests that match, let’s talk and see if things would work.

If you of these criteria describe what you’re up to, ping me.

  • You’ve just got out of college and haven’t touched a distributed database before, but would really like to get into distributed databases and distributed systems and tell people about what you’ve learned. Let’s talk. Ping me @Adron.
  • You were in college, but thought “meh, I’m good” and want to join the workforce and have technical chops but have been looking for a good fit for a self-starter, self-organizing type of role. Let’s talk. Ping me @Adron.
  • You’d like to work with a team to make a product better, learn about, teach people what you’ve learned, write about your experiences, and even speak about your experiences. Let’s talk. Ping me @Adron.
  • You’ve built an application in C#, or Java, or Go, or JavaScript but not really done a lot of database work but are interested in going deeper. Let’s talk. Ping me @Adron.
  • You’d like to work with a team remotely, in a position you’d get to learn a lot, experiment, and build applications to test out ideas you have about application development. Let’s talk. Ping me @Adron.
  • You’d like to work on a team that isn’t toxic, has a healthy working practice, communicates regularly and effectively, enjoys learning, teaching, and helping each other get things learned, built, and deployed! Let’s talk. Ping me @Adron.

Hopefully that portrays the idea well. Emphasis on, get in touch with me. I’d love to chat about the role if you’re interested and see if you’d like to move forward. If dev advocate isn’t what you want we’ve got a number of awesome, remote, seriously cool jobs open right now. From site reliability to engineering to sales or what not. I’m happy to get you connected to the right people, so let’s talk. Ping me @Adron.

Evangelism, Advocacy, and Activism in The Technology Industry

Usually I just head to my local office in downtown Ballard, a neighborhood in Seattle that was and still is largely its own city. Today however I’ve boarded the 17x Express Bus into the downtown Seattle. While in transit, as I always do, I just sat back introspecting on the day to be and the days of past, while reading Jeff’s post “From Evangelist to Developer Advocates” on our occupation title changes. As of recently we went from the somewhat inappropriately named Developer Evangelist to the more accurate title of Developer Advocate.

People have written about these titles in DevRel (Developer Relations) in the past, as have I. I wanted to add a few thoughts about these titles in this particular situation, and draw out some recent events where others seem to incorrectly, albeit with reason, conflate actual evangelism with advocacy. I’ll wrap up with another specific word that is important, activism, and how that comes to play in the tech industry also.

Spread the Word of God! Eh… ?!

Alright let’s get down to the real meat of the definition of the word evangelism.

Evangelism – 1 “the winning or revival of personal commitments to Christ” and 2 “militant or crusading zeal” so yeah,  wow. Not the actual intention.

evangelism

Most uses of the word all center around spreading the gospel, specifically the gospel of the Christian God in the bible often through militant fasion and prospectively genocidal eradication of peoples. Somewhere in the late 80’s, 90’s or something some tech company (I think Microsoft if memory serves) in partial tongue in cheek jest dubbed an occupation position evangelists that would go out and spread the good word of the technology. I only know parts of the myth and origin story, but suffice it to say, it was kind of a joke that stuck and at this point has just been a title for well over a decade or two now. One that sincerely should probably not be used anymore, as I hope nobody is militantly pushing technology on others.

Another note, many of us referred to officially or unofficially have gotten hit with this association in often negative ways. For example. Follow that thread for the trash fire it becomes and the horror of the iWill Estate troll account. But I do digress.

This is one of the dangers of tech appropriating titles and such (as it all to often does) it tends to create societal blow back that is more than unwelcome. But on toward a better future and a better title right?

Advocacy – “the act or process of supporting a cause or proposal” or “the act or process of advocating”. Alright, now we’re on to something!

advocacy

But seriously, evangelists in technology aren’t preachers, and according to statistics are dramatically more likely to just be atheist, so being an advocate is exponentially better in so many ways. It really is, in so many ways, an occupation that is involved with the act of advancing, working with, and showing others certain tools, languages, or related technologies within the technology industry. In this way, using the word advocate in the title is just simply a more accurate and effective choice in so many ways. It isn’t a word derived from jest, it’s definition itself aligns with the occupation, and in my not so humble opinion it sounds a lot better. I am, and always will be an advocate for many different things.

Delving Further into Advocacy

Over the years I’ve done far more than merely advocacy work. I’ve worked in everything from labor, cooking, software developer for startups to enterprises, security, teaching, enterprise desk jockey (I mean software engineer, but the difference is sometimes minimal), and a host of other work. Each had various ranges of activities that needed done that went far beyond the actual occupation title. The title is merely a poorly designed window that one can look into to see what an occupation entails. The real details need written down, and really thought about in detail outside of the title itself. The following is a list of the top key things I do as a developer advocate.

  1. I write code for reference AND production. I write lot’s of code in a number of languages (very polyglot, much wow, very confusion). I work through the problems and plights of different technology stacks. I work with systems, operations, and all the intertwining characteristics in between. Sometimes only at a very high level architecturally and other times at the deeper level of shifting bits and fighting pointer errors in C. The idea however, is the technology situation isn’t just the mythical nonsensical full stack as the code schools say, but the real life from hardware to software, top to bottom full stack of intricate and often frustrating detail! In summary, it’s a blast if you’re a curious type that likes to bounce around in the various domain problem spaces.
  2. I extensively get to and know how to travel, well. This one gets a little personal. I don’t just book flights and stay in hotels. Often I wouldn’t even need to do this but I like to make a point that I will handle my own travel, and expense it as needed. The stress of traveling inefficiently can end up being the death knell of being an advocate. It can lead to burn out, sickness (yes, actually being sick), and other health related issues. Matter of fact this topic will be another entire blog entry, or entries, that I’ll write on the matter. But let’s just say I travel on a semi-frequent basis at this point. A nice cool 1-2 weeks out of every 2-3 months. Which in many ways is minimal for a lot of advocacy and related positions. More on this topic in a future post.
  3. SSO and Cartesian password nightmare management. I’ve never in my life had to manage as many usernames and passwords as when working as a developer advocate. The reasoning is simple, as with consulting I often end up helping out with a lot of different systems. But  also in doing development for reference applications I end up having access to so many macahines that need recreated, keys that need rolled, and related things that it is almost overwhelming. Password keepers are a life saver. Automation keeps me sane.
  4. I don’t not code asshole. As an advocate I routinely have to deal with that one asshole at a conference or a talk who wants to try to “call me out” or complain that I don’t really “have responsibilities” or related rude, crass, asshole behavior. At this point in my life, I simply disregard such comments but I still need to manage these comments and the individuals making them so they don’t detract from what I’m trying to provide and help people with. I will also admit, [TRIGGER WARNING-start] as an advocate that is a cis-gendered white male I get the privilege of not having to also defend myself for my gender identity, sex, or related identity, but even then it’s still a pain and can only imagine what others that aren’t cis gendered white males deal with. [TRIGGER WARNING-end] The tech industry has a lot of assholes, and as an advocate I get to learn how to manage them on an almost daily basis. I’d rather not having to do it, but I’m out here to learn as much as I am out here to teach others about application development, databases, and related technology. To all the other 98% of people that are friendly to me, thanks, I appreciate it, keep up the good work! Beers (or your beverage of choice) on me next round!
  5. I advocate for the developer. This can mean a number of things; from organizing developer focused conferences to getting bugs reported to meeting with and discussing future product paths with developers and product teams. In many ways I am a matchmaker of minds, connecting those that can take action to those that seek action, that look to better the tools we use. One could say, in this effort I’m the bridge point. I actually have a pretty obscenely huge contact list because of this. I’m always thinking, “who could I connect this person with that also wants X to get built”? This is honestly one of the most mentally exhausting parts of my job, but also one that has huge rewards. What I can learn from those I connect often exceeds any wild expectation.

One More Word: Activism or Activist

I added this word as often, when one advocates, one also gets to work with people who are and will be activists. Before I continue, the definition.

Activism – “a doctrine or practice that emphasizes direct vigorous action especially in support of or opposition to one side of a controversial issue”.

activism

There’s a specific reason I bring up activism. It isn’t specifically because of the current political climate in the world, but I’d lie if I didn’t mention it’s part of it. Activism is something that is also interwoven into the software industry. From open source software itself to the free software movement. Activism is a very important and distinctive activity in the software industry. I bring this up because of the important parallels and some of the call outs I wanted to make. Get involved – anybody can – here’s the details.

Beyond this, I’ve been involved in a number of activist work that is often convergent with advocacy, albeit it often involves parking, bicycle advocacy, safe streets, urban city design, and related transportation and urban planning work.

References: AKA More reading on the topic!

An interesting post that simply asks the question and looks at some recent conversations on the topic.

This is a comment thread on Hacker News that is pretty insightful of the various perspectives of the various titles. But also with some interesting anecdotal information about what people have seen among Apple, Google, and other companies and how they orient these positions to work with the community.

This is a post that popped out at me. What does this actually even mean? Based on the words this actually sounds super creepy.

Another post I dug up on the topic reminds me of why we have so many hard issues with words meaning their written definition and then what we infer from their general meaning in actual daily use. For example this post seems to just skip over defining the words from the dictionary as a point of reference and just run with the assumed, or the writer’s assumed definition of what they’ve observed of the occupations using the word.

Here are a few posts from some other developer advocates, on the topic of what developer advocacy is.

A few of my past posts.

Finally my posts about watching the awesome team being built at Microsoft here and my fortune in finding and joining the awesome team at DataStax here.

Summary

It’s complicated, there’s no TLDR so just read and keep learning.

Distributed Systems: Cassandra, DataStax, a Short SITREP

SITREP = Situation Report. It’s military speak. 💂🏻‍♂️

Apache Cassandra is one of the most popular databases in use today. It has many characteristics and distinctive architectural details. In this post I’ll provide a description and some details for a number of these features and characteristics, divided as such. Then, after that (i.e. toward the end, so skip there if you just want to the differences) I’m doing to summarize key differences with the latest release of the DataStax Enterprise 6 version of the database.

Cassandra Characteristics

Cassandra is a linearly scalable, highly available, fault tolerant, distributed database. That is, just to name a few of the most important characteristics. The Cassandra database is also cross-platform (runs on any operating systems), multi-cloud (runs on and across multiple clouds), and can survive regional data center outages or even in multi-cloud scenarios entire cloud provider outages!

Columnar Store, Column Based, or Column Family? What? Ok, so you might have read a number of things about what Cassandra actually is. Let’s break this down. First off, a columnar or column store or column oriented database guarantees data location for a single column in a node on disk. The column may span a bunch of or all of the rows that depend on where or how you specify partitions. However, this isn’t what the Cassandra Database uses. Cassandra is a column-family database.

A column-family storage architecture makes sure the data is stored based on locality of the data at the partition level, not the column level. Cassandra partitions group rows and columns split by a partition key, then clustered together by a specified clustering column or columns. To query Cassandra, because of this, you must know the partition key in order to avoid full data scans!

Cassandra has these partitions that guarantee to be on the same node and sort strings table (referred to most commonly as an SSTable *) in the same location within that file. Even though, depending on the compaction strategy, this can change things and the partition can be split across multiple files on a disk. So really, data locality isn’t guaranteed.

Column-family stores are great for high throughput writes and the ability to linearly scale horizontally (ya know, getting lots and lots of nodes in the cloud!). Reads using the partition key are extremely fast since this key points to exactly where the data resides. However, this often – at least last I know of – leads to a full scan of the data for any type of ad-hoc query.

A sort of historically trivial but important point is the column-family term comes from the storage engine originally used based on a key value store. The value was a set of column value tuples, which where often referenced as family, and later this family was abstracted into partitions, and then the storage engine was matched to that abstraction. Whew, ok, so that’s a lot of knowledge being coagulated into a solid eh!  [scuse’ my odd artful language use if you visualized that!]

With all of this described, a that little history sprinkled in, when reading the description of Cassandra in the README.asc file of the actual Cassandra Github Repo things make just a little more sense. In the file it starts off with a description,

Apache Cassandra is a highly-scalable partitioned row store. Rows are organized into tables with a required primary key.

Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster.

Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.

Now that I’ve covered the 101 level of what Cassandra is I’ll give a look at DataStax and their respective offering.

DataStax

DataStax Enterprise at first glance might be a bit confusing since immediate questions pop up like, “Doesn’t DataStax make Cassandra?”, “Isn’t DataStax just selling support for Cassandra?”, or “Eh, wha, who is DataStax and what does this have to do with Cassandra?”. Well, I’m gonna tell ya all about where we are today regarding all of these things fit.

Performance

DataStax provides a whole selection of amenities around a database, which is derived from the Cassandra Distributed Database System. The core product and these amenities are built into what we refer to as the “DataStax Enterprise 6“. Some of specific differences are that the database engine itself has been modified out of band and now delivers 2x the performance of the standard Cassandra implemented database engine. I was somewhat dubious when I joined but after the third party benchmarks where completed that showed the difference I grew more confident. My confidence in this speed increase grew as I’ve gotten to work with the latest version I can tell in more than a few situations that it’s faster.

Read Repair & NodeSync

If you already use Cassandra, read repair works a certain way and that still works just fine in DataStax Enterprise 6. But one also has the option of using NodeSync which can help eliminate scripting, manual intervention, and other repair operations.

Spark SQL Connectivity

There’s also an always on SQL Engine for automated uptime for apps using DataStax Enterprise Analytics. This provides a better level of analytics requests and end -user analytics. Sort of on this related note, DataStax Studio also has notebook support for Spark SQL now. Writing one’s Spark SQL gets a little easier with this option.

Multi-Cloud / Hybrid-Cloud

Another huge advantage of DataStax Enterprise is going multi-cloud or hybrid-cloud with DataStax Enterprise Cassandra. Between the Lifecycle Manager (LCM), OpsCenter, and related tooling getting up and running with a cluster across a varying range of data-centers wherever they may be is quick and easy.

Summary

I’ll be providing deeper dives into the particular technology, the specific differences, and more in the future. For now I’ll wrap up this post as I’ve got a few others coming distinctively related to distributed database systems themselves ranging from specific principles (like CAP Theorem) to operational (how to and best ways to manage) and development (patterns and practices of developing against) related topics.

Overall the solutions that DataStax offers are solid advantages if you’re stepping into any large scale data (big data or whatever one would call their plethora of data) needs. Over the coming months I’ve got a lot of material – from architectural research and guidance to tactical coding implementation work – that I’ll be blogging about and providing. I’m really looking forward to exploring these capabilities, being the developer advocate to DataStax for the community of users, and learning a thing or three million.