In Flight to Apache Cassandra Days

Another flight down to the bay area. Today it was Alaska Air Flight 330 from Seattle to San Jose. It was mostly a clear day at start, with a solid layer of bright cloud cover exiting Washington on the way down to Oregon. As we crossed over that arbitrary human defined line of Oregon and California, nature presented us with even more perfectly glowing bright cloud cover. This is Cascadia after all and it’s basically covered in clouds the majority of the time. On departure I also noted Bremerton has three aircraft carriers in dock along with a normal plethora of other naval vessels. The amount of naval power in the area is always pretty awe inspiring.

Why was I in flight once again? I am heading down to teach with Jeff Carpenter (@jscarp) at the South Bay Cassandra User Group‘s Cassandra Day events. These are single day events, where we cover an introduction to Apache Cassandra, concepts of data-modeling for Apache Cassandra, and then a wrap up of application development with the respective drivers. Now if you aren’t in Santa Clara – or ya know Menlo Park, San Jose, Oakland, San Francisco, or well, the surrounding area – there are other days scheduled! We also have days scheduled that aren’t even located in the Bay, so check out the full list of events:

https://www.datastax.com/company/events

NOTE: If you’re interested in Seattle, Portland, or Vancouver BC area events, scroll all the way down to the end of this blog entry I’ve got more details for you!

Introduction to Apache Cassandra

In the introduction to Apache Cassandra we cover an overview of the architecture and features of the distributed database. Starting off with a definition of a distributed hash ring and how this is used in Apache Cassandra to provide data storage across the nodes that make up the Apache Cassandra Database. Moving on we’ll get into the other capabilities, trade offs of data replication between nodes, configuration settings, and a lot more.

Data Modeling

For data modeling we start off with a short review of relational database data modeling to provide something that is more familiar for many people. From this, we then build off of many concepts around denormalization, breaking apart various levels of normalization forms, and then get into the thinking and approach behind modeling an application in a distributed database and go deeper with details around Apache Cassandra.

Application Development

For application development, focusing around the Java language and technology stack, we’ll start with some concepts around how the drivers connect to and work with Apache Cassandra. We’ll open up some code too, get into some code changes and additions, to get more familiar with how the driver works and some of the capabilities of the driver itself.

Most of the code, concepts, and related material in use around Java and the tech stack are directly usable on C#, JavaScript, and even using the community open source Go CQL Library.

Coming soon…

In the coming weeks (ok, maybe a month or two) we’ll be updating this material for Apache Cassandra v4 and additionally, I’m aiming to line up some half day and probably some full day workshops in the Cascadian region: Portland, Seattle, and Vancouver BC. They’ll be almost identical except for a few tweaks, but you’ll have to RSVP to find out the details!

Also, if you’re in between any of those cities and have a stop on the Amtrak Cascades, let me know and we’ll get an RSVP list started for your city and see if we can get the required attendee count to make it official!

Meetup Video: “Does the Cloud Kill Open Source?”

🆕 Had a great time at the last Seattle Scalability Meetup. I’ve also just finished processing and fixing up the talk video from this last Seattle Scalability Meetup. I feel like I’ve finally gotten the process of streaming and getting things put together post-stream so that I can make them available almost immediately afterwards.

Here @rseroter gives us a full review of various business models, open source licenses, and a solid situational report on cloud providers and open source.

Join the meetup group here: https://www.meetup.com/Seattle-Scalability-Meetup/

The next meetup on April 23rd we’ve got Dr. Ryan Zhang coming in to talk about serverless options. More details, and additional topic content will be coming soon.

Then in May, on the 28th, Guinevere (@guincodes) is going to present “The Pull Request That Wouldn’t Merge”. More details, and additional topic content will be coming soon.

Here’s some of the talks I streamed recently. Note, didn’t have the gear setup all that well just yet, but the content is there!

Machine Learning, Protocols, Classification, and Clustering

Today Suz Hinton @noopkat and Amanda Moran @AmandaDataStax are presenting, “Alternative Protocols – how offline machines can still talk to each other” and “Classification and Clustering Algorithms paired with Wine and Chocolate” respectively. The aim is to stream these talks tonight too on my Thrashing Code Twitch Channel. If you can attend in person, we’re almost at capacity so make sure you snag one of the remaining RSVP’s.

Here’s some more details on the speakers for tonight.

Continue reading “Machine Learning, Protocols, Classification, and Clustering”

Meetups Last Week: Serverless Identity and Security, Advanced XSS, RAFT Algorithms & Events, and Event Modeling.

Tuesday: Matthew Henderson, Serverless Identity and Security, then Naomi Bornemann, Advanced XSS Techniques.

Wednesday: James Nugent, RAFT Algorithm and Events, then Adam Dymitruk on Event Modeling.

Lena presents “So You Want to Run Data-Intensive Systems on Kubernetes”

If you’re interested in running data-intensive systems (think Apache Cassandra, DataStax Enterprise, Kafka, Spark, Tensorflow, Elasticsearch, Redis, etc) in Kubernetes this is a great talk. @Lenadroid covers what options are available in Kubernetes, how architectural features around pods, jobs, stateful sets, and replica sets work together to provide distributed systems capabilities. Other features she continues and delves into include custom resource definitions (CRDs), operators, and HELM Charts, which include future and peripheral feature capabilities that can help you host various complex distributed systems. I’ve included references below the video here, enjoy.

References: