Thrashing Code Metal Monday for Week of April 22nd

This week’s selection, is a a selection, that I’ll leave here to speak for itself.

 

 

Monday Metal Sliding in From The Side

This Monday included some metal of various sorts, from metal tubes flying through the air to some riffy madness to boot. I’ve need an extra dose today! This first band is Parasite Inc. for a little German melodic death metal to end your Monday.

Next up, check out Orden Ogan for a wild mix on a dystopian future!

Last up get your German ready because this is a twister in the court!

In other news I’m going to be laying down tracks for some of my own music in the coming days. Just popped the $$ for Superior Drummer 3 and am ramping up on that in the evenings. Join me on Twitch (@adronhall) to learn with me as I work through it all. Cheers, and happy thrashing code!

In Flight to Apache Cassandra Days

Another flight down to the bay area. Today it was Alaska Air Flight 330 from Seattle to San Jose. It was mostly a clear day at start, with a solid layer of bright cloud cover exiting Washington on the way down to Oregon. As we crossed over that arbitrary human defined line of Oregon and California, nature presented us with even more perfectly glowing bright cloud cover. This is Cascadia after all and it’s basically covered in clouds the majority of the time. On departure I also noted Bremerton has three aircraft carriers in dock along with a normal plethora of other naval vessels. The amount of naval power in the area is always pretty awe inspiring.

Why was I in flight once again? I am heading down to teach with Jeff Carpenter (@jscarp) at the South Bay Cassandra User Group‘s Cassandra Day events. These are single day events, where we cover an introduction to Apache Cassandra, concepts of data-modeling for Apache Cassandra, and then a wrap up of application development with the respective drivers. Now if you aren’t in Santa Clara – or ya know Menlo Park, San Jose, Oakland, San Francisco, or well, the surrounding area – there are other days scheduled! We also have days scheduled that aren’t even located in the Bay, so check out the full list of events:

https://www.datastax.com/company/events

NOTE: If you’re interested in Seattle, Portland, or Vancouver BC area events, scroll all the way down to the end of this blog entry I’ve got more details for you!

Introduction to Apache Cassandra

In the introduction to Apache Cassandra we cover an overview of the architecture and features of the distributed database. Starting off with a definition of a distributed hash ring and how this is used in Apache Cassandra to provide data storage across the nodes that make up the Apache Cassandra Database. Moving on we’ll get into the other capabilities, trade offs of data replication between nodes, configuration settings, and a lot more.

Data Modeling

For data modeling we start off with a short review of relational database data modeling to provide something that is more familiar for many people. From this, we then build off of many concepts around denormalization, breaking apart various levels of normalization forms, and then get into the thinking and approach behind modeling an application in a distributed database and go deeper with details around Apache Cassandra.

Application Development

For application development, focusing around the Java language and technology stack, we’ll start with some concepts around how the drivers connect to and work with Apache Cassandra. We’ll open up some code too, get into some code changes and additions, to get more familiar with how the driver works and some of the capabilities of the driver itself.

Most of the code, concepts, and related material in use around Java and the tech stack are directly usable on C#, JavaScript, and even using the community open source Go CQL Library.

Coming soon…

In the coming weeks (ok, maybe a month or two) we’ll be updating this material for Apache Cassandra v4 and additionally, I’m aiming to line up some half day and probably some full day workshops in the Cascadian region: Portland, Seattle, and Vancouver BC. They’ll be almost identical except for a few tweaks, but you’ll have to RSVP to find out the details!

Also, if you’re in between any of those cities and have a stop on the Amtrak Cascades, let me know and we’ll get an RSVP list started for your city and see if we can get the required attendee count to make it official!

Zhi Yang Presenting “Hierarchical Topic Modeling in Cancer Patients’ Mutational Profiles”

zhi-yang.pngIntroducing Zhi Yang > @zhiiiyang < presenting “Hierarchical Topic Modeling in Cancer Research”.

Topic models have been widely applied to extract topics from various range of documents or collections of texts, i.e., online customers reviews, medical records, scientific
journals, legal documents, books and etc. Its application facilitates the process for us to quickly understand the most featured and commonly shared information embedded texts without actually reading through the entire collection. In addition, topic models also allow us to access the contribution of each topic and its representations across different documents. Human genomes have been exposed to an assortment of mutational processes by contributing to unique patterns of somatic mutations. What would happen if we apply the same concept to the somatic mutations obtained from the cancer patients and look for “topics” of mutations? What would these “topics” tell us about the most important information for our health, genetic, risk factors for cancer and
something more that slip under the radar?

Shiraishi et al’s have proposed a topic model targeted for somatic mutations to capture the characteristics and burdens contributed by mutational processes. By closely examining the burdens, we’d like to compare them across different categories, say, for example, time, cancer subtype, ethnicity, smoking history, etc. Then, we’d like to develop the statistical machinery to infer the difference between the mutational profiles across different categories and associate the variations with the know exposures. This tool is potentially useful for identifying novel and existing mutational processes and correlating them with risk factors in which later can be used to monitor any treatment effects in personalized medicine and targeted therapy.

Read the publication here at biorxiv and come check out Zhi Yang’s talk at ML4ALL happening April 28th-30th in amazing Portland, Oregon! Get your tickets to attend here. For the schedule, our excellent sponsors docs for the conference, check out the ML4ALL Conference Site!

Sachi Parikh Presenting “My Journey Learning ML and AI through Self Study as a High School Student”

sachi-parikhIntroducing Sachi Parikh > @parikhsachi < presenting “My Journey Learning ML and AI through Self Study as a High School Student”.

Sachi is a high school student in the Bay Area who is interested in AI and Machine Learning and loves to code, read and learn. In the talk she’s put together for us she’s delved into the path she’s taken to get into this topic. I’ve seen an outline of this path and I’ll admit, I’m impressed, but you’ll have to come and attend to talk to see the outline!

Come check out Sachi Parikh’s talk and learn about this learning path at ML4ALL happening April 28th-30th in amazing Portland, Oregon! Get your tickets to attend here. For the schedule, our excellent sponsors docs for the conference, check out the ML4ALL Conference Site!

Karl Weinmeister Presenting “Build, train, and serve your ML models on Kubernetes with Kubeflow”

karl-weinmeisterIntroducing Karl Weinmeister > @kweinmeister < presenting “Build, train, and serve your ML models on Kubernetes with Kubeflow”.

Karl is a Developer Advocacy Manager from Google’s Developer Relations Artificial Intelligence and Machine Learning team.  Karl has worked extensively in cloud and mobile, and was a contributor to one of the first AI-based crossword puzzle solvers that is still referenced today.

Distributing ML workloads across multiple nodes has become common. To achieve higher and higher levels of accuracy, data scientists are using more data and more complex models than ever before.

Kubeflow is an open-source platform for model building, serving, and training. It is built on industry standard Kubernetes infrastructure and runs in multiple clouds and on-premises.

In this session, we’ll discuss the problems that Kubeflow solves, and how you can use it to create reproducible ML workflows.

Come check out Karl Weinmeister’s talk at ML4ALL happening April 28th-30th in amazing Portland, Oregon! Get your tickets to attend here. For the schedule, our excellent sponsors docs for the conference, check out the ML4ALL Conference Site!

Aeva van der Veen Presenting “Gaming Rigs and ML Pipelines: how to get started with the tools you already have”

aeva-vanIntroducing Aeva van der Veen > @aevavoom < presenting “Gaming Rigs and ML Pipelines: how to get started with the tools you already have”.

Aeva is an outspoken open source advocate with over a decade experience contributing to F/OSS software and communities. They have been building distributed systems on Linux-based systems since ’99, and are most well known for their work in the OpenStack community wherein they founded Ironic, the Bare-Metal-as-a-Service project. Aeva lives in rainy Seattle and enjoys staying home when not travelling for work.

If you think that only big tech companies or PhD scientists can use ML & AI, I’d like to show you that an individual open-source enthusiast can build and train a model on commodity hardware using Open Data – and then scale that up on a public cloud.

And if you’re a PC gamer, you probably already have all the tools you need!

  • fast.ai, an easy-to-learn Python ML framework
  • nvidia-docker on an Ubuntu Gaming PC
  • public-domain GIS imagery
  • a couple terabytes of storage space and a fast internet connection

This talk grew out of a startup competition last year: we tried to use public-domain satellite imagery to help predict and prevent forest fires. Even though we chose not to pursue this as a business, it’s an excellent example of how combine open source software, public data, and a gaming PC to build an ML pipeline.

Come check out Aeva van der Veen’s talk at ML4ALL happening April 28th-30th in amazing Portland, Oregon! Get your tickets to attend here. For the schedule, our excellent sponsors docs for the conference, check out the ML4ALL Conference Site!