KubeCon 2018 Mission Objectives: Is the developer story any better?

This is my second year at KubeCon. This year I had a very different mission than I did last year. Last year I wanted to learn more about services meshes, what the status of various features around Kubernetes like stateful sets, and overall get a better idea of what the overall shape of the product was and where the project was headed.

This year, I wanted to find out two specific things.

  1. Developer Story: Has the developer story gotten any better?
  2. Database Story: Do any databases, and their respective need for storage, have a good story on Kubernetes yet?

Well, I took my trusty GoPro Camera, mounted it up like the canon the Predator uses on my should and departed. I was going to attend this conference with a slightly different plan of attack. I wanted to have video and not only take a few notes, attend some sessions, and generally try to grok the information I collected that way. My thinking went along the lines, with additional resources, I’d be able to recall and use even more of the information available. With that, here’s the video I shot while perusing the showroom floor and some general observations. Below the fold and video I’ll have additional commentary, links, and updates along with more debunking the cruft from the garbage from the good news!

Gloo-01Gloo – Ok, this looks kind of interesting. I stopped to look just because it had interesting characters. When I strolled up to the booth I listened for a few minutes, but eventually realized I needed to just dig into what docs and collateral existed on the web. Here’s what’s out there, with some quick summaries.

Gloo exists as an application gateway. Kind of a mesh of meshes or something, it wasn’t immediately clear. But I RTFMed the Github Repo here and snagged this high level architecture diagram. Makes it interesting and prospectively offers insight to its use.

high_level_architecture

Gloo has some, I suppose they’re “sub” projects too. Here’s a screenshot of the set of em’. Click it to navigate to solo.io which appears to be the parent organization. Some pretty cool software there, lots of open source goodness. It also leads me to think that maybe this is part of that first point above that I’m looking for, which is where is the improved developer story?

gloo.png

More on that later, for now, I want to touch on one more thing before moving on to next blog posts about the KubeCon details I’m keen to tell you about.

ballerina-logo

Ballerina – Ok, when I approached the booth I wasn’t 100% sure this was going to be what I was wanting it to be. Upon getting a demo (in video too) and then returning to the web – as ya do – then digging into the details and RTFMing a bit I have become hopeful. This stack of technology looks good. Let’s review!

The description on the website describes Ballerina as,

A compiled, transactional, statically and strongly typed programming language with textual and graphical syntaxes. Ballerina incorporates fundamental concepts of distributed system integration and offers a type safe, concurrent environment to implement microservices with distributed transactions, reliable messaging, stream processing, and workflows.

which sounds like something pretty solid, that could really help developers build on – let’s say Kubernetes – in a very meaningful way. However it could also expand far beyond just Kubernetes, which is something I’ve wanted to see, and help developers expand and expedite their processes and development around line of business applications. Which currently, is still the same old schtick with the now ancient RAD tools all the way to today’s React and web tools without a good way to develop those with understanding or integrations with modern tooling like Kubernetes. It’s always, jam it on top, config a bunch of yaml, and toss it over the wall still.

A few more key points of how Ballerina is described on the website. Here’s the stated philosophy goal,

Ballerina makes it easy to write cloud native applications while maintaining reliability, scalability, observability, and security.

Observability and security eh? Ok, I’ll be checking into this further, along with finally diving into Rust in the coming weeks. It looks like 2019 is going to be the year I delve into more than one new language! Yikes, that’s gonna be intense!

TiDB – Clearly the team at PingCap didn’t listen to the repeated advice of “don’t write your own database, it’ll…” from Charity Majors and a zillion other people who have written their own databases! But hey, this one, regardless of the advice being unheeded, looks kind of interesting. Right in the TiDB repo they’ve got an architecture diagram which is…  well, check out the diagram first.

tidb-architecture

So it has a mySQL app protocol, that connects with the TiDB cluster, which then has a DistSQL API (??) and KV API connecting to the TiKV (which stands for Ti Key Value) which is a cluster, that then uses a DistSQL API to connect the other direction to a Spark Cluster. The Spark SQL can then be used. It appears the running theme is SQL all the things.

Above this, to manage the clusters and such there’s a “PD Cluster” which I also need to read about. If you watched the video above, you’ll notice the reference to it being the ZooKeeper of the system. This “PD Cluster ZooKeeper” thing manages the metadata, TSO Data Location and data location pertinent to the Spark Cluster. Overall, 4 clusters to manage the system.

Just for good measure (also in the video) the TiDB is built in Go, while the TiKV is built in Rust, and some of the data location or part of the Spark comms are handled with some Java Virtual Machine – I think – I might have misunderstood some of the explanation. So how does all this work? I’ve no idea at this point, but I’m curious to find out. But before that in the next few weeks and months I’m going to be delving into building applications in Node.js, Java, and C# against Cassandra and DataStax Enterprise, so I might add some cross-comparisons against TiDB.

Also, even though I didn’t get to have a conversation with anybody from Foundation DB I’m interested in how it’s working these days too, especially considering it’s somewhat storied history. But hey, what project doesn’t have a storied history these days right! Stay tuned, subscribe here on the blog, and I’ll have updates when that work and other videos, twitch streams, and the like are published.

After all those conversations and running around the floor, at this point I had to take a coffee break. So with that, enjoy this video on how to appropriately grab good coffee real quick and an amazing cookie treat. Cheers!

Yes, I mispelled “dummy” it’s ok I don’t want to re-render it. I also know that the cookie name is kind of vulgar LOLz but you know what, welcome to Seattle we love ya even when you’re mind is in the gutter!

Top 10 West Coast Confs for 2019

I’ve been putting together a list of conferences that I want to aim to attend this coming year. I made it, then thought, “somebody else could use this list probably” so here it is. If you think of any other specific conferences I ought to add and attend please leave a comment. Enjoy!

March 7-10 is SCALE Southern California Linux Expo in Pasadena, California

March 25-28 is O’Reilly Strata in San Francisco, California

April 26-28 is LinuxFest Northwest in Bellingham, Washington

June 3-5 is Monitorama in Portland, Oregon

June 10-13 is O’Reilly Velocity in San Jose, California

June 10-13 is O’Reilly Software Architecture Conference SACON in San Jose, California

July 15-18 is O’Reilly OSCON in Portland, Oregon

August 21-23 is the Open Source Summit in San Diego, California

September 9-12 is the O’Reilly Artificial Intelligence in San Jose, California

November 18-21 KubeCon 2019 in San Diego, California

Without Dates – Conferences that are really great that don’t currently have a date just yet.

Polyglot Conf in Vancouver BC

Seattle Code Camp

Microsoft Build

GDG DevFest

What others should I add that are awesome Seattle or immediate surrounding area conferences?

 

DataStax Developer Days

Over the last week I had the privilege and adventure of coming out to Chicago and Dallas to teach about operations and security capabilities of DataStax Enterprise. More about that later in this post, first I’ll elaborate on and answer the following:

  • What is DataStax Developer Day? Why would you want to attend?
  • Where are the current DataStax Developer Day events that have been held, and were future events are going to be held?
  • Possibilities for future events near a city you live in.

What is DataStax Developer Day?

The way we’ve organized this developer day event at DataStax, is focused around the DataStax Enterprise built on Apache Cassandra product, however I have to add the very important note that this isn’t merely just a product pitch type of thing, you can and will learn about distributed databases and systems in a general sense too. We talk about a number of the core principles behind distributed systems such as the pivotally important consistent hash ring, datacenter and racks, gossip, replication, snitches, and more. We feel it’s important that there’s enough theory that comes along with the configuration and features covered to understand who, what, where, why, and how behind the configuration and features too.

The starting point of the day’s course material is based on the idea that one has not worked with or played with a Apache Cassandra or DataStax Enterprise. However we have a number of courses throughout the day that delve into more specific details and advanced topics. There are three specific tracks:

  1. Cassandra Track – this track consists of three workshops: Core Cassandra, Cassandra Data Modeling, and Cassandra Application Development. [more details]
  2. DSE Track – this track consists of three workshops: DataStax Enterprise Search, DataStax Enterprise Analytics, and DataStax Enterprise Graph. [more details]
  3. Bonus Content – This track has two workshops: DataStax Enterprise Overview and DataStax Enterprise Operations and Security.  [more details]

Why would you want to attend?

  • One huge rad awesome reason is that the developer day events are FREE. But really, nothing is ever free right? You’d want to take a day away from the office to join us, so there’s that.
  • You also might want to even stay a little later after the event as we always have a solidly enjoyable happy hour so we can all extend conversations into the evening and talk shop. After all, working with distributed databases, managing data, and all that jazz is honestly pretty enjoyable when you’ve got awesome systems like this to work with, so an extended conversation into the evening is more than worth it!
  • You’ll get a firm basis of knowledge and skillset around the use, management, and more than a few ideas about how Apache Cassandra and DataStax Enterprise can extend your system’s systemic capabilities.
  • You’ll get a chance to go beyond merely the distributed database system of Apache Cassandra itself and delve into graph, what it is and how it works, analytics, and search too. All workshops take a look at the architecture, uses, and what these capabilities will provide your systems.
  • You’ll also have one on one time with DataStax engineers, and other technical members of the team to ask questions, talk about architecture and solutions that you may be working on, or generally discuss any number of graph, analytics, search, or distributed systems related questions.

Where are the current DataStax Developer Day events that have been held, and were future events are going to be held? So far we’ve held events in New York City, Washington DC, Chicago, and Dallas. We’ve got two more events scheduled with one in London, England and one in Paris, France.

Future events? With a number of events completed and a few on the calendar, we’re interested in hearing about future possible locations for events. Where are you located and where might an event of this sort be useful for the community? I can think of a number of cities, but organizing them into order to know where to get something scheduled next is difficult, which is why the team is looking for input. So ping me via @Adron, email, or just send me a quick message from here.

Strata, Ninjas, Distributed Data Day, and Graph Day Trip Recap

This last week was a helluva set of trips, conferences to attend, topics to discuss, and projects to move forward on. This post I’ll attempt to run through the gamut of events and the graph of things that are transversing from the conference nodes onward! (See what I did there, yeah, transpiling that graph verbiage onto events and related efforts!)

Monday Flight(s)

Monday involved some flying around the country for me via United. It was supposed to be a singular flight, but hey, why not some adventures around the country for shits and giggles right! Two TIL’s (Things I Learned) that I might have known already, but repetition reinforces one’s memory.

  1. If you think you’ve bought a nonstop ticket be sure to verify that there isn’t a stopover half way through the trip. If there’s any delays or related changes your plane might be taken away, you’ll get shuffled off to who know’s what other flight, and then you end up spending the whole day flying around instead of the 6 hour flight you’re supposed to have.
  2. Twitter sentiment tends to be right, it’s good policy to avoid United, they schedule their planes and the logistical positions and crews in ways that generally become problematic quickly when there’s a mere minor delay or two.

Tuesday Strata Day Zero (Train & Workshop Day)

Tuesday rolled in and Strata kicked off with a host of activities. I rolled in to scope out our booth but overall, Tuesday was a low yield activity day. Eventually met up with the team and we rolled out for an impromptu team dinner, drinks, and further discussions. We headed off to Ninja, which if you haven’t been there it’s a worthy adventure for those brave enough. I had enough fun that I felt I should relay this info and provide a link or three so you too could go check it out.

Wednesday Strata Day One

Day two of Strata kicked off and my day involved mostly discussions with speakers, meetings, a few analyst discussions, and going around to booths to check out which technology I needed to add to my “check it out soon” list. Here are a few of the things I noted and are now on the list.

I also worked with the video team and cut some video introductions for Strata and upcoming DataStax Developer Days Announcements. DataStax Developer Days are free events coming to a range of cities. Check them out here and sign up for whichever you’re up for attending. I’m looking forward to teaching those sessions and learning from attendees about their use cases and domains in which they’re working.

The cities you’ll find us coming to soon:

I wish I could come and teach in every city but I narrowed it down to Chicago and Dallas, so if you’re in those cities, I look forward to meeting you there! Otherwise you’ll get to meet other excellent members of the team!

This evening we went to Death Ave. The food was great, drinks solid, and the name was simply straight up metal. Albeit it be a rather upper crust dining experience and no brutal metal was in sight to be seen or heard. However, I’d definitely recommend the joint, especially for groups as they have a whole room you can get if you’ve got enough people and that improves the experience over standard dining.

Thursday Strata Day Two

I scheduled my flights oddly for this day. Which in turn left me without any time to spend at Strata. But that’s the issues one runs into when things are booked back to back on opposite coasts of the country! Thus, this day involved me returning to Newark via Penn Station and flying back out to San Francisco. As some of you may know, I’m a bit of a train geek, so I took a New Jersey NEC (Northeast Corridor) train headed for Trenton out of Penn back to the airport.

The train, whether you’re taking the Acela, Metroliner, NJ Transit, or whatever is rolling along to Newark that day is the way to go in my opinion. I’ve taken the bus, which is slightly cheaper, but meh it’s an icky east coast intercity bus. The difference in price in a buck or three or something, nothing significant, and of course you can jump in an Uber, Taxi, or other transport also. Even when they can make it faster I tend to prefer the train. It’s just more comfortable, I don’t have to deal with a driver, and they’re more reliable. The turnpikes and roadways into NYC from Newark aren’t always 100%, and during rush hour don’t even expect to get to the city in a timely manner. But to each their own, but for those that might not know, beware the taxi price range of $55 base plus tolls which often will put your trip into Manhattan into the $99 or above price range. If you’re going to any other boroughs you better go ahead and take a loan out of the bank.

The trip from Newark to San Francisco was aboard United on a Boeing 757. I kid you not, regardless of airline, if you get to fly on a 757 versus a 737 or Airbus 319 or 320, it’s preferable. Especially for flights in the 2+ hour range. There is just a bit more space, the engines make less noise, the overall plane flies smoother, and the list of comforts is just a smidgen better all around. The 757 is the way to go for cross continent flights!

In San Francisco I took the standard BART route straight into the city and over to the airbnb I was staying at in Protrero Hill. Right by Farley’s on Texas Street if you know the area. I often pick the area because it’s cheap (relatively), super chill, good food nearby, not really noisy, and super close to where the Distributed Data Summit and Graph Day Conferences Venue is located.

The rest of Thursday included some pizza and a short bout of hacking some Go. Then a moderately early turn in around midnight to get rested for the next day.

Friday Distributed Data Summit

I took the short stroll down Texas Street. While walking I watched a few Caltrain Commuter Trains roll by heading into downtown San Francisco. Eventually I got to 16th and cross the rail line and found the walkway through campus to the conference venue. Walked toward the building entrance and there was my fellow DataStaxian Amanda. We chatted a bit and then I headed over to check out the schedule and our DataStax Booth.

We had a plethora of our rather interesting and fun new DataStax tshirts. I’ll be picking some up week after next during our DevRel week get together. I’ll be hauling these back up to Seattle and could prospectively get some sent out to others in the US if you’re interested. Here’s a few pictures of the tshirts.

After that joined the audience for Nate McCall’s Keynote. It was good, he put together a good parallel of life and finding and starting to work with and on Cassandra. Good kick off, and after I delved into a few other talks. Overall, all were solid, and some will even have videos posted on the DataStax Academy Youtube Account. Follow me @Adron or the @DataStaxAcademy account to get the tweets when they’re live, or alternatively just subscribe to the YouTube Channel (honestly, that’s probably the easiest way)!

After the conference wrapped up we rolled through some pretty standard awesome hanging out DevRel DataStax style. It involved the following ordered events:

  1. Happy hour at Hawthorne in San Francisco with drink tickets, some tasty light snacks, and most excellent conversation about anything and everything on the horizon for Cassandra and also a fair bit of chatter about what we’re lining up for upcoming DataStax releases!
  2. BEER over yonder at the world famous Mikeller Bar. This place is always pretty kick ass. Rock n’ Roll, seriously stout beer, more good convo and plotting to take over the universe, and an all around good time.
  3. Chinese Food in CHINA TOWN! So good! Some chow mein, curry, and a host of things. I’m a big fan of always taking a walk into Chinatown in San Francicsco and getting some eats. It’s worth it!

Alright, after that, unlike everybody else that then walked a mere two blocks to their hotel or had taken a Lyft back, I took a solid walk all the way down to the Embarcadero. Walked along for a bit until I decided I’d walked enough and boarded a T-third line train out to Dogpatch. Then walked that last 6 or so blocks up the hill to Texas Street. Twas an excellent night and a great time with everybody!

Saturday Graph Day

Do you do graph stuff? Lately I’ve started looking into Graph Database tech again since I’ll be working on and putting together some reference material and code around the DataStax Graph Database that has been built onto the Cassandra distro. I’m still, honestly kind of a newb at a lot of this but getting it figured out quickly. I do after all have a ton of things I’d like to put into and be able to query against from a graph database perspective. Lot’s of graph problems of course don’t directly correlate to a graph database being a solution, but it’s indeed part of the solution!

Overall, it was an easy day, the video team got a few more talks and I attended several myself. Again, same thing as previously mentioned subscribe to the channel on Youtube or follow me on Twitter @Adron or the crew @DataStaxAcademy to get notified when the videos are released.

Summary

It has been a whirlwind week! Exhausting but worth it. New connections made, my own network of contacts and graph of understanding on many topics has expanded. I even got a short little time in New York among all the activity to do some studying, something I always love to break away and do. I do say though, I’m looking forward to getting back to the coding, Twitch streams, and the day to day in Seattle again. Got some solid material coming together and looking forward to blogging that too, and it only gets put together when I’m on the ground at home in Seattle.

Cheers, happy thrashing code!

ML4ALL LiveStream, Talks & More

If you’re attending, or if you’re at the office or at home, you can check out the talks as they go live on the ML4ALL Youtube Channel! Right now during the conference we also have the live feed on the channel, so if you’re feeling a little FOMO this might help a little. Enjoy!

Here are a few gems that are live already!

Manuel Muro “Barriers To Accelerating The Training Of Artificial Neural Networks”

-> Introduction of Manuel

Jon Oropeza “ML Spends A Year In Burgundy”

-> Introduction of Jon

Igor Dziuba “Teach Machine To Teach: Personal Tutor For Language Learners”

-> Introduction to Igor

Barriers To Accelerating The Training Of Artificial Neural Networks – A Systemic Perspective – Meet Manuel Muro

manuel-muroThe real breakthrough for the modern Artificial Intelligence (AI) and Machine Learning (ML) technology explosions started back in 1943 when researchers McCulloch & Pitts came up with a mathematical model to represent that function of the biological neuron; nature’s gift that allows all life to operate and learn over time. Eventually this research would then give birth to the Artificial Neural Network (ANN).

Continue reading “Barriers To Accelerating The Training Of Artificial Neural Networks – A Systemic Perspective – Meet Manuel Muro”

Conducting a Data Science Contest in Your Organization w/ Ashutosh Sanzgiri

ashutosh-sanzgiriAshutosh Sanzgiri (@sanzgiri) is a Data Scientist at AppNexus, the world’s largest independent Online Advertising (Ad Tech) company. I develop algorithms for machine learning products and services that help digital publishers optimize the monetization of their inventory on the AppNexus platform.

Ashutosh has a diverse educational and career background. He’s attained a Bachelor’s degree in Engineering Physics from the Indian Institute of Technology, Mumbai, a Ph.D. in Particle Physics from Texas A&M University and he’s conducted Post-Doctoral research in Nuclear Physics at Yale University. In addition to these achievements Ashutosh also has a certificate in Computational Finance and an MBA from the Oregon Health & Sciences University.

Prior to joining AppNexus, Ashutosh has held positions in Embedded Software Development, Agile Project Management, Program Management and Technical Leadership at Tektronix, Xerox, Grass Valley and Nike.

Scaling Machine Learning (ML) at your organization means increasing ML knowledge beyond the Data Science (DS) department. Several companies have Data / ML literacy strategies in place, usually through an internal data science university or a formal training program. At AppNexus, we’ve been experimenting with different ways to expand the use of ML in our products and services and share responsibility for its evaluation. An internal contest adds a competitive element, and makes the learning process more fun. It can engage people to work on a problem that’s important to the company instead of working on generic examples (e.g. “cat vs dog” classification), and gives contestants familiarity with the tools used by the DS team.

In this talk, Ashutosh will present the experience of conducting a “Kaggle-style” internal DS contest at AppNexus. he’ll discuss our motivations for doing it and how we went about it. Then he’ll share the tools we developed to host the contest. The hope being you too will find inspiration to try something in your organization!