Category Archives: Video

Twitz Coding Session in Go – Cobra + Viper CLI Wrap Up + Twitter Data Retrieval

Part 3 of 3 – Coding Session in Go – Cobra + Viper CLI for Parsing Text Files, Retrieval of Twitter Data, and Exports to various file formats.

UPDATED PARTS:

  1. Twitz Coding Session in Go – Cobra + Viper CLI for Parsing Text Files
  2. Twitz Coding Session in Go – Cobra + Viper CLI with Initial Twitter + Cassandra Installation
  3. Twitz Coding Session in Go – Cobra + Viper CLI Wrap Up + Twitter Data Retrieval (this post)

Updated links to each part will be posted at bottom of  this post when I publish them. For code, written walk through, and the like scroll down below the video and timestamps.

0:54 The thrashing introduction.
3:40 Getting started, with a recap of the previous sessions but I’ve not got the sound on so ignore this until 5:20.
5:20 I notice, and turn on the volume. Now I manage to get the recap, talking about some of the issues with the Twitter API. I step through setup of the app and getting the appropriate ID’s and such for the Twitter API Keys and Secrets.
9:12 I open up the code base, and review where the previous sessions got us to. Using Cobra w/ Go, parsing and refactoring that was previously done.
10:30 Here I talk about configuration again and the specifics of getting it setup for running the application.
12:50 Talking about Go’s fatal panic I was getting. The dependency reference to Github for the application was different than what is in application and don’t show the code that is actually executing. I show a quick fix and move on.
17:12 Back to the Twitter API use by using the go-twitter library. Here I review the issue and what the fix was for another issue I was having previous session with getting the active token! Thought the library handled it but that wasn’t the case!
19:26 Now I step through creating a function to get the active oath bearer token to use.
28:30 After deleting much of the code that doesn’t work from the last session, I go about writing the code around handling the retrieval of Twitter results for various passed in Twitter Accounts.

The bulk of the next section is where I work through a number of functions, a little refactoring, and answering some questions from the audience/Twitch Chat (working on a way to get it into the video!), fighting with some dependency tree issues, and a whole slew of silliness. Once that wraps up I get some things committed into the Github repo and wrap up the core functionality of the Twitz Application.

58:00 Reviewing some of the other examples in the go-twitter library repo. I also do a quick review of the other function calls form the library that take action against the Twitter API.
59:40 One of the PR’s I submitted to the project itself I review and merge into the repo that adds documentation and a build badge for the README.md.
1:02:48 Here I add some more information about the configuration settings to the README.md file.

1:05:48 The Twitz page is now updated: https://adron.github.io/twitz/
1:06:48 Setup of the continuous integration for the project on Travis CI itself: https://travis-ci.org/Adron/twitz
1:08:58 Setup fo the actual travis.yml file for Go. After this I go through a few stages of troubleshooting getitng the build going, with some white space in the ole’ yaml file and such. Including also, the famous casing issue! Ugh!
1:26:20 Here I start a wrap up of what is accomplished in this session.

NOTE: Yes, I realize I spaced and forgot the feature where I export it out to Apache Cassandra. Yes, I will indeed have a future stream where I build out the part that exports the responses to Apache Cassandra! So subcribe, stay tuned, and I’ll get that one done ASAP!!!

1:31:10 Further CI troubleshooting as one build is green and one build is yellow. More CI troubleshooting! Learn about the travis yaml here.
1:34:32 Finished, just the bad ass outtro now!

The Codez

In the previous posts I outlined two specific functions that were built out:

  • Part 1 – The config function for the twitz config command.
  • Part 2 – The parse function for the twitz parse command.

In this post I focused on updating both of these and adding additional functions for the bearer token retrieval for auth and ident against the Twitter API and other functionality. Let’s take a look at what the functions looked like and read like after this last session wrap up.

The config command basically ended up being 5 lines of fmt.Printf functions to print out pertinent configuration values and environment variables that are needed for the CLI to be used.

The parse command was a small bit changed. A fair amount of the functionality I refactored out to the buildTwitterList() and exportFile, and rebuildForExport functions. The buildTwitterList() I put in the helper.go file, which I’ll cover a littler later. But in this file, which could still use some refactoring which I’ll get to, I have several pieces of functionality; the export to formats functions, and the if else if logic of the exportParsedTwitterList function.

Next up after parse, it seems fitting to cover the helpers.go file code. First I have the check function, which simply wraps the routinely copied error handling code snippet. Check out the file directly for that. Then below that I have the buildTwitterList() function which gets the config setting for the file name to open to parse for Twitter accounts. Then the code reads the file, splits the results of the text file into fields, then steps through and parses out the Twitter accounts. This is done with a REGEX (I know I know now I have two problems, but hey, this is super simple!). It basically finds fields that start with an @ and then verifies the alphanumeric nature, combined with a possible underscore, that then remove unnecessary characters on those fields. Wrapping all that up by putting the fields into a string/slice array and returning that string array to the calling code.

The next function in the Helpers.go file is the getBearerToken function. This was a tricky bit of code. This function takes in the consumer key and secret from the Twitter app (check out the video at 5:20 for where to set it up). It returns a string and error, empty string if there’s an error, as shown below.

The code starts out with establishing a POST request against the Twitter API, asking for a token and passing the client credentials. Catches an error if that doesn’t work out, but if it can the code then sets up the b64Token variable with the standard encoding functionality when it receives the token string byte array ( lines 9 and 10). After that the request then has the header built based on the needed authoriztaion and content-type properties (properties, values? I don’t recall what spec calls these), then the request is made with http.DefaultClient.Do(req). The response is returned, or error and empty response (or nil? I didn’t check the exact function signature logic). Next up is the defer to ensure the response is closed when everything is done.

Next up the JSON result is parsed (unmarshalled) into the v struct which I now realize as I write this I probably ought to rename to something that isn’t a single letter. But it works for now, and v has the pertinent AccessToken variable which is then returned.

Wow, ok, that’s a fair bit of work. Up next, the findem.go file and related function for twitz. Here I start off with a few informative prints to the console just to know where the CLI has gotten to at certain points. The twitter list is put together, reusing that same function – yay code reuse right! Then the access token is retrieved. Next up the http client is built, the twitter client is passed that and initialized, and the user lookup request is sent. Finally the users are printed out and below that a count and print out of the count of users is printed.

I realized, just as I wrapped this up I completely spaced on the Apache Cassandra export. I’ll have those post coming soon and will likely do another refactor to get the output into a more usable state before I call this one done. But the core functionality, setup of the systemic environment needed for the tool, the pertinent data and API access, and other elements are done. For now, that’s a wrap, if you’re curious about the final refactor and the Apache Cassandra export then subscribe to my Twitch @adronhall and/or my YouTube channel ThrashingCode.

UPDATED SERIES PARTS

    1. Twitz Coding Session in Go – Cobra + Viper CLI for Parsing Text Files
    2. Twitz Coding Session in Go – Cobra + Viper CLI with Initial Twitter + Cassandra Installation
    3. Twitz Coding Session in Go – Cobra + Viper CLI Wrap Up + Twitter Data Retrieval (this post)

     

Cedrick Lunven on Creating an API for your database with Rest, GraphQL, gRPC

Here’s a talk Cedrick Lunven (who I have the fortune of working with!) about creating API’s for your database, your distributed database. He starts out with a few objectives for the talk:

  1. Provide you a working API implementing Rest, gRPC, and GraphQL.
  2. Give implementation details through Demo.
  3. Reveal hints to choose and WHY, (specifically to work with Databases)

Other topics include specific criteria around conceptual data models, shifting from relational to distributed columnar store, with differentiation between entities, relationships, queries, and their respective behaviors. All of this is pertinent to our Killrvideo reference application we have too.

Enjoy!

KubeCon 2018 Mission Objectives: Is the developer story any better?

This is my second year at KubeCon. This year I had a very different mission than I did last year. Last year I wanted to learn more about services meshes, what the status of various features around Kubernetes like stateful sets, and overall get a better idea of what the overall shape of the product was and where the project was headed.

This year, I wanted to find out two specific things.

  1. Developer Story: Has the developer story gotten any better?
  2. Database Story: Do any databases, and their respective need for storage, have a good story on Kubernetes yet?

Well, I took my trusty GoPro Camera, mounted it up like the canon the Predator uses on my should and departed. I was going to attend this conference with a slightly different plan of attack. I wanted to have video and not only take a few notes, attend some sessions, and generally try to grok the information I collected that way. My thinking went along the lines, with additional resources, I’d be able to recall and use even more of the information available. With that, here’s the video I shot while perusing the showroom floor and some general observations. Below the fold and video I’ll have additional commentary, links, and updates along with more debunking the cruft from the garbage from the good news!

Gloo-01Gloo – Ok, this looks kind of interesting. I stopped to look just because it had interesting characters. When I strolled up to the booth I listened for a few minutes, but eventually realized I needed to just dig into what docs and collateral existed on the web. Here’s what’s out there, with some quick summaries.

Gloo exists as an application gateway. Kind of a mesh of meshes or something, it wasn’t immediately clear. But I RTFMed the Github Repo here and snagged this high level architecture diagram. Makes it interesting and prospectively offers insight to its use.

high_level_architecture

Gloo has some, I suppose they’re “sub” projects too. Here’s a screenshot of the set of em’. Click it to navigate to solo.io which appears to be the parent organization. Some pretty cool software there, lots of open source goodness. It also leads me to think that maybe this is part of that first point above that I’m looking for, which is where is the improved developer story?

gloo.png

More on that later, for now, I want to touch on one more thing before moving on to next blog posts about the KubeCon details I’m keen to tell you about.

ballerina-logo

Ballerina – Ok, when I approached the booth I wasn’t 100% sure this was going to be what I was wanting it to be. Upon getting a demo (in video too) and then returning to the web – as ya do – then digging into the details and RTFMing a bit I have become hopeful. This stack of technology looks good. Let’s review!

The description on the website describes Ballerina as,

A compiled, transactional, statically and strongly typed programming language with textual and graphical syntaxes. Ballerina incorporates fundamental concepts of distributed system integration and offers a type safe, concurrent environment to implement microservices with distributed transactions, reliable messaging, stream processing, and workflows.

which sounds like something pretty solid, that could really help developers build on – let’s say Kubernetes – in a very meaningful way. However it could also expand far beyond just Kubernetes, which is something I’ve wanted to see, and help developers expand and expedite their processes and development around line of business applications. Which currently, is still the same old schtick with the now ancient RAD tools all the way to today’s React and web tools without a good way to develop those with understanding or integrations with modern tooling like Kubernetes. It’s always, jam it on top, config a bunch of yaml, and toss it over the wall still.

A few more key points of how Ballerina is described on the website. Here’s the stated philosophy goal,

Ballerina makes it easy to write cloud native applications while maintaining reliability, scalability, observability, and security.

Observability and security eh? Ok, I’ll be checking into this further, along with finally diving into Rust in the coming weeks. It looks like 2019 is going to be the year I delve into more than one new language! Yikes, that’s gonna be intense!

TiDB – Clearly the team at PingCap didn’t listen to the repeated advice of “don’t write your own database, it’ll…” from Charity Majors and a zillion other people who have written their own databases! But hey, this one, regardless of the advice being unheeded, looks kind of interesting. Right in the TiDB repo they’ve got an architecture diagram which is…  well, check out the diagram first.

tidb-architecture

So it has a mySQL app protocol, that connects with the TiDB cluster, which then has a DistSQL API (??) and KV API connecting to the TiKV (which stands for Ti Key Value) which is a cluster, that then uses a DistSQL API to connect the other direction to a Spark Cluster. The Spark SQL can then be used. It appears the running theme is SQL all the things.

Above this, to manage the clusters and such there’s a “PD Cluster” which I also need to read about. If you watched the video above, you’ll notice the reference to it being the ZooKeeper of the system. This “PD Cluster ZooKeeper” thing manages the metadata, TSO Data Location and data location pertinent to the Spark Cluster. Overall, 4 clusters to manage the system.

Just for good measure (also in the video) the TiDB is built in Go, while the TiKV is built in Rust, and some of the data location or part of the Spark comms are handled with some Java Virtual Machine – I think – I might have misunderstood some of the explanation. So how does all this work? I’ve no idea at this point, but I’m curious to find out. But before that in the next few weeks and months I’m going to be delving into building applications in Node.js, Java, and C# against Cassandra and DataStax Enterprise, so I might add some cross-comparisons against TiDB.

Also, even though I didn’t get to have a conversation with anybody from Foundation DB I’m interested in how it’s working these days too, especially considering it’s somewhat storied history. But hey, what project doesn’t have a storied history these days right! Stay tuned, subscribe here on the blog, and I’ll have updates when that work and other videos, twitch streams, and the like are published.

After all those conversations and running around the floor, at this point I had to take a coffee break. So with that, enjoy this video on how to appropriately grab good coffee real quick and an amazing cookie treat. Cheers!

Yes, I mispelled “dummy” it’s ok I don’t want to re-render it. I also know that the cookie name is kind of vulgar LOLz but you know what, welcome to Seattle we love ya even when you’re mind is in the gutter!

Property Graph Modeling with an FU Towards Supernodes – Jonathan Lacefield

Some notes along with this talk. Which is about ways to mitigate super nodes, partitioning strategies, and related efforts. Jonathan’s talk is vendor neutral, even though he works at DataStax. Albeit that’s not odd to me, since that’s how we roll at DataStax anyway. We take pride in working with DSE but also with knowing the various products out there, as things are, we’re all database nerds after all. (more below video)

In the video, I found the definition slide for super node was perfect.

supernodes.png

See that super node? Wow, Florida is just covered up by the explosive nature of that super node! YIKES!

In the talk Jonathan also delves deeper into the vertexes, adjacent vertices, and the respective neighbors. With definitions along the way, so it’s a great talk to watch even if you’re not up to speed on graph databases and graph math and all that related knowledge.

impacts

traversal.png

The super node problem he continues on to describe have two specific problems that are detailed; query performance traversals and storage retrieval. Such as a Gremlin traversal (one’s query), moving along creating traversers, until it hits a super node, where a computational explosion occurs.

Whatever your experience, this talk has some great knowledge to expand your ideas on how to query, design, and setup data in your graph databases to work against. Along with that more than a few elements of knowledge about what not to do when designing a schema for your graph data. Give a listen, it’s worth your time.

 

Twitz Coding Session in Go – Cobra + Viper CLI for Parsing Text Files

Part 1 of 3 – Coding Session in Go – Cobra + Viper CLI for Parsing Text Files, Retrieval of Twitter Data, Exports to various file formats, and export to Apache Cassandra.

UPDATED PARTS:

  1. Twitz Coding Session in Go – Cobra + Viper CLI for Parsing Text Files (this post)
  2. Twitz Coding Session in Go – Cobra + Viper CLI with Initial Twitter + Cassandra Installation
  3. Twitz Coding Session in Go – Cobra + Viper CLI Wrap Up + Twitter Data Retrieval

Updated links to each part will be posted at bottom of  this post when I publish them. For code, written walk through, and the like scroll down and I’ll have code samples toward the bottom of the timestamps under the video, so scroll, scroll, scroll.

3:40 Stated goals of application. I go through a number of features that I intend to build into this application, typing them up in a text doc. The following are the items I added to the list.

  1. The ability to import a text file of Twitter handles that are mixed in among a bunch of other text.
  2. The ability to clean up that list.
  3. The ability to export the cleaned up list of handles to a text, csv, JSON, xml, or plain text file.
  4. The ability, using the Twitter API, to pull data form the bio, latest tweets, and other related information.

github8:26 Creating a new Github Repo. A few tips and tricks with creating a new repo with the Github.

9:46 Twitz is brought into existence! Woo
~~ Have to reset up my SSH key real quick. This is a quick tutorial if you aren’t sure how to do it, otherwise skip forward to the part where I continue with project setup and initial coding.

12:40 Call out for @justForFunc and Francesc’s episode on Cobra!

Check out @Francesc’s “Just for Func”. It’s a great video/podcast series of lots and lots of Go!

13:02 Back to project setup. Cloning repo, getting initial README.emd, .gitignore file setup, and related collateral for the basic project. I also add Github’s “issues” files via the Github interface and rebase these changes in later.
14:20 Adding some options to the .gitignore file.
15:20 Set dates and copyright in license file.
16:00 Further setup of project, removing WIKIs, projects, and reasoning for keeping issues.
16:53 Opening Goland up after an update. Here I start going through the specific details of how I setup a Go Project in Goland, setting configuration and related collateral.
17:14 Setup of Goland’s golang Command Line Launcher.
25:45 Introduction to Cobra (and first mention of Viper by association).
26:43 Installation of Cobra with go get github.com/spf13/cobra/cobra/ and the gotchas (remember to have your paths set, etc).
29:50 Using Cobra CLI to build out the command line interface app’s various commands.
35:03 Looking over the generated code and commenting on the comments and their usefulness.
36:00 Wiring up the last bits of Cobra with some code, via main func, for CLI execution.
48:07 I start wiring up the Viper configuration at this point. Onward from here it’s coding, coding, and configuration, and more coding.

Implementing the `twitz config` Command

1:07:20 Confirming config and working up the twitz config command implementation.
1:10:40 First execution of twitz config and I describe where and how Cobra’s documentation strings work through the --help flag and is put into other help files based on the need for the short or long description.

The basic config code when implemented looked something like this end of session. This snippet of code doesn’t do much beyond provide the displace of the configuration variables from the configuration file. Eventually I’ll add more to this file so the CLI can be easier to debug when setting it up.

package cmd

import (
	"fmt"
	"github.com/spf13/cobra"
	"github.com/spf13/viper"
)

// configCmd represents the config command
var configCmd = &cobra.Command{
	Use:   "config",
	Short: "A brief description of your command",
	Long: `A longer description that spans multiple lines and likely contains examples
and usage of using your command. For the custom example:
Cobra is a CLI library for Go that empowers applications.
This application is a tool to generate the needed files
to quickly create a Cobra application.`,
	Run: func(cmd *cobra.Command, args []string) {
		fmt.Printf("Twitterers File: %s\n", viper.GetString("file"))
		fmt.Printf("Export File: %s\n", viper.GetString("fileExport"))
		fmt.Printf("Export Format: %s\n", viper.GetString("fileFormat"))<span data-mce-type="bookmark" id="mce_SELREST_start" data-mce-style="overflow:hidden;line-height:0" style="overflow:hidden;line-height:0;"></span>
	},
}

func init() {
	rootCmd.AddCommand(configCmd)
}

Implementing the `twitz parse` Command

1:14:10 Starting on the implementation of the twitz parse command.
1:16:24 Inception of my “helpers.go” file to consolidate and clean up some of the code.
1:26:22 REDEX Implementation time!
1:32:12 Trying out the REGEX101.com site.

I’ll post the finished code and write up some details on the thinking behind it when I post video two of this application development sessions.

That’s really it for this last session. Hope the summary helps for anybody working through building CLI apps with Go and Cobra. Until next session, happy thrashing code.

UPDATED SERIES PARTS

  1. Twitz Coding Session in Go – Cobra + Viper CLI for Parsing Text Files (this post)
  2. Twitz Coding Session in Go – Cobra + Viper CLI with Initial Twitter + Cassandra Installation
  3. Twitz Coding Session in Go – Cobra + Viper CLI Wrap Up + Twitter Data Retrieval

How to Become a Data Scientist in 6 Months a Hacker’s Approach to Career Planning

I’ve been digging around for some good presentations, giving a listen to some videos from PyData London 2016. Out of the videos I watched, I chose this talk by Tetiana Ivanova. This is a good talk for those looking to get into working as a data scientist. Here’s a few notes I made while watching the video.

Why did Tetiana hack here career? Well, first off, she was a mathematician she wanted to get out of that world of academia and make a change.

…more, including references and links, below the video…

Why does our society support higher education? Tetiana points emphasizes a number of things that that interline in a person’s life such as social pressure, historical changes, prestige, and status signaling to dictate their career options and provide choices in direction. I found this fascinating as I’ve read about, and routinely notice that higher education at an established and known school is largely about prestige and status signaling more than the actual education itself. As we see all the time in society, people will gain a position of status and power all too often based on the prestige and status arbitrarily associated with them because they went to this school or that school or some Ivy League fancy pants school.

But wrap all that up, and Tetiana outlines an important detail, “Prestige is exploitable!”

There’s also some hard realities Tetiana bullet points in a slide that I found worthy of note.

  • Set a realistic time frame.
  • Don’t trust yourself with sticking to deadlines.
  • Make a study plan.
  • Prepare for uncertainty.

I can’t draw enough emphasis to this list. There are also two points that I’d like to point out even more detail.

First is “don’t trust yourself with sticking to deadlines“. The second is “prepare for uncertainty“. If you expect to meet all your deadlines, you are going to dramatically increase your need to prepare for uncertainty. Because you will simply not meet all of your deadlines. For many of us we won’t meet most of our deadlines, let alone all of the deadlines.

Tetiana continues to cover a number of details around willpower, self management and self organization, and an insightful take on the topic of nerd networking. Watch the talk, it’s a good one and will provide a lot of insight, from a clearly introspective and intelligent individual, into clearing and stepping into a data science – or possibly other – career path!

References:

Jonathan Ellis talks about Five Lessons in Distributed Databases

Notes on the talk…

  1. If it’s not SQL it’s not a database. Watch, you’ll get to hear why… ha!

Then Jonathan covers the recent history (sort of recent, the last ~20ish years) of the industry and how we’ve gotten to this point in database technology.

  1. It takes 5+ years to build a database.

Also the tens of millions of dollars with that period of time. Both are needed, in droves, time and money.

…more below the video.

  1. The customer is always right.

Even when they’re clearly wrong, they’re largely right.

For number 4 and 5 you’ll have to watch the video. Lot’s good stuff in this video including comparisons of Cosmos, Dynamo DB, Apache Cassandra, DataStax Enterprise, and how these distributed databases work, their performance (3rd Party metrics are shown) and more details!