How to Build an NPM Package, Beginning the Symphonize Project

NPM has helped to build on the massive Node.js popularity and drive JavaScript from a simple scripting language in the web browser to a powerful and capable back-end server language. A quick refresher, NPM stands for Node.js Package Manager and each package is made up of:

  1. a folder containing a program described by a package.json file.
  2. a gzipped tarball containing [1]
    1. a url that resolves to [2]
    2. a <name>@<version> that is published on the registry with [A]
    3. a <name>@<tag> that points to [B]
    4. a <name> that has a “latest” tag satisfying [C]
    5. a git url that, when cloned, results in [1]
Path structure view in Jetbrains Webstorm IDE.
Path structure view in Jetbrains Webstorm IDE.

With that basic understanding of what a module is that NPM provides, let’s jump through the steps to build a module that provides some basic functionality. I won’t cover too many parts in detail yet, just the happy path to getting an NPM library running.

First let’s create an appropriate folder and file structure to get started with. Here’s the commands I ran to get started.

[sourcecode language=”bash”]
mkdir bin
mkdir lib
[/sourcecode]

With these two directories created I then created the following files in the designated paths. In bin I created the symphonize.js file and in lib I created a main.js file.

Now, I added the following code to the symphonize.js file.

[sourcecode language=”javascript”]
exports.Coupling = function (searchThis, forThis) {
var returnValue = ‘no’;
if (searchThis.indexOf(forThis) > -1) {
returnValue = ‘yes’;
}
return returnValue;
}
[/sourcecode]

In the main.js file I added the following.

[sourcecode language=”javascript”]
(function () {
var couple = require(‘../bin/symphonize’);
couple.Coupling("Sample text", "Sample");
}).call(this)
[/sourcecode]

There are a number of issues with this code, I know, but it’s just a sample of the minimal amount of code, folder structure and packages.json that I need to get this package installed and ready for iteration as I move forward with the actual code base and what functionality will actually be added. Speaking of the packages.json file, I created one and added the following configuration settings to it.

[sourcecode language=”javascript”]
{
"author": "Adron Hall",
"name": "symphonize",
"description": "Prints out data to the console! Will be iterating soon for real functionality!",
"version": "0.1.0",
"repository": {
"url": "git@github.com:Adron/symphonize.git"
},
"main": "./lib/main",
"bin": {
"replaceme": "./bin/symphonize"
},
"dependencies": {},
"devDependencies": {},
"optionalDependencies": {},
"engines": {
"node": "*"
}
}
[/sourcecode]

That is now enough for me to at least get the module added to the global NPM repository, get things pointed back to Github appropriately and move forward with actual coding. I might even setup some continuous builds and delivery at some point, since I’ve now got the end point of where the libraries will be going. The commands to get a module uploaded to the NPM Repository are as follows. This command of course assumes I’ve already added a user using npm adduser or I’ve added one via the web site interface at https://npmjs.org/.

[sourcecode language=”bash”]
npm publish
[/sourcecode]

I’ve now got everything prepared and uploaded to NPM there is now a symphonize module library ready for use.

My NPM Page for Symphonize. Click to go to the actual NPM page.
My NPM Page for Symphonize. Click to go to the actual NPM page.

Here’s a few quick references to where everything is:

Sorry Database Nerds, Nobody Actually Gives a Shit…

So I’ve been in more than a few conversations about data structures, various academic conversations and other notions about where and how data should be stored. I’ve been on projects and managed projects that involve teams of people determining how to manage data so that other people can just not manage data. They want to focus on business use and not the data mechanisms underneath. The root of everything around databases really boils down to a single thing – how can we store X and retrieve X – nobody actually trying to get business done or change the world is going to dig into the data storage mechanisms if they don’t have to. To summarize,

nobody actually gives a shit…

At least nobody does until the database breaks, or somebody has to be hired to manage or tune queries or something or some other problem comes up. In the ideal world we could just put data into the ether and have it come back when we ask for it. Unfortunately we have to keep caring for where the data is, how it’s stored, the schema (even in schema-less, you still need to know the schema of the data at some point, it’s just another abstraction to push off dealing with the database), how to backup, recover, data gravity, proximity and a host of other concerns. Wouldn’t it be cool if we could just work on our app or business? Wouldn’t it be nice to just, well, focus on things we actually give a shit about?

Managed Data Systems!

The whole *aaS and PaaS World has been pushing to simplify operations to the point that the primary, if not the only concern, is the business itself. This is a pretty big step in many ways, but holds a lot of hope and promise around fixing the data gravity, proximity, management and related concerns. One provider of services that has an interesting start around the NoSQL realm is Orchestrate.io. I’ll have more about them in the future, as I’ll actually be working on hacking on some code against their platform. They’re currently solving a number of the mentioned issues. Which is great, a solid starting point that takes us past the draconian nature of the old approach to NoSQL and Relational Databases in general.

There has been some others, such as Mongo Labs or such, that have created a sort of DBaaS. This however doesn’t fill the gap that Orchestrate.io is filling. So far almost every *aaS database or other solution has merely been a single type of database that a developer can just throw data at in a single kind of way. Not really flexible, and really only abstracting some manual work, but not providing much additional value add around using the actual data. Orchestrate.io is bridging these together with search, replication and other features to provide a platform on which multiple options are available via the API. Key value, geo, time series and others are all coming together for them nicely. Having all the options actually creates a real value add, versus just provide one single way to do one thing.

Intelligent Data Systems?

After checking out and interviewing Orchestrate.io recently I’ve stumbled into a few other ideas. It would be perfect for them to implement or for the open source community to take a stab at. What would happen if the systems storing the data knew where to put things? What would be the case for providing an intelligent indexing policy or architecture at the schema design decision layer, the area where a person usually must intervene? Could it be done?

A decision tier that scans and makes decisions on the data to revamp the way it is stored against a key value, geo, time series or other method. Could it be done in real time? Would it have to go through some type of processing system? The options around implementing something like this are numerous, but this just leaves a lot of space for providing value add around the data to reduce the complexity of this decision making.

Imagine you have key value data, that needs to be associative based on graph principles, that you must store in a highly available system with pertinent real-time data provided based on those graph relations. A decision layer, to create an intelligent data system, could monitor the data and determine the frequent query paths against the data. If the data is growing old it could move data from real-time to archival via the key value. Other decisions could be made to push up data segments into a cache tier or some other mechanism to provide realtime graph connections to client queries. These are all decisions that would need to be made by somebody working on the data, but could be put into a set of rules to allow for re-allocation of the data via automated mechanisms into better storage options. Why keep old data that isn’t queried in the active in memory graph store, push it to the distributed key store. Why keep the graph data on drive when it can be in memory with correlated keys in a key value in memory store, backed by an on drive key value? All valid decisions, all becoming better understood day by day. It’s about time some of this decision process started to be automated.

What are your thoughts? Pro-intelligent data systems or anti-intelligent data systems? Think it’ll work or is it the wrong approach? Maybe the system should approach some other zenith or axiom point to become truly abstracted and transparent?

Orchestrate.io JavaScript Client Library

Today I’m starting a project working with Orchestrate.io’s API & open source software collaborations. More about the project in a moment, let’s get up to speed on what I’ll be including in this project. My main focus is to build a client library to access Orchestrate.io. During building this I’ll dive into the key value, graph and other storage mechanisms that the client library will provide. Beyond that, I’ll take a stroll through building an NPM library and the pertinent JavaScript the library. So buckle up, we’re going on a code slinging hash writing hacking session.

Over the course of putting together this material, I’ll be posting most of the core material on Orchestrate.io’s blog, so subscribe for updates as they come out. Feedly is a good option, connect via searching for “orchestrate.io” or navigate over to the Orchestrate.io blog itself. 😉

Project Effort Context

During building the client I’ll take a dive into who, what, where, when, why and how to interact with the various data structures. I’ll aim for the client to follow the model of the existing Go Client Library that is available at Orchestrate Go Client on Github. It follows a basic model as shown below in Go language.

[sourcecode language=”cpp”]
c := client.NewClient("Your API Key")
// Get a value
value, _ := c.Get("collection", "key")
// Put a value
c.Put("collection", "key", strings.NewReader("Some JSON"))
// Search
results, _ := c.Search("collection", "A Lucene Query")
// Get Events
events, _ := c.GetEvents("collection", "key", "kind")
// Put Event
c.PutEvent("collection", "key", "kind", strings.NewReader("Some JSON"))
// Get Relations
relations, _ := c.GetRelations("collection", "key", []string{"kind", "kind"})
// Put Relation
c.PutRelation("sourceCollection", "sourceKey", "kind", "sinkCollection", "sinkKey")
[/sourcecode]

I’ll be working on this client, but don’t hold back on me, feel free to jump in with some of your own code or telling me I wrote some code wrong or whatever. I’d gladly accept any committers jumping in to help out. The more we all work together the more useful information I can provide during this project.

Once this project has produced a workable client pending interest from the community I’ll put together some material about where, how and some best uses around using the client in your Node.js Application. Even prospectively build a JavaScript client side library prospectively for use with Angular or other popular client side libraries.

References

Junction Two Weeks Bi-weekly Review : Issue #005

First the bad news, then the good news! That’s the appropriate way to present it right?

Schedule Break on Junction (The bad news)

I’m taking a break on Junction for a few weeks to get some other projects off the ground. In a few weeks the plan is to swing back around to Junction and make some changes to the project, which might be pretty big changes, but I’ll leave those as a surprise for now. So right now there isn’t a whole lot of functional code base that is working, partly because Windows 8 and all has left me a little devoid of urgency. If somebody out there really wants to see Windows 8 have a Riak user interface and management tool let me know, maybe we can work out some new urgency on the project! 😉

JavaScript, Go, Training and Orchestrations (The good news)

Over the next few months I’m working on putting together a lot of content for several great companies. One you might have guessed if you’ve read the last few blog entries, “PIE’s Third Class, You Better Keep an Eye on These Companies…” and “Orchestrate.io, Stop Dealing With the Database Infrastructure!” specifically, is some content around what Orchestrate.io is doing. That’ll be coming up real soon, but more about that later.

I’ve just wrapped up my first Pluralsight course that will be available on Riak. I’ll be diving back into working on a course around Docker & Vagrant in the coming days. I’ll be posting some of the work as I go along, of course not the whole thing, but an idea of what the material will be.

There’s also a few more, undisclosed so far, companies I’ll be putting together some content for. Prospectively some content teams even, so if you’re interested in contributing (or working on as a paid consultant) ping me. I might just have some interesting work for you.

So with all that, I’ll have more updates, more coding mischievousness and content coming up in the days and months ahead. Cheers! -Adron

Orchestrate.io, Stop Dealing With the Database Infrastructure!

In this interview I talk to Matt Heitzenroder, Co-founder of Orchestrate.io and previous general manager of Basho Europe, data nerd and love of data types. In this video he talks about the data types, data structures, schema or schema-less options, graph, stores and other ideas behind Orchestrate.io. He also jumps into what exactly Orchestrate’s Mission is.

We also dive into some mentions around plans for geo, time series and what Orchestrate is doing with these data options. After a bit of high level discussion, Matt gives us some strategy and tactical around the plans for their involvement in the community, business domains and open source plans.

Close and important to my passions, we discuss some of the plans around what is coming down the pipe for open source involvement, how Orchestrate will fit into that and what code you’ll be seeing from the team.

https://vimeo.com/77274021

For a sneak peak of some of the open source coming your way check out and maybe even help out with Salter, now with more Go language oompf!  https://github.com/dizzyd/salter