Got some excellent coding and systems setup coming up in the next few days. Also a meetup on the 28th with Tim Kellogg and Alena Hall presenting on some interesting topics around distributed database data working on Kubernetes and WebAssembly of the hot temperament type. A new surprise guest addition on my Twitch channel that is scheduled to swing into Valhalla and help build out a cluster and respective needed DHCP, DNS, and related configuration for a setup on the metal!
August 23rd 10:00am DataStax Academy Twitch Stream – This session will include David and I as we dig into the Killrvideo Reference Application and discuss the pull request with the SSL additions to the code base. We’ll talk through the changes, why they’ve been made, and what advantage they provide for deployment into a multi-cloud environment.
August 24th 1:30pm~ish – Systems Configuration Setup and Cluster Setup on the metal! – This session will include a guest who’s going to step in and help in setup and configuration of a 5 node cluster of systems plus bastion server. Join for a how-to of setup, configuration, and clustering details.
August 24th, 3:33pm – Go Coding on Colligere – In this session I’ve finally gotten to the point where I can start filling out some of the CLI functionality. I’ll be adding over the next few days items to the issues list on Github too. Feel free to add content, or related items to the issue log too!
August 31st, 3:33pm – More Go, More Feature Additions – This session isn’t 100% clarified and ready just yet. But I’ll be there, ready to sling some more Go code and get more features done, more code refactored, and progress made.
There’s more than a few ways to configure node.js applications. I’ll discuss a few of them in this blog entry, so without mincing work, to configuring apps!
[sourcecode language=”javascript”]
var config = require(‘./config’)
[/sourcecode]
The disadvantage is when the application gets a little bigger the configuration can become unwieldy without very specific, strictly enforced guidelines.
Solution #2: Use a Library/Framework Like Convict.js
The use of a library provides some baseline in which to structure configuration. In the case of convict.js it uses a baseline schema that then can be used to extend or override based on configurations needed for alternate environments. A first steps in setting up convict.js for the fueler project looks like this.
Setup a convict.js file:
[sourcecode language=”javascript”]
var convict = require(‘convict’);
// Schema
var conf = convict({
env: {
doc: "The App Environment.",
format: ["production", "development", "test"],
default: "development",
env: "NODE_ENV"
},
port: {
doc: "The port to bind.",
format: "port",
default: 3000,
env: "PORT"
},
database: {
host: {
default: "someplace:cool",
env: "DB_HOST"
}
}
});
// perform validation
conf.validate();
module.exports = conf;
[/sourcecode]
The main two configuration values are the environment and port values. Others will be added as more of the application is put together, but immediately I just wanted something to put in the project to insure it works.
The save gets it put into the package.json file as a dependency. Once this is installed I opened up the app.js file of the project and added a require at the top of the file after the path require and before the express() call.
[sourcecode language=”javascript”]
var path = require(‘path’);
var config = require(‘./config’);
var app = express();
[/sourcecode]
In the app.set line for the port I changed the setting of the port to be the configuration parameter.
Now when I run the application, the port will be derived from the config.js file setting.
Now What Did I Do?
I’ll write more about this in the near future, but for now I’ve run into something not being setup right. I’m still working through various parts of customizing my setup. In the instructions for convict.js, which aren’t very thorough beyond the most basic use, is how to insure that the other environments are setup with *.json files. What I mean by this is…
I’ve setup a directory with three json files. It looks like this.
My Config Directory
Each of these files (or at least one of the files) I would think, based on the instructions, get loaded and merged into configuration based on the code in my app.js as shown below.
The order of override for the configuration values starts with the base config.js, then any *.json files override those config.js settings and any environment variables override the *.json set configuration variables. Based on that, unless of course I’ve missed something for this snippet of code, I should be getting the configuration settings from the *.json files.
My config file data looks like this. Since it is using cjson I went ahead and stuck comments in there too.
[sourcecode language=”javascript”]
/**
* Created by adron on 3/14/14.
* Description: Adding test configuration for the project.
*/
{
"port": {
"doc": "The port to bind.",
"format": "port",
"default": 1337,
"env": "PORT"
}
}
[/sourcecode]
Until later, happy coding, I’m going to dive into this and figure out what my issue is. In my next blog entry I’ll be sure to post an update to what the problem is.
Oh, and that fueler project. Feel free to ping me and jump into it.
NOTE: If you just want to check out the code bits, scroll down to the sub-title #symphonize #hacking. Also important to note I’m putting the library through a fairly big refactor at the moment so that everything aligns with the documentation that I’ve recently created. So many things may not be implemented, but we’re moving toward v0.1.0, which will be a functional implementation of the library available via npm based entirely on the documentation and specs that I outline after the history.
There are two main reasons why I chose Orchestrate.io and a data generation library as the two things I wanted to combine. The first, is I knew the orchestrate.io team and really dug what they were building. I wanted to work with it and check out how well it would work for my use cases in the future. The ability to go sit down, discuss with them what they were building was great (which I interviewed Matt Heitzenroder @roder that you can watch Orchestrate.io, Stop Dealing With the Database Infrastructure!) The second reason is that my own startup that I’m co-founding with Aaron Gray (@agray) needed to use key value and graph data storage of some type, somewhere. Orchestrate.io looked like a perfect fit. After some research, giving it a go, it fit very well into what we are building.
December then rolled into the standard holiday doldrums and slowdowns. So fast forward to January post a few rounds of beer and good tidings and I got the 3rd in the series published titled Getting Serious With Symphony.js – JavaScript TDD/BDD Coding Practices (3/3). The post doesn’t speak too much to symphony.js usage but instead my efforts to use TDD or BDD practices in trying to write the library.
Slowly I made progress in building the library and finally it’s in a mostly releasable state now. I use this library daily in working with the code base for Deconstructed and imagine I’ll use it ongoing for many other projects. I hope others might be able to find uses for it too and maybe even add capabilities or ideas. Just ping me via Twitter @adron or Github @adron, add an issue on Github and I’ll be happy to accept pull requests for new features, code refactoring, add you to the project or whatever else you’re interested in.
#symphonize #hacking
Now for the nitty gritty. If you’re up for using or contributing to the project check out the symphonize.js github pages site first. It’s got all the information to help get you kick started. However, you can keep reading as I’ve included much of the information there along with the examples from the README.md below.
NOTE: As I mentioned at the top of this blog entry, the funcitonal implementation of code isn’t available via npm just yet, myself and some others are ripping through a good refactor to align the implementation fo the library with the rewritten and newly available documentation – included blow and at the github pages.
[sourcecode language=”javascript”]
git clone git@github.com:YourUserName/symphonize.git
cd symphonize
npm install
[/sourcecode]
Using The Library
The intended usage is to invocate the JavaScript object and then call generate. That’s it, a super simple process. The code would look like this:
[sourcecode language=”javascript”]var Symphonize = require(‘../bin/symphonize’);
var symphonize = new Symphonize();
[/sourcecode]
The basic constructor invocation like this utilizes the generate.json file to generate data from. To inject the json configuration programmatically just inject the json configuration information via the constructor.
[sourcecode language=”javascript”]
var configJson = {"schema":"keyvalue"};
var Symphonize = require(‘../bin/symphonize’);
var symphonize = new Symphonize();
[/sourcecode]
Once the Symphonize data generator has been created call the generate() method as shown.
That’s basically it. But you say, it’s supposed to do X, Y or Z. Well that’s where the json configuration data comes into play. In the configuration data you can set the data fields and what they’ll generate, what type of data will be generated, the specific schema, how many records to create and more.
generate.json
The library comes with the generate.json file already setup with a working example. Currently the generation file looks like this:
[sourcecode language=”javascript”]
{
"schema": "keyvalue", /* keyvalue, graph, event, geo */
"count": 20, /* X values to generate. */
"write_source": "console", /* console, orchestrateio and whatever other data sources that might come up. */
"fields": {
/* generates a random name. */
"fieldName": "name",
/* generates a random dice roll of a d20. */
"fieldTwo": "d20",
/* A single lorum ipsum random statement is genereated. */
"fieldSentence": "sentence",
/* A random guid is generated. */
"fieldGuid": "guid" }
}
[/sourcecode]
Configuration File Definitions
Each of the configuration options that are available have a default in the configuration file. The default is listed in italics with each definition of the configuration option listed below.
“schema” : This is used to select what type of data structure type is going to be generated. The default iskeyvalue for this option.
“count” : This provides the total records that are to be generated by the library. The default is 1 for this option.
“write_source” : This provides the location to output the generated data to. The default is console for this option.
“fields” : This is a JSON field within the JSON configuration file that provides configuration options around the fields, number of fields and their respective data to generate. The default is one field, with a default data type of guid. Each of the respective entries in this JSON option is a self contained JSON name and value pair. This then looks simply like this (which is also shown above in part):[sourcecode language=”javascript”]{
"someBoolean": "boolean",
"someChar": "character",
"aFloat": "float",
"GetAnInt": "integer",
"fieldTwo": "d20",
"diceRollD10": "d10",
"_string": {
"fieldName": "NameOfFieldForString",
"length": 5,
"pool": "abcdefgh"
},
"_sentence": {
"fieldName": "NameOfFiledOfSentences",
"sentence": "5"
},
"fieldGuid": "guid"
}
[/sourcecode]
Fields Configuration: For each of the fields you can either set the field to a particular data type or leave it empty. If the field name and value pair is left empty then the field defaults to guid. The types of data to generate for fields are listed below. These listed are all simple field and data generation types. More complex nested generation types are listed below under Complex Field Configuration below the simple section.
“boolean“: This generates a boolean value of true or false.
“character“: This generates a single character, such as ‘1’, ‘g’ or ‘N’.
“float“: This generates a float value, similar to something like -211920142886.5024.
“integer“: This generates an integer value, similar to something like 1, 14 or 24032.
“d4“: This generates a random integer value based on a dice roll of one four sided dice. The integer range being 1-10.
“d6“: This generates a random integer value based on a dice roll of one six sided dice. The integer range being 1-10.
“d8“: This generates a random integer value based on a dice roll of one eight sided dice. The integer range being 1-10.
“d10“: This generates a random integer value based on a dice roll of one ten sided dice. The integer range being 1-10.
“d12“: This generates a random integer value based on a dice roll of one twelve sided dice. The integer range being 1-10.
“d20“: This generates a random integer value based on a dice roll of one twenty sided dice. The integer range being 1-20.
“d30“: This generates a random integer value based on a dice roll of one thirty sided dice. The integer range being 1-10.
“d100“: This generates a random integer value based on a dice roll of one hundred sided dice. The integer range being 1-10.
“guid“: This generates a random globally unique identifier. This value would be similar to ‘F0D8368D-85E2-54FB-73C4-2D60374295E3’, ‘e0aa6c0d-0af3-485d-b31a-21db00922517’ or ‘1627f683-efeb-4db8-8174-a5f2e3378c87’.
Complex Field Configuration: Some fields require more complex configuration for data generation, simply because the data needs some baseline of what the range or length of the values need to be. The following list details each of these. It is also important to note that these complex field configurations do not have defaults, each value must be set in the JSON configuration or an error will be thrown detailing that a complex field type wasn’t designated. Each of these complex field types is a JSON name and value parameter. The name is the passed in data type with a preceding underscore ‘_’ to generate with the value having the configuration parameters for that particular data type.
“_string“: This generates string data based on a length and pool parameters. Required fields for this include fieldName, length and pool. The JSON would look like this:[sourcecode language=”javascript”]"_string": {
"fieldName": "NameOfFieldForString",
"length": 5,
"pool": "abcdefgh"
}
[/sourcecode]
Samples of the result would look like this for the field; ‘abdef’, ‘hgcde’ or ‘ahdfg’.
“_hash“: This generates a hash based on the length and upper parameters. Required fields for this included fieldName, length and upper. The JSON would look like this:[sourcecode language=”javascript”]"_hash": {
"fieldName": "HashFieldName",
"length": 25,
"casing": ‘upper’
}
[/sourcecode]
Samples of the result would look like this for the field: ‘e5162f27da96ed8e1ae51def1ba643b91d2581d8’ or ‘3F2EB3FB85D88984C1EC4F46A3DBE740B5E0E56E’.
“_name”: This generates a name based on the middle, *middleinitial* and prefix parameters. Required fields for this included fieldName, middle, middle_initial and prefix. The JSON would look like this:[sourcecode language=”javascript”]"_name": {
"fieldName": "nameFieldName",
"middle": true,
"middle_initial": true,
"prefix": true
}
[/sourcecode]
Samples of the result would look like this for the field: ‘Dafi Vatemi’, ‘Nelgatwu Powuku Heup’, ‘Ezme I Iza’, ‘Doctor Suosat Am’, ‘Mrs. Suosat Am’ or ‘Mr. Suosat Am’.
So that covers the kick start of how eventually you’ll be able to setup, use and generate data. Until then, jump into the project and give us a hand.
This week I’ve traveled to Philadelphia to meet with a number of the Basho team to work together and receive training with the trainers or the best ways to approach content on Riak and more generally the best ways we can all brainstorm up to approach specific topics. Some of those topics include things like:
Access Patterns around Log Storage & Analysis
Bloom Filters
CRDTs
Consensus Protocols
Erasure Coding
LevelDB and Bitcask Backends
MDC Repl
Out of the options we discussed in training today I ran with benchmarking. It is always near and dear to many of the customers’, clients’ and curious’ that I talk to. I dove in to see what exactly we offer with basho_bench (docs info, github repo) in detail and functionality, but also dove into other benchmarks are out there that others may have run in the past.
basho_bench
What exactly is basho_bench? The basho_bench project is a code repo on Github that offers a set of benchmarking tests that are run against a Riak cluster. There are a few prerequisites to the quick steps below, the prerequisites are:
[sourcecode language=”bash”]
git clone git://github.com/basho/basho_bench.git
cd basho_bench
make all
[/sourcecode]
Once that is done building, review the directory structure that is in the basho_bench directory. The following should be available in the directory.
[sourcecode language=”bash”]
$ ls
FAQ deps rebar
LICENSE ebin rebar.config
Makefile examples src
README.org include tests
basho_bench priv
[/sourcecode]
The examples directory has several default config files available to run with basho_bench for testing. If there is a devrel setup with the default 127.0.0.1 IP usage, just run the following command to begin generating stats. If the cluster being tested is not a devrel with 127.0.0.1 then give the configuration section of the docs a read for information on how to point basho_bench at an alternative cluster.
The reason I post both is that ‘make results‘ doesn’t seem to always work to build the results and the manual execution will actually get the results built. With the results built, check the tests directory in the basho_bench directory for the summary.png file. If you open the file it should look something like this.
Default empty http.config results from basho_bench. (Click for full size image)
From here you can now run basho_bench and get the results that are specific to basho_bench. However, this now leads me to a higher abstract topic of why do benchmarking in the first place.
Why Benchmark? How to Benchmark!
The definition for benchmark,
bench·mark
[bench-mahrk] noun
1.a standard of excellence, achievement, etc., against which similar things must be measured or judged: The new hotel is a benchmark in opulence and comfort.
2.any standard or reference by which others can be measured or judged: The current price for crude oil may become the benchmark.
3.Computers. an established point of reference against which computers or programs can be measured in tests comparing their performance, reliability, etc.
4.Surveying. Usually, bench mark. a marked point of known or assumed elevation from which other elevations may be established. Abbreviation: BM adjective
5.of, pertaining to, or resulting in a benchmark: benchmark test,benchmark study.
While basho_bench provides an interesting baseline test that shows various pieces of data to work with, it shows nothing by default that is specific to YOUR use case. The basho_bench is not ideal for your production environment, it is not your dev or user acceptance testing or test criteria, it is an example. To truly get effective numbers that really encompass your needs for your project you will need to provide custom configuration for basho_bench or write your own specific benchmark.
The reason behind this is, with Riak as with other NoSQL solutions, is that you’re working toward a goal that is very data specific and unknown. It has specific domain logic and criteria that is specific to the use case, a custom benchmark can provide real data related to that domain logic and criteria.
In the end, even though basho_bench is a great tool to get started, do basic tests, and a great project to get ideas from it is not the panacea benchmark. You’ll need to create the specific benchmark for your use case yourself.
You must be logged in to post a comment.