Getting The Bits Swinging in the Business

Read Part 1 & Part 2 of this series if you haven't already. 

Last entry I worked up the basic origins and destinations for the data we have. Now we have to start turning this into something real, something concrete. The following processes include a whole soup of acronyms and other cryptic vocabulary. The most common thing I will probably use is the ETL acronym, which stands for Extract, Transform, and Load. This is the process of bridging the data to the various originations and destination and moving that data appropriately.

In the last entry I finalized the originations and destinations as shown below;

Originations
Excel & Access (Office 2007) *.mdb, *.xlsx, and *.csv/*.txt data stores
Internal Account Software (IAS) This one is a prospective can of worms.  Proprietary layouts, de-normalized & normalized data, and all sorts of redundant, non-atomic data.  This sounds like an accounting package right?  :p
Webtrends Analytics Data Exchange Web Services (DX) Webtrends web services provide REST style architecture, with the ability for data to be retrieved in XML, JSON, HTML, or other formats (we can add more if need be, just let us know).
Point of Sale System (POS) This system provides two daily exports, one at 6:00am and one at noon for processing.  The export format is *.csv.

Destination
SSRS SQL Server Reporting Services, with the core underlying data stored in SQL Server.

In my previous entry you may have noticed that I had posted SSIS with the Destination list.  Being one that corrects themselves when mistaken, I took it out, as it does not belong there.  The SSIS is our tool that will perform the ETL functionality for this project.

At this point we are finally going to get into the dirty bits of these pieces of technology, and how we need to tie them together.  I am going to attack them over the next few entries based on the order in the lists above.  The first item, is the Excel & Access 2007 customer relations listings from sales.  Here?s a description and a few shots of what this thing looks like.

The access database is setup with a very simple relational data schema.  This is shown below (click image for larger view).

You can see there are pretty standard pieces of data, in a generally normalized (3rd form for the most part) structure.  This is fine.

Next is a shot of the data entry screen for adding opportunities.  There are respective screens for customers and employees.  Everything needed for a basic customer list & tracking basic things.  Nothing too extravagant here either.  Again, all is fine.

Below is a simple report that shows the available opportunities that are open.

Another report showing the forecasts.

Below is another forecast sliced grid.

So all that seems normal enough.  But the processes are what makes things tricky.  If everyone just managed sales from the database, all would be right in the world.  The first thing that breaks this is that each sales person enters their sales during the day and other information in a spreadsheet that is not linked to the underlying database.  Someone each morning puts the previous days sales information.  This of course, breaks down data integrity.  Below is a sample sheet that is used each day.

One thing that Excel is used for, that doesn?t break reports is the lists of prospective customers to call.  As shown below.

Now that we have a break down of the Excel & Access Customer Relations Management Software, I will move onto the other pieces of technology in the next entry.  This is the data point that has the most prospective data risk, so I put it at the top of the list to cover first.  After I cover each of the systems, we will move into the architecture of the system overall.  So keep reading, more juicy bits to come.