Big Data Infrastructure

Big Data Infrastructure

Definitions for Business Intelligence and Big Data are a lot like a Picasso or Matisse, abstract & defined by the subjective impressions of the viewer.  However, making Business Intelligence (or Analytics) happen with raw, unstructured, or streaming data pushes us out of the theoretical/abstract and into the tangible delivery focused realm.

What is Big Data?

Big Data is less about the physical measurements of your data volume and more about the processing methodologies and structural composition of your data.  As such, there are no thresholds used to clearly define small, medium, or big data volumes.  For our expressed purpose, we will use the term Big Data to refer to the methodology of storing raw, transactional level data coming from either structured or non-structured forms.

What is Business Intelligence?

Similar to Big Data, Business Intelligence has become much of a buzz phrase and includes Reporting/Analytics, Database Programming, Extract-Transform-Load (ETL) Programming, Master Data Management, and many more topics.  However, each of these sub-classifications is becoming more and more of a Category all it’s own.  Master Data Management, as an example, can take on many shapes, from data cleansing scripts to full Business Reviews to define Revenue, Profit, Customer, etc.  Similarly, Reporting/Analytics includes Dashboards and Scorecards, but what is the difference between a Dashboard and a Scorecard?

How is BI on Big Data even possible?

Large data volumes have existed for many years, but mainly in transactional systems like ERP’s.  This data was optimized for transaction processing, not reporting.  The SQL necessary to report on the data often required many to dozens of joins, thus performance was sub-optimal.  In come Mr. Kimball and Mr. Inmon who defined methodologies (or religions for some) to stage, normalize, and structure data for optimal analysis and reporting.  Regardless of your allegiances, this required aggregating data and this aggregation removed a portion of the detail, but provided sufficient insight into the trajectory of the business.  Unfortunately, it also meant having to create multiple cubes or rollups to answer the questions necessary to keep the lights on.

However, times are changing, at least technology and the cost of hardware is changing.  As hardware prices, specifically disk, drop and the technology to distribute the load across all the hardware makes leaps and bounds forward, the ability to report, data mine, and draw analytics based conclusions from raw & unstructured data increase exponentially.  The time has come for a trip back to the future.  By that I mean that not too long ago we were using file based databases similar to FoxPro or JD Edwards or others that distributed data across multiple file based “tables”.  Now are coming full circle to something similar, but have gained some extra advantages with the Hadoop Distributed File System (HDFS) and MapReduce functionality that allow us to utilize these “files” or “tables” to their fullest extent with the pooled resources of multiple machines/servers.  No longer are we defined by the size of our single mainframe or blade systems.  From a Business Intelligence perspective, specifically focused on Reporting/Analytics & ETL, these types of advantages allow us to no longer concern ourselves with throughput; rather we can focus on asking the most effective questions.  Identify trends based on individual customer experiences, rather than a one-size-fits-all approach.  Make targeted recommendations or marketing messages to each customer based on purchase history and predictive modeling based on other customers with a similar arc.


Business Intelligence is just a general term used to define the process of driving efficiencies from your everyday activities and learning from your customer’s reactions to those activities.  Similarly, Big Data is a buzz-word meant to incite your interests.  In the end, both are extremely ambiguous and neither is actionable.  However, measuring your business and drawing efficiencies from those measurements is just good business.  Doing so while customizing the experience for each of your customers is even better, so don’t let the buzz-words get in the way of good business.

What next?

Watch for a more detailed whitepaper coming in March.

Let Axian come to the rescue and help define your BI strategy, develop a roadmap, work with your business community to identify the next project, and provide clarity and direction to a daunting task. For more details about Axian, Inc. and the Business Intelligence practice, click here to view our portfolio or email us directly to setup a meeting.

Let Axian's data superheroes improve your bottom line