It’s Not the Size of Your Data But How You Use It
I often hear people brag about the size of their data. Don’t get me wrong. I work for EMC so you know I love this. But you don’t need huge databases to have significant business impact. As sexy as unstructured data is, there’s a wealth of information sitting in your structured data too. The biggest issue I see is people don’t know how to use their data to make good business decisions. Sometimes a simple solution with a little scale can get the desired result.
As I’ve said in the past I’m a business guy first. Value, not coolness, is the goal. Coolness is a byproduct of your success! Excel models are a good example of something the business created to add value. But because the creator didn’t have the right skill or environment, the models have limited value. They are often very manual and do not scale. Many times the models only work thanks to cutting corners or using summarizations of data. Now these flawed models are being used only because your people don’t have the skills to build them differently. Which provides the perfect opening for your first Big Data project. If you’re well on your way in Advanced Analytics don’t overlook these small gems as opportunities. Excel models often can be broken down:
- Data Sources: how many are you using?
- Many times they are raw data tabs in Excel
- Joins: do I need to merge these data sources at some level?
- Many times you need multiple tabs to come together or have to summarize many sources before they can be joined together.
- Indexes: Do you need to add/manipulate those sources?
- How many times does the native data not have a view or rollup your business wants and you need to add?
- Formulas: Do you need to do any calculations off these data sets?
- This can get very complex depending on the Modeler’s skill set
- Summarizations: Do I need to summarize the data or some of the data in order to join or use?
- Many times pivots are used off raw tabs and then joined with other tabs or pivots because the data only joins cleanly at summarized levels.
Building a database solution rather than an Excel model can open the door for
- Automation: Drastically reduce manual effort of maintaining the model
- Scale: Will not have to cut corners or use summarizations. You can go to the lowest level needed.
- New Insight: Many times the improved scale exposes information not seen in summary level data.
Multi-million dollar decisions are made off these models. Now you can experience the full power of the Dark Side…I mean of your Big Data Solution…even if it’s structured data and not gigantic. Once you automate and scale this up, you have a database environment that can be leveraged. Now bring in the Data Scientist for even more insights that were not possible before!