New Challenges on the Big Data Journey

Frank Coleman By Frank Coleman Senior Director, DELL EMC Services September 13, 2016

No matter where you are on your journey to leveraging Big Data, you have challenges to overcome. My team and I have been doing this awhile now. While I’m happy to report we’ve solved many traditional data problems, new ones are always popping up.

If you’ve just decided to build a Data Lake / go big data / whatever you want to call it or just a little further along in your journey, here are some of the challenges we faced early on and how leveraging Big Data solved them.

Problem: Can’t get access to the data

The data doesn’t exist in a way that IT can get it to you as they are used to doing a ton of work on the data then pushing it to you via BI tools. Get in line for an IT budget request.

Resolution: I still have to request access to a data set but it’s not limited to what IT has put into BI tools. I can self-help on the complete data set

Problem: Can’t get the data at the level I need

The data is just too big. Seriously, did someone just say that?

Resolution: Once the data is available you can view it at all levels and create new levels if you want.

Problem: Data can’t process because it’s too large

Processing the data would take days or you would have to cut it into smaller chunks them merge it back together.

Resolution: Data runs in seconds, sometimes a minute or two; game-changing for decision support

Problem: Can’t merge the data with other IT-maintained data

Using Shadow IT since IT didn’t have a way for you to do this. If you’ve been around a while, you’re lying if you say haven’t done some Shadow IT work, using a server in a lab or, even worse, some desktop.

Resolution: All IT sourced data is available to me. Of course I have to ask for access first.

Problem: Can’t merge the data with outside data

Similar to the last one but a different twist as we often need to blend our data with 3rd party data or industry data.

Resolution: I can dump / feed my 3rd party data into my Analytical Sandbox to merge with the IT-sourced data instead of using a Shadow IT solution.

Problem: Can’t add in business rules to the data

Getting to the data via BI tools was the only way and the tools were very limited. Many times we just use the BI tool for Extract Transform Load and dump that output into a Shadow IT table for the next step.

Resolution: At a database level it’s just a few lines of code and a new column with our data shows up.

Problem: Can’t process many months of data, never mind years

If you haven’t already pulled and aggregated all of history, it will take days just to put the data together.

Resolution: We can literally say, “Sit down, let’s take a look”. To be honest, we can’t make it look pretty that quickly but sometimes quick and dirty is good enough. Even on bigger projects getting the data is not the problem anymore.

Everything sounds great, right? But don’t get ahead of yourself. With these traditional data solutions, come “new” problems/questions:

Where is my return on investment?

Let’s face it, it wasn’t cheap to get here so you had better have some ideas on how you’re going to create value. And let me say this: you saving days on pulling data is not the return they’re looking for. The return is the new insights and changes in business workflow you are about to do.

How do I ask the right questions before attacking a problem so I don’t create cool reports with no impact?

This was one of the hardest ones for me. We made a ton of mistakes turning on the new system as we didn’t know what was possible. Don’t boil the ocean on your first projects or they’ll fail. But don’t just focus on the BI benefit of speed because no one cares. Well, they’ll care if you plan to cut your BI staff. Otherwise no one cares . Most importantly, don’t give up!

How do I get Shadow IT teams or other BI users on board with a new way of thinking?

I’ve said this before: they are your best friends and can be your worst foe. Get them engaged and feeling like part of the team. They usually have tons of business knowledge and are often data SMEs in many areas. Initially they will view you as a threat looking to put the IT handcuffs on them. Change is hard and not everyone will get there. Embrace the ones that are willing to try.

How do I share this information, securely?

You now have some serious data out there and it needs serious rules about access to it. You must be able to grant access to the right people and make sure they don’t share it with unauthorized people.

How do I feed this insight into applications to impact workflow?

If you have access to the data and are able to model the data well you’re almost there. Data visualization is cool but not good enough. I want to be able to feed this insight into our workflow applications to impact the way we work in real time. Sorry folks, I don’t have a silver bullet for this one. But making sure your projects are clearly outlined with who is going to use your data and how they’ll use it can help. For example, you plan to feed your Data Science model data into an application which enables Sales and/or Service. This is when you are truly killing it, impacting workflow via Data Science! You don’t need a fancy chart, just the right information in front of the person making a decision when they need it.

Having these new problems is a sign of progress and that makes me happy. Many of our historical problems are gone and as we break new ground we will always run in to new obstacles. I’m curious if anyone else has had a similar experience or other challenges I may not have listed.

Let me know!

Frank Coleman

About Frank Coleman

Senior Director, DELL EMC Services

Frank is a Senior Director of Business Operations for Dell EMC Services. He is living the world of Big Data in this role, as he is responsible for using advanced data analytics to improve the customer experience with Dell EMC’s services organization.

This role keeps Frank immersed in Big Data, and he is at the cutting edge of using Big Data to solve real business problems. Frank has a strong blend of technical knowledge and business understanding, and has spent the last nine years focused on the business of service.

Under his leadership, EMC was honored in mid-2012 for the third consecutive year with the Technology Services Industry Association (TSIA) STAR Award for “Excellence in the Use of Metrics and Business Intelligence.” Prior to joining EMC, Frank worked in various fields and remote technical support roles.

Read More

Share this Story
Join the Conversation

Our Team becomes stronger with every person who adds to the conversation. So please join the conversation. Comment on our posts and share!

Leave a Reply

Your email address will not be published. Required fields are marked *

2 thoughts on “New Challenges on the Big Data Journey

  1. Frank, very informative approach when tackeling big data challenges. With the explosion of big data, companies are faced with data challenges in three different areas. First, you know the type of results you want from your data but it’s computationally difficult to obtain. Second, you know the questions to ask but struggle with the answers and need to do data mining to help find those answers. And third is in the area of data exploration where you need to reveal the unknowns and look through the data for patterns and hidden relationships. The open source HPCC Systems big data processing platform can help companies with these challenges by deriving insights from massive data sets quickly and simply. Designed by data scientists, it is a complete integrated solution from data ingestion and data processing to data delivery. Their built-in Machine Learning Library and Matrix processing algorithms can assist with business intelligence and predictive analytics. More at

  2. Hey Frank, An excellent article I must say! The interface layer of web has brought businesses much closer to their audience but at the same time it has increased competition too and so are customer expectations! Data is undeniably a valuable resource in this digital age, but without right data you can’t convert customer service into customer experience and that’s why first party data must be the first consideration for businesses rather than third party data.