Lost in the Lake? 5 Keys to Data Lake Success

Frank Coleman By Frank Coleman Senior Director, DELL EMC Services June 2, 2016

I had a cup of coffee with EMC Chief Data Governance Officer Barbara Latulippe recently. We talked about how more and more people tell us they have access to analytical sandboxes attached to a Data Lake but still can’t find the information they need.

Is this a Data Governance problem? A Skill problem? A Technology problem? A Tools problem?

The answer is yes, it’s all of that!

When you build a Data Lake you most likely have structured and unstructured data in it. For this post I’m only going to talk about the structured data because it’s the fastest/easiest to get value from it and a larger audience will benefit.

Structured Data

Biggest Complaint: I can’t find my data!

Reply: “You have everything you need. Why are you complaining?”

So what’s the problem?

Ok, many of us are used to using reporting tools and having nice clean flat tables fed from an EDW/GDW database. Now I have thousands or more tables with very little connection. I blogged about his problem before, likening it to dumping a bag of Legos on your desk and saying “Here you go”.

Keys to Success

  1. Light Data Governance
    • You need some form of Data Governance or you create chaos. Please read Rachel Haines’ blog around just enough Data Governance. You don’t want the lake buried in red tape but you can’t just dump all the data in one place and expect value to magically jump out of it.
  2. Data SMEs
    • You need the help of your data SMEs to make some order of this chaos and then document record and explain what they did. These data SMEs are the wizards who can make the magic jump out of the lake. Capturing what they do and making it available to the masses is where the value starts piling up.
  3. Leverage your Reporting Tools for help – See if the Reporting tools can show you the SQL or get IT to help
    • When you first start out, many people don’t know what columns to grab or what they are called because they are used to working with reporting tools. Many reporting tools can show you the XML or SQL being created when you grab the data.
  4. Focus on Team Skills
    • When we first got the Data Lake we had some skills issues. Most of my team were BI people and needed to skill up on SQL and then Hadoop. Being totally honest, not everyone was able to make that transition and new hires were targeted with those skills.
    • It’s important to partner with your IT teams and have regular knowledge sharing events. Both sides can benefit as you probably have the Data SME knowledge and they have more technical knowledge. The more you collaborate the better you understand each other’s needs and how to work more effectively.
  5. It’s hard work. Wishful thinking and complaining doesn’t make it better.
    • Sorry I had to throw that in : ). Regular meetings with your IT teams on what is and isn’t working is key. These are not complaint sessions bashing IT. We show real use cases that we’re struggling to get going. Early on it may be access to data, just finding the data or query restrictions on your roles.

If you are on the journey or just thinking about getting a Data Lake, I hope you found this useful.  Please let me know if you found any other lessons that enabled your success leveraging a Data Lake.

Frank Coleman

About Frank Coleman

Senior Director, DELL EMC Services

Frank is a Senior Director of Business Operations for Dell EMC Services. He is living the world of Big Data in this role, as he is responsible for using advanced data analytics to improve the customer experience with Dell EMC’s services organization.

This role keeps Frank immersed in Big Data, and he is at the cutting edge of using Big Data to solve real business problems. Frank has a strong blend of technical knowledge and business understanding, and has spent the last nine years focused on the business of service.

Under his leadership, EMC was honored in mid-2012 for the third consecutive year with the Technology Services Industry Association (TSIA) STAR Award for “Excellence in the Use of Metrics and Business Intelligence.” Prior to joining EMC, Frank worked in various fields and remote technical support roles.

Read More

Share this Story
Join the Conversation

Our Team becomes stronger with every person who adds to the conversation. So please join the conversation. Comment on our posts and share!

Leave a Reply

Your email address will not be published. Required fields are marked *

One thought on “Lost in the Lake? 5 Keys to Data Lake Success