Master Data in Big Data Management

By April Reeve January 23, 2013

Currently, most data management activities are segregated by data type: documents are kept in one type of file repository, emails in another, structured data in databases, etc. One of the goals and values of big data management is being able to analyze data across these repositories, but if so then how do we link the data together?  A big part of the answer, I believe, is master data.  Master data is the data describing the important things in the organization: customers, products, employees, organizational structure, financial reporting structure, etc.

People with appropriate access in the organization should be able to view about a customer, for example, not only the customer’s name, addresses, and other demographic information, but emails to and from and concerning the customer, documents related to them, as well as audio recordings of any calls to customer service and video of the customer visiting the organization’s offices. All the organizational information about a customer can be made appropriately available to customer service to support a customer inquiry, to identify additional products with which the customer might be interested, or to predict likely future behavior.

Standard business intelligence tools can be used to find and connect information about a customer located in databases.  Tools that search   text can be used to find information related to a customer in document and email repositories either because these items contain text with the customer’s identifying information or because someone has tagged the documentation with the customer’s id.  Similarly, audio and video files can be searched for the customer likeness or tagged manually with customer information. Links to a customer can be made at the time the information is stored or dynamically when a query is made about the customer.  Tagging files and documents with customer identifiers can be performed automatically or manually.  The ability to attach the customer information automatically is critical to big data management since the volume of data is usually beyond human manageable scale and we need to move away from the concept of manually crafted metadata.

And so, if the data in databases is called “structured” with keys associated with the master data in the organization, then we can integrate that data together with the “unstructured” data in files, documents, and emails by tagging the unstructured data with the master data information, automatically and manually, at storage and at query time.

About April Reeve

With 25 years of experience as an enterprise architect and program manager, April fully deserves her Twitter handle: @Datagrrl.

She knows data extremely well, having spent more than a decade in the financial services industry where she managed implementations of very large application systems.

April is a Data Management Specialist as part of EMC Global Services, with expertise in Data Governance, Master Data Management, Business Intelligence, Data Warehousing Conversion, Data Integration and Data Quality. All of these add up to one simple statement: April is very good at helping large companies organize their data and capture value from it. April works for EMC Consulting as a Business Consultant in the Enterprise Information Management practice.

Read More

Share this Story
Join the Conversation

Our Team becomes stronger with every person who adds to the conversation. So please join the conversation. Comment on our posts and share!

Leave a Reply

Your email address will not be published. Required fields are marked *

One thought on “Master Data in Big Data Management

  1. Pingback: The big data governance challenge | IT World Canada News