Is the Data Governance Value Message Getting Lost?
I recently had a client that decided to set aside an implementation of Data Governance (DG) in favor of a Data Quality (DQ) program. To me, this says that they really didn’t understand the methods and value that Data Governance would bring to their organization.
But let’s set that aside for a moment…
Working backward, if any organization is to be successful at managing data quality, they are going to have to:
- Score the data quality (pass/fail) against desired fit-for-use quality targets
- This is not a one-time thing. They will need to continuously monitor and report the quality of the data. Depending on the importance of the individual data elements, this may be daily/weekly/monthly/etc.
- Use a data profiling tool to measure the quality of the individual data elements in the physical database(s)
- Configure the profiling tool to measure the known business rules for creation of the data (including cross field/table dependencies and constraints there are on the data)
- Document the physical models where the data resides
- Document the business rules for creation of the data
By comparison, here is the method for “data onboarding” that EMC recommends as part of our Data Governance reference model:
- Establish who has decision rights over, and is accountable for the quality of the data
- Understand who is a stakeholder in the data by virtue of the fact that they create or use or update the data
- Document the business term definition of each data element
- Document the business rules for the creation of the data (including dependencies and constraints)
- Document the business rules for the usage of the data
- Prioritize the data with respect to its criticality to the business
- All data is not created equally
- Data of lower importance should not be subject to the same level of rigor as KDEs (Key Data Elements)
- Document the metrics to be used to determine the quality of the data
- In context, document the fit-for-use quality target(s) for the data (Note: each data use context may have a different acceptable quality target)
- Document the lineage of the data from source systems, through ETL, to storage in the physical database(s)
- Configure the data profiling tools to test the data against the documented business rules
- Continuously measure the quality of the data using the configured data profiling tool
- Score the data quality and report back to the business on the pass/fail status for each data use context
Obviously, when it comes to properly measuring data quality, there is a need to imbed Data Governance in the process (whether the organization chooses to call it Data Governance or not!).
What the organization gains by building a Data Governance program in the service of a Data Quality program is:
- A clear definition of the roles and responsibilities around the management and measurement of the data
- Accountability is key to understanding, maintaining, and improving the quality of the data over time (continuous quality improvement)
- A well-defined set of policies for:
- Onboarding data (establishing the definitions, business rules, quality targets)
- Scoring (scoring models) and publishing the results of the data quality measurement
- Documenting discovered data quality defects
- Triage, root cause analysis, impact analysis, and prioritization of data defects
- Remediation of data defects and, where necessary, the broken processes that created the data
- Change Management around the deployment of data / process defect fixes
- An operational framework for integrating the people and processes with the data
One of the key concepts that you should take away from this case study is the idea that your approach to Data Governance implementation should be one of building “just enough Data Governance” to get the job done. If a DG program is initially built to support a DQ initiative, then that DG program must be appropriately scaled to fit, but still leave open the possibility of the future federation the DG methods / policies / collateral across multiple data domains, or a future scaling of the DG model up to an enterprise-class (COE) program.
In any case, IMO, any attempt to do DQ without at least some measure of DG, is doomed to failure.