By David Loshin
1) An approach to specifying data validity rules that can be
used to determine whether a data instance or record has an error. This is more
of a discipline that can be guided by formal representations of business or data
rules. Often metadata management tools and data profiling tools have
repositories for capturing defined rules, leading to our next technique…
2) A method for applying those rules to data. This often will take
advantage of the operational aspects of a data profiling or monitoring tool to
validate a data instance against a set of rules. It may also incorporate parsing
and standardization rules to identify known error patterns.
3) A means for reporting errors to a data analyst or steward. Some data
analysis and profiling can be configured to automatically notify a data steward
when a data validity rule is violated. In other situations, the results of
applying the validation rule can be accumulated in a repository and a front-end
reporting tool is used to provide visualization and notification of errors.
4) An inventory of actions to take when specific errors occur. As your
team becomes more knowledgeable about the types of errors that can occur, you
will also become accustomed to the methods employed for analysis and
In time, the repeated use of tools and the corresponding actions for remediation
can be evolved into standardized methods, which can be documented, published,
and used as the basis for training data quality analysts.