For my inaugural blog I would like to direct you to Jim Harris Obsessive-Compulsive Data Quality post on Data Profiling. He makes several important...
Use Data Profiling to Shine a Light on Bad Data
Data discovery is a crucial first step in the data quality journey. It allows businesses to collect metadata on existing records to help you find and fix places where bad data is entering your systems. It also monitors data over time to ensure that you maintain consistent data quality. When you can see what’s in your data, you can make more informed business decisions and strategies.
By using specific metadata information, you can see where bad data is slipping into your systems. Then, you can set specific business rules for incoming records, which will help you standardize your data so that it is easier to analyze and understand.
For example, you can look at the state abbreviations that have been collected in your customer data. If there are any invalid abbreviations, such as XX or YY, you can set business rules to ensure that only valid state abbreviations are being accepted. This can be done with many pieces of data including ZIP Codes, email syntax, phone patterns, and more.
Once you have discovered what’s in your data, you can then continue to use data profiling to monitor your data over time. This helps you make sure that your data stays up to date and that your data governance processes work correctly. Profiling over time also allows you to see how your data quality initiatives work over time, and helps you find any processes that may still need attention.
Melissa’s Data Profiler has over 100 types of detailed metadata to help you shine a light on the bad data that could be lurking in your database, and is the perfect first step on any data quality journey. Be sure to download our free ebook to learn more about data profiling and how to overcome other common data obstacles!