Data Profiling: Pushing Metadata Boundaries

Blog Administrator | Analyzing Data, Analyzing Data Quality, Data Management, Data Profiling, Data Quality, Golden Record | , , , , ,

By
Joseph Vertido
Data Quality Analyst/MVP Channel Manager

Two truths about data: Data is always changing. Data will always have problems. The two truths become one reality–bad data. Elusive by nature, bad data manifests itself in ways we wouldn’t consider and conceals itself where we least expect it. Compromised data integrity can be saved with a comprehensive understanding of the structure and contents of data.
Read More

Record Linkage and Data Enhancement

Blog Administrator | Data Enhancement, Data Enrichment, Data Management, Data Quality, Duplicate Elimination, Record Linkage | , , , , , ,

By David Loshin

In my last two posts we looked at the distribution of information about entities and the use of record linkage to find corresponding data records in different data sets that can be linked together. Record linkage can be used for a number of processes that we bundle under the concept of “data enhancement,” which we’ll use to describe any methods for

improving the value and usefulness of information.… Read More

Context is Key to Measuring Data Quality

Blog Administrator | Address Correction, Analyzing Data Quality, Data Management, Data Quality | , , , ,

By Elliot King

Beauty is in the eyes of the beholder, but that is not the case when it comes to data quality or, at least, it is not the whole story. Data quality can be measured along several different dimensions. But in the final analysis data quality depends on the context within which the data is used.

 Perhaps the most obvious criteria by which to measure data quality is
accuracy.… Read More

Are You A Dupe Detective?

Blog Administrator | Data Management, Data Quality, Data Quality Components for SSIS, ETL, Fuzzy Matching | , , , , , , , ,

By Joseph Vertido

The process of finding approximate matching records in your data to get rid of
duplicates is precisely that – fuzzy. It raises as many questions as answers. Am
I using a good matching algorithm? Am I matching on the right fields? Is it a
true match or a false one?

The problem begins when inconsistent data enters from multiple sources.… Read More