More About Data Quality Assessment

Melissa Team | Address Quality, Analyzing Data, Analyzing Data Quality, Data Management, Data Profiling, Data Quality, Data Quality Assessment | , , , , , ,

By David Loshin

In our last series of blog entries, I shared some thoughts about data quality assessment and the use of data profiling techniques for analyzing how column value distribution and population corresponded to expectations for data quality. Reviewing the frequency distribution allowed an analyst to draw conclusions about column value completeness, the validity of data values, and compliance with defined constraints on a column-by-column basis.
Read More

Managing Customer Connectivity

Blog Administrator | Analyzing Data, Analyzing Data Quality, Customer Centricity, Data Management, Data Quality | , , , , , , , , ,

By David Loshin

At the end of our last entry, we had come to the conclusion that standardization of potentially variant data values was a key activator for evaluating record similarity when looking to group customer records together based on any set of characteristic attributes.
Read More

The Format of Nothing

Blog Administrator | Analyzing Data, Analyzing Data Quality, Data Governance, Data Management, Data Profiling, Data Quality | , , , , ,

By David Loshin

The first question I always wonder about missing data is about the format of the missing data, especially in systems that predate the concept of the “system null” value. For example, early systems maintained files storing tables with fixed-width columns.
Read More

All or Nothing?

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Quality | , , , , ,

By David Loshin

One of the most frequently referenced dimensions of data quality is completeness. At a formal level, completeness implies rules specifying mandatory assignment of values to particular data elements. In layman’s terms, that specifies rules to make sure critical attributes are populated with values.
Read More

Record Linkage and Fuzzy Matching Part 1

Blog Administrator | Uncategorized | , , , , , , , , , , ,

 

This will be the first in a series of posts identifying similar records between two different sources or grouping of records from a single source, based on existing column string of values. We will define an approach, review actual implementations with various tools and vendor’s products.Read More