More About Data Quality Assessment

Melissa Team | Address Quality, Analyzing Data, Analyzing Data Quality, Data Management, Data Profiling, Data Quality, Data Quality Assessment | , , , , , ,

By David Loshin

In our last series of blog entries, I shared some thoughts about data quality assessment and the use of data profiling techniques for analyzing how column value distribution and population corresponded to expectations for data quality. Reviewing the frequency distribution allowed an analyst to draw conclusions about column value completeness, the validity of data values, and compliance with defined constraints on a column-by-column basis.
Read More

Managing Customer Connectivity

Blog Administrator | Analyzing Data, Analyzing Data Quality, Customer Centricity, Data Management, Data Quality | , , , , , , , , ,

By David Loshin

At the end of our last entry, we had come to the conclusion that standardization of potentially variant data values was a key activator for evaluating record similarity when looking to group customer records together based on any set of characteristic attributes. From an operational standpoint, this activity is supported using data quality tools that can parse and standardize data.
Read More

The Format of Nothing

Blog Administrator | Analyzing Data, Analyzing Data Quality, Data Governance, Data Management, Data Profiling, Data Quality | , , , , ,

By David Loshin

The first question I always wonder about missing data is about the format of the missing data, especially in systems that predate the concept of the “system null” value. For example, early systems maintained files storing tables with fixed-width columns. When one of a record’s field was missing a value, something had to be fitted into that field to ensure that the rest of the columns lined up correctly.
Read More

All or Nothing?

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Quality | , , , , ,

By David Loshin

One of the most frequently referenced dimensions of data quality is completeness. At a formal level, completeness implies rules specifying mandatory assignment of values to particular data elements. In layman’s terms, that specifies rules to make sure critical attributes are populated with values.

Now there are a few things to think about here regarding the critical nature of
completeness rules for data validity, from the data creation side and from the
data consumption side.… Read More

Record Linkage and Fuzzy Matching Part 1

Blog Administrator | Uncategorized | , , , , , , , , , , ,

 

This will be the first in a series of posts identifying similar records between two different sources or grouping of records from a single source, based on existing column string of values. We will define an approach, review actual implementations with various tools and vendor’s products.

There are many facets to review. I would like to start by drawing from several articles from experienced practitioners, either from academia or the commercial market.Read More