Data Quality Assessment: Value and Pattern Frequency

Blog Administrator | Analyzing Data, Analyzing Data Quality, Data Profiling, Data Quality, Data Quality Assessment | , , , , , ,

By David Loshin

Once we have started our data quality assessment process by performing column value analysis, we can reach out beyond the scope of the types of null value analysis we discussed in the previous blog post. Since our column analysis effectively tallies the number of each value that appears in the column, we can use this frequency distribution of values to identify additional potential data flaws by considering a number of different aspects of value frequency (as well as lexicographic ordering), including:

  • Range Analysis, which looks at the values, and allows the analyst to consider whether they can be ordered so as to determine whether the values are constrained within a well-defined range.
Read More

Data Quality Assessment: Sparsity and Nullness

Blog Administrator | Analyzing Data, Analyzing Data Quality, Data Quality, Data Quality Assessment | , , , , , ,

By David Loshin

The first set of data quality assessment techniques that use column value frequency analysis focuses on the relationship of the population of values to the business processes that consume the data. The intent is to understand how the relative population of the column is associated with defined (or implicit) business rules, and then isolate and validate those rules.
Read More

Ask First, Fix Later

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Audit, Data Quality, Data Quality Assessment | , , , , , , ,

By Elliot King

Like the Boston Red Sox breaking their fans’ hearts, almost inevitably (stress on the almost) you will discover that some percentage of your data is wrong. The realization that you have data quality problems may come about for few reasons: 1) you’ve looked under the hood of your data systems by conducting a data assessment or 2) a data audit revealed that the data you have is not what you think you have.
Read More

Data Quality Assessment: Column Value Analysis

Blog Administrator | Analyzing Data, Analyzing Data Quality, Data Cleansing, Data Enrichment, Data Profiling, Data Quality, Data Quality Assessment | , , , , ,

By David Loshin

In recent blog series, I have shared some thoughts about methods used for data quality and data correction/cleansing. This month, I’d like to share some thoughts about data quality assessment, and the techniques that analysts use to review potential anomalies that present themselves.

The place to start, though is not with the assessment task per se, but the context in which the data quality analyst will find him/herself when asked to identify potential data quality flaws.… Read More

Customer Centricity and Birds of a Feather

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Cleansing, Data Management, Data Quality, Data Quality Assessment | , , , ,

By David Loshin

Why do we care to establish physical locations for individuals? One reason should be patently obvious: in every interaction between a staff member from your company and customer, both parties are always physically located somewhere, and many business performance indicators are pinned to a location dimension, such as “sales,” “customer complaints,” or “product distribution” by region.

Location is meaningful when it comes to analyzing customer behavior.… Read More