Data Quality Assessment: Column Value Analysis

By David Loshin In recent blog series, I have shared some thoughts about methods used for data quality and data correction/cleansing. This month, I'd like to share some thoughts about data quality assessment, and the techniques that analysts use to review potential anomalies that present themselves. The place to start, though is not with the assessment task per se, but…

Continue Reading

Using Data Quality Tools for Classification

By David Loshin Hierarchical classification schemes are great for scanning through unstructured text for identifying critical pieces of information that can be mapped to an organized analytical profile. To enable this scanning capability, you will need two pieces of technology. The first involves a text analysis methodology for scanning text and determining which character strings and phrases are meaningful and…

Continue Reading

What is Meant by “Flocking Together” Virtually?

By David Loshin In my last two posts, we have been reviewing the concepts of contact methods that have in the past been used for identifying a customer's location, and the ramifications of an increasing trend in which the contact mechanism is less reliable for establishing a location. In particular, the traditional use of telephone numbers to isolate a customer's…

Continue Reading

The Format of Nothing

By David Loshin The first question I always wonder about missing data is about the format of the missing data, especially in systems that predate the concept of the "system null" value. For example, early systems maintained files storing tables with fixed-width columns. When one of a record's field was missing a value, something had to be fitted into that…

Continue Reading

Improving Identity Resolution and Matching via Structure, Standards, and Content

By David Loshin One of the most frequently-performed activities associated with customer data is searching - given a customer's name (and perhaps some other information), looking that customer's records up in databases. And this leads to an enduring challenge for data quality management, which supports finding the right data through record matching, especially when you don't have all the data…

Continue Reading

Enter the Contact Zone: Where Data Integration and Data Quality Are Simplified

By Joseph Vertido For many, the concepts of data integration and data quality are separate and have no commonality. But in reality, when you combine them - they create a partnership that excels. Where data quality leaves off, data integration begins, and vice versa. A new product - Contact Zone - fuses these two concepts together into one revolutionary solution…

Continue Reading

Validation, Standardization, and Correction: Tool or Process?

By David Loshin There are all sorts of tools associated with address standardization, cleansing, and validation. As an example, the USPS has a certification program for software vendors, referred to as CASS (Coding Accuracy Support System)™ certification. According to their website, CASS enables the Postal Service™ to evaluate the accuracy of address matching software programs in the following areas: (1)…

Continue Reading

Achieving “Proactivity?”

By David Loshin Standardizing the approaches and methods used for reviewing data errors, performing root cause analysis, and designing and applying corrective or remedial measures all help ratchet an organization's data quality maturity up a notch or two. This is particularly effective when fixing the processes that allow data errors to be introduced in the first place totally eliminates the…

Continue Reading

Standardizing Your Approach to Monitoring the Quality of Data

By David Loshin In my last post, I suggested three techniques for maturing your organizational approach to data quality management. The first recommendation was defining processes for evaluating errors when they are identified. These types of processes actually involve a few key techniques: 1) An approach to specifying data validity rules that can be used to determine whether a data…

Continue Reading

Four Pillars of Data Quality Improvement

By Elliot King Almost all data quality management programs have four key elements that serve as the foundations for success--data profiling, data improvement, integration and data augmentation. Put in other words, data quality programs must determine what is broken; fix what can be fixed; consolidate what can be consolidated and enhance what needs to be enhanced. Sounds easy, right? If…

Continue Reading