Structural Differences and Data Matching

By David Loshin Data matching is easy when the values are exact, but there are different types of variation that complicate matters. Let's start at the foundation: structural differences in the ways that two data sets represent the same concepts. For example, early application systems used data files that were relatively "wide," capturing a lot of information in each record,…

Continue Reading

Improving Identity Resolution and Matching via Structure, Standards, and Content

By David Loshin One of the most frequently-performed activities associated with customer data is searching - given a customer's name (and perhaps some other information), looking that customer's records up in databases. And this leads to an enduring challenge for data quality management, which supports finding the right data through record matching, especially when you don't have all the data…

Continue Reading

Clean Data is Good Data

By Elliot King The cliché is as old as computing itself--garbage in, garbage out. And that cliché is as true now as ever, if not more so. Unfortunately, with information flowing into companies from so many sources including the Web and third-party providers, mistakes should not just be expected; they are basically inevitable. Garbage data is going to get in…

Continue Reading

Enter the Contact Zone: Where Data Integration and Data Quality Are Simplified

By Joseph Vertido For many, the concepts of data integration and data quality are separate and have no commonality. But in reality, when you combine them - they create a partnership that excels. Where data quality leaves off, data integration begins, and vice versa. A new product - Contact Zone - fuses these two concepts together into one revolutionary solution…

Continue Reading

Validation, Standardization, and Correction: Tool or Process?

By David Loshin There are all sorts of tools associated with address standardization, cleansing, and validation. As an example, the USPS has a certification program for software vendors, referred to as CASS (Coding Accuracy Support System)™ certification. According to their website, CASS enables the Postal Service™ to evaluate the accuracy of address matching software programs in the following areas: (1)…

Continue Reading

Achieving “Proactivity?”

By David Loshin Standardizing the approaches and methods used for reviewing data errors, performing root cause analysis, and designing and applying corrective or remedial measures all help ratchet an organization's data quality maturity up a notch or two. This is particularly effective when fixing the processes that allow data errors to be introduced in the first place totally eliminates the…

Continue Reading

Record Linkage and Data Enhancement

By David Loshin In my last two posts we looked at the distribution of information about entities and the use of record linkage to find corresponding data records in different data sets that can be linked together. Record linkage can be used for a number of processes that we bundle under the concept of "data enhancement," which we'll use to…

Continue Reading