Reflections: The Challenges of Master Data Resolution

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Management, Data Quality, MDM | , , , , , ,

By David Loshin

I have worked for almost fifteen years on what would today be called master data management. I recall the first significant project involved unique identification of individuals based on records pulled from about five different sources, and there were three specific challenges:
  1. Determination of identifying attributes – specifying the data elements that, when composed together, provide enough information to differentiate between records representing different entities;
  2. Identity resolution in the presence of variation-having the right algorithms, tools, and techniques for using the identifying attribute values to search for and find matching records among a collection of source data sets; and
  3. Performance management- tuning the algorithms and tools properly to ensure (as close to) linear scalability as the volumes of data grow.
Read More

Business Rules Rule

Blog Administrator | Analyzing Data, Analyzing Data Quality, Data Management, Data Quality | , , , ,

By Elliot King

Back in the day when television sets were still built in America, the Zenith Corp. ran an ad that proclaimed that the quality went in before the name went on. Okay, at some point Zenith was trying to gloss over the fact that the company had fallen behind in automation and a lot of their manufacturing process was still conducted by hand.
Read More

Approximate Matching

Blog Administrator | Analyzing Data, Data Management, Data Quality, Duplicate Elimination, Record Linkage | , , , , ,

By David Loshin

Actually, my first name is not David – that is really my middle name, but it is the given name my parents used when talking to me. This has actually led to a lot of confusion over the years, especially when confronted with a form asking for me “first name” and my “last name.” For official forms (like my driver’s license) I use my real first name as my “first name,” but for non-official forms I often just use David.
Read More

Record Linkage and Data Enhancement

Blog Administrator | Data Enhancement, Data Enrichment, Data Management, Data Quality, Duplicate Elimination, Record Linkage | , , , , , ,

By David Loshin

In my last two posts we looked at the distribution of information about entities and the use of record linkage to find corresponding data records in different data sets that can be linked together. Record linkage can be used for a number of processes that we bundle under the concept of “data enhancement,” which we’ll use to describe any methods for

improving the value and usefulness of information.… Read More