Performance Scalability

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Integration, Data Management, Data Matching, Data Quality, Duplicate Elimination, MDM | , , , ,

By David Loshin

In my last post I noted that there is a growing need for continuous entity identification and identity resolution as part of the information architecture for most businesses, and that the need for these tools is only growing in proportion to the types and volumes of data that are absorbed from different sources and analyzed.
Read More

Reflections: The Challenges of Master Data Resolution

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Management, Data Quality, MDM | , , , , , ,

By David Loshin

I have worked for almost fifteen years on what would today be called master data management. I recall the first significant project involved unique identification of individuals based on records pulled from about five different sources, and there were three specific challenges:
  1. Determination of identifying attributes – specifying the data elements that, when composed together, provide enough information to differentiate between records representing different entities;
  2. Identity resolution in the presence of variation-having the right algorithms, tools, and techniques for using the identifying attribute values to search for and find matching records among a collection of source data sets; and
  3. Performance management- tuning the algorithms and tools properly to ensure (as close to) linear scalability as the volumes of data grow.
Read More