Contact Data and Identifying Information

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Quality | , , ,

By David Loshin

When inspecting two records for similarity (or for differentiation), the values in the identifying attributes from each corresponding record are compared to determine whether the two records can be presumed to represent the same entity or distinct entities.

For people, there are some obvious attributes used for comparison – they are ones that are inherently associated with the individual, such as first name, last name, birth date, eye color, or birth location.… Read More

Reflections: The Challenges of Master Data Resolution

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Management, Data Quality, MDM | , , , , , ,

By David Loshin

I have worked for almost fifteen years on what would today be called master data management. I recall the first significant project involved unique identification of individuals based on records pulled from about five different sources, and there were three specific challenges:
  1. Determination of identifying attributes – specifying the data elements that, when composed together, provide enough information to differentiate between records representing different entities;
  2. Identity resolution in the presence of variation-having the right algorithms, tools, and techniques for using the identifying attribute values to search for and find matching records among a collection of source data sets; and
  3. Performance management- tuning the algorithms and tools properly to ensure (as close to) linear scalability as the volumes of data grow.
Read More

Validation of Data Rules

Blog Administrator | Address Quality, Address Validation, Analyzing Data, Analyzing Data Quality, Data Profiling, Data Quality | , , , ,

By David Loshin

Over the past few blog posts, we have looked at the ability to define data quality rules asserting consistency constraints between two or more data attributes within a single data instance, as well as cross-table consistency constraints to ensure referential integrity. Data profiling tools provide the ability to both capture these kinds of rules within a rule repository and then apply those rules against data sets as a method for validation.
Read More

Fundamentals of SQL Server Contact Data Cleansing

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Quality, Data Quality Components for SSIS, SQL Server Integration Services | , ,

By
Joseph Vertido

New Webinar!
The roadblocks involved in implementing a solid routine for normalizing data can be extensive and can prove to be time consuming, especially when dealing with specific types of domains. In this new webcast, our data quality analyst Joseph Vertido, will help

you understand the major roadblocks in normalizing contact data (address, name, phone and email) and experience a new approach from Melissa Data, using plugins for Microsoft SQL Server Integration Services 2012 to effectively normalize contact data, all at no cost.… Read More

More About Data Quality Assessment

Melissa Team | Address Quality, Analyzing Data, Analyzing Data Quality, Data Management, Data Profiling, Data Quality, Data Quality Assessment | , , , , , ,

By David Loshin

In our last series of blog entries, I shared some thoughts about data quality assessment and the use of data profiling techniques for analyzing how column value distribution and population corresponded to expectations for data quality. Reviewing the frequency distribution allowed an analyst to draw conclusions about column value completeness, the validity of data values, and compliance with defined constraints on a column-by-column basis.
Read More