Contact Data and Identifying Information

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Quality | , , ,

By David Loshin

When inspecting two records for similarity (or for differentiation), the values in the identifying attributes from each corresponding record are compared to determine whether the two records can be presumed to represent the same entity or distinct entities.

For people, there are some obvious attributes used for comparison – they are ones that are inherently associated with the individual, such as first name, last name, birth date, eye color, or birth location.… Read More

Validation of Data Rules

Blog Administrator | Address Quality, Address Validation, Analyzing Data, Analyzing Data Quality, Data Profiling, Data Quality | , , , ,

By David Loshin

Over the past few blog posts, we have looked at the ability to define data quality rules asserting consistency constraints between two or more data attributes within a single data instance, as well as cross-table consistency constraints to ensure referential integrity. Data profiling tools provide the ability to both capture these kinds of rules within a rule repository and then apply those rules against data sets as a method for validation.
Read More

Get Used to It: Inconsistent Data is the New Normal

Melissa Team | Address Quality, Analyzing Data, Analyzing Data Quality, Data Cleansing, Data Management, Data Quality | , , , , ,

By Elliot King

Nobody is perfect and neither is corporate data. Indeed, data errors are intrinsic to IT’s DNA. Data inevitably decays. Errors can be caused when data from outside sources are merged into a system. And then, of course, the humans that interact with the system are, well, human.

Unfortunately, despite the best efforts of data quality professionals, the three major IT trends–analytics, big data, and unstructured data–while promising great payoffs generally, promise to exacerbate data quality issues.… Read More

Understanding Hierarchies

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Management, Data Quality | , ,

By David Loshin

Defining standards for group classification helps in reducing confusion due to inconsistencies across generated reports and analyses. In the automobile classification example we have been using for the past few posts, we might pick the NHTSA values (mini passenger cars, light passenger cars, compact passenger cars, medium passenger cars, heavy passenger cars, sport utility vehicles, pickup trucks, and vans) as the standard.
Read More