The Challenge of Identifying Information

Blog Administrator | Analyzing Data, Data Integration, Data Management, Data Quality, Record Linkage | , , , ,

By David Loshin

In my last post, I introduced the question of determining which characteristics are used to uniquely differentiate between any pair of records within a data set. The same question is relevant when attempting to match a pair of records as well, once they are determined to represent the same entity.
Read More

Entities and their Characteristics

Blog Administrator | Analyzing Data Quality, Data Quality, Fuzzy Matching, Record Linkage | , , , , ,

By David Loshin

How can you tell if two records refer to the same person (or company, or other type of organization)? In our recent posts, we have looked at how data quality techniques such as parsing and standardization help in normalizing the data values within different records so that the records can be compared.
Read More

Inferred Knowledge and Customer Intelligence through Matching and Linkage

Blog Administrator | Data Enhancement, Data Management, Data Quality, Fuzzy Matching, Record Linkage | , , , ,

By David Loshin

What I have found to be the most interesting byproduct of record linkage is the ability to infer explicit facts about individuals that are obfuscated as a result of distribution of data. As an example, consider these records, taken from different data sets:

A:

David
Loshin
301-754-6350
1163 Kersey Rd
Silver Spring
MD
20902

B:
Knowledge Integrity, Inc
1163 Kersey Rd
Silver Spring
MD
20902

C:
H David
Lotion
1163 Kersey Rd
Silver Spring
MD
20902

D:
Knowledge Integrity, Inc.

Read More

Record Linkage and Data Enhancement

Blog Administrator | Data Enhancement, Data Enrichment, Data Management, Data Quality, Duplicate Elimination, Record Linkage | , , , , , ,

By David Loshin

In my last two posts we looked at the distribution of information about entities and the use of record linkage to find corresponding data records in different data sets that can be linked together. Record linkage can be used for a number of processes that we bundle under the concept of “data enhancement,” which we’ll use to describe any methods for

improving the value and usefulness of information.… Read More