Managing Unique Customer Identities with Master Entity Indexes

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Customer Identities, Data Integration, Data Management, Data Matching, Data Quality, MDM | , , , , , ,

By David Loshin

In the past few entries in this series we have basically been looking at an approach to understanding customer behavior at particular contextual interactions that are informed by information pulled from customer profiles.

But if the focal point is the knowledge from the profile that influences behavior, you must be able to recognize the individual, rapidly access that individual’s profile, and then feed the data from the profile into the right analytical models that can help increase value.… Read More

Performance Scalability

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Integration, Data Management, Data Matching, Data Quality, Duplicate Elimination, MDM | , , , ,

By David Loshin

In my last post I noted that there is a growing need for continuous entity identification and identity resolution as part of the information architecture for most businesses, and that the need for these tools is only growing in proportion to the types and volumes of data that are absorbed from different sources and analyzed.

While I have discussed the methods used for parsing, standardization, and matching is past blog series, one thing I alluded to a few notes back was the need for increased performance of these methods as the data volumes grow.… Read More

Content Standards for Data Matching and Record Linkage

Blog Administrator | Address Standardization, Analyzing Data, Data Management, Data Matching, Data Quality, Record Linkage | , , , , ,

By David Loshin

As I suggested in my last post, applying parsing and standardization to normalize data value structure will reduce complexity for exact matching. But what happens if there are errors in the values themselves?

Fortunately, the same methods of parsing and standardization can be used for the content itself. This can address the types of issues I noted in the first post of this series, in which someone entering data about me would have used a nickname such as “Dave” instead of “David.”

Read More

Normalizing Structure Using Data Standardization for Improved Matching

Blog Administrator | Address Quality, Address Standardization, Analyzing Data, Data Matching, Data Quality, Record Linkage | , , , , , ,

By David Loshin

In my last few posts, I discussed how structural
differences impact the ability to search and match records across different
data sets. Fortunately, most data quality tool suites use integrated parsing
and standardization algorithms to map structures together.

As long as there is some standard representation, we should be able to come
up with a set of rules that can help to rearrange the words in a data value
to match that standard.

Read More

Structural Differences and Data Matching

Blog Administrator | Address Quality, Address Standardization, Data Cleansing, Data Enhancement, Data Enrichment, Data Governance, Data Integration, Data Management, Data Matching, Data Quality, Duplicate Elimination, Fuzzy Matching | , , ,

By David Loshin

Data matching is easy when the values are exact, but there are different types of variation that complicate matters. Let’s start at the foundation: structural differences in the ways that two data sets represent the same concepts. For example, early application systems used data files that were relatively “wide,” capturing a lot of information in each record, but with a lot of duplication.
Read More