Structural Differences and Data Matching
By David Loshin
More modern systems use a relational structure that segregates unique
attributes associated with each data concept – attributes about an individual
are stored in one data table, and those records are linked to other tables
containing telephone numbers, street addresses, and other contact data.
Transaction records refer back to the individual records, which reduces the
duplication in the transaction log tables.
The differences are largely in the representation – the older system might have
a field for a name, a field for an address, perhaps a field for a telephone
number, and the newer system might break up the name field into a first name,
middle name, and last name, the address into fields for street, city, state, and
ZIP code, and a telephone number into fields for area code and exchange/line
These structural differences become a barrier when performing records searches
and matching. The record structures are incompatible: different number of
fields, different field names, and different precision in what is stored.
This is the first opportunity to consider standardization: if structural
differences affect the ability to compare a record in one data set to records in
another data set, then applying some standards to normalize the data across the
data sets will remove that barrier. More on structural standardization in my