Garbage In …

Melissa Team | Data Cleansing, Data Quality | , ,

By Elliot King

We all know that dirty data is not really dirty; it is just incorrect. Data
cleansing consists of correcting mistakes in the data.

Mistakes make their way into contact data in several different ways. It may just
be wrong or incomplete; it may not be updated; and it may be duplicated if small
variations are entered into the contact information each time a customer gets in
touch your organization. I know that my name shows up in at least half a dozen
variations in more than one company database.

There are several strategies for ensuring that a high percentage of your
customer contact data is correct (some errors will inevitably creep in) but one
of the most important steps you can take is right at the very beginning. Before
you even start collecting data, you should ask yourself how much information do
you actually need to capture about each customer, and what field or fields
define a unique record?

Do you really need to capture somebody’s fax number? Do you need the honorifics
like Mr. or Dr.? (Honorifics were on a form I recently had to complete
to buy an airline ticket. In fact, they were a required field). Are there other
pieces of data that can be eliminated from your contact record?

And while it is obvious that the name field should not be used to determine a
unique record, what should be? With Web-based forms, for example, many people
enter incorrect email addresses to avoid getting spam.

The fact is that the more information required on a contact information form,
the more mistakes it will have. It is much more efficient to collect data
correctly at the beginning of the process, than to locate and fix incorrect data