By Elliot King

Elliot King

As Donald Rumsfeld, the former secretary of the defense once famously said, “there are known unknowns and there are unknown unknowns.” In other words, somethings we know we don’t know and consequently we can do the research to learn what we need to know. But other times, we don’t even know what we don’t know. Unknown unknowns present real risks, as Rumsfeld sadly learned.

Data quality can fall into both camps. Sometimes companies understand that the
quality of their data varies and they have to assess its quality regularly. But
in many cases, an organization is completely unaware of data problems. So how
can you mitigate the risks of developing unknown data quality issues? The best
solution is prevention.

The most visible cause of data corruption is poor data entry. If there are no
rules defining how data is entered into your information systems,
inconsistencies will inevitably fester.

For example, should name records be required to include Mr., Ms., Mrs., Miss,
Dr. and so on? Who is a Ms., who is a Mrs., and who is a Miss? Should the
records include those titles at all? Can the United States be entered in an
address record as U.S., USA, U.S.A. or the United States? Those choices need to
be defined and those definitions need to be enforced. The choices should also be
rational. If you capture more information than you really need, you become more
vulnerable to data quality issues.

Data decay is also a serious driver of data corruption. People move and they
change their names. According to some estimates, name and address data can decay
at the rate of two percent a month or 25 percent a year. That kind of decay
occurs whether you track it or not.

Other common sources of data corruption are data migration–when information from
one system or application is moved to another; data mergers–when data is
combined from different sources into a master file, and data consolidation–when
companies attempt to eliminate redundant data.

Companies that are alert to the threats those kinds of processes pose to data
quality can put safeguards into place to mitigate the problems that might arise.
But those who are blind to the risks, risk being blindsided.


Leave a Reply

Your email address will not be published. Required fields are marked *