Data Enhancement for Analytical Purposes

Blog Administrator | Data Enhancement, Data Quality | , , , , , , ,

By David Loshin

Last time we looked at two example operational uses of data enhancement. But the
value is not limited to insertion in specific operational workflows, because
enhancement is often done to provide additional detail for reporting and
analysis purposes.

And in these cases, enhancement goes beyond data standardization and correction;
instead, the enhancement process can add more information by linking one data
set to another. The appended data can augment an analytical process to include
extra information in generated report and interactive visualizations.

As an example, recall that in a previous post, I talked about collecting ZIP
code values at a point of sale. A retail company can take sales data that
includes this geographic data element and then enhance the data with
demographic profiles provided by the US Census Bureau to look for
correlation between purchasing patterns and documented demographics about the
specific locations (including sex, age, race, Hispanic or Latino origin,
household relationship, household type, group quarters population, housing
occupancy, and housing tenure).

Geographic data enhancement also adds value for analysis. Given a pair of
addresses, an enhancement process can evaluate different types of distances
(direct distance and driving distance are two examples) between those two
points. This can be useful in a number of analytical applications, such as site
location planning, which compares properties based on a variety of criteria
(possibly including the median driving distance for local customers for a bank
branch, or average driving time for delivering pizza to frequent customers).

There are many data aggregators who can supply demographic and behavior data
that can enhance your customer data sets. And you can use your own company’s
data for enhancement as well, such as your own product sales by region used to
develop your own customer segmentation data.

Standardizing names and addresses is the first step, and linking those records
to the reference data collections allows direct linkage based on specific
criteria, ranging from gross-level linkage (say, at the county level) down to
specific enhancement at the individual level (such as the names of the magazine
to which a customer subscribes).

These qualitative enhancements augment the business intelligence and analytics
processes to help companies make more sales, increase revenues, and improve