Techniques for Address Standardization

Blog Administrator | Address Standardization, Data Management, Data Quality, Geocoding | , ,

By David Loshin

In my recent set of posts, we looked at the value and importance of address standardization as an integral component of both transactional and analytical applications, especially when seeking levels of accuracy associated with the concept of location, which in some cases goes beyond the concept of “address.”

But knowing that with some degree of precision we can map locations to their nearest geocoded location, let’s think about aspects of a more general challenge: ensuring resolution of a provided descriptive address to an actual known address.

Let me clarify this a little. When I talk about a “provided descriptive address” I am referring to what an individual has presented as an address. And while another individual might be able to infer enough meaning from a presented address to make a delivery, the address might have misspellings, errors, or other variations that might prevent it from being adequately mapped to a specific geocoded location.

Aside from the other benefits we have already considered, transforming the address into a standard form will simplify the geocoding process. That transformation process leverages a few straightforward ideas, namely:

  1. There is a representative model for “standardized” addresses with its accompanying formats, syntax, acceptable value lists, and rules.
  2. An application is able to scan a non-standardized (or what I called a “provided descriptive”) address, differentiate between the parts that are good and the ones that do not meet the standard.
  3. There is a way to map the non-standard parts into standard ones.

In fact, all three of these ideas are doable, and over the next set of postings let’s look at each one of these in greater detail.