Storing International Postal Data

Melissa Team | Analyzing Data, Analyzing Data Quality, Data Quality, International Address Verification | , , , ,

By Charles Gaddy

I’m sitting in my favorite coffee shop, scanning the patrons in the room. One
thing came to mind – this restaurant reflects a good cross-section of America. I
took an impromptu poll among some of the people I met over the course of a month to ask two

1. Do you think you could make an international phone call?
2. Do you think you could address a letter to Germany?

These may seem like simple activities, but for the average American, I feel they
are quite challenging. Of my “back of the napkin” survey about seven out of 10
felt they could make the call, and two out of 10, the address. However, when
asked how to address a letter they mostly replied – “just like in the U.S., but
with a Country at the end.”

This drives home the idea that in the U.S., businesses are just now starting to
focus on international markets. I know that if I asked someone from the EU or
from Great Britain, they would be knowledgeable in telephone country codes,
shipping to mainland Europe from the UK, etc.

What I am hoping to address in the next several blog postings are the necessary
guidelines for the configuration and storage of postal address data in a
database table or tables, which is flexible around address format, structure,
postal code indicators, and alphabets/languages.

How to Account for International Address Formats in a Database Table

In the past, I have done database work with domestic and international
companies. One very specific example, which is relevant to this blog post, is a
major international airline. This airline started as a U.S. domestic carrier and
has expanded over time into Central and South America as well as the PAC RIM and
Europe. The legacy systems they had in place stored address data in a classify
U.S. format: Address1, Address2, Address3, City, State, ZIP Code.

My company was hired to do postal address verification, correction, and
standardization on their international data and put it into this legacy system.
Immediately, you can see the challenges. ZIP Code in the U.S. is a number; in
the United Kingdom, Canada, and others – it’s alphanumeric. In Mexico there can
be as many as eight address lines as well as three administrative – a village,
district, province model which does not fit with the classic U.S. city and state
model. Even the best of address verification and correction cannot account for
“putting a square peg in a round hole,” which is these legacy structures.

In the next blog post, I will suggest an optimized format for storing address
data within a single database table.