Tokenization and Parsing
Blog Administrator | Address Standardization, Data Quality |
By David Loshin
As we have discussed in the previous posts, the data values stored within data elements carry specific meaning within the context of the business uses of the modeled concepts, so to be able to standardize an address, the first step is identifying those chunks of information that are embedded in the values.
This means breaking out each of the chunks of a data value that carry the
meaning, and in the standardization biz, each of those chunks is called a token.… Read More