Using Data Quality Tools for Classification

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Management, Data Profiling, Data Quality | , , , , , , ,

By David Loshin

Hierarchical classification schemes are great for scanning through unstructured text for identifying critical pieces of information that can be mapped to an organized analytical profile. To enable this scanning capability, you will need two pieces of technology.

The first involves a text analysis methodology for scanning text and determining which character strings and phrases are meaningful and which ones are largely noise.… Read More

Approximate Matching

Blog Administrator | Analyzing Data, Data Management, Data Quality, Duplicate Elimination, Record Linkage | , , , , ,

By David Loshin

Actually, my first name is not David – that is really my middle name, but it is the given name my parents used when talking to me. This has actually led to a lot of confusion over the years, especially when confronted with a form asking for me “first name” and my “last name.”
Read More

Entities and their Characteristics

Blog Administrator | Analyzing Data Quality, Data Quality, Fuzzy Matching, Record Linkage | , , , , ,

By David Loshin

How can you tell if two records refer to the same person (or company, or other type of organization)? In our recent posts, we have looked at how data quality techniques such as parsing and standardization help in normalizing the data values within different records so that the records can be compared.
Read More