Record Matching Made Easy with MatchUp Web Service

Blog Administrator | Data Governance, Data Integration, Data Management, Data Matching, Data Quality, Data Quality Components for SSIS, Data Steward, Data Warehouse, Duplicate Elimination, Fuzzy Matching, Golden Record, Householding, Identity Resolution, Record Linkage, SQL Server Integration Services, SSIS, Survivorship | , , , , , , ,

MatchUp®,
Melissa’s solution to identify and eliminate duplicate records, is now
available as a web service for batch processes, fulfilling one of most frequent
requests from our customers – accurate database matching without maintaining
and linking to libraries, or shelling out to the necessary locally-hosted data
files.

 

Now
you can integrate MatchUp into any aspect of your network that can communicate
with our secure servers using common protocols like XML, JSON, REST or SOAP.

 

Select
a predefined matching strategy, map the table input columns necessary to
identify matches to the respective request elements, and submit the records for
processing. Duplicate rows can be identified by a combination of NAME, ADDRESS,
COMPANY, PHONE and/or EMAIL.

 

Our
select list of matching strategies removes the complexity of configuring rules,
while still applying our fast and versatile fuzzy matching algorithms and
extensive datatype-specific knowledge base, ensuring the tough-to-identify
duplicates will be flagged by MatchUp. 

 

The output response returned by the service
can be used to update a database or create a unique marketing list by
evaluating each record’s result codes, group identifier and group count, and
using the record’s unique identifier to link back the original database record.

 

Since
Melissa’s servers do the processing, there are no key files – the temporary
sorting files – to manage, freeing up valuable hardware resources on your local
server.

 

Customers
can access the MatchUp
Web Service
license by obtaining a valid license from our sales team and
selecting the endpoint compatible to your development platform and necessary
request structures here.

Groupings and Hierarchies

Blog Administrator | Analyzing Data, Analyzing Data Quality, Data Quality, Householding | , , ,

By David Loshin

In our last set of posts, we looked at householding – inferring relationships for grouping individuals together based on shared characteristics.
In this series, we look at how we manage the quality of the data representing those shared characteristics. First let’s look at an example: organizing individuals based on their preferences for types of cars. There are a number of different classifications of cars, mostly focusing on car size, and these can be used for grouping individuals by reference.

And that is the problem: there are a number of different classifications of
cars, and without a defined standard, there’s bound to be confusion. Here are
three examples (I got them from
a page at Wikipedia):

• The Highway Loss Data Institute (HLDI) classifies cars into five groups: Sports, Luxury, Large, Midsize, and Small.
• The National Highway Traffic Safety Administration (NHTSA) has eight classifications (based on curb weight of the car): mini passenger cars, light passenger cars, compact passenger cars, medium passenger cars, heavy passenger cars, sport utility vehicles, pickup trucks, and vans.
• The EPA has a car classification based on interior and cargo space: Two-seaters, minicompacts, subcompact, compact, mid-size, large, small station wagons, mid-size station wagons, large station wagons.

While one application might assign a demographic classification based on the
HLDI groupings, another application might use the NHTSA classification, but
aspects of those classifications don’t match: the set of small HLDI cars might
include the NHTSA sets of mini passenger cars, light passenger cars, and compact
passenger cars.

The absence of a standard within the enterprise for choices of classification
may seem irrelevant within siloed functions, but as more business processes are
monitored across multiple functions, variant dimensions for classification and
analysis will create confusion somewhere down the line.