Customer Data Platform (CDP) vs Master Data Management (MDM)

Melissa AU Team | Customer Data Platform, Master Data Management | , , , , ,

There’s no doubt about it – the key to retaining customers isn’t about fighting price wars, it’s offering them the best experience possible. Customers like being wooed and to do this, the right way, businesses need to understand their customers. This is why customer data is so important. You need to know what is in your customer’s shopping cart to send them reminder emails, you need to understand their demographic profile to personalize their landing page, and so on.

This is where Master Data Management (MDM) and Customer Data Platform (CDP) come in. While both handle data, there are a few intrinsic differences between them. Let’s take a look at what these applications are and the difference between them.

What are Master Data Management (MDM) applications?

MDM is a data-led, technology-enabled discipline aimed at creating a single, reliable data source for all operational and analytical systems. It is typically driven by IT to ensure accuracy, uniformity and semantic consistency of the enterprise’s master data assets. To do this, it pools the data into master data files and tracks relevant data points throughout the organization.

MDMs merge data by comparing multiple data points such as names, addresses, phone numbers, etc. Many MDM products are a part of larger data handling solutions.

What is a Customer Data Platform (CDP)?

As the name suggests, a CDP caters exclusively to customer data. It can be described as a packaged software that connects to all customer-related systems and allows business users to manage and manipulate the collected data. CDP empowers marketing teams to provide personalized customer experiences. It links data by comparing a single data point such as IP addresses or email accounts to identify prospects.

Master Data Management vs Customer Data Platform

Let’s look at some of the critical differences and approaches between MDM and CDP.

Data governance

Both CDPs and MDMs can handle vast amounts of data. However, as all businesses know, when it comes to data, quality matters more than quantity. A CDP cannot differentiate between good and bad data. Once the data enters the system, it does not have rules for how to deal with poor data. Thus, the value of the data held is significantly lowered.

On the other hand, MDMs are designed to create unified data views. The system collects data from multiple sources and validates it to create a reliable data source for the organization. Often, the data from MDMs is used for CDPs.

Data integration

CDPs are designed with the marketing team in mind. It pulls structured and unstructured information from various marketing applications and helps the marketing team assess what or why a customer is doing something and thus, what is the next step they need to take to woo the customer. The system is not capable of further integrations.

On the other hand, MDMs are designed to create a single source of truth. They can collect and send information to various enterprise applications and are not limited only to marketing. It can be used for business analytics, to drive data-based decisions, etc.

Adding context to data

For any kind of data to be valuable, it must be with context. CDPs collect and compare data but they have limited abilities when it comes to the hierarchical management of customer data. MDMs are much better suited to do this.

MDMs create links that help businesses understand aspects such as which 2 customers are related, which customers are also suppliers, etc. Understanding how customers interact with each other and the different parts of a business provide valuable operational intelligence.

Single vs Multi-Domain

Data is complex and typically the various domains are constantly interacting with each other. CDP offers a great perspective on customer data in a domain but this is an isolated view of the customer. Users can apply varied rules to match data for different purposes and each application can work with their own IDs to unify the data within the CDP.

If you’re looking for a multi-domain view, MDMs offer better functionality. By organizing data with cross-domain relationships in mind, it allows a business to see all the different factors affecting it. For example, you can see the type of products being sold to a segment of customers and the modes of purchase to improve promotions and target customers with higher accuracy. In terms of matching data sources, MDMs follow more rigid rules.

What do You Need?

While MDMs are more established, CDPs are a relatively newer player in the field. You can work with both or choose one based on your end goals. If it’s just about customer data for your marketing team, A CDP will suffice. But, if you need a system that helps derive context from all the data collected and gives you a holistic view of the business, MDMs are a better solution.

MDM – Secure, Fast and Hassle-Free with Unison

Author | Data Quality, MDM, Unison | , , , , , , , , ,

Automated data quality routines, lightning fast processing (50 million records per hour), and no programming expertise required for master data management? Unison has you covered. It unifies all of Melissa’s data cleansing technologies through a straightforward, modern and powerful user interface without sacrificing speed or security. Explore what makes Unison so unique from other platforms and how it was designed with data stewards in mind. Turn to page 34 in Big Data Quarterly for a Melissa exclusive on Unison.

Performance Scalability

Blog Administrator | Address Quality, Analyzing Data, Analyzing Data Quality, Data Integration, Data Management, Data Matching, Data Quality, Duplicate Elimination, MDM | , , , ,

By David Loshin

In my last post I noted that there is a growing need for continuous entity identification and identity resolution as part of the information architecture for most businesses, and that the need for these tools is only growing in proportion to the types and volumes of data that are absorbed from different sources and analyzed.

While I have discussed the methods used for parsing, standardization, and matching is past blog series, one thing I alluded to a few notes back was the need for increased performance of these methods as the data volumes grow.

Let’s think about this for a second. Assume we have 1,000 records, each with a set of data attributes that are selected to be compared for similarity and matching. In the worst case, if we were looking to determine duplicates in that data set, we would need to compare each records against the remaining records. That means doing 999 comparisons 1,000 times, for a total of 999,000 comparisons.

Now assume that we have 1,000, 000 records. Again, in the worst case we compare each record against all the others, and that means 999,999 comparisons performed 1,000,000 times, for a total of 999,999,000,000 potential comparisons. So if we scale up the number of records by a factor of 1,000, the number of total comparisons increases by a factor of 1,000,000!

Of course, our algorithms are going to be smart enough top figure out ways to reduce the computation complexity, but you get the idea – the number of comparisons grows in a geometric way. And even with algorithmic optimizations, the need for computational performance remains, especially when you realize that 1,000,000 records is no longer considered to be a large number of records – more often we look at data sets with tens or hundreds of millions of records, if not billions.

In the best scenario, performance scales with the size of the input. New technologies enable the use of high performance platforms, through hardware appliances, software that exploits massive parallelism and data distribution, and innovative methods for data layouts and exchanges.

In my early projects on large-scale entity recognition and master data management, we designed algorithms that would operate in parallel on a network of workstations. Today, these methods have been absorbed into the operational fabric, in which software layers adapt in an elastic manner to existing computing resources.

Either way, the demand is real, and the need for performance will only grow more acute as more data with greater variety and diversity is subjected to analysis. You can’t always just throw more hardware at a problem – you need to understand its complexity and adapt the solutions accordingly. In future blog series, we will look at some of these issues and ways that new tools can be adopted to address the growing performance need.





Identity Resolution and Variation

Blog Administrator | Uncategorized | , , , , ,

By David Loshin

In my last post, I shared some thoughts on the first of three specific challenges (specifically focusing on the use of contact information as identifying attributes) associated with entity identification and resolution as part of a master data management activity. This week, we look at the second challenge: the increased difficulty of resolving identities for matching records in large data sets as the degree of variation increases.

It should not be surprising that many entity names are subject to variation as a byproduct of human interaction with business applications.

Consider the name of mega big box retailer Wal-Mart, whose corporate name I have seen spelled as “Walmart,” “Walmarts,” “Wall-Mart,” “Wall-Marts,” “Wall-Mart’s,” “Wallmart,” “Wallmarts,” “Wal-Mart,” as well as “Wal*Mart” – and this is just the beginning, since when you add in variation between upper- and lower-casing of different letters as well as individual store identification numbers and named locations, you can end up with an incredible number of different representations.

The variations will tend to increase when they continue coming from a variety of data sources, especially in social media contexts that have no controls. It is unlikely that Twitter users will subject their tweets to any kind of name validation, and despite the fact that some people are marvelously adept at typing on a smart phone keypad, the majority of us are subject to fumbling fingers when using such a small palette for communication.

In other words, while the breadth of inputs increases, so do the opportunities for misspelling names, and the numbers of variations will be proportional to the massive data volumes.

This means two things. First, businesses will need to increasingly rely on existing tools and techniques for parsing through text, standardizing terms in context, entity recognition and extraction, and identity resolution as part of a general capability for absorbing data from a variety of text-based sources. Second, these techniques must perform well and scale up as the data volumes grow.

These conclusions should also not come as a surprise. But an interesting corollary is that entity identification and identity resolution are becoming necessary parts of the organizational information architecture, even in small and medium-sized businesses.