We’re living in a data-driven world but as businesses around the world are learning, the impact of data-driven decisions relies on data quality. Poor data quality can cause expensive mistakes, put the company at risk of fines and damage the brand’s reputation.
A Harvard Business Review study found that only 3% of business leaders considered the data their department was working with to meet acceptable quality standards. The need to work on understanding and improving data quality is evident. There are many myths and misconceptions associated with this. Let’s debunk a few.
Myth #1: Quantity is the first priority
It’s easy to assume that the more data you hold, the more informed you can be. However, simply collecting data indiscriminately without heeding the quality of data being collected makes the exercise worthless and worse, puts the business at risk of bad decisions.
A smaller but verified database is more valuable than a large, unverified database. Simply holding on to unverified data not only impacts decision-making but can also increase complexity and cause compliance issues with data privacy regulations.
Myth #2: The IT department is solely responsible for data quality
When incorrect, incomplete or duplicate data is found, the blame is more than not directed instantly towards the IT department. IT does play an important role in implementing tools to improve data quality but this does not make data quality a solely IT responsibility.
Maintaining a sustainable high-quality database is a responsibility that must be shared by data owners, users and the IT department. Businesses must establish clear roles and responsibilities for everyone who interacts with or uses data to run effective data quality management systems.
Myth #3: Onboarding verification is sufficient to improve data quality
Managing a database to ensure it meets high-quality standards cannot be seen as a one-time exercise. Checking data before it enters the database is critical and keeps bad data out. However, even data that was once good can decay with time and thus lower the overall database quality. For example, email addresses change at the rate of 31% annually.
Similarly, people change home addresses and phone numbers while city administration may rename streets. Data decay can also be triggered by data migration, human error and system changes. Thus, maintaining high quality standards requires continuous monitoring and regular assessments.
Myth #4: Perfection is essential
While aiming for perfection is good, in reality, 100% quality data is not required for analytics. Insisting on perfection can hold a company back. When required for analytics or decision making, data is often viewed from hypothetical standpoints.
Questions often take a ‘what if’ format. However, data is always a reference to historical events. This doesn’t mean that it cannot be used to draw inferences for decision making. Businesses must set independent standards of accuracy and completeness that are acceptable to them. Any data that meets these standards may be used for decision making.
Myth #5: Improving data quality is too expensive
Many businesses cite high costs and the time factor as reasons for being unable to maintain high-quality data. Working on data quality does take time and money but this should be looked at as an investment rather than an expense.
Working with unverified data firstly increases the risk of making bad decisions. Advertisements targeting the wrong customers, mismanaged inventory, opening stores in the wrong locations, etc. can have a significant impact on business profits. There’s also the cost of storing bad data to be considered. Worse, bad data influences other data in the database and the magnitude of this problem can quickly snowball.
Cutting corners to save money is never recommended when it comes to data quality. That said, there are many practices such as data validation, data profiling, etc. that can be incorporated within modest budgets.
Myth #6: All you need are the right tools
Technology can help clean data and improve the overall quality of a database. That said, these tools alone cannot guarantee success. As with a hammer, tools for data quality are only as useful as the skills of the people using them. To maintain high quality levels, businesses must create data governance teams and policies based on the organization’s requirements and data quality dimension relevance.
User roles and responsibilities must be thoughtfully assigned to minimize the risk of human error and maximize the potential of the data verification and cleaning tools being used.
Myth #7: It’s all about addressing data
Verifying data, cleaning it up, standardizing formats and other forms of data cleaning are important steps towards improving data quality. However, limiting the effort to these steps may not be sufficient. Businesses wanting to maintain high data quality standards must move from a reactive approach to a proactive one. This means going beyond addressing data to looking into data collection and generation processes.
The root cause of data quality issues must be identified and addressed. This may imply looking at a different source for data collection, data profiling according to business logic as well as data creation and consumption, clear data ownership policies, regular data quality audits and so on.
In conclusion
Above all, one of the most dangerous myths regarding data quality is the assumption that all data contained in an organization’s database is good. To build a data-driven culture, organizations must recognize the context and relativity of data quality dimensions.
To meet high data quality standards, you must pay attention to aspects such as accuracy, timeliness, completeness, formats, consistency and relevancy. It is important to remember that good standards in one dimension may not be reflected in others. For example, simply because data is accurate does not mean it is formatted correctly.
Gaining a better understanding of data quality and debunking myths is critical for businesses wanting to leverage the full potential of their data assets. Data quality must always be addressed proactively as early as possible. With the right data governance policy and data quality tools, businesses can leverage trustworthy data to gain a competitive edge.