The Evolution Of Data Quality
Melissa AU Team | |
How do brands decide where to open a store? How does an entrepreneur decide where to advertise his new company? What makes a company decide to promote one product and not the others? Data – all of these decisions are based on data.
Today, finding data is easy but this is not enough. To support good short-term and long-term decision making, businesses must ensure that they rely on good quality data. This means that it must be an accurate representation of reality. It must be:
These criteria developed over time as businesses grew more and more reliant on data. Let’s take a look at this evolution.
Establishing The Importance Of Data Quality
The term ‘Business Intelligence’ (BI) was established by Professor Richard Millar Devens in 1865. It was first published in his Cyclopædia of Commercial and Business Anecdotes. The term was used to describe how Sir Henry Furnese gathered data and acted on it in a bid to boost profits.
In 1958, Hans Peter Luhn published an article about the potential of collecting and analyzing business information with the help of technology to make it useful. His articles mentioned that using this information ‘first’ could offer significant advantages.
Until the late 1960s, the only people who could translate data into useful information were specialists. At the time, data was typically stored in silos and remained fragmented and disjointed. Thus, the results were often questionable. The problem was recognized by Edgar Codd in 1970. He presented a solution in the form of a “relational database model” that was soon adopted worldwide.
Early Data Management
The earliest form of a database management system can be described as the decision support system (DSS). Modern business intelligence has its roots in this system. By the 1980s, the number of BI vendors had grown considerably. By now, the value of data-driven decisions had been realized by many businesses. Several tools were also developed to help make data more accessible and organized. This included data warehouses, OLAP and Executive Information Systems. This further led to the development of relational databases.
Data Quality-As-A-Service (DQaaS)
At this time, data was stored in large mainframe computers that held the name and address data for data delivery. They were designed to track customer accounts that were invalid because the person had moved, got married or divorced, died or any other such reason. They could also correct errors in common spellings, names and addresses.
Around 1986, governmental agencies allowed companies to reference postal data by cross-checking their data with the NCOA (National Change of Address) registry. This move dramatically improved data quality and accuracy and was initially sold as a service.
The Internet And Data Availability
By the early 1990s, the internet was beginning to make its presence felt and it brought with it a flood of data. At the same time, reliance on data analysis for business decisions was also increasing. The relational databases in use could no longer keep up with the data available to them. This was compounded by the different data types that were being developed.
A solution emerged in the form of Non-relational databases or NoSQL. These databases used multiple computer systems and could translate many different types of data and made data management more flexible. Large unstructured data sets handled by NoSQL began to be referred to as big data and the term became official by 2005.
By 2010, businesses developed means to store, combine, manipulate and present information in different ways to cater to their different needs. This marked the beginning of data governance. Pioneering companies in the field designed governance organizations to find the best way to maintain data and develop collaborative processes.
They also brought about a “policy-centric approach” to data models, data quality standards and data security. They designed processes that allowed data to be stored in multiple databases as long as it adhered to a common policy standard and thus data became a corporate asset.
Data Quality In The Present Day
Data governance is still evolving. There are a number of associations such as the Data Governance Institute, Data Governance Professionals Organization, etc. that are committed to promoting data governance practices.
Given that it reduces the risk of human error and offers faster processing, Machine Learning is becoming a popular way to implement data governance. Every company that uses data should have its own data governance teams and policy. Data governance teams comprise of business managers and data managers as well as clients who use the business services.
The policy should ideally cover the overall data management, how data is to be used, to whom it should be available and data security. There are countless tools available to enhance data quality but finding the tools best suited to a particular situation can be a challenge. If you’re looking for ways to enhance your data quality, choose workflow driven data quality tools that can be scaled easily while maintaining a system of trust.