
“A more strategic approach is required to address today’s Data Quality concerns,” said Nigel Turner, Principal Consultant, Global Data Strategy. Turner discussed Data Quality Management, what it comprises, and how to succeed by closely aligning it with two other disciplines, at the DATAVERSITY® Enterprise Data Governance Online (EDGO) Conference.
What is the definition of data quality management?
Data Quality Management is defined in the DAMA-DMBoK2 as the planning, implementation, and control of actions that apply quality management approaches to data in order to ensure that it is fit for consumption and meets the needs of data consumers. Turner explained that high-quality data is clearly fit for purpose since it can be demonstrated to suit the needs of the data’s users.
The degree of precision required to consider data fit for purpose varies according to the data’s intended use. Accuracy of one hundred percent is not always possible, nor is it necessary in all circumstances. When it comes to billing clients, for example, he added that 100 percent should always be the goal, as this directly influences revenue.
On the other hand, 85 percent is definitely acceptable for a marketing prospect database. The work and expense associated with achieving perfect accuracy should be evaluated against any potential rewards. To demonstrate the success of Data Quality measures, baselines must be created so that the ensuing outcomes can be quantified when improvements are made.
Criteria for Data Quality
Turner stated that the following five critical parameters are utilized to assess data quality:
Accuracy: Does the data accurately reflect reality? “Do I genuinely live at that address if an organization knows my name and address?”
Completeness: Are all of the address’s components present? Is any necessary information missing?
Reliability: Is data consistent across all sources when it is duplicated? “Are all six data sources correct if my name and address are stored in six distinct data sources?”
Accessibility: Are the appropriate individuals able to obtain the appropriate data?
Can they do it in a timely manner, rather than waiting until it is too late to be of any value or use to them?
How Low-Quality Data Affects Businesses and Individuals
Regulatory: Data quality is becoming increasingly crucial as a result of new rules and regulations governing data use and the associated penalties.
Decision-Making: “If your data is unfit for purpose, data-driven decision-making might result in some really undesirable results,” he explained. For example, if performance data is incomplete or erroneous, poor decisions can occur.
Revenue Loss and Cost Increases: Failure to bill clients on time or for all services or goods owing to poor data quality results in revenue loss. When the incorrect product is delivered to the consumer, costs increase. The consumer is responsible for returning the item, and the correct product must be re-sent.
Costs of reputation: Negative data stories can undermine a company’s brand and reputation, and in certain situations, cause physical injury.
Two Cautionary Tales About Data Quality Failures
Amazon recently advertised a Canon telephoto lens on sale for $9,498 during its Prime Day sale event. The lens regularly retails for roughly $13,000. However, the sale advertisement showed the price as $94.98. Although Amazon detected and fixed the error within a few hours, the lens had already been purchased by hundreds of customers at the $94.98 price.
Despite the $7,738 loss on each lens sold, Amazon elected to honor the transaction at that price rather than face legal action from customers who purchased the goods in good faith at the advertised price. If they had not resolved the issue and rectified the situation, it may have impacted client loyalty to their brand, he explained.
A guy was admitted to a National Health Service hospital in the United Kingdom for a cystoscopy, a diagnostic procedure that involves the insertion of a camera into the bladder to look for any issues. The patient’s name was confusingly similar to that of another patient who was also scheduled for surgery, and because no one checked the patient’s identity, he was given a circumcision rather than a cystoscopy. Apart from one reasonably irate patient pursuing legal action, Turner stated that poor data quality can result in both personal and economic damages.
The State of Data Quality Is Worse Than Most Managers Believe Realize
Turner explained that in a 2017 study, Harvard Business Analyze invited 75 top executives from diverse firms to randomly review 100 records from a critical data source. They discovered that only three out of every hundred records — or 3% of all records evaluated — passed expected Data Quality requirements. In other words, 97 percent of respondents reported major Data Quality issues. While a large portion of that 97 percent can be attributable to outdated data, even among more recent records, over half also suffered Data Quality problems.
Four Case Studies Illustrate the Industry’s Impact
More than half of businesses say that at least 26% of their data is erroneous (BARC 2019). According to a 2017 analysis by MIT Sloan, poor data quality costs businesses between 15% and 25% of revenue.
According to a 2016 IBM research, the US economy loses $3.1 trillion annually owing to poor data quality, and Royal Mail Data Services determined in 2017 that poor data quality costs UK businesses an average of 6% of yearly sales.
Why Does Low-Quality Data Persist?
Data quality is hard due to the complexity of businesses and organizations. Twenty years ago, the standard was a mainframe with a single database, accessed via dumb terminals by end-users. Managing Data Quality was a lot simpler in that setting.
Since then, the rate of data expansion has accelerated substantially, making it extremely difficult to stay up, Turner explained. Data collection sites have become widespread, and the data variety has expanded to incorporate more structured, semi-structured, and unstructured data. Duplication and errors are significantly more likely to occur now that people carry numerous devices and have multiple contact choices.
Due to the ongoing change in the business environment as a result of mergers, acquisitions, launches, and closures, a business-to-business contact list now degrades at a rate of 2% per month, rendering 75% of the database obsolete within three years.
Additionally, a lack of understanding of the function of data might result in a lack of accountability. Poor data quality is a business issue, not an IT issue, he explained, and if no one is formally accountable for data improvement, it will never be addressed.
Source: data science course Malaysia, data science in Malaysia