Improving data quality in an organization can be challenging. In this interview, digital analytics expert Wendy Zhang shares her winning tactics.
Two familiar data quality challenges
D.D.: Wendy, you have spent many years leading and driving analytics initiatives at scale. From your experience, what common data quality challenges have you identified?
W.Z.: I think there are two important data quality challenges that people experience, and they are definitely related:
1. People don't realize that they need better data quality
What I hear a lot in the industry is that there are a lot of buzzwords thrown around and some people are tempted to think that adopting a new piece of technology or AI/ML will solve all their problems, without realizing that there are a lot of data quality steps you have to take before to get yourself set up. This is one of the main reasons why many digital transformation initiatives don’t work.
Everybody knows the phrase garbage in, garbage out. I would actually argue that it’s even worse than that: garbage can lead to a disaster. It’s similar to building a house: if the foundation is wobbly and unstable, it's not going to last. Another analogy I use: everybody wants the penthouse, but no one wants to build the groundwork.
2. People don't realize how difficult it is to clean up data
Getting the data foundation right can take a lot of time. It is a detailed and oftentimes tedious or unglamorous task.
More importantly, it’s not a one-off project, it’s a process. It doesn't have a certain end date - as your business grows, you have more data, and your company is forever evolving. As we collect more and more data, it will have a trickle-down effect on how to manage it and keep your business rules up to date.
So, the question arises: how do you monitor data that when you're supposed to do something new? Is it going to be completely revamped or is it going to be periodic monitoring adapting to the changing world?
At some point, the data quantity changes are going to lead to something qualitative and of sizeable substance.
adoption
D.D.: Why do we need to address data quality challenges now, not later? What can be the price of waiting?
W.Z.: Well, it's not a situation that will go away or improve on its own. And no technology is going to solve that problem alone for you. We have to do it now because the more data gets piled on, the greater the challenge becomes. If you’re not solving the root cause problem, all data quality challenges are only going to be amplified down the road, and it will be harder and harder to address them later. It grows multifold, especially as the scale and the complexity of data grows exponentially. This will impact your business processes, your people, and your bottom line. The longer you wait, the longer and harder it will take to address those issues, and the more profound will the business impact be.
The longer you wait, the longer and harder it will take to address data quality issues, and the more profound will the business impact be.
The importance of data quality for digital transformation
D.D.: In the light of the pandemic, many organizations have accelerated digital transformation. However, new technologies like AI and machine learning hinge on the ability to harness data of the right quality. Why is data quality important for successfully leveraging these new technologies?
W.Z.: Data quality is definitely a very important factor for accelerating digital transformation.
During the pandemic, we all found new ways of working and had to adapt to keep the business running. At the same time, most of us witnessed an awakening moment: the old ways that have worked well before, are not working so well anymore.
We have experienced digitalization in the digital age first-hand. In turn, this development brought some data deficiencies and inefficient systems into the spotlight.
As a result, we are now seeing a wave of companies trying to do digital transformation. And this involves
- a significant investment in people that have the corresponding experience
- a lot of spending upfront on the technology. But very few companies actually consider this: is our data ready?
Data quality unfortunately still comes as an afterthought. It’s not part of the digital transformation process from the onset. We're always surprised by how messy our data is and how data of questionable quality slows down processes and leads to wasted financial resources time and time again.
Imagine you're well on your way from a technology standpoint, from the system and architecture standpoint, but your data is not ready. When you’re e.g. migrating data, you cannot use the same old processes of lifting and shifting data in the cloud. If you’re using the same processes, you’re going to deal with the same challenges.
Data quality unfortunately still comes as an afterthought. It’s not part of the digital transformation process from the onset. We're always surprised by how messy our data is and how data of questionable quality slows down processes and leads to wasted financial resources time and time again.
D.D.: What about people? What role do they play in this digital transformation process?
W.Z.: People definitely play a key role in the digital transformation investment. You need the right people with institutional knowledge to help you actually make that digital transition. Given that there is a talent war going on, we need to start building our talent pool to really do digital transformation. We need to find people with the right skills and experience, and that can be a challenge.
In the light of the ongoing pandemic, many marketing and analytics professionals have been reassessing their priorities, careers, and lives. As a result, it tightens the eligible talent pool in the market and it becomes really hard to find the right talent.
Once you find them, a company needs to help them grow and continue learning, so that they are engaged and motivated to drive the digital transformation process successfully.
D.D.: Indeed. So, for a digital transformation process to be successful, you need the following 3 elements: the right people, the right technology, the right data foundation.
W.Z.: Precisely. Technology is the enabler. It needs to work as intended: efficiently, effectively and economically feasible. People are the ones driving everything: how you design and redesign your processes, how you treat your data, and how you implement your technology. And data is the foundation holding everything together.
What layers of data quality to focus on
D.D.: Reaching 100% data quality is not a realistic goal. From your experience, what layers of data quality (e.g. accuracy, completeness, relevance, timeliness) would you recommend focusing on and why?
W.Z.: Of course, in a perfect world, they're all important. If we could achieve all the dimensions of data quality, that would be ideal.
I started my data career in the Air Force, and we were also always trained to make sure that data was current, complete and accurate. In that context, those three aspects are very very important.
When I moved to the financial services industry, our mantra was that data should be fit for purpose. This was a combination of having data that is current, complete and accurate.
In my mind, relevancy is the most important aspect of data quality. You can have accurate and timely data, but if it’s not relevant for what you're doing or fit for purpose, then it’s not really usable. You cannot really leverage it as it doesn’t serve your goals. So, I would say that both relevance and finding the data you need for the purpose are probably the most important aspects.
You can have accurate and timely data, but if it’s not relevant for what you're doing or fit for purpose, then it’s not really usable. You cannot really leverage it as it doesn’t serve your goals.
organization
fall behind