There are many challenges that must be faced when working with data, including:
- Data doesn’t explain itself. Someone must provide an interpretation of the data, including what it means, how to use it properly, and how to evaluate whether the data is of good quality or not.
- Data is shared and used by many, for many different purposes. So who owns it? Who makes decisions about it and is responsible when the data goes “wrong”?
- Many processes that use data depend on people upstream of the process to “get it right.” But who says what “right” is? And who determines when it goes “wrong”?
- The software development life cycles require many handoffs between requirements, analysis, design, construction, and data use. There are lots of places where the handoffs can corrupt the data and endanger the data quality.
- Technical people tasked with data implementation are not familiar with the data’s meaning or how it is used.
- Those of us in the data community have a long history and habits around tolerance of ambiguity both in data meaning and data content.
All of these factors lead to a poor understanding of the data and a perception of poor quality, with little ability to know the difference