This is the second module of Data Discovery.
At the end of Module 2, you should be able to:
- Highlight challenges in Data Discovery
- Give a brief explanation on the five(5) steps of Data Discovery
- Respond to all review questions on Module 2
What are the Challenges of Data Discovery?
One major challenge that could hinder the best outcomes in data analysis are issues regarding DATA PROCESSING steps and general data management issues. Understanding the challenges of data discovery can improve overall business outcomes. Below are some data discovery challenges faced in many organizations.
Data Management
It is widely known that without a standardized policy and implementation of data governance rules in place, data discovery processes will continue to be hampered. Data issues could cover a wide range of problems as related to its accuracy and consistency. However, if data is managed appropriately with the right data governance framework, this could drastically minimize the long-term risk of business failures which could be as a result of misrepresentation of any organizational true state.
Data Size
Data size also referred to as Data Volume highlights large quantities of data created and stored. Due to the size of this data set, implementing a data discovery process could become a challenge hence skewing the outcomes of analysis. However, with a strong data governance policy in place, this could be averted.
Data Inconsistencies
When Data has multiple sources, there is 99.9% probability of having different outcomes by different data analyst. An example is an analyst in the sales team who presented during a senior management meeting different data outcomes with another team who also used the same data set but gotten from different data sources. It is clear that a single source of truth of data should be implemented to solve issues related to data in silos.
Data Variety Issues
Data can never be statistic as businesses strife to remain competitive thereby introducing new data points to actualize their goal. It is therefore important to preserve the consistency and accuracy of the growing variety of the different data sets ingested/captured in different formats against the old for an optimized data discovery processes.
5 Steps of Data Discovery
Now that we have a deep understanding of Data discovery processes and the value which we can unearth from it, It is important we understand the five(5) data discovery processes that will help structure and leverage the full value of any data set.
The five(5) Steps of data discovery are:
ACQUIRE==PREPARE==VISUALIZE==EXPLORE==COLLABORATE==
The table below highlights data tools and exercise that could be used to achieve data discovery steps.

Review Questions
Define in your own words, the challenges of Data Discovery.
List and explain the Steps in Data Discovery processes beyond those highlighted above?
What data tools can be used for data discovery?
Your message has been sent
Hands- On Exercise:
What insights can you derive from the sample data set below? Present your answers on the Presentation slides below
