We have two(2) modules on Data Exploration. At the end of Module 1, you should be able to:
- Have an Understanding of Data Exploration
- Differentiate between Data Discovery and Data Exploration
- Explain in details, the three(3) general steps in data exploration
Exploration is synonymous to the word “investigate”, “examine”, “review” and “analyze”.
Data exploration is the second step in the four phased approach of advanced analytics and it focuses on reviewing data collected and doing a deep dive to unearth insights from the unknown.
This might sound confusing as data discovery also has to do with unearthing insights.
True! However, in data discovery, analyst collects specific data sets used in solving targeted business problems while data exploration deals with identifying something new and unknown from the data set already collected. This requires the analysts to have no limitations on the size of their data sets and on the given outcome as seen on the data sets.
An example is the Sample Supermarket Store dataset on our Data discovery page whose owner is interested in expanding his business, increasing profits, reducing cost and staying competitive with an absolute advantage in the short, medium and long term . Implementing a data discovery analysis will need the analyst to pick up specific data points to answer the store owners questions. An example could be comparing cost incurred on sales to profit received for the period in view.
In the above example, Data Exploration will need an analyst to further investigate and drill down to more specific issues connected to the cost data where patterns of cost falls at higher levels- more like doing a root cause analysis on cost.
Data exploration is critical in the analysis cycle and for big data where we have enormous data-sets- structured and unstructured- as using the wrong data could lead to misleading and non-optimal conclusions.
THE THREE(3) GENERAL STEPS IN DATA EXPLORATION
Below is a general step used in data exploration.

- Goal Definition: Using the Supermarket Store example, one of the owner’s goal is to reduce cost incurred in his store. As a data analyst, you have collected relevant data sets and have also done some form of data discovery and noticed that the cost incurred on expenses is on the high side. However, you intend to drill-down/explore more on other data sets affiliated to the increase in cost to unravel more insights.
- Method Selection and Analysis: To select the best method for analysis in a given data set, It is advisable for analyst to understand the type of data sets they are working with. Questions around the following could be very useful:
a. Are they Quantitative or Qualitative variables? see more on variables.
b. Are there relationships between the various variables? Using the supermarket store example on cost, can we identify a relationship between marital status of employees of the store and the cost incurred in the store for that period?
c. What is the best statistical method to be implemented to solving the various data problems, either on relationships or rankings or time related issues and depending on insights uncovered from “b”
3. Visualization and Evaluation: The last step in data exploration is visualizing insights uncovered with in-depth recommendation.
Review Questions:
- Discuss Data Exploration
- How does Data Exploration differ from Data Discovery?
- Give a review of the three(3) general steps in Data Exploration.
Your message has been sent
Kindly click Data Exploration: Module 2 to read more on Data Exploration
