#1 Most Important Skill for Data Analysis (+ 8 Good Ways to Level Up)
As a data analytics teacher, one of the most frequently asked questions that I receive is this: what skills should I develop to be a great data analyst?
My students seem to expect the typical reply. If you’ve asked this question before, you probably know that the standard reply goes something like this: a great analyst will know how to program in Python / R, query databases using SQL, have an extensive math and statistics background, understand and execute data mining algorithms, etc. etc.
My reply is a bit different – read on for more.
The #1 Skill That a Data Analyst Should Have
While I would agree that the tools and techniques listed above are good for furnishing a data analyst’s toolbox, I would argue that the most important skill is missing from that list.
To me, the most important skill necessary for data analysis is simple. A great data analyst is a master of critical thinking. They then use this fundamental skill to solve problems with data.
You will notice that my response is free from tools, and free from techniques and tactics. The magical skill is critical thinking… but it’s not always easy to develop.
Here are eight ways to help you develop your critical thinking skills.
Understand The Problem
1. Learn how to dissect a problem.
A good place to start is to ask this question: Why do we analyze data?
In general, we analyze data to spot trends and patterns in data (called insights) that in turn help us to make better, more informed decisions.
Typically, we go into an analysis with some sort of understanding about what we want to find – a better marketing approach to target customers that are most likely to make a purchase, a more efficient allocation of resources to produce a product, a family shopping list to minimize waste while making sure that everyone’s needs are met, to reduce the number of customers that cancel our service, etc. These are all desires that might be presented in hopes of finding some insight.
2. Understand the questions that you can ask of the data presented to you.
As with any problem that we may have, we are often faced with a goal and often a set of constraints. When it comes to data, we are often constrained by availability of data, accessibility of data, data quality, and data reliability.
These factors can impact the questions that we can ask. For example, I may want to optimize a process, but I might not have data at a level that is granular enough for me to key in on my subject.
Being creative and following your curiosities may enable you to assess the problem and reframe the way that you can provide insight based on the data available to you. Alternatively, you may be able to explore and find additional data that will help you.
Know Your Data
3. Take the time to understand your data.
In my experience, I frequently see inexperienced data practitioners that are presented with some data that do not take the time to understand it. This puzzles me exceedingly!
Very rarely (if ever) are we presented with data that are clean, structured in the way that we need it to be, and containing all of the elements that we need for a successful analysis.
A few questions that you might ask yourself in order to slow down and understand your data are as follows:
- Do I understand the contents of the data set, its attributes and sources? Do I trust the source of the data?
- Can I identify my unit of analysis in the data set? Will I need to aggregate the data in order to make meaning at the level of granularity that I need?
- Is the data complete?
- Are there data quality issues?
4. Be curious about missing data.
Missing data is important – and will be the topic of a future blog post (sign up for our emails if you’re interested in getting new post alerts). But for the purposes of this discussion, missing data is data that is missing (either randomly or systematically) from your data set.
Missing data can bias your analysis or render any results that you find to be unreliable. There are, however, ways that you can manage missing data and/or improve your data set just by being a bit more curious about the data.
5. Be curious about the shape of your data.
Similar to missing data, the shape of your data is important. The shape of data refers to the distribution of your data. Think of common statistics like the mean, median, mode, standard deviation and variance. If your data are skewed, there are assumptions and rules that might apply to a normally distributed variable that might not apply.
There are techniques, like data transformations, that may enable you to shape your data into something that is more useful to the type of analysis that you want to do. This can help you to avoid biasing your data or drawing inappropriate conclusions.
Identify and Question Bias
6. Know the types of bias that exist.
As a matter of ethics, it is important to know not only your data but also how it is transformed and used. Through conducting these processes, we often introduce our own biases into our analysis – sometimes without even knowing it. For example, we decide what is measured, how it is measured, how we aggregate the data, etc.
We must be in the practice of identifying and questioning bias in our data sources, in our construction of measures, and in our understanding of how our analyses will be used.
Knowing and reflecting on the types of cognitive bias that exist is the first step in doing this.
Master Data Cleansing and Manipulation
7. Understand that data analysis is part art, part science.
Part of the fun of being in the analytics / data science space is that you get to don both hats of creativity and technical excellence. There is no one absolute answer. In fact, there are infinitely many ways that you can develop your analysis.
Know the basics of what can be done and the impact that it will have on your data. This will enable you to take your data in directions that others may not think. It will make it possible for your to structure your thinking about a topic and create meaning that other analysts may not see when they rush to find insights.
8. Document your actions and understanding.
This is perhaps one of the most difficult things for a technical / analytical person to do, but it is critically important. In order to recall your thinking and disclaim your constraints and assumptions at any given point in your process, you will want to take detailed notes and document your understanding.
This also becomes important if you attempt to replicate your work or hand it off to someone else. When reconstructing your thought-process, it can be difficult to recall details that might have caused you to alter your approach.
Concluding Thoughts
Before teaching, I spent over a decade working in industry. I can assure you that I am not alone is saying that I am less impressed with someone that can run some complex automated process, and more impressed with someone that can reason with me about the outputs that were generated.
The path to effective reasoning and delivery of insights is solid critical thinking. This is the most important skill that you can develop as an analyst.
What other ways might you develop critical thinking skills for data analysis? Let us know in the comments section.
Leave a Reply