This checklist will guide you through your data analysis projects by reminding you which actions to do next.
Our Data Cleaning Checklist consists of 3 main sections:
Extraction, Cleaning and Profiling
We have also added the Reminders section to remind you certain things throughout your analysis.
Click here to download the checklist in Excel File
Let's break the checklist down..
Before starting an analysis, it is always recommended to check the personal data guidelines to avoid any potential data privacy problems.
Then next thing you should do is to discuss the business needs with your stakeholders to chunk down the problem as much as possible before starting to code. Otherwise, you will spend hours on data profiling and trying to come up with an hypothesis.
After chunking down the problem and identifying the business needs, you can proceed to the Extraction section.
In Extraction section, you have to find the optimum way to get the data you need from the right sources. You should be querying for only the columns you need to avoid messy data.
After you get your data ready in your preferred platform whether if it is a BI tool or a language like Python, you should focus on :
- Data Integrity
- Handling null and duplicate values
- Formatting columns
- Handling outliers
- Naming conventions