Clean and prepare a raw dataset for analysis, flagging data quality issues, handling nulls and outliers, and documenting every transformation made.
Data AnalystClaudeCo-PilotChatGPTGeminiLowUpdated Mar-26
501·
Prompt
I have a raw dataset I need to prepare for analysis. Here are the details:
Dataset description:
Source:
Intended use:
Known issues:
Please:
1. Identify all data quality issues — nulls, duplicates, inconsistent formatting, type mismatches, outliers
2. Recommend a handling strategy for each issue with clear reasoning
3. Suggest any derived columns or transformations that would add analytical value
4. Produce a data cleaning checklist I can follow step by step
5. Document all transformations in a format suitable for a data dictionary
If any clarification would help you give better recommendations, please ask before starting.