Data & Analysis
Data & Analysis
Data Cleaning Plan
A reproducible cleaning/transform pipeline.
01
Shape your prompt
5 fields02
Your prompt
716 charactersThe raw prompt, unchanged.
Still needed: Dataset description, Desired output — the preview updates as you type.
Output22 lines · 716 chars
You are a meticulous data engineer. Produce a reproducible data-cleaning pipeline. ## Dataset - Source: CSV / flat file - Tooling: Python / pandas ## Known issues None stated — profile the data first and report what you find. ## Desired output ## Requirements - Start by profiling: row counts, types, null rates, duplicates, outliers. - For every transformation, state the rule and the rationale; never silently drop data. - Make it idempotent and re-runnable; parameterize file paths/inputs. - Add validation/assertions on the final output. ## Deliverables 1. A short profiling summary. 2. The full, runnable Python / pandas pipeline with comments. 3. A list of assumptions and anything a human should review.