This content originally appeared on DEV Community and was authored by Sujitha Selvaraj
Data Cleaning Challenge with Pandas (Google Colab)
Data cleaning is one of the most crucial steps in any data science or analytics project. In this challenge, I worked on a real-world dataset from Kaggle with over 100,000 rows, performing various Pandas operations to clean, preprocess, and prepare it for further analysis.
Dataset Details
For this challenge, I selected the E-commerce Sales Dataset from Kaggle containing around 120,000 rows and 12 columns.
It includes data such as:
Order ID
Customer Name
Product & Quantity
Sales & Discount
Region
Order Date
Before Cleaning:
Rows β 120,000
Columns β 12
File format β .csv
Tools & Environment
Python 3
Google Colab
Libraries: Pandas, NumPy, Matplotlib
This content originally appeared on DEV Community and was authored by Sujitha Selvaraj
