#you need to rename the csv file 1st for easy going
In this project, I performed an end-to-end data analytics workflow using Python and SQL. I downloaded a dataset using the Kaggle API, processed and cleaned the data using Python (Pandas), and loaded the cleaned data into SQL Server for further analysis. I then designed and executed SQL queries to extract meaningful insights. This project showcases my ability to manage data pipelines and apply analytical techniques to answer business-related questions.
Step-by-Step Explanation:
1. Download Dataset using Kaggle API: • The dataset, named "Retail Orders," was downloaded using the Kaggle API.
• Required authentication setup using the Kaggle JSON token.
2. Data Cleaning and Processing in Python (Pandas):
• Loaded the dataset into a Jupyter notebook using Pandas.
• Performed data cleaning: handling missing values, renaming columns, and correcting data types.
• Created new columns and performed necessary transformations for better analysis.
3. Load Data into SQL Server:
• Connected to SQL Server and loaded the cleaned dataset into SQL tables for further analysis.
4. SQL Analysis:
• Designed and executed multiple SQL queries (5-6) to answer specific business-related questions
• These queries focused on key retail metrics such as sales performance, order trends, and customer behavior.