Skip to content

VSaicholik/Retail-Data-Wrangling-Profit-Analysis-R-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“Š Retail Data Wrangling & Profit Analysis using R

A complete Retail Sales Data Analysis Project developed using R Programming and the Tidyverse ecosystem. This project focuses on data cleaning, wrangling, exploratory data analysis (EDA), and profit analysis across multiple retail datasets from different regions.

πŸš€ Project Overview

This project combines and analyzes retail order datasets from:

Central Region Western Region Eastern Region

The datasets contained inconsistencies in:

Column names Date formats Numeric formats Duplicate fields Missing values

Using R, the project standardizes and cleans the data to create a unified dataset for analysis and reporting.

πŸ› οΈ Technologies & Libraries Used πŸ”Ή Programming Language R Programming πŸ”Ή Libraries tidyverse dplyr tidyr ggplot2 lubridate readr stringr πŸ“‚ Project Features βœ… Data Wrangling Cleaned inconsistent datasets Standardized column formats Converted date formats Removed duplicate columns Parsed numeric values Combined multiple regional datasets βœ… Exploratory Data Analysis (EDA) Sales Analysis Profit Analysis Shipping Time Analysis Regional Performance Comparison Product Category Insights βœ… Data Transformation Created calculated fields Generated shipping duration metrics Unified schema across datasets Built analytical-ready dataset βœ… Visualization Profit Trends Sales Comparisons Regional Distribution Analysis Product Category Charts πŸ“ Project Structure Retail-Data-Wrangling-Profit-Analysis-R/ β”‚ β”œβ”€β”€ mid+project.pdf # Project Report & R Code β”œβ”€β”€ Orders_Central.csv # Central Region Dataset β”œβ”€β”€ Orders_West.csv # Western Region Dataset β”œβ”€β”€ Orders_East.txt # Eastern Region Dataset └── README.md # Project Documentation βš™οΈ Data Processing Workflow πŸ“Œ Step 1 β€” Import Datasets Loaded CSV and TXT datasets Imported regional retail order data πŸ“Œ Step 2 β€” Data Cleaning Removed unnecessary columns Renamed inconsistent attributes Converted data types πŸ“Œ Step 3 β€” Data Wrangling Standardized datasets Merged datasets into one master dataset πŸ“Œ Step 4 β€” Feature Engineering

Created:

Shipping Duration Year-Based Metrics Profit Calculations πŸ“Œ Step 5 β€” Analysis & Insights

Performed:

Regional Analysis Profitability Analysis Sales Trend Analysis Customer Insights πŸ“Š Key Analysis Areas πŸ’° Profit Analysis Identified high-profit regions Compared category-level profits Evaluated discount impact on profits 🚚 Shipping Analysis Calculated time-to-ship metrics Compared regional shipping performance πŸ›’ Sales Insights Product category performance Regional sales comparison Customer purchasing behavior ▢️ How to Run the Project 1️⃣ Install Required Libraries install.packages("tidyverse") install.packages("lubridate") 2️⃣ Open the Project in RStudio

Open:

mid+project.pdf

OR run the R scripts directly.

3️⃣ Execute the Code

Run the code sections step-by-step in:

RStudio Posit Cloud Jupyter with R Kernel πŸ“ˆ Skills Demonstrated πŸ”Ή Data Cleaning πŸ”Ή Data Wrangling πŸ”Ή Data Transformation πŸ”Ή Exploratory Data Analysis πŸ”Ή Statistical Thinking πŸ”Ή Data Visualization πŸ”Ή Business Insights Generation πŸ”Ή Retail Data Analytics 🎯 Learning Outcomes

This project helped in gaining hands-on experience with:

Real-world messy datasets Retail business analytics Data preprocessing techniques Data integration across multiple sources Analytical storytelling using R πŸ‘¨β€πŸ’» Developed By Sai Cholik Vempati

Master’s in Information Systems

🌐 GitHub Profile πŸ’Ό LinkedIn Profile

⭐ Project Highlights

βœ… Multi-region retail dataset integration βœ… Advanced data wrangling using Tidyverse βœ… Business-focused profit analysis βœ… Shipping performance analytics βœ… Clean and structured analytical workflow

About

Retail sales analysis using R and tidyverse to clean, merge, and analyze multi-region datasets. Explores shipping performance, product profitability, customer segments, and yearly sales trends with data wrangling and visualization (ggplot2) to generate business insights.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors