Comprehensive Retail Data Analysis using SQL

Introduction

In this article we will be learning various SQL  basics through examples by doing a comprehensive EDA on a Retail Sales Dataset

The dataset that has been used for the illustrations in this article can be downloaded from this link.

Let us first have a look at the dataset that we will be analyzing.

Data_Snapshot

Data_Snapshot

Based on the above output the features in the dataset can be broadly classified into the following categories:

Dataset_Description

Dataset_Description

Before diving deeper into the demographic and transactional analysis, let us have a look at how many distinct customers, shop categories and states are in the dataset.

Distinct Customers states and shop categories

Distinct Customers states and shop categories

Demographic Analysis

We have three demographic features that are associated with every customer which are Age, Gender and State.

Let’s deep-dive into each of these features to understand the dataset better.

First let us see what is the minimum, maximum and average age of the customers in the dataset.

Age statistics

Age statistics

 

 

 

Let us find out how many males and females are in our customer base

Gender Distribution

Gender Distribution

 

 

 

 

Let us now have a look at which gender-state segment contributes the most to our dataset

State Gender Segments

State Gender Segments

 

 

 

 

 

 

As can be seen from the above output the females from State DP contribute the most to our dataset

Transactional Analysis

Average Customer Spend

One key finding that would help us understand our dataset better is that customers from which state spend more.

A way to measure this is to calculate the Average customer spend metric that can be done in the following way:

Average Spend across states

Average Spend across states

 

 

 

 

 

As observed from the above output, the customers that belong to the state DP spend the most.

Sales Across Categories

Category wise Sales

Category wise Sales

 

 

 

 

 

Purchase patterns across Gender

Let us have a look at which category do females spend the most on.

Sales by Female Customers

 

 

 

 

 

 

Let us calculate the same for male customers and analyze whether there is any difference between the purchase pattern of males and females

Sales by Male Customers

Sales by Male Customers

 

 

 

 

 

 

Going by the above results we can conclude that males and females have similar purchase patterns.

Sales Across States

Let us find out which state contributes to the maximum amount of sales.

Sales Contribution Across States

Sales Contribution Across States

 

 

 

 

 

 

Average Basket Size across Gender

Let us find out which gender purchases more number of items per transaction.

Items_per_transaction

Items_per_transaction

 

 

 

 

Therefore from the above output we can conclude that males purchase more number of items per transaction.

Results

Based on the above analysis, we can summaries the results as follows:

  • The average age of the customers in the dataset is 48 years
  • The state DP is the largest in terms of customer base as well as Sales
  • Customers from the state AP have the lowest average customer spending and customers from DP have the highest average customer spending
  • Both males and females spend the most on clothes which is followed by electronics
  • Our customer base consists of 101 females and 74 males
  • Males purchase more number of items per transaction

 

You may also like...