Selim Ellieh Logo Image
Selim Ellieh

Supermarket Sales Analytics

Supermarket Sales Analytics utilizes machine learning to predict retail sales and uncover key business insights from supermarket data.

Project Image

Project Overview

This project focuses on the comprehensive analysis of supermarket store sales data. The primary goal is to uncover insights through exploratory data analysis, employ various machine learning techniques to predict sales, and understand the effectiveness of different models. This analysis is particularly useful for identifying key drivers of sales and optimizing business strategies in retail.

Key Features:

  • Exploratory data analysis (EDA): Conducted thorough EDA to understand correlations between features, identify gaps and outliers in the data, and visualize the density and distribution of the data using Kernel Density Estimation (KDE).
  • Outlier Removal: Implemented techniques to identify and remove outliers from the dataset, enhancing the accuracy and reliability of the predictive models.
  • Clustering with KMeans: Utilized KMeans clustering to segment the data, revealing distinct customer groups and patterns for targeted analysis.
  • KFold Cross-Validation: Applied KFold cross-validation to evaluate model robustness, preventing overfitting and verifying model generalizability.
  • Predictive Modeling: Developed and compared multiple regression models, including RandomForestRegressor, LinearRegression, ElasticNet, KNeighborsRegressor, and XGBRegressor, to determine the most effective approach for sales prediction.

Tools Used

Python
NumPy
Pandas
Matplotlib
Seaborn
Scikit-Learn
XGBoost
MissingNo
Jupyter Notebook