Skip to content

Latest commit

 

History

History
42 lines (25 loc) · 2.46 KB

File metadata and controls

42 lines (25 loc) · 2.46 KB

Time-Series-Analysis-on-Groceries-Data

Problem Statement:

The main objective of the project is to understand the implementation of the time series model and apply it in the groceries data. The dataset that we have chosen contains details of purchase of a variety of grocery items over time & we aim to utilize this dataset to predict the demand for these different products in future. This will help manage inventory and increase efficiency for the distributors.

The data is originally collected from the Kaggle dataset: https://www.kaggle.com/datasets/heeraldedhia/groceries-dataset

Research and Design Questions:

  • Are there seasonal or trend-based patterns in customers' purchasing behavior that can be identified and accounted for in a forecasting model?
  • What is the most appropriate forecasting method for this type of data (e.g. time series analysis using ARIMA, SARIMA,SARIMAX), and how can its accuracy be evaluated?
  • How can data preprocessing techniques such as feature scaling, outlier detection, or missing value imputation improve the accuracy of a forecasting model?
  • What are the trade-offs between a simpler, more interpretable forecasting model and a more complex, potentially more accurate model, and how can these trade-offs be optimized for the specific business problem at hand?
  • Experimenting with the timeframe factor - whether to consider daily or monthly data makes a difference in the accuracy of the predictions. Using automated models/libraries and different methods to find the most accurate predictions.

Following are few insights of the data:

image

image

Future Scope :

  • Expand the analysis done for the ‘Beverages’ category to all the other categories.
  • Make predictions for all the other categories.
  • Built a real time system to intake data and make predictions periodically

References :