With the vast number of products available on eCommerce platforms, personalizing content for users has become crucial to enhance engagement and customer satisfaction. Recommendation systems play a vital role in achieving this personalization. While traditional recommendation systems like Content-Based Filtering and Collaborative Filtering are widely used, graph-based approaches have gained significant traction in recent years due to their ability to model complex relationships between users and items effectively.
This project focuses on building a graph-based recommendation system by generating graph embeddings and utilizing similarity search with FAISS (Facebook AI Similarity Search) to deliver personalized product recommendations for an eCommerce platform.
To build a graph-based recommender system that recommends the best products to users on eCommerce platforms based on their purchase and search history.
The dataset contains user information with nine attributes collected from an eCommerce website. The attributes include:
- User ID
- Product ID
- Purchase history
- Search queries
- Product categories
- Time of interaction
- Purchase frequency
Note: Embedding data is not pushed to the repository due to upload size limitations.
- Programming Language: Python
- Libraries and Frameworks:
- File Management: Parquet format for efficient storage and processing.
- Database Management: SQL querying for data manipulation.
- Define the scope of the recommendation system.
- Analyze the data to identify user-item interaction patterns.
- Use interaction data to construct user-product graphs.
- Save the graphs in
.edg
format for compatibility with embedding tools.
- Generate random walks on the graph using techniques like DeepWalk and Node2Vec.
- Train embedding models using Word2Vec and Node2Vec algorithms.
- Save the embeddings in Parquet format for further use.
- Use UMAP (Uniform Manifold Approximation and Projection) for reducing embedding dimensions.
- Visualize user-product clusters to identify patterns and similarities.
- Use FAISS (Facebook AI Similarity Search) to build a scalable recommendation engine.
- Perform similarity searches on the embedding vectors to recommend products.
- Evaluate the quality of recommendations.
- Visualize clusters and analyze user preferences.
.
├── data/ # Contains data files.
│ ├── ConstructedGraph/ # Generated graphs from user-item interactions.
│ ├── Edg_Graphs_DataFile/ # Graphs saved in .edg format.
│ ├── Embedding_Data/ # Generated graph embeddings.
├── documentation/ # Learning resources and reference materials.
├── models/ # Saved models for embeddings and recommendations.
├── notebooks/ # Jupyter notebooks for experimentation.
│ ├── Data_Exploration_and_Analysis.ipynb
│ ├── Graph_Construct.ipynb
│ ├── Deepwalk_and_Node2Vec_Training.ipynb
│ ├── Result_Analysis.ipynb
│ ├── Embedding_Vector_Search_with_FAISS.ipynb
├── scripts/ # Python scripts for modular tasks.
├── requirements.txt # List of dependencies with versions.
└── README.md # Project documentation.
git clone <repository_url>
cd <repository_folder>
Install the required libraries using the following command:
pip install -r requirements.txt
- Train embedding models and generate recommendations:
python scripts/train_embeddings.py
- Perform similarity search using FAISS:
python scripts/faiss_search.py
- Visualize user-item clusters and analyze results using the Jupyter notebooks in the
notebooks
folder. - Check the
Embedding_Data
folder for saved embeddings and recommendations.
- Personalized Recommendations:
- Generated product recommendations based on user purchase and search history.
- Graph Insights:
- Visualized clusters to understand user-product relationships.
- Efficient Similarity Search:
- FAISS enabled fast and scalable recommendations.
Contributions are welcome! To contribute:
- Fork the repository.
- Create a feature branch:
git checkout -b feature-name
- Commit your changes:
git commit -m "Add feature"
- Push your branch:
git push origin feature-name
- Open a pull request.
This project is licensed under the MIT License. See the LICENSE
file for details.
For any questions or suggestions, please reach out to:
- Name: Abhinav Navneet
- Email: [email protected]
- GitHub: AjNavneet
Special thanks to:
- FAISS for enabling fast and efficient similarity searches.
- pecanpy and gensim for graph embeddings.
- umap for dimensionality reduction.
- DuckDB for its fast and efficient database capabilities.