Here, I explore Collaborative Filtering, a technique used in recommender systems.
I focus on 2 types of collaborative filtering: user-based and item-based. I've created a memory-based implementation for both of them.
The MovieLens 100K dataset is used for building the recommender systems.
Copy the dataset, and unzip it into a folder.
The implementation of the collaborative filtering algorithms is done using Pandas.
For the item-based collaborative filtering algorithm, I based my implementation on the excellent Udemy course Taming Big Data with Apache Spark and Python - Hands On! by Frank Kane. The motivation to implement with Pandas the algorithm is to compare implementations with a library for distributed computing like Spark. For the user-based approached, I did not follow a specific recipe.
Jupyter notebooks (here and here) explain the methodology and the followed steps. I also develop a strategy to measure the quality of the recommendations here and here.