Skip to content

This project walks through how you can create recommendations using Apache Spark machine learning. There are a number of jupyter notebooks that you can run on IBM Data Science Experience, and there a live demo of a movie recommendation web application you can interact with. The demo also uses IBM Message Hub (kafka) to push application events to…

License

Notifications You must be signed in to change notification settings

XJTLUmedia/movie-recommender-demo

 
 

Repository files navigation

Overview

This project walks through how you can create recommendations using Apache Spark machine learning. There are a number of jupyter notebooks that you can run on IBM Data Science Experience, and there a live demo of a movie recommendation web application you can interact with. The demo also uses IBM Message Hub (kafka) to push application events to topic where they are consumed by a spark streaming job running on IBM BigInsights (hadoop).

Quick start

There is an overview video on YouTube.

This project is a demo movie recommender application. This demo has been installed with approximately four thousand movies and 500,000 ratings. The ratings have been generated randomly. The purpose of this web application is to allow users to search for movies, rate movies, and receive recommendations for movies based on their ratings.

Notebooks

Start with Introduction to read more about this project.

You can import these notebooks into IBM Data Science Experience. I have occasionally experienced issues when trying to load from a URL. If that happens to you, try cloning or downloading this repo and importing the notebooks as files.

Technologies

The overall architecture looks like this:

Overall Architecture

The technologies used in this demo are:

Core components (Web Application)

  • Python flask application
  • IBM Bluemix for hosting the web application and services
  • IBM Cloudant NoSQL for storing movies, ratings, user accounts and recommendations
  • IBM Datascience Experience (DSX) and Spark as a Service

Optional components (Hadoop Warehouse)

The core demo can run without these components.

  • IBM Compose Redis for maintaining an Atomic Increment counter for ID fields for user accounts. Use this if you want integer user account ids rather than the guuids generated by Cloudant.
  • IBM Message Hub for the web application to send a stream of ratings as they are entered by the user.
  • IBM BigInsights on Cloud using spark streaming to ingest data from MessageHub and expose via Hive.

Setting up your own demo web application instance on Bluemix

Quick deploy

Click on the link below, then follow the instructions. Note that this step may take quite a long time (maybe 30 minutes).

Deploy to Bluemix

  • CAUTION: a python flask application instance with 128MB memory and an instance of Cloudant 'Lite' will get deployed - you may get charged for these services. Please check charges before deploying. Note that Redis, Message Hub and BigInsights do not get deployed by default. If you wish to deploy the solution these optional components, follow the instructions here

After deploying to Bluemix, you will need to create a new DSX project and import the notebooks. The notebook Step 07 is responsible for creating recommendations and saving them to Cloudant. You will not get recommendations until you have setup this notebook with your Cloudant credentials and run the notebook from DSX.

Web application screenshots

Rating a movie

The screenshot below shows some movies being rated by a user.

Screenshot of rating a movie

Movie recommendations

The screenshot below shows movie recommendations provided by Spark machine learning.

Screenshot of movie recommendations

About

This project walks through how you can create recommendations using Apache Spark machine learning. There are a number of jupyter notebooks that you can run on IBM Data Science Experience, and there a live demo of a movie recommendation web application you can interact with. The demo also uses IBM Message Hub (kafka) to push application events to…

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 97.4%
  • Python 1.7%
  • Other 0.9%