In the last lab, you learned how to visualize and manipulate time series data, and how to use ARIMA modeling to produce forecasts for time series data. You also learned how to conclude a correct parametrization of ARIMA models. This can be a complicated process, and while statistical programming languages such as R provide automated ways to solve this issue, those have yet to be officially ported over to Python.
Fortunately, the Data Science team at Facebook recently published a new library called fbprophet
, which enables data analysts and developers alike to perform forecasting at scale in Python. We encourage you to read this article by Facebook explaining how fbprophet
simplifies the forecasting process and provides an improved predictive ability.
- Model a time series using the Facebook's Prophet
- Describe the difference between ARIMA and Additive Synthesis for time series forecasting
- Use the methods in the
fbprophet
library to plot predicted values
Facebook's prophet
uses an elegant yet simple method for analyzing and predicting periodic data known as the additive modeling. The idea is straightforward: represent a time series as a combination of patterns at different scales such as daily, weekly, seasonally, and yearly, along with an overall trend. Your energy use might rise in the summer and decrease in the winter, but have an overall decreasing trend as you increase the energy efficiency of your home. An additive model can show us both patterns/trends and make predictions based on these observations.
The following image shows an additive model decomposition of a time series into an overall trend, yearly trend, and weekly trend.
“Prophet has been a key piece to improving Facebook’s ability to create a large number of trustworthy forecasts used for decision-making and even in product features.”
In order to compute its forecasts, the fbprophet
library relies on the STAN programming language. Before installing fbprophet
, you need to make sure that the pystan
Python wrapper to STAN is installed. We shall first install pystan
and fbprophet
using pip install
.
# If installing from terminal
# pip install pystan
# pip install fbprophet
# If installing from a jupyter notebook
# !pip install pystan
# !pip install fbprophet
Let's start by reading in our time series data. We will cover some data manipulation using pandas
, accessing financial data using the Quandl
library, and plotting with matplotlib
.
# Import necessary libraries
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
%matplotlib inline
from matplotlib.pylab import rcParams
plt.style.use('fivethirtyeight')
from fbprophet import Prophet as proph
# Import passengers.csv and set it as a time series
The fbprophet
library also imposes the strict condition that the input columns be named ds
(the time column) and y
(the metric column), so let's rename the columns in our ts
DataFrame.
# Rename the columns [Month, AirPassengers] to [ds, y]
# Plot the timeseries
We will now learn how to use the fbrophet
library to predict future values of our time series. The Facebook team has abstracted away many of the inherent complexities of time series forecasting and made it more intuitive for analysts and developers alike to work with time series data.
To begin, we will create a new prophet object with Prophet()
and provide a number of arguments. For example, we can specify the desired range of our uncertainty interval by setting the interval_width
parameter.
# Set the uncertainty interval to 95% (the Prophet default is 80%)
Model = Prophet(interval_width=0.95)
Now that our model has been initialized, we can call its .fit()
method with our DataFrame ts
as input. The model fitting should take no longer than a few seconds.
# Fit the timeseries to Model
In order to obtain forecasts of our time series, we must provide the model with a new dataframe containing a ds
column that holds the dates for which we want predictions. Conveniently, we do not have to concern ourselves with manually creating this dataframe because prophet provides the .make_future_dataframe()
helper method. We will call this function to generate 36 datestamps in the future. The documentation for this method is available here.
It is also important to consider the frequency of our time series. Because we are working with monthly data, we clearly specified the desired frequency of the timestamps (in this case, MS is the start of the month). Therefore, the .make_future_dataframe()
will generate 36 monthly timestamps for us. In other words, we are looking to predict future values of our time series 3 years into the future.
# Use make_future_dataframe() with a monthly frequency and periods = 36 for 3 years
This future dates dataframe can now be used as input to the .predict()
method of the fitted model.
# Predict the values for future dates and take the head of forecast
We can see that Prophet returns a large table with many interesting columns, but we subset our output to the columns most relevant to forecasting, which are:
ds
: the datestamp of the forecasted valueyhat
: the forecasted value of our metric (in Statistics, yhat is a notation traditionally used to represent the predicted values of a value y)yhat_lower
: the lower bound of our forecastsyhat_upper
: the upper bound of our forecasts
# Subset above mentioned columns and view the tail
A variation in values from the output presented above is to be expected as Prophet relies on Markov chain Monte Carlo (MCMC) methods to generate its forecasts. MCMC is a stochastic process, so values will be slightly different each time.
Prophet also provides a convenient method to quickly plot the results of our forecasts.
# Use Prophet's plot method to plot the predictions
Prophet plots the observed values of the time series (the black dots), the forecasted values (blue line) and the uncertainty intervals of our forecasts (the blue shaded regions).
One other particularly strong feature of Prophet is its ability to return the components of our forecasts. This can help reveal how daily, weekly, and yearly patterns of the time series contribute to the overall forecasted values. We can use the .plot_components()
method to view the individual components.
# Plot model components
Since we are working with monthly data, Prophet will plot the trend and the yearly seasonality but if you were working with daily data, you would also see a weekly seasonality plot included.
From the trend and seasonality, we can see that the trend is playing a large part in the underlying time series and seasonality comes into play mostly toward the beginning and the end of the year. With this information, we've been able to quickly model and forecast some data to get a feel for what might be coming our way in the future from this particular dataset.
In this lab, you learned how to use the fbprophet
library to perform time series forecasting in Python. We have been using out-of-the box parameters, but Prophet enables us to specify many more arguments. In particular, Prophet provides the functionality to bring your own knowledge about time series to the table.