Author: Bethany Lusch ([email protected]), combining and adapting materials evolved over time by Marieme Ngom, Asad Khan, Prasanna Balaprakash, Taylor Childers, Corey Adams, Kyle Felker, and Tanwi Mallick
This tutorial covers the basics of neural networks (aka "deep learning"), which is a technique within machine learning that tends to outperform other techniques when dealing with a large amount of data.
This is a quick overview, but the goals are:
- to introduce the fundamental concepts of deep learning through hands-on activities
- to give you the necessary background for the more advanced topics in the coming weeks.
Some rough definitions:
- Artificial intelligence (AI) is a set of approaches to solving complex problems by imitating the brain's ability to learn.
- Machine learning (ML) is the field of study that gives computers the ability to learn without being explicitly programmed (i.e. learning patterns instead of writing down rules.) Arguably, machine learning is now a subfield of AI.
Last week, we learned about using linear regression to predict the sale price of a house. We fit a function to the dataset:
- Input: above ground square feet
- Output: sale price
- Function type: linear
- Loss function: mean squared error
- Optimization algorithm: stochastic gradient descent
This week, we'll work on a "classification" problem, which means that we have a category label for each data point, and we fit a function that can categorize inputs.
The MNIST dataset contains thousands of examples of handwritten numbers, with each digit labeled 0-9.
We'll start with the MNIST problem in this notebook: Fitting MNIST with a multi-layer perceptron (MLP)
Next week, we'll learn about other types of neural networks.
- If you are using ALCF, first log in. From a terminal run the following command:
-
Although we already cloned the repo before, you'll want the updated version. To be reminded of the instructions for syncing your fork, click here.
-
We will be downloading data in our Jupyter notebook, which runs on hardware that by default has no Internet access. From the terminal on Polaris, edit the ~/.bash_profile file to have these proxy settings:
export HTTP_PROXY="http://proxy-01.pub.alcf.anl.gov:3128"
export HTTPS_PROXY="http://proxy-01.pub.alcf.anl.gov:3128"
export http_proxy="http://proxy-01.pub.alcf.anl.gov:3128"
export https_proxy="http://proxy-01.pub.alcf.anl.gov:3128"
export ftp_proxy="http://proxy-01.pub.alcf.anl.gov:3128"
export no_proxy="admin,polaris-adminvm-01,localhost,*.cm.polaris.alcf.anl.gov,polaris-*,*.polaris.alcf.anl.gov,*.alcf.anl.gov"
-
Now that we have the updated notebooks, we can open them. If you are using ALCF JupyterHub or Google Colab, you can be reminded of the steps here.
-
Reminder: Change the notebook's kernel to
datascience/conda-2023-01-10
(you may need to change kernel each time you open a notebook for the first time):- select Kernel in the menu bar
- select Change kernel...
- select datascience/conda-2023-01-10 from the drop-down menu
Here are Asad Khan's recommendations for further reading:
- tensorflow.org tutorials
- keras.io tutorials
- CS231n: Convolutional Neural Networks for Visual Recognition
- Deep Learning Specialization, Andrew Ng
- PyTorch Challenge, Udacity
- Deep Learning with Python
- Keras Blog
And Bethany's personal favorite: a thorough hands-on textbook: book with notebooks.