Merge pull request #136 from zshiqiang/add_documentations

add documentations
cog-imperial · Dec 5, 2023 · a3d128d · a3d128d
2 parents b43ac5e + 540b63c
commit a3d128d
Show file tree

Hide file tree

Showing 13 changed files with 411 additions and 80 deletions.
diff --git a/docs/api_doc/omlt.neuralnet.layers_activations.rst b/docs/api_doc/omlt.neuralnet.layers_activations.rst
@@ -6,6 +6,8 @@ Layer and Activation Functions
 Layer Functions
 ----------------
 
+.. automodule:: omlt.neuralnet.layers.__init__
+
 .. automodule:: omlt.neuralnet.layers.full_space
    :members:
    :undoc-members:
@@ -26,6 +28,8 @@ Layer Functions
 Activation Functions
 ---------------------
 
+.. automodule:: omlt.neuralnet.activations.__init__
+
 .. automodule:: omlt.neuralnet.activations.linear
    :members:
    :undoc-members:

diff --git a/docs/notebooks.rst b/docs/notebooks.rst
@@ -2,22 +2,29 @@ Jupyter Notebooks
 ===================
 
 OMLT provides Jupyter notebooks to demonstrate its core capabilities. All notebooks can be found on the OMLT 
-github `page <https://github.com/cog-imperial/OMLT/tree/main/docs/notebooks/>`_. The notebooks are summarized as follows:
+github `page <https://github.com/cog-imperial/OMLT/tree/main/docs/notebooks/>`_.
+
+The first set of notebooks demonstrates the basic mechanics of OMLT and shows how to use it:
 
 * `build_network.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/build_network.ipynb/>`_ shows how to manually create a `NetworkDefinition` object. This notebook is helpful for understanding the details of the internal layer structure that OMLT uses to represent neural networks. 
 
 * `import_network.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/import_network.ipynb/>`_ shows how to import neural networks from Keras and PyTorch using ONNX interoperability. The notebook also shows how to import variable bounds from data.
 
 * `neural_network_formulations.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/neural_network_formulations.ipynb>`_ showcases the different neural network formulations available in OMLT.
 
+* `index_handling.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/index_handling.ipynb>`_ shows how to use `IndexMapper` to handle the mappings between indexes.
+
+* `bo_with_trees.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/bo_with_trees.ipynb>`_ incorporates gradient-boosted trees into a Bayesian optimization loop to optimize the Rosenbrock function.
+
+* `linear_tree_formulations.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/trees/linear_tree_formulations.ipynb>`_ showcases the different linear model decision tree formulations available in OMLT.
+
+The second set of notebooks gives application-specific examples:
+
 * `mnist_example_dense.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/mnist_example_dense.ipynb>`_ trains a fully dense neural network on MNIST and uses OMLT to find adversarial examples.
 
 * `mnist_example_convolutional.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/mnist_example_convolutional.ipynb>`_ trains a convolutional neural network on MNIST and uses OMLT to find adversarial examples.
 
 * `auto-thermal-reformer.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/auto-thermal-reformer.ipynb>`_ develops a neural network surrogate (using sigmoid activations) with data from a process model built using `IDAES-PSE <https://github.com/IDAES/idaes-pse>`_.
 
 * `auto-thermal-reformer-relu.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/neuralnet/auto-thermal-reformer-relu.ipynb>`_ develops a neural network surrogate (using ReLU activations) with data from a process model built using `IDAES-PSE <https://github.com/IDAES/idaes-pse>`_.
-
-* `bo_with_trees.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/trees/bo_with_trees.ipynb>`_ incorporates gradient-boosted-trees into a Bayesian optimization loop to optimize the Rosenbrock function.
-
-* `linear_tree_formulations.ipynb <https://github.com/cog-imperial/OMLT/blob/main/docs/notebooks/trees/linear_tree_formulations.ipynb>`_ showcases the different linear model decision tree formulations available in OMLT.
+* 
diff --git a/docs/notebooks/neuralnet/index_handling.ipynb b/docs/notebooks/neuralnet/index_handling.ipynb
@@ -0,0 +1,257 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Index handling\n",
+    "\n",
+    "Sometimes moving from layer to layer in a neural network involves rearranging the size from the output of the previous layer to the input of the next layer. This notebook demonstrates how to use this functionality in OMLT.\n",
+    "\n",
+    "## Library Setup\n",
+    "\n",
+    "Start by importing the libraries used in this project:\n",
+    "\n",
+    " - `numpy`: a general-purpose numerical library\n",
+    " - `omlt`: the package this notebook demonstates.\n",
+    " \n",
+    "We import the following classes from `omlt`:\n",
+    "\n",
+    " - `NetworkDefinition`: class that contains the nodes in a Neural Network\n",
+    " - `InputLayer`, `DenseLayer`, `PoolingLayer2D`: the three types of layers used in this example\n",
+    " - `IndexMapper`: used to reshape the data between layers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "from omlt.neuralnet import NetworkDefinition\n",
+    "from omlt.neuralnet.layer import IndexMapper, InputLayer, DenseLayer, PoolingLayer2D"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Then we define a simple network that consists of a max pooling layer and a dense layer:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "0\tInputLayer(input_size=[9], output_size=[9])\n",
+      "1\tPoolingLayer(input_size=[1, 3, 3], output_size=[1, 2, 2], strides=[2, 2], kernel_shape=[2, 2]), pool_func_name=max\n",
+      "2\tDenseLayer(input_size=[4], output_size=[1])\n"
+     ]
+    }
+   ],
+   "source": [
+    "# define bounds for inputs\n",
+    "input_size = [9]\n",
+    "input_bounds = {}\n",
+    "for i in range(input_size[0]):\n",
+    "    input_bounds[(i)] = (-10.0, 10.0)\n",
+    "\n",
+    "net = NetworkDefinition(scaled_input_bounds=input_bounds)\n",
+    "\n",
+    "# define the input layer\n",
+    "input_layer = InputLayer(input_size)\n",
+    "\n",
+    "net.add_layer(input_layer)\n",
+    "\n",
+    "# define the pooling layer\n",
+    "input_index_mapper_1 = IndexMapper([9], [1, 3, 3])\n",
+    "maxpooling_layer = PoolingLayer2D(\n",
+    "    [1, 3, 3],\n",
+    "    [1, 2, 2],\n",
+    "    [2, 2],\n",
+    "    \"max\",\n",
+    "    [2, 2],\n",
+    "    1,\n",
+    "    input_index_mapper=input_index_mapper_1,\n",
+    ")\n",
+    "\n",
+    "net.add_layer(maxpooling_layer)\n",
+    "net.add_edge(input_layer, maxpooling_layer)\n",
+    "\n",
+    "# define the dense layer\n",
+    "input_index_mapper_2 = IndexMapper([1, 2, 2], [4])\n",
+    "dense_layer = DenseLayer(\n",
+    "    [4],\n",
+    "    [1],\n",
+    "    activation=\"linear\",\n",
+    "    weights=np.ones([4, 1]),\n",
+    "    biases=np.zeros(1),\n",
+    "    input_index_mapper=input_index_mapper_2,\n",
+    ")\n",
+    "\n",
+    "net.add_layer(dense_layer)\n",
+    "net.add_edge(maxpooling_layer, dense_layer)\n",
+    "\n",
+    "for layer_id, layer in enumerate(net.layers):\n",
+    "    print(f\"{layer_id}\\t{layer}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this example, `input_index_mapper_1` maps outputs of `input_layer` (with size [9])  to the inputs of `maxpooling_layer` (with size [1, 3, 3]), `input_index_mapper_2` maps the outputs of `maxpooling_layer` (with size [1, 2, 2]) to the inputs of `dense_layer` (with size [4]). Given an input, we can evaluate each layer to see how `IndexMapper` works: "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "outputs of maxpooling_layer:\n",
+      " [[[5. 6.]\n",
+      "  [8. 9.]]]\n",
+      "outputs of dense_layer:\n",
+      " [28.]\n"
+     ]
+    }
+   ],
+   "source": [
+    "x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])\n",
+    "y1 = maxpooling_layer.eval_single_layer(x)\n",
+    "print(\"outputs of maxpooling_layer:\\n\", y1)\n",
+    "y2 = dense_layer.eval_single_layer(y1)\n",
+    "print(\"outputs of dense_layer:\\n\", y2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Without `IndexMapper`, the output of `maxpooling_layer` is identical to the input of `dense_layer`. When using `IndexMapper`, using `input_indexes_with_input_layer_indexes` can provide the mapping between indexes. Therefore, there is no need to define variables for the inputs of each layer (except for `input_layer`). As shown in the following, we print both input indexes and output indexes for each layer. Also, we give the mapping between indexes of two adjacent layers, where `local_index` corresponds to the input indexes of the current layer and `input_index` corresponds to the output indexes of previous layer."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "input indexes of input_layer:\n",
+      "[(0,), (1,), (2,), (3,), (4,), (5,), (6,), (7,), (8,)]\n",
+      "output indexes of input_layer:\n",
+      "[(0,), (1,), (2,), (3,), (4,), (5,), (6,), (7,), (8,)]\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(\"input indexes of input_layer:\")\n",
+    "print(input_layer.input_indexes)\n",
+    "print(\"output indexes of input_layer:\")\n",
+    "print(input_layer.output_indexes)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "input indexes of maxpooling_layer:\n",
+      "[(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 1, 0), (0, 1, 1), (0, 1, 2), (0, 2, 0), (0, 2, 1), (0, 2, 2)]\n",
+      "output indexes of maxpooling_layer:\n",
+      "[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1)]\n",
+      "input_index_mapping_1:\n",
+      "local_index: (0, 0, 0) input_index: (0,)\n",
+      "local_index: (0, 0, 1) input_index: (1,)\n",
+      "local_index: (0, 0, 2) input_index: (2,)\n",
+      "local_index: (0, 1, 0) input_index: (3,)\n",
+      "local_index: (0, 1, 1) input_index: (4,)\n",
+      "local_index: (0, 1, 2) input_index: (5,)\n",
+      "local_index: (0, 2, 0) input_index: (6,)\n",
+      "local_index: (0, 2, 1) input_index: (7,)\n",
+      "local_index: (0, 2, 2) input_index: (8,)\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(\"input indexes of maxpooling_layer:\")\n",
+    "print(maxpooling_layer.input_indexes)\n",
+    "print(\"output indexes of maxpooling_layer:\")\n",
+    "print(maxpooling_layer.output_indexes)\n",
+    "print(\"input_index_mapping_1:\")\n",
+    "for local_index, input_index in maxpooling_layer.input_indexes_with_input_layer_indexes:\n",
+    "    print(\"local_index:\", local_index, \"input_index:\", input_index)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "input indexes of dense_layer:\n",
+      "[(0,), (1,), (2,), (3,)]\n",
+      "output indexes of dense_layer:\n",
+      "[(0,)]\n",
+      "input_index_mapping_2:\n",
+      "local_index: (0,) input_index: (0, 0, 0)\n",
+      "local_index: (1,) input_index: (0, 0, 1)\n",
+      "local_index: (2,) input_index: (0, 1, 0)\n",
+      "local_index: (3,) input_index: (0, 1, 1)\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(\"input indexes of dense_layer:\")\n",
+    "print(dense_layer.input_indexes)\n",
+    "print(\"output indexes of dense_layer:\")\n",
+    "print(dense_layer.output_indexes)\n",
+    "print(\"input_index_mapping_2:\")\n",
+    "for local_index, input_index in dense_layer.input_indexes_with_input_layer_indexes:\n",
+    "    print(\"local_index:\", local_index, \"input_index:\", input_index)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "OMLT",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.16"
+  },
+  "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/src/omlt/neuralnet/__init__.py b/src/omlt/neuralnet/__init__.py
@@ -1,16 +1,19 @@
 r"""
-We use the following notation to describe layer and activation functions:
+The basic pipeline in source code of OMLT is:
 
 .. math::
 
     \begin{align*}
-    N &:= \text{Set of nodes (i.e. neurons in the neural network)}\\
-    M_i &:= \text{Number of inputs to node $i$}\\
-    \hat z_i &:= \text{pre-activation value on node $i$}\\
-    z_i &:= \text{post-activation value on node $i$}\\
-    w_{ij} &:= \text{weight from input $j$ to node $i$}\\
-    b_i &:= \text{bias value for node $i$}
+        \mathbf z^{(0)}
+        \xrightarrow[\text{Constraints}]{\text{Layer 1}} \hat{\mathbf z}^{(1)}
+        \xrightarrow[\text{Activations}]{\text{Layer 1}} \mathbf z^{(1)}
+        \xrightarrow[\text{Constraints}]{\text{Layer 2}} \hat{\mathbf z}^{(2)}
+        \xrightarrow[\text{Activations}]{\text{Layer 2}} \mathbf z^{(2)}
+        \xrightarrow[\text{Constraints}]{\text{Layer 3}}\cdots
     \end{align*}
+
+where :math:`\mathbf z^{(0)}` is the output of `InputLayer`, :math:`\hat{\mathbf z}^{(l)}` is the pre-activation output of :math:`l`-th layer, :math:`\mathbf z^{(l)}` is the post-activation output of :math:`l`-th layer.
+
 """
 from omlt.neuralnet.network_definition import NetworkDefinition
 from omlt.neuralnet.nn_formulation import (

diff --git a/src/omlt/neuralnet/activations/__init__.py b/src/omlt/neuralnet/activations/__init__.py
@@ -1,3 +1,7 @@
+r"""
+Since all activation functions are element-wised, we only consider how to formulate activation functions for a single neuron, where :math:`x` denotes pre-activation variable, and :math:`y` denotes post-activation variable.
+
+"""
 from .linear import linear_activation_constraint, linear_activation_function
 from .relu import ComplementarityReLUActivation, bigm_relu_activation_constraint
 from .smooth import (

diff --git a/src/omlt/neuralnet/activations/linear.py b/src/omlt/neuralnet/activations/linear.py
@@ -8,12 +8,12 @@ def linear_activation_constraint(
     r"""
     Linear activation constraint generator
 
-    Generates the constraints for the linear activation function.
+    Generates the constraints for the linear activation function:
 
     .. math::
 
         \begin{align*}
-        z_i &= \hat{z_i} && \forall i \in N
+            y=x
         \end{align*}
 
     """