Structured neural networks (SNNs) are a new neural networks concept. These networks base their structure on mechanical and control theory laws.
The framework's goal is to allow the users fast modeling and control of a mechanical system such as an autonomous vehicle, an industrial robot, a walking robot, a flying drone.
Below is the workflow that the framework follows.
Using a conceptual representation of your mechanical system the framework generates the structured neural network of model of mechanical device considered. Providing suitable experimental data, the framework will realize an effective training of the neural models by appropriately choosing all the hyper-parameters. The framework will allow the user to synthesize and train a structured neural network that will be used as a control system in a few simple steps and without the need to perform new experiments. The realized neural controller will be exported using C language or ONNX, and it will be ready to use.
Table of Contents
You can install the dependencies of the neu4mes framework from PyPI via:
pip install -r requirements.txt
The structured neural model is defined by a list of inputs by a list of outputs and by a list of relationships that link the inputs to the outputs.
Let's assume we want to model one of the best-known linear mechanical systems, the mass-spring-damper system.
The system is defined as the following equation:
Suppose we want to estimate the value of the future position of the mass given the initial position and the external force.
In the neu4mes framework we can build an estimator in this form:
x = Input('x')
F = Input('F')
x_z_est = Output('x_z_est', Fir(x.tw(1))+Fir(F.last()))
The first thing we define the input variable of the system.
Input variabiles can be created using the Input
function.
In our system we have two inputs the position of the mass, x
, and the external force, F
, exerted on the mass.
The Output
function is used to define an output of our model.
The Output
gets two inputs, the first is the name of the output and the second is the structure of the estimator.
Let's explain some of the functions used:
- The
tw(...)
function is used to extract a time window from a signal. In particular we extract a time window of 1 second. - The
last()
function that is used to get the last force applied to the mass. - The
Fir(...)
function to build an FIR filter with the tunable parameters on our input variable.
So we are creating an estimator for the variable x
at the instant following the observation (the future position of the mass) by building an
observer that has a mathematical structure equal to the one shown below:
Where the variables
Let's now try to train our observer using the data we have. We perform:
mass_spring_damper = Neu4mes()
mass_spring_damper.addModel('x_z_est', x_z_est)
mass_spring_damper.addMinimize('next-pos', x.z(-1), x_z_est, 'mse')
mass_spring_damper.neuralizeModel(0.2)
Let's create a neu4mes object, and add one output to the network using the addModel
function.
This function is needed for create an output on the model. In this example it is not mandatory because the same output is added also to the minimizeError
function.
In order to train our model/estimator the function addMinimize
is used to add a loss function to the list of losses.
This function takes:
- The name of the error, it is presented in the results and during the training.
- The second and third inputs are the variable that will be minimized, the order is not important.
- The minimization function used, in this case 'mse'.
In the function
addMinimize
is used thez(-1)
function. This function get from the dataset the future value of a variable (in our case the position of the mass), the next instant, using the Z-transform notation,z(-1)
is equivalent tonext()
function. The functionz(...)
method can be used on anInput
variable to get a time shifted value.
The obective of the minimization is to reduce the error between
x_z
that represent one sample of the next position of the mass get from the dataset and
x_z_est
is one sample of the output of our estimator.
The matematical formulation is as follow:
where n
represents the number of sample in the dataset.
Finally the function neuralizeModel
is used to perform the discretization. The parameter of the function is the sampling time and it will be chosen based on the data we have available.
data_struct = ['time','x','dx','F']
data_folder = './tutorials/datasets/mass-spring-damper/data/'
mass_spring_damper.loadData(name='mass_spring_dataset', source=data_folder, format=data_struct, delimiter=';')
Finally, the dataset is loaded. neu4mes loads all the files that are in a source folder.
Using the dataset created the training is performed on the model.
mass_spring_damper.trainModel()
In order to test the results we need to create a input, in this case is defined by:
x
with 5 sample because the sample time is 0.2 and the window ofx
is 1 second.F
is one sample because only the last sample is needed.
sample = {'F':[0.5], 'x':[0.25, 0.26, 0.27, 0.28, 0.29]}
results = mass_spring_damper(sample)
print(results)
The result variable is structured as follow:
>> {'x_z_est':[0.4]}
The value represents the output of our estimator (means the next position of the mass) and is close as possible to x.next()
get from the dataset.
The network can be tested also using a bigger time window
sample = {'F':[0.5, 0.6], 'x':[0.25, 0.26, 0.27, 0.28, 0.29, 0.30]}
results = mass_spring_damper(sample)
print(results)
The value of x
is build using a moving time window.
The result variable is structured as follow:
>> {'x_z_est':[0.4, 0.42]}
The same output can be generated calling the network using the flag sampled=True
in this way:
sample = {'F':[[0.5],[0.6]], 'x':[[0.25, 0.26, 0.27, 0.28, 0.29],[0.26, 0.27, 0.28, 0.29, 0.30]]}
results = mass_spring_damper(sample,sampled=True)
print(results)
This folder contains all the neu4mes library files, the main files are the following:
- activation.py this file contains all the activation functions.
- arithmetic.py this file contains the aritmetic functions as: +, -, /, *., ^2
- fir.py this file contains the finite inpulse response filter function. It is a linear operation without bias on the second dimension.
- fuzzify.py contains the operation for the fuzzification of a variable, commonly used in the local model as activation function.
- input.py contains the Input class used for create an input for the network.
- linear.py this file contains the linear function. Typical Linear operation
W*x+b
operated on the third dimension. - localmodel.py this file contains the logic for build a local model.
- ouptut.py contains the Output class used for create an output for the network.
- parameter.py contains the logic for create a generic parameters
- parametricfunction.py are the user custom function. The function can use the pytorch syntax.
- part.py are used for selecting part of the data.
- trigonometric.py this file contains all the trigonometric functions.
- neu4mes.py the main file for create the structured network
- model.py containts the pytorch template model for the structured network
This folder contains some complex example of the use of the neu4mes fromwork. The objective of this folder is demostrate the effectivness of the framework in solving real problems. The examples proposed, some of them related of accompanying article, are as follows:
- Modeling a linear mass spring damper. The obejtive is to estimate the future position and velocity of the mass. We consider the system equipped with position sensor and a force sensor.
- ...
This folder contains the unittest of the library in particular each file test a specific functionality.
This folder contains functionality underdevelopment. These files presents the new functionalities and the syntax chosen.
The files in the examples folder are a collection of the functionality of the library. Each file present in deep a specific functionality or function of the framework. This folder is useful to understand the flexibility and capability of the framework.
In this section is explained the shape of the input/output of the network.
The structured network can be called in two way:
- The shape of the inputs not sampled are [total time window size, dim] Sampled inputs are reconstructed as soon as the maximum size of the time window is known. 'dim' represents the size of the input if is not 1 means that the input is a vector.
- The shape of the sampled inputs are [number of samples = batch, size of time window for a sample, dim]
In the example presented before in the first call the shape for
x
are [1,5,1] forF
are [1,1,1] in the second call forx
are [2,5,1] forF
are [2,1,1]. In both cases the last dimensions is ignored as the input are scalar. The output of the structured neural model The outputs are defined in this way for the different cases: - if the shape is [batch, 1, 1] the final two dimensions are collapsed result [batch]
- if the shape is [batch, window, 1] the last dimension is collapsed result [batch, window]
- if the shape is [batch, window, dim] the output is equal to [batch, window, dim]
- if the shape is [batch, 1, dim] the output is equal to [batch, 1, dim]
In the example
x_z_est
has the shape of [1] in the first call and [2] because the the window and the dim were equal to 1.
The shape and time windows remain unchanged, for the binary operators shape must be equal.
input shape = [batch, window, dim] -> output shape = [batch, window, dim]
The input must be scalar, the fir compress di time dimension (window) that goes to 1. A vector input is not allowed. The output dimension of the Fir is moved on the last dimension for create a vector output.
input shape = [batch, window, 1] -> output shape = [batch, 1, output dimension of Fir = output_dimension]
The window remains unchanged and the output dimension is user defined.
input shape = [batch, window, dimension] -> output shape = [batch, window, output dimension of Linear = output_dimension]
The function fuzzify the input and creates a vector for output. The window remains unchanged, input must be scalar. Vector input are not allowed.
input shape = [batch, window, 1] -> output shape = [batch, window, number of centers of Fuzzy = len(centers)]
Part selects a slice of the vector input, the input must be a vector. Select operation the dimension becomes 1, the input must be a vector. For both operation if there is a time component it remains unchanged.
Part input shape = [batch, window, dimension] -> output shape = [batch, window, selected dimension = [j-i]]
Select input shape = [batch, window, dimension] -> output shape = [batch, window, 1]
The TimePart selects a time window from the signal (works like timewindow tw([i,j])
but in this the i,j are absolute).
The SamplePart selects a list of samples from the signal (works like samplewindow sw([i,j])
but in this the i,j are absolute).
The SampleSelect selects a specific index from the signal (works like zeta operation z(index)
but in this the index are absolute).
For all the operation the shape remains unchanged.
SamplePart input shape = [batch, window, dimension] -> output shape = [batch, selected sample window = [j-i], dimension]
SampleSelect input shape = [batch, window, dimension] -> output shape = [batch, 1, dimension]
TimePart input shape = [batch, window, dimension] -> output shape = [batch, selected time window = [j-i]/sample_time, dimension]
The local model has two main inputs, activation functions and inputs. Activation functions have shape of the fuzzy
input shape = [batch, window, 1] -> output shape = [batch, window, number of centers of Fuzzy = len(centers)]
Inputs go through input function and output function.
The input shape of the input function can be anything as long as the output shape of the input function have the following dimensions
[batch, window, 1]
so input functions for example cannot be a Fir with output_dimension different from 1.
The input shape of the output function is [batch, window, 1]
while the shape of the output of the output functions can be any
Parameter shape are defined as follows [window = sw or tw/sample_time, dim]
the dimensions can be defined as a tuple and are appended to window
When the time dimension is not defined it is configured to 1
The Parametric functions take inputs and parameters as inputs
Parameter dimensions are the same as defined by the parameters if the dimensions are not defined they will be equal to [window = 1,dim = 1]
Dimensions of the inputs inside the parametric function are the same as those managed within the Pytorch framework equal to [batch, window, dim]
Output dimensions must follow the same convention [batch, window, dim]
This project is released under the license GNU Affero General Public License v3.0.