Skip to content

Commit

Permalink
workshop final (ThummeTo#137)
Browse files Browse the repository at this point in the history
  • Loading branch information
ThummeTo authored Jul 8, 2024
1 parent d2c92ea commit 3d70f9c
Showing 1 changed file with 16 additions and 39 deletions.
55 changes: 16 additions & 39 deletions examples/pluto-src/SciMLUsingFMUs/SciMLUsingFMUs.jl
Original file line number Diff line number Diff line change
Expand Up @@ -508,13 +508,11 @@ md"""
When connecting an FMU with an ANN, technically different signals could be used: States, state derivatives, inputs, outputs, parameters, time itself or other observable variables. Depending on the use case, some signals are more clever to choose than others. In general, every additional signal costs a little bit of computational performance, as you will see. So picking the right subset is the key!
![](https://github.com/ThummeTo/FMIFlux.jl/blob/main/examples/pluto-src/SciMLUsingFMUs/src/plan_e1.png?raw=true)
Choose additional FMU variables to put in together with the state derivatives:
"""

# ╔═╡ 5d688c3d-b5e3-4a3a-9d91-0896cc001000
md"""
We start building our deep model as a `Chain` of layers. For now, there is only a single layer in it: The FMU `fmu` itself. The layer input `x` is interpreted as system state (compare to the figure above) and set in the fmu call via `x=x`. Further, we want all state derivatives as layer outputs `dx_refs=:all` and some additional outputs specified via `y_refs=CHOOSE_y_refs`.
We start building our deep model as a `Chain` of layers. For now, there is only a single layer in it: The FMU `fmu` itself. The layer input `x` is interpreted as system state (compare to the figure above) and set in the fmu call via `x=x`. The current solver time `t` is set implicitly. Further, we want all state derivatives as layer outputs by setting `dx_refs=:all` and some additional outputs specified via `y_refs=CHOOSE_y_refs` (you can pick them using the checkboxes).
"""

# ╔═╡ 68719de3-e11e-4909-99a3-5e05734cc8b1
Expand All @@ -539,19 +537,13 @@ x1 = FMIZoo.getState(data_train, tStart+1.0)
# ╔═╡ f4e66f76-76ff-4e21-b4b5-c1ecfd846329
begin
using FMIFlux.FMISensitivity.ReverseDiff
using FMIFlux.FMISensitivity.ForwardDiff

prepareSolveFMU(fmu, parameters)
jac_rwd = ReverseDiff.jacobian(x -> model(x), x1);
A_rwd = jac_rwd[1:length(x1), :]
end

# ╔═╡ ea655baa-b4d8-4fce-b699-6a732dc06051
begin
using FMIFlux.FMISensitivity.ForwardDiff
prepareSolveFMU(fmu, parameters)
jac_fwd = ForwardDiff.jacobian(x -> model(x), x1);
A_fwd = jac_fwd[1:length(x1), :]
end

# ╔═╡ 0a7955e7-7c1a-4396-9613-f8583195c0a8
md"""
Depending on how many signals you select, the output of the FMU-layer is extended. The first six outputs are the state derivatives, the remaining are the $(length(CHOOSE_y_refs)) additional output(s) selected above.
Expand Down Expand Up @@ -590,11 +582,8 @@ If we use reverse-mode automatic differentiation via `ReverseDiff.jl`, the deter

# ╔═╡ b163115b-393d-4589-842d-03859f05be9a
md"""
For forward-mode automatic differentiation (using *ForwardDiff.jl*), it's the same of course:
"""
Forward-mode automatic differentiation (using *ForwardDiff.jl*)is available, too.
# ╔═╡ cae2e094-b6a2-45e4-9afd-a6b78e912ab7
md"""
We can determine further Jacobians for FMUs, for example the Jacobian $C = \frac{\partial y}{\partial x}$ states (using *ReverseDiff.jl*):
"""

Expand All @@ -603,16 +592,6 @@ begin
C_rwd = jac_rwd[length(x1)+1:end, :]
end

# ╔═╡ fe85179e-36ba-45d4-b7a1-a893a382ade4
md"""
And the same for using *ForwardDiff.jl*:
"""

# ╔═╡ 5b8084b1-a8be-4bf3-b86d-e2603ae36c5b
begin
C_fwd = jac_fwd[length(x1)+1:end, :]
end

# ╔═╡ 5e9cb956-d5ea-4462-a649-b133a77929b0
md"""
Let's check the performance of these calls, because they will have significant influence on the later training performance!
Expand Down Expand Up @@ -688,15 +667,15 @@ end

# ╔═╡ eaf37128-0377-42b6-aa81-58f0a815276b
md"""
> 💡 Keep in mind that the choice of interface might have a significant impact on your inference and training performance! However, some signals are simply required to be part of the interface, because the effect we want to train for depends on them.
> 💡 Keep in mind that the choice of interface might has a significant impact on your inference and training performance! However, some signals are simply required to be part of the interface, because the effect we want to train for depends on them.
"""

# ╔═╡ c030d85e-af69-49c9-a7c8-e490d4831324
md"""
## Online Data Pre- and Postprocessing
**is required for hybrid models**
Now that we have defined the signals that come *from* the FMU and go *into* the ANN, we need to think about data pre- and post-processing. In ML, this is often done before the actual training starts. In hybrid modeling, we need to do this *online*, because the FMU constantly generates signals that might not be suitable for ANNs. On the other hand, the signals generated by ANNs might not suit the expected FMU input. This gets more clear if we have a look on the used activation functions, like e.g. the *tanh*.
Now that we have defined the signals that come *from* the FMU and go *into* the ANN, we need to think about data pre- and post-processing. In ML, this is often done before the actual training starts. In hybrid modeling, we need to do this *online*, because the FMU constantly generates signals that might not be suitable for ANNs. On the other hand, the signals generated by ANNs might not suit the expected FMU input. What *suitable* means gets more clear if we have a look on the used activation functions, like e.g. the *tanh*.
![](https://github.com/ThummeTo/FMIFlux.jl/blob/main/examples/pluto-src/SciMLUsingFMUs/src/plan_e2.png?raw=true)
Expand All @@ -718,7 +697,7 @@ end

# ╔═╡ 0dadd112-3132-4491-9f02-f43cf00aa1f9
md"""
In general, it looks like the velocity isn't saturated too much by `tanh`. This is a good thing and not always the case! However, the very beginning of the trajectory is saturated too much (the peak value of $\approx -3$ is saturated to $\approx -1$). This is bad, because the hybrid model velocity is *slower* at this point in time and it won't reach the same angle over time as the original FMU.
In general, it looks like the velocity isn't saturated too much by `tanh`. This is a good thing and not always the case! However, the very beginning of the trajectory is saturated too much (the peak value of $\approx -3$ is saturated to $\approx -1$). This is bad, because the hybrid model velocity is *slower* in this time interval and the hybrid system won't reach the same angle over time as the original FMU.
We can add shift (=addition) and scale (=multiplication) operations before and after the ANN to bypass this issue. See how you can influence the output *after* the `tanh` (and the ANN respectively) to match the ranges. The goal is to choose pre- and post-processing parameters so that the signal ranges needed by the FMU are preserved by the hybrid model.
"""
Expand Down Expand Up @@ -770,7 +749,7 @@ md"""
# ╔═╡ 0fb90681-5d04-471a-a7a8-4d0f3ded7bcf
md"""
## Introducing Gates
**to control how physical and machine learning model interact**
**to control how physical and machine learning model contribute and interact**
![](https://github.com/ThummeTo/FMIFlux.jl/blob/main/examples/pluto-src/SciMLUsingFMUs/src/plan_e3.png?raw=true)
"""
Expand Down Expand Up @@ -879,7 +858,7 @@ md"""
# ╔═╡ 4454c8d2-68ed-44b4-adfa-432297cdc957
md"""
## FMU inputs
In general, you can use arbitrary values as input for the FMU layer, like system inputs, states or parameters. In this example, we want to use only system states as inputs for the FMU layer - to keep it easy, named:
In general, you can use arbitrary values as input for the FMU layer, like system inputs, states or parameters. In this example, we want to use only system states as inputs for the FMU layer - to keep it easy - which are:
- currents of both motors
- angles of both joints
- angular velocities of both joints
Expand All @@ -900,7 +879,7 @@ Pick additional ANN layer inputs:

# ╔═╡ 06937575-9ab1-41cd-960c-7eef3e8cae7f
md"""
It might be clever to pick additional inputs, because the effect being learned (slip-stick of the pen) might depend on this additional input. However, every additional signal has a little negative impact on the computational performance.
It might be clever to pick additional inputs, because the effect being learned (slip-stick of the pen) might depend on these additional inputs. However, every additional signal has a little negative impact on the computational performance and a risk of learning from wrong correlations.
"""

# ╔═╡ 356b6029-de66-418f-8273-6db6464f9fbf
Expand Down Expand Up @@ -929,7 +908,7 @@ ANN gates shall be initialized with $(GATES_INIT), meaning the ANN contributes $
# ╔═╡ c0ac7902-0716-4f18-9447-d18ce9081ba5
md"""
## Resulting neural FMU
Even if this looks a little confusing at first glance, our final neural FMU topology looks like this:
Our final neural FMU topology looks like this:
"""

# ╔═╡ 84215a73-1ab0-416d-a9db-6b29cd4f5d2a
Expand Down Expand Up @@ -982,7 +961,7 @@ end

# ╔═╡ bc09bd09-2874-431a-bbbb-3d53c632be39
md"""
We can evaluate it, by putting in our start state `x0`. The model computes the resulting state derivative:
We find a `Chain` consisting of multipl layers and the corresponding parameter counts. We can evaluate it, by putting in our start state `x0`. The model computes the resulting state derivative:
"""

# ╔═╡ f02b9118-3fb5-4846-8c08-7e9bbca9d208
Expand All @@ -992,6 +971,8 @@ On basis of this `Chain`, we can build a neural FMU very easy:

# ╔═╡ d347d51b-743f-4fec-bed7-6cca2b17bacb
md"""
So let's get that thing trained!
# Training
After setting everything up, we can give it a try and train our created neural FMU. Depending on the chosen optimization hyperparameters, this will be more or less successful. Feel free to play around a bit, but keep in mind that for real application design, you should do hyper parameter optimization instead of playing around by yourself.
Expand All @@ -1000,7 +981,7 @@ After setting everything up, we can give it a try and train our created neural F
# ╔═╡ d60d2561-51a4-4f8a-9819-898d70596e0c
md"""
## Hyperparameters
Besides the already introduced hyperparameters - the depth, width and initial gate opening off the hybrid model - further parameters might have significant impact on the training success.
Besides the already introduced hyperparameters - the depth, width and initial gate opening of the hybrid model - further parameters might have significant impact on the training success.
### Optimizer
For this example, we use the well-known `Adam`-Optimizer with a step size `eta` of $(@bind ETA Select([1e-4 => "1e-4", 1e-3 => "1e-3", 1e-2 => "1e-2"])).
Expand Down Expand Up @@ -4500,11 +4481,7 @@ version = "1.4.1+1"
# ╟─f7c119dd-c123-4c43-812e-d0625817d77e
# ╟─f4e66f76-76ff-4e21-b4b5-c1ecfd846329
# ╟─b163115b-393d-4589-842d-03859f05be9a
# ╟─ea655baa-b4d8-4fce-b699-6a732dc06051
# ╟─cae2e094-b6a2-45e4-9afd-a6b78e912ab7
# ╟─ac0afa6c-b6ec-4577-aeb6-10d1ec63fa41
# ╟─fe85179e-36ba-45d4-b7a1-a893a382ade4
# ╟─5b8084b1-a8be-4bf3-b86d-e2603ae36c5b
# ╟─5e9cb956-d5ea-4462-a649-b133a77929b0
# ╟─9dc93971-85b6-463b-bd17-43068d57de94
# ╟─476a1ed7-c865-4878-a948-da73d3c81070
Expand Down Expand Up @@ -4565,7 +4542,7 @@ version = "1.4.1+1"
# ╟─abc57328-4de8-42d8-9e79-dd4020769dd9
# ╟─e8bae97d-9f90-47d2-9263-dc8fc065c3d0
# ╟─2dce68a7-27ec-4ffc-afba-87af4f1cb630
# ╟─c3f5704b-8e98-4c46-be7a-18ab4f139458
# ╠═c3f5704b-8e98-4c46-be7a-18ab4f139458
# ╟─1a608bc8-7264-4dd3-a4e7-0e39128a8375
# ╟─ff106912-d18c-487f-bcdd-7b7af2112cab
# ╟─51eeb67f-a984-486a-ab8a-a2541966fa72
Expand Down

0 comments on commit 3d70f9c

Please sign in to comment.