diff --git a/examples/pluto-src/SciMLUsingFMUs/SciMLUsingFMUs.jl b/examples/pluto-src/SciMLUsingFMUs/SciMLUsingFMUs.jl index f644e5a6..3a228387 100644 --- a/examples/pluto-src/SciMLUsingFMUs/SciMLUsingFMUs.jl +++ b/examples/pluto-src/SciMLUsingFMUs/SciMLUsingFMUs.jl @@ -508,13 +508,11 @@ md""" When connecting an FMU with an ANN, technically different signals could be used: States, state derivatives, inputs, outputs, parameters, time itself or other observable variables. Depending on the use case, some signals are more clever to choose than others. In general, every additional signal costs a little bit of computational performance, as you will see. So picking the right subset is the key! ![](https://github.com/ThummeTo/FMIFlux.jl/blob/main/examples/pluto-src/SciMLUsingFMUs/src/plan_e1.png?raw=true) - -Choose additional FMU variables to put in together with the state derivatives: """ # ╔═╡ 5d688c3d-b5e3-4a3a-9d91-0896cc001000 md""" -We start building our deep model as a `Chain` of layers. For now, there is only a single layer in it: The FMU `fmu` itself. The layer input `x` is interpreted as system state (compare to the figure above) and set in the fmu call via `x=x`. Further, we want all state derivatives as layer outputs `dx_refs=:all` and some additional outputs specified via `y_refs=CHOOSE_y_refs`. +We start building our deep model as a `Chain` of layers. For now, there is only a single layer in it: The FMU `fmu` itself. The layer input `x` is interpreted as system state (compare to the figure above) and set in the fmu call via `x=x`. The current solver time `t` is set implicitly. Further, we want all state derivatives as layer outputs by setting `dx_refs=:all` and some additional outputs specified via `y_refs=CHOOSE_y_refs` (you can pick them using the checkboxes). """ # ╔═╡ 68719de3-e11e-4909-99a3-5e05734cc8b1 @@ -539,19 +537,13 @@ x1 = FMIZoo.getState(data_train, tStart+1.0) # ╔═╡ f4e66f76-76ff-4e21-b4b5-c1ecfd846329 begin using FMIFlux.FMISensitivity.ReverseDiff + using FMIFlux.FMISensitivity.ForwardDiff + prepareSolveFMU(fmu, parameters) jac_rwd = ReverseDiff.jacobian(x -> model(x), x1); A_rwd = jac_rwd[1:length(x1), :] end -# ╔═╡ ea655baa-b4d8-4fce-b699-6a732dc06051 -begin - using FMIFlux.FMISensitivity.ForwardDiff - prepareSolveFMU(fmu, parameters) - jac_fwd = ForwardDiff.jacobian(x -> model(x), x1); - A_fwd = jac_fwd[1:length(x1), :] -end - # ╔═╡ 0a7955e7-7c1a-4396-9613-f8583195c0a8 md""" Depending on how many signals you select, the output of the FMU-layer is extended. The first six outputs are the state derivatives, the remaining are the $(length(CHOOSE_y_refs)) additional output(s) selected above. @@ -590,11 +582,8 @@ If we use reverse-mode automatic differentiation via `ReverseDiff.jl`, the deter # ╔═╡ b163115b-393d-4589-842d-03859f05be9a md""" -For forward-mode automatic differentiation (using *ForwardDiff.jl*), it's the same of course: -""" +Forward-mode automatic differentiation (using *ForwardDiff.jl*)is available, too. -# ╔═╡ cae2e094-b6a2-45e4-9afd-a6b78e912ab7 -md""" We can determine further Jacobians for FMUs, for example the Jacobian $C = \frac{\partial y}{\partial x}$ states (using *ReverseDiff.jl*): """ @@ -603,16 +592,6 @@ begin C_rwd = jac_rwd[length(x1)+1:end, :] end -# ╔═╡ fe85179e-36ba-45d4-b7a1-a893a382ade4 -md""" -And the same for using *ForwardDiff.jl*: -""" - -# ╔═╡ 5b8084b1-a8be-4bf3-b86d-e2603ae36c5b -begin - C_fwd = jac_fwd[length(x1)+1:end, :] -end - # ╔═╡ 5e9cb956-d5ea-4462-a649-b133a77929b0 md""" Let's check the performance of these calls, because they will have significant influence on the later training performance! @@ -688,7 +667,7 @@ end # ╔═╡ eaf37128-0377-42b6-aa81-58f0a815276b md""" -> 💡 Keep in mind that the choice of interface might have a significant impact on your inference and training performance! However, some signals are simply required to be part of the interface, because the effect we want to train for depends on them. +> 💡 Keep in mind that the choice of interface might has a significant impact on your inference and training performance! However, some signals are simply required to be part of the interface, because the effect we want to train for depends on them. """ # ╔═╡ c030d85e-af69-49c9-a7c8-e490d4831324 @@ -696,7 +675,7 @@ md""" ## Online Data Pre- and Postprocessing **is required for hybrid models** -Now that we have defined the signals that come *from* the FMU and go *into* the ANN, we need to think about data pre- and post-processing. In ML, this is often done before the actual training starts. In hybrid modeling, we need to do this *online*, because the FMU constantly generates signals that might not be suitable for ANNs. On the other hand, the signals generated by ANNs might not suit the expected FMU input. This gets more clear if we have a look on the used activation functions, like e.g. the *tanh*. +Now that we have defined the signals that come *from* the FMU and go *into* the ANN, we need to think about data pre- and post-processing. In ML, this is often done before the actual training starts. In hybrid modeling, we need to do this *online*, because the FMU constantly generates signals that might not be suitable for ANNs. On the other hand, the signals generated by ANNs might not suit the expected FMU input. What *suitable* means gets more clear if we have a look on the used activation functions, like e.g. the *tanh*. ![](https://github.com/ThummeTo/FMIFlux.jl/blob/main/examples/pluto-src/SciMLUsingFMUs/src/plan_e2.png?raw=true) @@ -718,7 +697,7 @@ end # ╔═╡ 0dadd112-3132-4491-9f02-f43cf00aa1f9 md""" -In general, it looks like the velocity isn't saturated too much by `tanh`. This is a good thing and not always the case! However, the very beginning of the trajectory is saturated too much (the peak value of $\approx -3$ is saturated to $\approx -1$). This is bad, because the hybrid model velocity is *slower* at this point in time and it won't reach the same angle over time as the original FMU. +In general, it looks like the velocity isn't saturated too much by `tanh`. This is a good thing and not always the case! However, the very beginning of the trajectory is saturated too much (the peak value of $\approx -3$ is saturated to $\approx -1$). This is bad, because the hybrid model velocity is *slower* in this time interval and the hybrid system won't reach the same angle over time as the original FMU. We can add shift (=addition) and scale (=multiplication) operations before and after the ANN to bypass this issue. See how you can influence the output *after* the `tanh` (and the ANN respectively) to match the ranges. The goal is to choose pre- and post-processing parameters so that the signal ranges needed by the FMU are preserved by the hybrid model. """ @@ -770,7 +749,7 @@ md""" # ╔═╡ 0fb90681-5d04-471a-a7a8-4d0f3ded7bcf md""" ## Introducing Gates -**to control how physical and machine learning model interact** +**to control how physical and machine learning model contribute and interact** ![](https://github.com/ThummeTo/FMIFlux.jl/blob/main/examples/pluto-src/SciMLUsingFMUs/src/plan_e3.png?raw=true) """ @@ -879,7 +858,7 @@ md""" # ╔═╡ 4454c8d2-68ed-44b4-adfa-432297cdc957 md""" ## FMU inputs -In general, you can use arbitrary values as input for the FMU layer, like system inputs, states or parameters. In this example, we want to use only system states as inputs for the FMU layer - to keep it easy, named: +In general, you can use arbitrary values as input for the FMU layer, like system inputs, states or parameters. In this example, we want to use only system states as inputs for the FMU layer - to keep it easy - which are: - currents of both motors - angles of both joints - angular velocities of both joints @@ -900,7 +879,7 @@ Pick additional ANN layer inputs: # ╔═╡ 06937575-9ab1-41cd-960c-7eef3e8cae7f md""" -It might be clever to pick additional inputs, because the effect being learned (slip-stick of the pen) might depend on this additional input. However, every additional signal has a little negative impact on the computational performance. +It might be clever to pick additional inputs, because the effect being learned (slip-stick of the pen) might depend on these additional inputs. However, every additional signal has a little negative impact on the computational performance and a risk of learning from wrong correlations. """ # ╔═╡ 356b6029-de66-418f-8273-6db6464f9fbf @@ -929,7 +908,7 @@ ANN gates shall be initialized with $(GATES_INIT), meaning the ANN contributes $ # ╔═╡ c0ac7902-0716-4f18-9447-d18ce9081ba5 md""" ## Resulting neural FMU -Even if this looks a little confusing at first glance, our final neural FMU topology looks like this: +Our final neural FMU topology looks like this: """ # ╔═╡ 84215a73-1ab0-416d-a9db-6b29cd4f5d2a @@ -982,7 +961,7 @@ end # ╔═╡ bc09bd09-2874-431a-bbbb-3d53c632be39 md""" -We can evaluate it, by putting in our start state `x0`. The model computes the resulting state derivative: +We find a `Chain` consisting of multipl layers and the corresponding parameter counts. We can evaluate it, by putting in our start state `x0`. The model computes the resulting state derivative: """ # ╔═╡ f02b9118-3fb5-4846-8c08-7e9bbca9d208 @@ -992,6 +971,8 @@ On basis of this `Chain`, we can build a neural FMU very easy: # ╔═╡ d347d51b-743f-4fec-bed7-6cca2b17bacb md""" +So let's get that thing trained! + # Training After setting everything up, we can give it a try and train our created neural FMU. Depending on the chosen optimization hyperparameters, this will be more or less successful. Feel free to play around a bit, but keep in mind that for real application design, you should do hyper parameter optimization instead of playing around by yourself. @@ -1000,7 +981,7 @@ After setting everything up, we can give it a try and train our created neural F # ╔═╡ d60d2561-51a4-4f8a-9819-898d70596e0c md""" ## Hyperparameters -Besides the already introduced hyperparameters - the depth, width and initial gate opening off the hybrid model - further parameters might have significant impact on the training success. +Besides the already introduced hyperparameters - the depth, width and initial gate opening of the hybrid model - further parameters might have significant impact on the training success. ### Optimizer For this example, we use the well-known `Adam`-Optimizer with a step size `eta` of $(@bind ETA Select([1e-4 => "1e-4", 1e-3 => "1e-3", 1e-2 => "1e-2"])). @@ -4500,11 +4481,7 @@ version = "1.4.1+1" # ╟─f7c119dd-c123-4c43-812e-d0625817d77e # ╟─f4e66f76-76ff-4e21-b4b5-c1ecfd846329 # ╟─b163115b-393d-4589-842d-03859f05be9a -# ╟─ea655baa-b4d8-4fce-b699-6a732dc06051 -# ╟─cae2e094-b6a2-45e4-9afd-a6b78e912ab7 # ╟─ac0afa6c-b6ec-4577-aeb6-10d1ec63fa41 -# ╟─fe85179e-36ba-45d4-b7a1-a893a382ade4 -# ╟─5b8084b1-a8be-4bf3-b86d-e2603ae36c5b # ╟─5e9cb956-d5ea-4462-a649-b133a77929b0 # ╟─9dc93971-85b6-463b-bd17-43068d57de94 # ╟─476a1ed7-c865-4878-a948-da73d3c81070 @@ -4565,7 +4542,7 @@ version = "1.4.1+1" # ╟─abc57328-4de8-42d8-9e79-dd4020769dd9 # ╟─e8bae97d-9f90-47d2-9263-dc8fc065c3d0 # ╟─2dce68a7-27ec-4ffc-afba-87af4f1cb630 -# ╟─c3f5704b-8e98-4c46-be7a-18ab4f139458 +# ╠═c3f5704b-8e98-4c46-be7a-18ab4f139458 # ╟─1a608bc8-7264-4dd3-a4e7-0e39128a8375 # ╟─ff106912-d18c-487f-bcdd-7b7af2112cab # ╟─51eeb67f-a984-486a-ab8a-a2541966fa72