-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hierarchical optimization in D2D #147
base: master
Are you sure you want to change the base?
Hierarchical optimization in D2D #147
Conversation
- Improve the docstring in arCalcHierarchical - Remove a redundant line in arInitHierarchical
Previously, this argument was accepet but unused by arCalcHierarchical. Now it is used to avoid computing scale gradients and sensitivities of observables when not necessary.
Previously, arInitHierarchical was using isSymType, the latter being a function introduced in R2019a. This commit removes this dependence and instead implements a small custom function which does the trick.
The scale parameters suitable for hierarchical optimization are those which appear _only_ in observables of form `s*xz` where `s` is the subject scale parameter and `xz` is a model species or a derived variable. Until now, arInitHierarchical has been detecting parameters which appear _at least in one_ observable of form `s*xz` (meaning that they were allowed to appear also in obervables of other form). Consequently, parameters not suitable for the hierarchical optimization could be detected by arInitHierarchical and the results of the hierarchical optimization could be not correct. This commit fixes it by adding extra checks in arInitHierarchical to make the latter function detect only the right parameters.
The implementation of hierarchical optimization depends on the specific form of the merit function. We currently assume the following one: J(s,p) = \sum_j (data_j - s*x(p,t_j))^2/std_j^2 Each feature that modifies this expression also breaks the method of hierarchical optimization, making its results incorrect. To avoid this, in arCalcHierarchical we try to check if d2d features affecting the form of the merit function are enabled. This commit adds extra checks of this kind in arCalcHierarchical, hopefully making the checklist complete (or close to complete).
The removed settings changes are redundant since these settings are tested anyway in arCalcHierarchical.
Currently arCalcHierarchical requires no error fitting, meaning that there can be no dependence of errors on any fittable parameters. Until now, arCalcHierarchical has detected error fitting incorrectly, excluding some possible settings which actually had no error fitting. In other words, the test for error fitting has been to strict. This commit fixes the issue by implementing a better test for error fitting in arCalcHierarchical.
This commit updates the help text of arInitHierarchical to reflect the changes in its functionality introduced in commit c3bbe5a. This should had been done in that commit. Additionally, the present commit fixes a typo in the subject help text.
Hierarchical optimization is not feasible if the solution of the model is zero for all measurement times. This commit adds the respective assertion in arCalchierarchical. This extra assertion is not strictly necessary. Degeneration of the solution to zero would result in NaN scales, subsequently resulting in NaN residuals, which would trigger an error anyway. Nevertheless, with the extra assertion we get a more specific error message which may make it easier to debug.
An update. As long as this implementation relies on the form So I need to write another commit to fix this. I will enforce the use of experimental error if the hierarchical optimization is enabled. Such a solution seems restrictive but:
I will commit in a moment. (Edit: Done, cf. commit 1b7fb04) |
Dear gdudziuk. |
Currently arCalcHierarchical requires no dependence of errors on any fittable parameters, neither explicitly nor implicitly. Until now, arCalcHierarchical has been asserting that there is no fitting of error parameters, but at the same time has been allowing implicit dependence of errors on fittable parameters of the model via observables. This is a bug since with such implicit dependence of errors on fittable parameters the results of arCalcHierarchical are incorrect. This commit fixes it by asserting the use of experimental errors, which are independent of the fittable parameters by definition.
Dear Example: Lines 24-25 in arInitHierarchical.m: Could you please update the code accordingly? |
Another question: I did the following:
There is a warning that I don't understand and obviously no scaling parameters were found. Why? ` arInitHierarchical arDisableHierarchical % Switch back to the "normal mode" |
At the moment, arCalcHierarchical allows experimental errors only, so the lines which handle the case of model-based errors are redundant and obscure the code. This commit removes such lines. See also commit 1b7fb04.
In arInitHierarchical, the cell arrays xSyms and zSyms has been created assuming that there is only one model loaded (i.e. the length of the ar.model structure is 1). This commit corrects the support for multiple models in arInitHierarchical.
Right. Fixed in commit a35790f. Thank you for pointing this out. In addition I have found another small issue - I will fix it this week. I will also take a look at the model of Swameye and let you know what's going on there. |
For certain purposes, in arCalcHierarchical we have been testing if all elements ar.model(im).data(id).useHierarchical are NaNs. But this cannot happen since the array useHierarchical is constructed such that its elements may be either false or true (and not NaN). So we should rather test if the elements of useHierarchical are all false. This commit fixes this issue.
Until now, arCalcHierarchical has been working correctly only for data series with no duplicate time points (or, more generally, predictor values), despite D2D allows such duplicates in the data. This commit fixes this issue.
There has been a code snippet in a loop over observables which has been independent of the obervables. This has resulted in unnecessary repeated assignments inside this snippet. So this commit moves this snippet to a more suitable location in the body of arInitHierarchical.
Function arInitHierarchical calls function isAlways which by default rasises a warning when testing a condition like `sym('a')==0`. Such conditions appear in run of normal operation of arInitHierarchical so the warning in question is unnecessary. This commit switches it off.
Some of the assertion tests in arCalcHierarchical should be required only for those observables which have parameters suitable for hierarchical optimization. Until now, arCalcHierarchical has been testing such assertions for all observables, thus being overly restrictive. This commit fixes this issue.
Until now, arCalcHierarchical has been working correctly only for data series with no missing observations, despite D2D allows such data. This commit fixes this issue.
I have played a bit more with example models shipped with D2D and discovered some issues in my code. They are already fixed. From the model "Becker_Science2010" I have learned that D2D allows repeated measurements in the data sheets (which I should had figured out earlier; see commit 7feb387) and from the model "Swameye_PNAS2003" I have learned that there also may be missing measurements in the data sheets (see commit 14319ec). I hope that now hierarchical optimization works correctly with all possible data. Concerning the warning "Unable to prove 'some_parameter == 0'.", this is a warning raised by the symbolic function Concerning the warning "No scale parameters found for hierarchical optimization.", this is correct since there are no such parameters in the model "Swameye_PNAS2003". At the moment, my implementation supports hierarchical optimization only for observables of form I suspect that this may be a problem for you. I can see that many D2D examples involve not only scales but also offsets so you probably would like to have the hierarchical optimization generalized to cover the setups involving offsets. But if this is acceptable for you to have that simple implementation as it is now, I can provide some toy model which demonstrates the use of the hierarchical optimization in the present state of development. Of course, for the test purposes you can get hierarchical optimization work with the model of Swameye by manually removing the offset parameters from that model. |
So, what do you think, @clemenskreutz? |
I have implemented a simple form of hierarchical optimization in D2D and I would like to propose to pull this feature upstream.
The method of hierarchical optimization is described in detail in (Weber et al., 2011) and (Loos et al.,2018). Let me however illustrate this method for the following simplified parameter estimation problem:
where
Y_j
and{std}_j
represent experimental measurements and associated standard deviations given or arbitrary scale andy(p,t_j)
is the model observable corresponding to the measurementsY_j
. Assume that the observabley
is of formy(p,t)=s*x(r,t)
wherex(r,t)
denote model species (or some derived variable) corresponding to the measurementsY_j
,s
is a scale parameter to be estimated along with the dynamic parametersr
andp=(r,s)
. Then, the above optimization problem becomesThe core observation is that the function
J_2
is quadratic and positive definite w.r.t. the scales
, assuming that at least some ofx(r,t_j)
are nonzero. Thus, given parametersr
, the optimal scales=s(r)
is uniquely determined and the optimization problem(Eq. 2)
is equivalent toThe optimal scales
s(r)
may be expressed explicitly. To do so, it is enough to resolve the condition\grad_{s} J_2(r,s)=0
w.r.t.s
, which gives:Having this, we may also explicitly compute the gradient of
J_3
w.r.t.r
(assuming that we can compute the sensitivities ofx
w.r.tr
) and use it to feed any gradient-based optimization algorithm.To sum up, thanks to the specific form of the observable
y
, namelyy=s*x
, we can replace the optimization problem(Eq. 2)
with the equivalent problem(Eq. 3)
which has less parameters to optimize. This is the method of hierarchical optimization. From my experience (and as one could expect), this can significantly improve the optimization speed. The trade off is that we need to assume a specific form of the observabley
. Nevertheless, I think that the case ofy=s*x
is common enough to implement hierarchical optimization for it.Theoretically, the method outlined above may be generalized to more complex observables, like e.g.
y=s*x+b
whereb
denotes an offset parameter. Such observables perhaps cover vast majority of practical applications. Havingy=s*x+b
we proceed analogously as above, namely instead of(Eq. 2)
we haveand we need to compute both the optimal scale
s=s(r)
and the optimal offsetb=b(r)
. The problem with such generalization is very pragmatic - the formulas fors(r)
andb(r)
become much more complex than in the simple case ofy=s*x
discussed above. For this reason I have implemented the hierarchical optimization just for the latter simple case and to be honest I think I will not implement the generalization toy=s*x+b
soon.Usage
There is function
arInitHierarchical
which detects all observables of formy=s*x
end enables the "hierarchical mode" of optimization. To get it work, just putarInitHierarchical
beforearFit
(orarFitLHS
or anything like that). There is also functionarDisableHierarchical
which switches back to the "normal mode". So an example setup with hierarchical optimization may look like this:The optimal scales
s(r)
are computed by functionarCalcHierarchical
, which is called fromarCalcRes
andarCalcSimu
. As a result, any D2D function that internally callsarSimu
can benefit from the "hierarchical mode". For example, in the following snippetthe scale parameters (if any) will be automatically fine-tuned prior to generating the plots.
Limitations
The implementation in this pull request strongly relies on the specific form
(Eq. 2)
of the optimization problem. Consequently, any D2D feature that modifiesJ_2
as in(Eq. 2)
into something else cannot be used with the hierarchical optimization, at least until suitable generalizations are implemented. Up to what I know, these interfering features are:r
nors
, even implicitly; cf. the comment below)arCollectRes
, namely:With this features enabled, the current implementation of the hierarchical optimization works incorrectly since the optimal scales
s(r)
are in such case different than in(Eq. 4)
. FunctionarCalcHierarchical
raises an error if any of these features is in use. I have done my best to make the above list complete. However, I'm a newcomer to D2D so there may be some other features that I don't know yet and that may interfere with the hierarchical optimization by modifying the merit functionJ_2
. So if anybody finds that there are some more features like that in D2D, please let me know.Further development
The possible directions of generalization of the method are:
y=s*x+b
, possibly with some restricting assumptions.