Replies: 7 comments
-
I think this should work (@saitcakmak knows more about the ins and outs of dimension-altering transforms though). Another alternative way one could do this might be via a custom Kernel that internally computes the predictions B and C and then computes a kernel on that augmented feature set. Though the transform approach seems less invasive. How expensive are the predictions of B and C? If these aren't super cheap then one thing to think about is whether you want to spend the cost of repeatedly predicting the same things during the model training. An alternative could be to just bulk-predict B and C for all the training data, then fit a standard model on that full augmented feature set, and then use a simple adapter for evaluating that model that augments the features outside the model right before calling One question I have is whether there is uncertainty associated with the predictions B and C, or whether these are deterministic. In the former setting one may want to think a bit more about how to properly propagate that uncertainty in the predictions as they now enter as features (this is related to robust BO, something also in @saitcakmak's territory). |
Beta Was this translation helpful? Give feedback.
-
Hi Max, thanks for your quick response. Concerning your idea of bulk prediction, this should be definitely possible for pure predictions. But when performing an actual optimization the proposed adapter also has to be available in the acquisition function, or? Because, in the optimization which is just performed on the original features, B and C are not known and computed at every optimization step of Concerning uncertainties: for our use case both is possible: deterministic models or models with uncertainty like GPs. For the original implementation, we wanted to neglect the uncertainty of the latent space models, since we did not know how to incorporate them. But if you have an idea, we are very open to it. Best, Johannes |
Beta Was this translation helpful? Give feedback.
-
Correct, you'd still have to do this for acquisition function optimization. So if that's the bulk of the compute then there isn't really a big benefit to doing the bulk predictions.
We've done similar things in the context of robust optimization. A reasonably straightforward way would be to do this via MC sampling - basically rather than using the mean prediction draw a number of samples from the B/C model posteriors, use those as inputs, and then marginalize across these samples after computing the prediction/acquisition function on the sample level. See this tutorial for a related example where the perturbations come from some known noise level of the inputs. Unless the uncertainty of these predictions is quite large, it's probably best to start with the simpler setup though and then go from there if it turns out that propagating this uncertainty is important. |
Beta Was this translation helpful? Give feedback.
-
Hi @jduerholt. Sorry about the late response here! I think extending the For handling uncertainty in Model B/C, you can probably leverage the |
Beta Was this translation helpful? Give feedback.
-
Hi @saitcakmak, thanks for your response. I will give a try in the next month and tell you how it was going ;) |
Beta Was this translation helpful? Give feedback.
-
Hi @saitcakmak, I just looked at the |
Beta Was this translation helpful? Give feedback.
-
The q-batch dimension is only expanded if |
Beta Was this translation helpful? Give feedback.
-
Hi botorch developers,
I was wondering what is the best way to realize the transfer learning scenario, depicted in the example below, in
botorch
(a more detailed discussion can be found in https://arxiv.org/abs/1711.05099). In this case, the final target is to optimize property A in a BO scenario. In addition, predicitve models for latent properties B and C are available and properties B and C are helpful features in infering target property A.The workflow would be to first predict B and C, append it to the original features and feed them into a GP to predict property A. What would be the best way to do this in
botorch
?My idea would be to implement a new
InputTransform
calledPredictiveInputTransform
which gets in its__init__
method an instantiatedtorch
based model which is then executed in thetransform
method and appends the predicted property to the initialX
matrix. Using theChainedInputTransform
several of thisPredictiveInputTransform
s can be chained after each other to realize the depicted scenario.What do you think?
Best,
Johannes
Beta Was this translation helpful? Give feedback.
All reactions