SingleTaskGP versus MultiTaskGP #1771
Replies: 11 comments 6 replies
-
By that we mean that the observations are not of what is sometimes called a "block design", where each output is observed at the same location - i.e. if you evaluate
Yes, that makes sense! Are your observations noisy? Or are they deterministic? If they are deterministic AND your data has the block design described above, there isn't actually any benefit from modeling this with a multi-task model (this is a property called autokrigeability). |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot for your reply. In my case, the observations are deterministic. And they have the block design, i.e., for each So far, I have been using |
Beta Was this translation helpful? Give feedback.
-
Hmm if the observations are deterministic, why infer a noise (that you know is zero)? You can instead use a Again, if you have deterministic observations AND your observations are of a block design, then there is no need to use a multi-task model (that is computationally a lot more expensive). |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot for your inputs. But how do you suggest we capture the correlation between our output quantities in this case, without using a |
Beta Was this translation helpful? Give feedback.
-
You won't get any benefit in this case from capturing the correlation due to the autokrigeability. |
Beta Was this translation helpful? Give feedback.
-
Thank you. And I wanted to clarify one more thing wrt the observations being deterministic as I believe I misunderstood the context of the term. We actually have noise in the |
Beta Was this translation helpful? Give feedback.
-
Yes, if you have multiple results on the same |
Beta Was this translation helpful? Give feedback.
-
And if there is indeed noise, then it may be beneficial to use a multi-task GP model (how large that benefit is will depend on the level of the noise / signal-to-noise ratio). |
Beta Was this translation helpful? Give feedback.
-
Hello @Balandat , In continuation of the same problem, I was wondering if In the earlier query, I wanted to model each of the 4 outputs as a separate task. But currently, I am exploring an option where I group 3 outputs into 1 task and have the other output as another task. I have defined the task indices using I then define the data and pass it to the model as:
This gives me an error message which says: Any leads would be appreciated! |
Beta Was this translation helpful? Give feedback.
-
Thank you! So, I see that you have now defined
and try to see the predictions using
I only see two outputs- which is one per task. But I would ideally like to see Does multi-task prevent us from doing so? |
Beta Was this translation helpful? Give feedback.
-
The data typically looks like this: I've just shown a small but fair representation of how the dataset looks like, and marked the columns containing the inputs ( Here's the relevant portion of the code:
I hope this is clear enough. Please let me know if there's some other information/code that I may need to plug for you to be able to take a closer look. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Issue description
Hello!
We are currently using botorch to train a multi-output GP model on our data. Let's say, the GP model is trying to fit the function f on our dataset [Y=f(X)], where Y is a 4-dimensional vector of the output, i.e., Y = [Y1, Y2, Y3, Y4]. Similarly, X is a 3-dimensional vector of the input, i.e., X = [X1, X2, X3].
We also note that the output data is correlated.
In this regard, we have two questions:
We would like to know what "different training data" here actually means.
SingleTaskGP
makes more sense (going by the documentation). However,SingleTaskGP
does not capture the correlations between outputs. As a result, we would like to check if the following approach is feasible-We would like to use
MultiTaskGP
by predicting each output in a different task (hence 4 tasks). By doing this, we would be able to leverage upon the ability ofMultiTaskGP
to capture correlation across tasks, thereby capturing correlation between outputs.We would like to hear the thoughts of the botorch community on these questions. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions