Does BoTorch automatically do optimizations with prohibitively large data dimensions? #1921

tiannuo-yang · 2023-07-07T06:40:07Z

tiannuo-yang
Jul 7, 2023

Hi, everyone!

Firstly, I just came into contact with BoTorch. I sincerely express my gratitude to BoTorch for his contributions in supporting the hyperparameter optimization community! It is really helpful for relevant researchers.

I have some confusion about high dimensional optimization. When conducting the most basic modules to conduct BO, I find that the overhead does not increase absolutely with more dimensions. I try three dimensions: 5, 20 and 197. And the overall execution time for 200 iterations is 402s, 294s and 85s, respectively.

Dose this mean that BoTorch would adopt automatic optimization techniques (e.g., random embeddings) when handling high dimensions, even if I don't mean to use these techniques? If so, how can I control (open/close) it?

This is my usage of BoTorch. I use totally CPUs for computation, and all the X data is normalized to [0,1].

import torch
from botorch.models import SingleTaskGP
from botorch.acquisition import ExpectedImprovement
from botorch.optim import optimize_acqf
from botorch.fit import fit_gpytorch_model
from gpytorch.mlls import ExactMarginalLogLikelihood
from utils import *

class VanillaBO:
    def __init__(self, env) -> None:
        self.env = env
        self.knob_num = len(env.names)
        self.bounds = torch.tensor([[0.0] * self.knob_num, [1.0] * self.knob_num])
        
    def init_sample(self, sample_num=10):
        X_init = LHS_sample(dimension=self.knob_num, num_points=sample_num)
        Y_init = self.env.get_state(X_init).reshape(-1, 1)

        self.X_init, self.Y_init = torch.tensor(X_init), torch.tensor(Y_init)

        self.model = SingleTaskGP(self.X_init, self.Y_init)
        self.mll = ExactMarginalLogLikelihood(self.model.likelihood, self.model)
    
    def step(self,):
        fit_gpytorch_model(self.mll)
        EI = ExpectedImprovement(self.model, best_f=self.Y_init.max())
        candidate, _ = optimize_acqf(EI, bounds=self.bounds, q=1, num_restarts=10, raw_samples=100)

        new_x = candidate.detach()
        new_y = self.env.get_state(new_x.numpy()).reshape(-1, 1)

        self.X_init = torch.cat([self.X_init, new_x])
        self.Y_init = torch.cat([self.Y_init, torch.tensor(new_y)])
        # print(self.X_init[-1,:].numpy().tolist(), self.Y_init[-1,:].numpy().tolist())
        
        self.model = SingleTaskGP(self.X_init, self.Y_init)
        self.mll = ExactMarginalLogLikelihood(self.model.likelihood, self.model)
    
    def best_knob(self,):
        best_x = self.X_init[self.Y_init.argmax()]
        best_y = self.Y_init.max()
        return best_x, best_y

Thank you for your help.

Answered by Balandat

Jul 8, 2023

I have some confusion about high dimensional optimization. When conducting the most basic modules to conduct BO, I find that the overhead does not increase absolutely with more dimensions. I try three dimensions: 5, 20 and 197. And the overall execution time for 200 iterations is 402s, 294s and 85s, respectively.

Since GPs are kernel methods, the main computational bottleneck for them is computing the solution to a N x N linear system, where N is the number of data points - note that this is independent of the dimension d (increasing d will make the computation of the kernel matrix more expensive, but for larger N the overhead of that is usually negligible).

However, what I think is goi…

View full answer

Balandat · 2023-07-08T18:57:42Z

Balandat
Jul 8, 2023
Collaborator

I have some confusion about high dimensional optimization. When conducting the most basic modules to conduct BO, I find that the overhead does not increase absolutely with more dimensions. I try three dimensions: 5, 20 and 197. And the overall execution time for 200 iterations is 402s, 294s and 85s, respectively.

Since GPs are kernel methods, the main computational bottleneck for them is computing the solution to a N x N linear system, where N is the number of data points - note that this is independent of the dimension d (increasing d will make the computation of the kernel matrix more expensive, but for larger N the overhead of that is usually negligible).

However, what I think is going on in your setting is simply that the in higher dimensions the acquisition function becomes really hard to optimize, and so optimize_acqf may end up in a local optimum or just a flat region with zero gradients and terminate there (and therefore run for a shorter time). I suggest you try using the new LogExpectedImprovement that we recently added to mitigate some of these issues.

The other problem with your setup is that the fit of the SingleTaskGP model will likely becomes really poor in higher dimensions. d=5 is totally fine, d=20 is probably starting to push it, and for d=197 the model will likely be terrible. In that case you'd want to use other techniques, e.g. SAASBO, TuRBO, or some random embedding technique as you mentioned.

Dose this mean that BoTorch would adopt automatic optimization techniques (e.g., random embeddings) when handling high dimensions, even if I don't mean to use these techniques? If so, how can I control (open/close) it?

No, BoTorch does not automatically apply things like random embedding or the like in high dimensions - when using BoTorch you have both the flexibility and the responsibility to specify what approaches to use.

1 reply

tiannuo-yang Jul 18, 2023
Author

Dear Dr. Balandat:

Sorry for taking so long to reply. Thanks a lot for your detailed answer and illustration!

My previous misunderstanding mainly came from insufficient knowledge of the optimize_acqf. And LogExpectedImprovement indeed solves my problem with d=20. I will definitely explore more SOTA techniques about BO integrated in BoTorch.

Tiannuo Yang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does BoTorch automatically do optimizations with prohibitively large data dimensions? #1921

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Does BoTorch automatically do optimizations with prohibitively large data dimensions? #1921

tiannuo-yang Jul 7, 2023

Replies: 1 comment · 1 reply

Balandat Jul 8, 2023 Collaborator

tiannuo-yang Jul 18, 2023 Author

tiannuo-yang
Jul 7, 2023

Replies: 1 comment 1 reply

Balandat
Jul 8, 2023
Collaborator

tiannuo-yang Jul 18, 2023
Author