Draft: improve hotstart by not using the auxiliary variable #156

jacopok · 2024-11-27T11:30:14Z

This is the first draft of a way to improve the hotstart transformation, solving #155.

It does not use an auxiliary variable and implements a single transformation, computed from the marginal distributions on the posterior samples and represented as a cubic "PChip" spline of the marginal CDFs, which is guaranteed to be monotonic.

This is what the new transform looks like when computed for a posterior guess with width $10^{-2}$ and centered around 0.3.

The auxiliary logweight is computed from the derivative of the spline, which is analytic.

The for loop in the get_aux_splines function is there to ensure that the compression is always one: if the tails of the posterior guess PDF approach zero, we do not need to capture that, and can only use the "interior" part of the PDF where its value is greater than the effective PDF outside of the compression region.
Essentially the idea is to avoid a very low-probability region at the edges of the posterior guess.

I think there is still some tuning to do and for sure some documentation to write, I wanted to start the PR early to make it clear what I'm doing.

…hange

JohannesBuchner · 2024-11-27T14:04:24Z

Thanks for looking into this.

We should have a few benchmarks to check that no biases are introduced, and to measure the speed-up.

I would suggest to use the ones from https://github.com/JohannesBuchner/space-of-inference-spaces/blob/main/image/problems.py

Asymmetric Gaussian (get_asymgauss(4), get_asymgauss(16))
Funnel (get_stdfunnel4(1, gamma=0.95), get_stdfunnel4(9, gamma=0.95))
Beta distribution (get_beta(2), get_beta(10))
Rosenbrock (get_rosenbrock(2), get_rosenbrock(20))

jacopok · 2024-11-27T14:25:30Z

Definitely! There is a big degree of freedom in choosing which posterior guess to use, I think we should compare

regular sampling
hotstart sampling with a "good" guess
hotstart sampling with a "bad" guess

but it's tricky to define what those mean in general. For the spike-slab case, I'd use Gaussians with variances comparable to the spike ("bad") and the slab ("good").

JohannesBuchner · 2024-11-28T01:59:55Z

Yeah, I think we should aim for much-improved performance for a Gaussian-ish posterior shape, and no bias with arbitrary performance for any other cases.

The former should be quite easy in high information gain analyses, such as asymgauss.

A "optimistic" guess may be based on 3 sigma, while a "cautious" guess could be based on 10 sigma-equivalent quantiles of the marginals (or full interval if the zoom interval is not much smaller than the full interval).
I guess one could also try taking the posterior from a Metropolis-Hastings MCMC run or via an optimizer run:

from snowline import ReactiveImportanceSampler
sampler = ReactiveImportanceSampler(paramnames, loglike, transform)
sampler.laplace_approximate(num_global_samples=1000, verbose=True)

which stores the Laplace Approximation into: sampler.invcov, sampler.cov, sampler.optu, sampler.optp, sampler.optL. But yeah, such heuristics will not always give good results.

jacopok · 2024-12-10T16:47:46Z

Quick progress update: after some trial and error I agree that a KDE is better.
I have an implementation to construct splines for the cumulative distribution and the density, also parameterized by the fraction of prior mass one wants the guess to "take up", ranging from 0 to 1: here is what it looks like

$density_cumulative_by_original_fraction$

Now I need to find a way to have two interpolants for the inverse cumulative distribution and the probability density which correspond exactly (since the inverse of a generic cubic spline is not a cubic spline anymore). It needs to be fast since it will be evaluated at every iteration; maybe a root solver is OK but I'd prefer something more direct.

Edit: Or, just do piecewise linear for the cumulative and piecewise constant for the derivative - we don't need more precision than that!

JohannesBuchner · 2024-12-10T21:57:36Z

I guess another parameter one could consider is a "temperature", as common for simulated annealing, for flattening the posterior distribution $\pi' = \pi^\beta$ where $\beta\in[0..1]$, or alternatively by convolution (e.g. broadening the Gaussians of the KDEs).

JohannesBuchner · 2024-12-11T09:04:14Z

I guess it is the user's responsibility to prepare the auxiliary probability distribution though, although we can give some suggestions in examples.

jacopok · 2024-12-12T16:56:29Z

Yes, but I think it's nice to give them easy "knobs to tweak": temperature and original fraction seem good to me.
Only temperature is a bit dangerous since it can still assign zero probability in some regions, if they are outside of the kde bounds.

I'm still checking things but the performance on those test problems seems very promising - as far as I can tell, no biases and sometimes huge speedups.

jacopok · 2024-12-13T16:11:15Z

These are the results so far - the rest are taking a long time with ultranest-safe, I'll update once they're done.

The posteriors should also be checked but I haven't started to open that can of worms yet.

The sampler configuration, in autosampler.py from the space-of-inference-spaces, is:

    elif samplername == 'ultranest-snowline':
        from ultranest import ReactiveNestedSampler, hotstart
        from snowline import ReactiveImportanceSampler
        
        # make snowline sample the unscaled posterior, so that we can 
        # use the u-points (in [0, 1]) it outputs directly
        if vectorized:
            # snowline doesn't take vectorized likelihoods
            def log_post_unscaled(x):
                return loglike(transform(x[np.newaxis, :]))[0]
        else:
            def log_post_unscaled(x):
                return loglike(transform(x))
        
        snowline_sampler = ReactiveImportanceSampler(param_names, log_post_unscaled, transform=lambda x: np.copy(x))
        
        laplace_approx = snowline_sampler.laplace_approximate(num_global_samples=1000, verbose=True)

        aux_param_names, aux_loglike, aux_transform, vectorized = \
            hotstart.get_auxiliary_contbox_parameterization(
            param_names, loglike, transform, upoints=laplace_approx['samples'], 
            vectorized=vectorized, volume_contribution=0.5)
                
        sampler = ReactiveNestedSampler(
            aux_param_names, 
            aux_loglike, 
            transform=aux_transform, 
            log_dir=log_dir, 
            resume=True, 
            vectorized=vectorized, 
            ndraw_max=10000000
        )
        ultranest_run(sampler, log_dir, max_ncalls)

The speedups are computed by timing the evaluation of everything, snowline included.

JohannesBuchner · 2024-12-13T16:54:31Z

That's awesome!

jacopok · 2024-12-16T09:43:57Z

Things are running very slow on the high-dimensional problems with ultranest-safe (order of days, some were truncated due to too many likelihood evaluations); do you think using a run with a stepsampler would make sense?
I'd use the same sampler for the aux and regular case, of course.

There are many options in autosampler.py - do you have a suggestion as to which one is a good "standard" setting?

JohannesBuchner · 2024-12-16T10:20:51Z

Yes, continuing with step sampler makes sense. This test is about compression anyway.

Default args should be fine.

I used ultranest-safe where feasible and ultranest-fast-fixed4d otherwise, before, see problems*.txt files in https://github.com/JohannesBuchner/space-of-inference-spaces/tree/main

jacopok added 3 commits November 27, 2024 11:35

first implementation of the spline representation of the coordinate c…

f759344

…hange

remove unused variable

67eca85

Fix whitespace

3425332

KDE implementation of auxiliary transformation

a8bf72b

JohannesBuchner mentioned this pull request Dec 19, 2024

How can I set an initial search value for each parameter JohannesBuchner/PyMultiNest#262

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: improve hotstart by not using the auxiliary variable #156

Draft: improve hotstart by not using the auxiliary variable #156

jacopok commented Nov 27, 2024

JohannesBuchner commented Nov 27, 2024

jacopok commented Nov 27, 2024

JohannesBuchner commented Nov 28, 2024

jacopok commented Dec 10, 2024 •

edited

Loading

JohannesBuchner commented Dec 10, 2024

JohannesBuchner commented Dec 11, 2024

jacopok commented Dec 12, 2024

jacopok commented Dec 13, 2024 •

edited

Loading

JohannesBuchner commented Dec 13, 2024

jacopok commented Dec 16, 2024

JohannesBuchner commented Dec 16, 2024

Draft: improve hotstart by not using the auxiliary variable #156

Are you sure you want to change the base?

Draft: improve hotstart by not using the auxiliary variable #156

Conversation

jacopok commented Nov 27, 2024

JohannesBuchner commented Nov 27, 2024

jacopok commented Nov 27, 2024

JohannesBuchner commented Nov 28, 2024

jacopok commented Dec 10, 2024 • edited Loading

JohannesBuchner commented Dec 10, 2024

JohannesBuchner commented Dec 11, 2024

jacopok commented Dec 12, 2024

jacopok commented Dec 13, 2024 • edited Loading

JohannesBuchner commented Dec 13, 2024

jacopok commented Dec 16, 2024

JohannesBuchner commented Dec 16, 2024

jacopok commented Dec 10, 2024 •

edited

Loading

jacopok commented Dec 13, 2024 •

edited

Loading