Skip to content

Latest commit

 

History

History
171 lines (124 loc) · 6.71 KB

output.rst

File metadata and controls

171 lines (124 loc) · 6.71 KB

Output format

By default the output of the code is an HDF5 file, with filename <output>_<timestamp>_mcmc.h5

Optionally several pickle files (pickle is Python's internal object serialization module), roughly equivalent to IDL SAVE files, can be output. These may be convenient, but are not very portable.

HDF5 output

The output HDF5 file contains datasets for the input observational data and the MCMC sampling chains. A significant amount of metadata is stored as JSON in dataset attributes. Anything that could not be JSON serialized during writing will have been pickled instead, with the pickle stored as string data in place of the JSON.

The HDF5 files can read back into python using

import prospect.io.read_results as reader
filename = "<outfilestring>_<timestamp>_mcmc.h5"
results, obs, model = reader.results_from(filename)

which gives a results dictionary, the obs dictionary containing the data to which the model was fit, and the model object used in the fitting. The results dictionary contains the production MCMC chains from emcee or the chains and weights from dynesty, basic descriptions of the model parameters, and the run_params dictionary. Some additional ancillary information is stored, such as code versions, runtimes, MCMC acceptance fractions, and model parameter positions at various phases of of the code. There is also a string version of the parameter file used. The results dictionary contains the information needed to regenerate the sps object used in generating SEDs.

sps = reader.get_sps(res)

It can sometimes be difficult to reconstitute the model object if it is complicated, for example if it was built by using or referencing files or data that are no longer available. For this reason it is suggested that references to filenames in parameter files be made command-line arguments

Pickles

It is possible to output prospector results as python "pickle" files. The results pickle is a serialization of the results dictionary, and has <timestamp>_mcmc appended onto the output file string specified when the code was run, where timestamp is in UT seconds. It uses only basic scientific python types (e.g. dictionaries, lists, and numpy arrays). It should therefore be readable on any system with Python and Numpy installed. This can be accomplished with

import pickle
filename = "<outfilestring>_<timestamp>_mcmc"
with open(filename, "rb") as f:
    result = pickle.load(f)
print(result.keys())

A second pickle file containing the model object has the extension <timestamp>_model. It is a direct serialization of the model object used during fitting, and is thus useful for regenerating posterior samples of the SED, or otherwise exploring properties of the model.

However, this requires Python and a working Prospector installation of a version compatible with the one used to generate the model pickle. If that is possible, then the following code will read the model pickle:

import pickle
model_file = "<outfilestring>_<timestamp>_model"
with open(model_file, 'rb') as mf:
    mod = pickle.load(mf)
print(type(mod))

If Powell optimization was performed, this pickle also contains the optimization results (as a list of Scipy OptimizerResult objects).

Basic diagnostic plots

Several methods for visualization of the results are included in the Prospector.io.read_results module.

First, the results file can be read into useful dictionaries and objects using

import prospect.io.read_results as reader
filename = "<outfilestring>_<timestamp>_mcmc"
results, obs, model = reader.results_from(filename)

See the help for prospect.io.read_results.results_from() for a description of the returned objects.

It is often desirable to plot the parameter traces for the MCMC chains. That is, one wants to see the evolution of the parameter values as a function of MCMC iteration. This is useful to check for convergence. It can be done easily for both emcee and dynesty results by

tracefig = reader.traceplot(results)

Another useful thing is to look at the "corner plot" of the parmeters. If one has the corner.py package, then

cornerfig = reader.subcorner(results, showpars=model.theta_labels()[:5])

will return a corner plot of the first 5 free parameters of the model. If showpars is omitted then all free parameters will be plotted. There are numerous other options to the subcorner method, which is a thin wrapper on corner.py, but they are documented (help(reader.subcorner))

Finally, one often wants to look at posterior samples in the space of the data, or perhaps the maximum a posteriori parameter values. Taking the MAP as an example, this would be accomplished by

        import np

# Find the index of the maximum a posteriori sample
ind_max = results["lnprobability"].argmax()
if res["chain"].ndim > 2:
    # emcee
    walker, iteration = np.unravel_index(ind_max, results["lnprobability"].shape)
            theta_max = results["chain"][walker, iteration, :]
elif res["chain"].ndim == 2:
    # dynesty
    theta_max = results["chain"][indmax, :]

# We need the SPS object to generate a model
sps = reader.get_sps(results)
# now generate the SED for the max. a post. parameters
spec, phot, x = model.predict(theta_max, obs=obs, sps=sps)

# Plot the data and the MAP model on top of each other
import matplotlib.pyplot as pl
if obs['wave'] is None:
            wave = sps.wavelengths
else:
    wave = obs['wavelength']
pl.plot(wave, obs['spectrum'], label="Spec Data")
pl.plot(wave, spec, label="MAP model spectrum")
if obs['filters'] is not None:
    pwave = [f.wave_effective for f in obs["filters"]]
    pl.plot(pwave, obs['maggies'], label="Phot Data")
    pl.plot(pwave, phot, label="MAP model photometry")

However, if all you want is the MAP model this may be stored for you, without the need to regenerate the sps object

import matplotlib.pyplot as pl

        best = res["bestfit"]
a = model.params["zred"] + 1
pl.plot(a * best["restframe_wavelengths"], best['spectrum'], label="MAP spectrum")
if obs['filters'] is not None:
    pwave = [f.wave_effective for f in obs["filters"]]
    pl.plot(pwave, best['photometry'], label="MAP photometry")