-
Notifications
You must be signed in to change notification settings - Fork 108
Config Documentation
This documentation is for the GalSim 2.x release line. (Features that are new in some version will be indicated as such.)
The basic configuration method is to use a dictionary which can be parsed in python. Within that structure, each field can either be a value, another dictionary which is then further parsed, or occasionally a list of items (which can be either values or dictionaries). The hierarchy can go as deep as necessary.
Our example config files are all yaml files, which are read using the executable galsim
.
This is a nice format for config files, but it is not required. Anything that can represent a
dictionary will do. For example, the executable galsim
also reads in and processes json-style
config files if you prefer.
If you would like a kind of tutorial that goes through typical uses of the config files, there
are a series of demo config files in the GalSim examples
directory. See Tutorials for more information.
This documentation is meant to be more of a reference once you already have the basic idea of how
the config files generally work.
For a concrete example of what a config file looks like, here is demo1.yaml (the first file in the aforementioned tutorial) stripped of most of the comments to make it easier to see the essence of the structure:
gal :
type : Gaussian
sigma : 2 # arcsec
flux : 1.e5 # total counts in all pixels
psf :
type : Gaussian
sigma : 1 # arcsec
image :
pixel_scale : 0.2 # arcsec / pixel
noise :
type : Gaussian
sigma : 30 # standard deviation of the counts in each pixel
output :
dir : output_yaml
file_name : demo1.fits
This file defines a dictionary, which in python would look like
config = {
'gal' : {
'type' : 'Gaussian',
'sigma' : 2.,
'flux' : 1.e5
},
'psf' : {
'type' : 'Gaussian',
'sigma' : 1.
},
'image' : {
'pixel_scale' : 0.2,
'noise' : {
'type' : 'Gaussian',
'sigma' : 30.
}
},
'output' : {
'dir' : 'output_yaml',
'file_name' : 'demo1.fits'
}
}
As you can see, there are several top level fields (gal
, psf
, image
, and output
) that
define various aspects of the simulation. There are others as well that we will describe below,
but most simulations will want to include at least these four.
Most fields have a type
item that defines what the other items in the field mean. (The image
and output
fields here have implicit types Single
and Fits
, which are the default, so may
be omitted.) For instance, a Gaussian surface brightness profile is defined by the parameters
sigma
and flux
.
Most types have some optional items that take reasonable defaults if you omit them.
E.g. the flux is not relevant for a PSF, so it may be omitted in the psf
field, in which
case the default of flux=1
is used.
At the top level, there are 6 basic fields:
- psf defines what kind of PSF profile to use.
- gal defines what kind of galaxy profile to use.
- stamp defines parameters related to building the postage stamp image of each object.
- image defines parameters related to the full images to be drawn.
- input defines any necessary input files or things that need some kind of initialization.
- output defines the names and format of the output files.
None of these are technically required, although it is an error to have neither psf nor
gal. (If you don't want to draw anything but noise, you need to let GalSim know that this is intentional by using type: None
for one of these.) But the most common usage would be to use psf, gal,
image and output. It is not uncommon for there to be no input files, so you will often omit
the input field.
And sometimes you will omit the gal field to draw an image with just stars.
Most simulations will use the default stamp type (called 'Basic'), which involves drawing
a galaxy convolved by a PSF (or just a PSF image if gal is omitted) on each postage stamp,
so this field will very often be omitted as well.
We will go through each one in turn. As we do, some values will be called float_value, int_value, etc. These can either be a value directly (e.g. float_value could just be 1.5), or they can be a dict that describes how the value should be generated each time (e.g. a random number or a value read from an input catalog). See Config Values for more information about how to specify these values.
In addition each value will have one of (required) or (optional) or (default = something) to indicate whether the item is required or if there is some sensible default value. The (optional) tag usually means that the action in question will not be done at all, rather than done using some default value. Also, sometimes no item is individually required, but one of several is.
The psf field defines the profile of the point-spread function (PSF). Any object type is allowed for the psf type, although some types are obviously more appropriate to use as a PSF than others. See Config Objects for a list of all the available object types.
If this field is omitted, the PSF will effectively be a delta function. I.e. the ideal galaxy surface brightness profiles will be drawn directly on the image without any convolution.
The gal field defines the profile of the galaxy. As for the psf field, any object type is allowed for the gal type, although some types are obviously more appropriate to use as a galaxy than others. See Config Objects for a list of all the available object types.
Technically, the gal field is not fundamental; its usage is defined by the stamp type. One could for instance define a stamp type that looked for a gal_set field instead that might give a list of galaxies to draw onto a single stamp. However, all of the stamp types defined natively in GalSim use the gal field, so it will be used by most users of the code.
If this field is omitted, the default stamp type = 'Basic' will draw the PSF surface brightness profiles directly according to the psf field. Other stamp types may require this field or may require some other field instead.
The stamp field defines the relevant properties and parameters of the stamp-building process. See Config Stamp for a list of all the available stamp types.
This field is often omitted, in which case the 'Basic' stamp type will be assumed.
The image field defines the relevant properties and parameters of the full image-building process. See Config Image for a list of all the available image types.
If this field is omitted, the 'Single' image type will be assumed.
The input field indicates where to find any files that you want to use in building the images or how to set up any objects that require initialization. See Config Input for a list of all the available input types.
This field is only required if you use object types or value types that use an input object. Such types will indicate this requirement in their descriptions.
The output field indicates where to write the output files and what kind of output format they should have. See Config Output for a list of all the available output types.
There are a couple of other top level fields that act more in a support role, rather than being part of the main processing.
Almost all aspects of the file building can be customized by the user if the existing GalSim types do not do precisely what you need. How to do this is described in the pages about each of the different top-level fields. In all cases, you need to tell GalSim what Python modules to load at the start of processing to get the implementations of your custom types. That is what this field is for.
The modules field should contain a list of modules that GalSim should import before
processing the rest of the config file. These modules can be either in the current directory
where you are running the code or installed in your Python distro. (Or technically, they
need to be located in a directory in sys.path
.)
See examples/des/meds.yaml
, examples/des/blendset.yaml
, and examples/great3/cgc.yaml
for some examples of this field.
Sometimes, it can be useful to define some configuration parameters right at the top of the config file that might be used farther down in the file somewhere to highlight them. Or sometimes, there are calculations that are needed by several different values in the config file, which you only want to calculate once.
You can put such values in a top-level eval_variables field. They work just like variables that you define for 'Eval' items, but they can be placed separately from those evaluations.
See examples/demo11.yaml
, examples/des/draw_psf.yaml
, and examples/great3/cgc.yaml
for examples of this field.
This feature directs the config processing to first load in some other file (or specific field with that file) and then possibly modify some components of that dict.
To load in some other config file named config.yaml
, you would write
template: config.yaml
If you only want to load a particular field from that file, say the image
field, you could write
template: config.yaml:image
The template field may appear anywhere in the config file. Wherever it appears, the contents
of the other file will be a starting point for that part of the current config dict,
but you can replace
or add values by specifying new values for some of the fields. Fields that are not at
the top level are specified using a .
to proceed down the levels of the config hierarchy.
e.g. image.noise.sky_level
refers to config['image']['noise']['sky_level']
.
For example, if you have a simulation defined in my_sim.yaml
, and you want to make another
simulation that is identical, except you want Sersic galaxies instead of Exponential galaxies say,
you could write a new file that looks something like this
template : my_sim.yaml
gal:
type : Sersic
n : { type : Random, min : 1, max: 4 }
half_light_radius :
template : my_sim.yaml:gal.half_light_radius
flux : 1000
output.dir : sersic_sim
This will load in the file my_sim.yaml
first, then replace the whole config['gal']
field
as well as config['output']['dir']
(leaving the rest of config['output']
unchanged).
The new config['gal']
field will use the same half_light_radius
specification from
the other file (which might be some complicated random variate that you did not want to
duplicate here).
If the template
field is not at the top level of the config dict, the adjustments should be made relative to that level of the dictionary:
psf :
template: cgc.yaml:psf
index_key : obj_num
items.0.ellip.e.max : 0.05
items.1.nstruts : 1
items.1.strut_angle : { type : Random }
Note that the modifications do not start with psf.
, since the template processing is being done within the psf
field.
Finally, if you want to use a different field from the current config dict as a template, you can use the colon notation without the file. E.g. To have a bulge plus disk that have the same kinds of parameters, except that the overall type is a DeVaucouleurs and Exponential respectively, you could do:
gal:
type: Sum
items:
-
type: DeVaucouleurs
half_light_radius: { type: Random, min: 0.2, max: 0.8 }
flux: { type: Random, min: 100, max: 1000 }
ellip:
type: Eta1Eta2
eta1: { type: RandomGaussian, sigma: 0.2 }
eta2: { type: RandomGaussian, sigma: 0.2 }
-
template: :gal.items.0
type: Exponential
This would gererate different values for the size, flux, and shape of each component. But the way those numbers are drawn would be the same for each.
See examples/great3/rgc.yaml
and examples/great3/cgc_psf.yaml
for examples of this feature.
The normal way to run a GalSim simulation using a config file is galsim config.yaml
, where
config.yaml
is the name of the config file to be parsed. For instance, to run demo1 (given
above), you would type
galsim demo1.yaml
Sometimes it is convenient to be able to change some of the configuration parameters from the command line, rather than edit the config file. For instance, you might want to make a number of simulations, which are nearly identical but differ in one or two specific attribute.
To enable this, you can provide the changed (or new) parameters on the command line after the name of the config file. E.g. to make several simulations that are identical except for the flux of the galaxy and the output file, one could do.
galsim demo1.yaml gal.flux=1.e4 output.file_name=demo1_1e4.fits
galsim demo1.yaml gal.flux=2.e4 output.file_name=demo1_2e4.fits
galsim demo1.yaml gal.flux=3.e4 output.file_name=demo1_3e4.fits
galsim demo1.yaml gal.flux=4.e4 output.file_name=demo1_4e4.fits
Notice that the .
is used to separate levels within the config hierarchy.
So gal.flux
represents config['gal']['flux']
.
For large simulations, one will typically want to split the job up into multiple smaller jobs,
each of which can be run on a single node or core. The natural way to split this up is by
parceling some number of output files into each sub-job. We make this splitting very easy using
the command line options -n
and -j
. The total number of jobs you want should be given
with -n
, and each separate job should be given a different -j
. So to divide a run across
5 machines, you would run one of the following commands on each of the 5 different machines
(or more typically send these 5 commands as jobs in a queue system).
galsim config.yaml -n 5 -j 1
galsim config.yaml -n 5 -j 2
galsim config.yaml -n 5 -j 3
galsim config.yaml -n 5 -j 4
galsim config.yaml -n 5 -j 5
There are few other command line options that we describe here for completeness.
-
-h
or--help
gives the help message. This is really the definitive information about thegalsim
executable, so if that message disagrees with anything here, you should trust that information over what is written here. -
-v {0,1,2,3}
or--verbosity {0,1,2,3}
sets how verbose the logging output should be. The default is-v 1
, which provides some modest amount of output about each file being built.-v 2
give more information about the progress within each output file, including one line of information about each object that is drawn.-v 3
(debug mode) gives a lot of output and should be reserved for diagnosing runtime problems.-v 0
turns off all logging output except for error messages. -
-l LOG_FILE
or--log_file LOG_FILE
gives a file name for writing the logging output. If omitted, the default is to write to stdout. -
-f {yaml,json}
or--file_type {yaml,json}
defines what type of configuration file to parse. The default is to determine this from the file name extension, so it is not normally needed, but if you have non-standard file names, you might need to set this. -
-m MODULE
or--module MODULE
gives a python module to import before parsing the config file. This has been superseded by themodules
top level field, which is normally more convenient. However, this option is still allowed for backwards compatibility. -
-p
or--profile
turns on profiling information that gets output at the end of the run (or when multi-processing, at the end of execution of a process). This can be useful for diagnosing where a simulation is spending most of its computation time. -
-n NJOBS
or--njobs NJOBS
sets the total number of jobs that this run is a part of. Used in conjunction with -j (--job). -
-j JOB
or--job JOB
sets the job number for this particular run. Must be in [1,njobs]. Used in conjunction with -n (--njobs). -
-x
or--except_abort
aborts the whole job whenever any file raises an exception rather than continuing on. (new in version 1.5) -
--version
shows the version of GalSim.