forked from diegojco/CMIP5_ESGF_downloads
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
80 lines (80 loc) · 3.73 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
# Automatized CMIP5 ESGF downloading.
#
# Author: Diego Jiménez de la Cuesta Otero (Automatization)
#
# This script processes a query to download a subset of the CMIP5 database.
#
# By default, it uses the following facets of ESGF:
#
# project, product, experiment, time_frequency, variable, ensemble and model.
#
# The script sends to ESGF server in bsurl (base search url) variable a query
# to obtain a wget download scripts, based on the parameters given in cfg, ds,
# projs, prods, exps, freqs, vars, enses and mods variables. The system is
# called RESTful protocol.
#
# The download will be done inside the basedir directory.
#
# Description of variables and subvalues:
#
# cfg:
# distrib: if the search is done inside a ESGF node or federation wide.
# latest: if it is requested the latest version of data.
# limit: the limit of files to be downloaded, 10000 is the maximum.
#
# ds: determines the directory tree that will be created and in which data will
# be downloaded.
#
# projs: In principle, it is fixed to CMIP5. You can switch it to other project
# or to a list of projects, however maybe you will need to tune
# something else in the script. However, thanks to RESTful, this is a
# simple task. Read the FAQ at one of ESGF node sites.
#
# prods: By default it is set to CMIP5 output1 product. For other products of
# CMIP5 or of other projects, visit a ESGF node site.
#
# exps: By default this are experiments within the project and product. Exam-
# ples are piControl or historical for CMIP5.
#
# freqs: A list of frequencies at which the output is wanted. Example is mon
# for monthly data.
#
# vars: These are the short names of the variables in the context of CMIP5.
# An example is ts, for the surface temperature.
#
# enses: These are a chain of the letters r, i and p with numbers between them.
# r stands for realization, i for initialization and p for physics.
# The numbers represent which realization, with which initialization and
# which physics is used. This is the case for CMIP5, for other projects,
# this may or may not exist. For CMIP5, modeling groups use p1 mostly,
# with the exception of GISS group, which has several physics settings.
# This is quite frustrating, since different physics for me counts as
# another different model. You need to trace this clearly since you can
# not average models with different physics and expect a model average.
#
# mods: This is a list of model names.
#
# From what I said before, it is highly recommended that first you visit ESGF
# website and make the search without downloading and copy the parameters in
# this script.
#
# The final result will be your data tree without any download scripts remai-
# ning behind, since once a download script has ended its use, it is erased.
#
# The reason behind the multiple for cycles is that some models tend to have a
# large number of files for a variable in a single experiment realization,
# small chunks of time. Therefore, rapidly the limit of 10000 files could be
# surpassed if downloaded too broadly.
#
# Take into account that sometimes, some nodes are slow. For example, DKRZ
# node is slow, because data is stored in tape.
#
# As a final note: you will need an openID from an account of ESGF. First,
# register in a ESGF node. The first time you run a wget download script (not
# necessarily with this script) and if you have installed java, your creden
# tials will be stored in your machine. Otherwise you will need to repeat
# your credentials any time the download scripts be run.
#
# If you modified the code, please make clear that you modified it, what is the
# modification and include references to me... and, of course, add you as an
# author.