beta version: documentation improving continuously, ready to be used.
Tested on python 3 and 2.7
Python wrapper of ZTF IRSA web API
You need to have an IRSA account that has access to ZTF Data to be able to get data using ztfquery
M. Rigault (corresponding author, [email protected], CNRS/IN2P3), M. Giomi (Humboldt Universiteat zu Berlin) and U. Feindt (Oskar Klein Center, Stockholm University)
using pip: pip install ztfquery
(favored)
or for the latest version:
go wherever you want to save the folder and then
git clone https://github.com/MickaelRigault/ztfquery.git
cd ztfquery
python setup.py install
Then you will need to setup your login and password information:
ipython
> import ztfquery
# This will ask you for your login information.
The login and password will be stored crypted under ~/.ztfquery. Remove this file to reload it.
You should also create the global variable $ZTFDATA
(usually in your ~/.bash_profile
or ~/.cshrc
). Data you will download from IRSA will be saved in the directory indicated by $ZTFDATA
following the IRSA data structure.
new since version 1.1.1
You can also directly provide your IRSA account settings when running load_metadata
and download_data
using the auth=[your_username, your_password]
parameter. Similarly, directly provide the username and password to the ztf ops page when loading NightSummary
using the ztfops_auth
parameter.
You want to see what ZTF has observed during a given night (say 10th of May 2018, i.e. 20180510):
from ztfquery import query
may1018 = query.NightSummary('20180510')
# The Information concerning the science targets are saved in the attribute `data`
print(may1018.data)
# The entire information, including the calibration exposure are in `data_all`
data
and data_all
are Pandas DataFrame.
If you now want to visualize which fields have been observed:
fig = may1018.show_gri_fields(title="Observed Fields \n 2018-05-10")
fig.show()
"""
Number of g (upper left), r (upper right), I (lower) observations for night 20180510.
The grey tile shows the primary ZTF grid for dec>-30deg.
Remark that particular night, no I band filter observation were made.
"""
The first time you will use NightSummary, it will ask for the username and password of ztfops webpage. These are not your irsa account settings.
username and password to ztfops webpage can be found in ZTF's twiki page (ZTFOps)
As of v0.6, you can directly download ztf data. For details, see Downloading the Data section below.
As a short example, if you want to download the science images from "quadran 1" of "ccd 6" simply do:
may1018.set_metadata("sci", paddedccdid="06", qid="01")
# Let's only try to download target observation for program ID 2 (partnership)
mask = (may1018.data["type"]=="targ") * (may1018.data["pid"]=="2")
may1018.download_data("sciimg.fits", show_progress=False, mask=mask)
The Generic ZTF data access query uses the ZTFQuery
object.
Being able to download data requires two steps:
- do the query to know which data are accessible (this uses the
load_metadata()
method. - do the actual download of the accessible data (this uses the
download_data()
method).
Other methods enables you to further see want is going on (like the plotting method show_gri_fields()
) or check what has already be downloaded and were that is on your computer (get_local_data()
).
In this example, we are going to query any thing that have been observed with a seeing lower than 2arcsec between the 1st of May 2018 and the 1st of June 2018.
from ztfquery import query
zquery = query.ZTFQuery()
# Check what are the Julian Dates of 1st of May 2018 and 1st of June 2018
from astropy import time
jd_1may18 = time.Time("2018-05-01").jd # 2458239.5
jd_1june18 = time.Time("2018-06-01").jd # 2458270.5
# Do the Query to see what exists
zquery.load_metadata(sql_query="seeing<2 and obsjd BETWEEN 2458239.5 AND 2458270.5") # this will take about 1min
# The information is save as Pandas DataFrame undern `metatable`
zquery.metatable # it contains about 50 000 entries...
# Show the observed fields, limiting it to the main (or primary) grid for visibility (say grid="secondary" to see this rest):
zquery.show_gri_fields(title="1stMay2018< time <1stJune2018 \n seeing<2", grid="main")
"""
In this figure, the colorbar shows the number of time a given field in in metatable.
Remark that each field is made of 16 CCD each divided into 4 quadran,
so each single exposure will represent 64 field entries.
"""
In this second example, we will want to access the I-band filter (filter #3) observations within 0.01 degree around RA=276.107960 Dec=+44.130398 since the 14th of May 2018.
from ztfquery import query
zquery = query.ZTFQuery()
# Print what are the Julian Dates of 14th of May 2018
from astropy import time
print(time.Time("2018-05-14").jd) # 2458252.5
# Do the Query to see what exists
zquery.load_metadata(radec=[276.107960,+44.130398], size=0.01, sql_query="fid=3 and obsjd>2458252.5") # takes a few seconds
# As of when the README has being written, this had 8 entries:
zquery.metatable
"""
obsjd ccdid filtercode
0 2.458268e+06 1 zi
1 2.458268e+06 15 zi
2 2.458256e+06 15 zi
3 2.458255e+06 1 zi
4 2.458262e+06 1 zi
5 2.458273e+06 1 zi
6 2.458262e+06 15 zi
7 2.458273e+06 15 zi
"""
Let's imagine you want have a target at a given coordinate RA=276.107960 Dec=+44.130398. You want the reference image associated to it.
To get the reference image metadata simply do:
from ztfquery import query
zquery = query.ZTFQuery()
zquery.load_metadata(kind="ref",radec=[276.107960, +44.130398], size=0.0001)
zquery.metatable[["field","filtercode", "ccdid","qid"]]
"""
field filtercode ccdid qid
0 764 zg 1 3
1 726 zr 15 2
2 726 zg 15 2
3 726 zi 15 2
4 764 zr 1 3
5 764 zi 1 3
"""
If you only want reference images of for "g" filter:
zquery.load_metadata(kind="ref",radec=[276.107960, +44.130398], size=0.0001, sql_query="fid=1")
or, instead of sql_query="fid=1"
, you could use sql_query="filtercode='zg'"
but be careful with the quotes around zg
You can then simply download the refence image by doing zquery.download_data()
as detailed below.
The actual data download is made possible after you did the load_metadata()
(see above)
downloading data from a NightSummary object : If you want to download data from NighSummary (available for version>v0.6) you need to run the set_metadata()
method (and not load_metadata()
that NightSummary
objects do not have). In set_metadata
you need to specify which kind of data you want ("sci", "raw", "cal" or "ref") and you need to provide mandatory arguments associated to this kind (e.g. for kind="sci", you need to provide the ccdid "paddedccdid" and the quadran id "qid", see documentation of set_metadata()
for details). Otherwise, the same download_data()
method is used for both NightSummary
or ZTFQuery
object.
Remember to set the global variable $ZTFQUERY
(see at the top of this document).
In this example, we will want to access All observations within 0.01 degree around RA=276.107960 Dec+44.130398 since the 14th of May 2018 with a seeing lower than 2arcsec.
from ztfquery import query
zquery = query.ZTFQuery()
# Step 1, load the meta data (NB: Julian Date of 14th of May 2018 is 2458252.
zquery.load_metadata(radec=[276.107960,+44.130398], size=0.01, sql_query="seeing<2 and obsjd>2458252.5")
# As of when the README was being written, this had 42 entries (only partnership data)
zquery.metatable[["obsjd", "seeing", "filtercode"]]
"""
obsjd seeing filtercode
0 2.458277e+06 1.83882 zr
1 2.458277e+06 1.84859 zr
2 2.458268e+06 1.74317 zi
3 2.458269e+06 1.65564 zr
4 2.458267e+06 1.90791 zr
...
35 2.458253e+06 1.84952 zr
36 2.458273e+06 1.77137 zr
37 2.458270e+06 1.71865 zr
38 2.458274e+06 1.84936 zg
39 2.458270e+06 1.60568 zr
40 2.458253e+06 1.98775 zr
41 2.458275e+06 1.99942 zg
"""
# Downloading the Data
zquery.download_data("psfcat.fits", show_progress=False)
You can download in multiprocessing simply by adding the keywork nprocess=X
where X is the number of parallel process you want. The show_progress
option will then show the overall progress (do not forget to add the notebook=True
option is this is run from a notebook. For example:
zquery.download_data("psfcat.fits", show_progress=True, notebook=True,
nprocess=4, verbose=True, overwrite=True)
In the above example, overwrite=True
enables to re-download existing file.
By default overwrite
is False
, which means that the code checks if you already have the file you want to download where you want to download it and if so, skips it. verbose
prints additional information like the name of files been downloaded.
What is happening inside download_data()
?
For each observation made with ZTF (that you have queried using load_metadata()
) there are plenty of data product made available. Here is the list for the science exposure (default of load_metadata()
, details here):
- sciimg.fits (primary science image)
- mskimg.fits (bit-mask image)
- psfcat.fits (PSF-fit photometry catalog)
- sexcat.fits (nested-aperture photometry catalog)
- sciimgdao.psf (spatially varying PSF estimate in DAOPhot's lookup table format)
- sciimgdaopsfcent.fits (PSF estimate at science image center as a FITS image)
- sciimlog.txt (log output from instrumental calibration pipeline)
- scimrefdiffimg.fits.fz (difference image: science minus reference; fpack-compressed)
- diffimgpsf.fits (PSF estimate for difference image as a FITS image)
- diffimlog.txt (log output from image subtraction and extraction pipeline)
- log.txt (overall system summary log from realtime pipeline)
In the above example, we have selected the catalog generated by PSF-fit photometry "psfcat.fits" ; the download of all the 42 catalogs took a couple of minutes ; images would be much slower.
Where are the downloaded data ?
The data are saved following IRSA structure (default, see download_data option if you do not want that).
To retrieve them simply do:
zquery.get_local_data("psfcat.fits")
Important: Retrieving data If you need to get them again later on, after you closed the session, you will need to redo the load_metadata()
query to find back the structure of the database, otherwise get_local_data()
will not know what to do.
If you need to work offline, I suggest you overwrite the download location within download_data
using the 'download_dir' option. If provided, all the data will be dumped inside this directory without following the IRSA structure.
starting with version 1.2.3
See here for an example of how to get the reference image(s) associated to a given coordinates.
If you want to know if a given field (say 400) already have there reference images use:
from ztfquery import fields
fields.has_field_reference(400)
"""
{'zg': True, 'zi': False, 'zr': True}
"""
If you want the list of all field that have, say a I-band image:
from ztfquery import fields
fields.get_fields_with_band_reference("zi")
"""
441, 442, 516, 517, 518, 519, 520, 522, 523, 524, 525,
526, 527, 528, 530, 531, 532, 534, 544, 547, 549, 550,
564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574,
575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585,
586, 596, 597, 613, 615, 616, 617, 618, 619, 620, 621,
622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632,
633, 634, 635, 645, 646, 660,...
"""
starting version 1.2.0 | see also https://github.com/ufeindt/marshaltools
You can now query data stored in the Marshal directly from ztfquery
.
There are three main utilities:
- Getting target datatable (coordinates, classification, redshift etc)
- Getting target spectra
- Getting target lightcurves
All of these could be called from MarshalAccess
, but 2. and 3. (spectra and lightcurves) can directly be downloaded from a native ztfquery.marshal
function (i.e. with no need to instanciate a MarshalAccess
object). Below are some examples.
password protection: Data right access to the Marshal could directly be passed into functions and methods (using the auth
argument) or, as usual and as favored, stored crypted into ~/.ztfquery
. The first time you will query for marshal information without explicitly providing an authentification using auth
, ztfquery
will prompt for your marshal username and password and will store save. Then anytime auth
is not given, the username and password stored will be used.
from ztfquery import marshal
# This instanciates a MarshalAccess object
m = marshal.MarshalAccess()
# Then downloads all targets you have access to.
m.load_target_sources()
# Target data are stored as a pandas DataFrame into `target_sources`
print(m.target_sources)
"""
a long table containing:
candid name ra dec classification field redshift creationdate iauname id lastmodified rcid release_auth release_status
"""
You can also download target sources only for one of you program using the program
arguments, for instance:
from ztfquery import marshal
m = marshal.MarshalAccess()
m.load_target_sources(program="Cosmology")
If you only want a subgroup of targets, you can use the get_target_data()
method:
m.get_target_data(["SN2018zd","ZTF18aahflrr","at2018akx"])
"""
a table containing:
'candid name ra dec classification field redshift creationdate iauname id lastmodified rcid release_auth release_status'
only for the given targets
"""
You can also directly get their coordinates, redshift or classification (get_target_{coordinates,redshift,classification}
) e.g.:
m.get_target_coordinates(["SN2018zd","ZTF18aahflrr","at2018akx"])
"""
ra dec
0 94.513250 94.513250
2 150.846667 -26.182181
3 153.923187 14.119114
"""
Remark: getting information (coordinates, redshift etc) for 1 or 1000 targets roughly takes the same amount of time, so better query all your targets at once.
You can download target spectra stored in the marshal using the download_spec
function.
For instance:
from ztfquery import marshal
marshal.download_spectra("ZTF18abcdef")
As such, spectra will be stored in $ZTFDATA/marshal/spectra/TARGET_NAME/
.
If you want to provide another directory, simply fill the dirout
argument, for instance:
from ztfquery import marshal
marshal.download_spectra("ZTF18abcdef", dirout="ANY_DIRECTORY_PATH")
You may also want to directly get the data (i.e. not storing them somewhere), then set dirout=None
from ztfquery import marshal
spectra = marshal.download_spectra("ZTF18abcdef", dirout=None)
Here, spectra
is a dictionary with the following structure: {filename_: readlines_array_of_ascii_spectraldata}
If you have dowloaded spectra using the default dirout output (dirout='default'
), you can load the spectra using get_local_spectra(TARGET_NAME)
, which returns the same dict
as defined just above ({filename_: readlines_array_of_ascii_spectraldata}
)
Similarly to the spectra, you can download the marshal lightcurve using the download_lightcurve
function.
from ztfquery import marshal
marshal.download_lightcurve("ZTF18abcdef")
download_lightcurve
has the same dirout
option as download_spec
, except that it saves lightcurve by default in $ZTFDATA/marshal/lightcurves/TARGET_NAME/
. Use get_local_lightcurves()
function to retreive lightcurve downloaded using dirout="default"
.
Lightcurves are stored as .csv and returns as pandas DataFrame. You can directly visualize the lightcurve using plot_lightcurve(lc_dataframe)
providing the pandas DataFrame.
from ztfquery import marshal
# Download lightcurve of
marshal.download_lightcurve("ZTF18abcdef")
# Loading it // this returns a dict with the format {filename: DataFrame} because one could have saved several .csv.
lc_dict = marshal.get_local_lightcurves("ZTF18abcdef")
# Plot it
marshal.plot_lightcurve(lc_dict["marshal_lightcurve_ZTF18abcdef.csv"])
available starting version 0.5
There is a simple library inside ztfquery
to load, access and display ZTF alerts.
Assuming you have a .avro
alert stored in you computer at full_path_to_avro
then:
from ztfquery import alert
ztfalert = alert.AlertReader.load(full_path_to_avro)
Inthere, the alert itself is stored as ztfalert.alert
.
Now, if you want to display the alert for instance, simply use the show()
method.
You can also quickly display the alert by using the display_alert
:
from ztfquery import alert
fig = alert.display_alert(full_path_to_avro, show_ps_stamp=True)
The metadata structure is detailed here: ztf_api