Quasar photometric redshifts for S-PLUS

This repository contains the source codes to estimate photometric redshifts for quasars using the S-PLUS photometry. We provide single-point estimates for the photometric redshifts using three different methods: Random Forest, Bayesian Mixture Density Model, and FlexCoDE (plus, the average between all three methods). Probability density functions (PDFs) are also provided but only for the Bayesian Mixture Density Model and FlexCoDE.

The accompanying paper is Nakazono & Valença et al. (submitted). Link to the paper will be provided as soon as it is published. There one can find details about the data, methods and the expected performances.

The repository is organized as follows:

data/: contains the dustmaps used for the extinction correction.
src/: contains the source codes to estimate the photometric redshifts.
final_model: contains the trained models for the Random Forest, Bayesian Mixture Density Model, and FlexCoDE. The codes to train the models are provided in a separate repository.
logs/: contains the logs of the photometric redshifts estimation.

Versions and dependencies

Codes were written in:

Python 3.7.16 for Random Forest (environment.yml)
Python 3.8.18 for Bayesian Mixture Density Model (bmdn_env.yml)
R 4.2.2 for FlexCoDE

Steps of the pipeline

At this moment, the pipeline is designed to run each method separately (although one can call all of them at once) for debugging purposes. Therefore, each step is saving an output file. This should be improved in the future.

The pipeline is divided into four main steps that should be run in the following order for each S-PLUS field:

get_crossmatch.py
preprocess.py
get_predictions.py
merge_catalogs.py

Step 1: get_crossmatch.py

This code is used to crossmatch the S-PLUS catalog with the unWISE and GALEX catalogs. For this step, stilts.jar is needed. Output is a.fits file with the crossmatched catalogs. This step can be skipped if the crossmatched catalog is already available. This step is implemented to run in a pre-defined folder at S-PLUS's server that contains the *VAC_features.fits files for each S-PLUS field.

Step 2: preprocess.py

This code is used to preprocess the data using output from Step 1 as input. It includes the following steps:

Load the crossmatched catalog
Handle missing-band values (fill with 99)
Apply extinction correction
Calculate features (colors) that are used to estimate the photometric redshifts
Save the preprocessed catalog

Step 3: get_predictions.py

This code is used to estimate the photometric redshifts using output from Step 2 as input. It includes the following steps:

Load the preprocessed catalog
Load the trained models
Estimate the photometric redshifts
Save the photometric redshifts and the probability density functions (PDFs)

Step 4: merge_catalogs.py

This code is used to merge the output of each method obtained from Step 3 in a single .csv file.

Examples of usage

To run the pipeline, one should use the following commands:

python get_crossmatch.py 
python preprocess.py
python get_predictions.py --rf --bmdn --flexcode
python merge_catalogs.py

Parsed arguments are:

--verbose: to print the logs
--replace: to replace the output files if they already exist. Skip if it exists.
--rf: to run the Random Forest method [only for get_predictions.py]
--bmdn: to run the Bayesian Mixture Density Model [only for get_predictions.py]
--flexcode: to run the FlexCoDE method [only for get_predictions.py]
--diff: to run code only for files that do not have output files yet [only for get_crossmatch.py but can be implemented for the other codes as well]
--remove: to remove the output file for each method after merging them in a single file. [only for merge_catalogs.py]

Important Notes

This codes are intended to work only for S-PLUS DR4 data. For any other data release, the trained models would likely need to retrained.
Large files (such as some trained models) are not being pushed to the repository. Please contact the authors if you need access to these files.
This pipeline is designed to work in S-PLUS' server only and its current pattern of data release directories. If you want to use it locally, you will need to adapt the code (e.g. paths) to your own data structure.

Codes were written by

Lilianne Nakazono
Raquel Valença
Rafael Izbicki
Gabriela Soares

Feel free to open a pull request or an issue. This repository is under development and we are open to suggestions and improvements.

How to cite

If you use this code, please cite the paper Nakazono & Valença et al. (submitted). Full reference will be provided as soon as it is published.

Contact

If you have any questions, please contact Lilianne Nakazono or Raquel Valença

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
data		data
final_model		final_model
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bmdn_env.yml		bmdn_env.yml
changelog.txt		changelog.txt
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quasar photometric redshifts for S-PLUS

Versions and dependencies

Steps of the pipeline

Step 1: get_crossmatch.py

Step 2: preprocess.py

Step 3: get_predictions.py

Step 4: merge_catalogs.py

Examples of usage

Important Notes

Codes were written by

How to cite

Contact

About

Releases

Packages

Contributors 3

Languages

License

splus-collab/qucats_predict

Folders and files

Latest commit

History

Repository files navigation

Quasar photometric redshifts for S-PLUS

Versions and dependencies

Steps of the pipeline

Step 1: get_crossmatch.py

Step 2: preprocess.py

Step 3: get_predictions.py

Step 4: merge_catalogs.py

Examples of usage

Important Notes

Codes were written by

How to cite

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages