The Python module designed to detect absorbers with doublet properties in SDSS/DESI quasar
qsoabsfind
is a Python module designed to detect absorbers with doublet properties in SDSS/DESI quasar spectra. This tool identifies potential absorbers using a convolution-based adaptive S/N approach, applies Gaussian fitting and extensive checks to reject false positives, and computes equivalent widths (EWs) of the lines using a simple double Gaussian.
Currently, the package only works for MgII 2796,2803 and CIV 1548,1550 doublets.
- Convolution-based adaptive S/N approach for detecting absorbers in QSO spectra.
- Gaussian fitting for accurate measurement of absorber properties (such as EW, line widths, and centers).
- Parallel processing using multiprocessing for efficient computation on a large number of spectra.
The full documentation is available at https://qsoabsfind.readthedocs.io.
- Python 3.6 or higher
numpy
scipy
astropy
numba
matplotlib
pytest
(for running tests)
First, clone the repository to your local machine:
git clone https://github.com/abhi0395/qsoabsfind.git
cd qsoabsfind
pip install .
python -m unittest discover -s tests
Before running the program, please read the data/datamodel.rst
file. The instructions for the input and output files are provided there. I have also provided an example QSO spectra FITS file, data/qso_test.fits
, which contains 500 continuum-normalized SDSS QSO spectra. You can use this file to test an example run as described below.
qsoabsfind --input-fits-file data/qso_test.fits \
--n-qso 500 \
--absorber MgII \
--output test_MgII.fits \
--headers SURVEY=SDSS AUTHOR=YOUR_NAME \
--n-tasks 16 \
--ncpus 4
Parallel mode can be memory-intensive if the input FITS file is large in size. As the code accesses the FITS file to read QSO spectra when running in parallel, it can become a bottleneck for memory, and the code may fail. Currently, I suggest the following:
-
Divide your file into smaller chunks: Split the FITS file into several smaller files, each containing approximately
N
spectra. Then run the code on these smaller files. -
Use a rule of thumb for file size: Ensure that the size of each individual file is no larger than
total_memory/ncpu
of your node or system. Based on this idea you can decide yourN
. I would suggestN = 1000
. -
Merge results at the end: After processing, you can merge your results.
In order to decide the right size of the FITS file, consider the total available memory and the number of CPUs in your system.
An example jupyter notebook is also available.
Contributions are welcome! Please submit a pull request or open an issue to discuss your ideas. If you have any questions/suggestions, please feel free to write to [email protected] or, preferably, open a GitHub issue.
Please cite Anand, Nelson & Kauffmann 2021 if you find this code useful in your research. The BibTeX entry for the paper can be found here.
Copyright (c) 2021-2025 Abhijeet Anand.
qsoabsfind is a free software made available under the MIT License. For details, see the LICENSE file.
Thanks,
Abhijeet Anand
Lawrence Berkeley National Lab