-
Notifications
You must be signed in to change notification settings - Fork 24
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #112 from mwalmsley/narval-migration
Prepare for v2 release
- Loading branch information
Showing
15 changed files
with
1,173 additions
and
336 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,19 +30,16 @@ Download the code using git: | |
|
||
git clone [email protected]:mwalmsley/zoobot.git | ||
|
||
And then pick one of the three commands below to install Zoobot and either PyTorch (recommended) or TensorFlow: | ||
And then pick one of the three commands below to install Zoobot and PyTorch: | ||
|
||
# Zoobot with PyTorch and a GPU. Requires CUDA 11.3. | ||
pip install -e "zoobot[pytorch_cu113]" --extra-index-url https://download.pytorch.org/whl/cu113 | ||
# Zoobot with PyTorch and a GPU. Requires CUDA 12.1 (or CUDA 11.8, if you use `_cu118` instead) | ||
pip install -e "zoobot[pytorch-cu121]" --extra-index-url https://download.pytorch.org/whl/cu121 | ||
|
||
# OR Zoobot with PyTorch and no GPU | ||
pip install -e "zoobot[pytorch_cpu]" --extra-index-url https://download.pytorch.org/whl/cpu | ||
pip install -e "zoobot[pytorch-cpu]" --extra-index-url https://download.pytorch.org/whl/cpu | ||
|
||
# OR Zoobot with PyTorch on Mac with M1 chip | ||
pip install -e "zoobot[pytorch_m1]" | ||
|
||
# OR Zoobot with TensorFlow. Works with and without a GPU, but if you have a GPU, you need CUDA 11.2. | ||
pip install -e "zoobot[tensorflow] | ||
pip install -e "zoobot[pytorch-m1]" | ||
|
||
This installs the downloaded Zoobot code using pip [editable mode](https://pip.pypa.io/en/stable/topics/local-project-installs/#editable-installs) so you can easily change the code locally. Zoobot is also available directly from pip (`pip install zoobot[option]`). Only use this if you are sure you won't be making changes to Zoobot itself. For Google Colab, use `pip install zoobot[pytorch_colab]` | ||
|
||
|
@@ -115,12 +112,6 @@ PyTorch (recommended): | |
- [pytorch/examples/representations/get_representations.py](https://github.com/mwalmsley/zoobot/blob/main/zoobot/pytorch/examples/representations/get_representations.py) | ||
- [pytorch/examples/train_model_on_catalog.py](https://github.com/mwalmsley/zoobot/blob/main/zoobot/pytorch/examples/train_model_on_catalog.py) (only necessary to train from scratch) | ||
|
||
TensorFlow: | ||
- [tensorflow/examples/train_model_on_catalog.py](https://github.com/mwalmsley/zoobot/blob/main/zoobot/tensorflow/examples/train_model_on_catalog.py) (only necessary to train from scratch) | ||
- [tensorflow/examples/make_predictions.py](https://github.com/mwalmsley/zoobot/blob/main/zoobot/tensorflow/examples/make_predictions.py) | ||
- [tensorflow/examples/finetune_minimal.py](https://github.com/mwalmsley/zoobot/blob/main/zoobot/tensorflow/examples/finetune_minimal.py) | ||
- [tensorflow/examples/finetune_advanced.py](https://github.com/mwalmsley/zoobot/blob/main/zoobot/tensorflow/examples/finetune_advanced.py) | ||
|
||
There is more explanation and an API reference on the [docs](https://zoobot.readthedocs.io/). | ||
|
||
I also [include](https://github.com/mwalmsley/zoobot/blob/main/benchmarks) the scripts used to create and benchmark our pretrained models. Many pretrained models are available [already](https://zoobot.readthedocs.io/en/latest/data_notes.html), but if you need one trained on e.g. different input image sizes or with a specific architecture, I can probably make it for you. | ||
|
@@ -129,44 +120,33 @@ When trained with a decision tree head (ZoobotTree, FinetuneableZoobotTree), Zoo | |
|
||
|
||
|
||
### (Optional) Install PyTorch or TensorFlow, with CUDA | ||
### (Optional) Install PyTorch with CUDA | ||
<a name="install_cuda"></a> | ||
|
||
*If you're not using a GPU, skip this step. Use the pytorch_cpu or tensorflow_cpu options in the section below.* | ||
|
||
Install PyTorch 1.12.1 or Tensorflow 2.10.0 and compatible CUDA drivers. I highly recommend using [conda](https://docs.conda.io/en/latest/miniconda.html) to do this. Conda will handle both creating a new virtual environment (`conda create`) and installing CUDA (`cudatoolkit`, `cudnn`) | ||
|
||
CUDA 11.3 for PyTorch: | ||
|
||
conda create --name zoobot38_torch python==3.8 | ||
conda activate zoobot38_torch | ||
conda install -c conda-forge cudatoolkit=11.3 | ||
*If you're not using a GPU, skip this step. Use the pytorch-cpu option in the section below.* | ||
|
||
CUDA 11.2 and CUDNN 8.1 for TensorFlow 2.10.0: | ||
Install PyTorch 2.1.0 or Tensorflow 2.10.0 and compatible CUDA drivers. I highly recommend using [conda](https://docs.conda.io/en/latest/miniconda.html) to do this. Conda will handle both creating a new virtual environment (`conda create`) and installing CUDA (`cudatoolkit`, `cudnn`) | ||
|
||
conda create --name zoobot38_tf python==3.8 | ||
conda activate zoobot38_tf | ||
conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0 | ||
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/ # add this environment variable | ||
CUDA 12.1 for PyTorch 2.1.0: | ||
|
||
### Latest minor features (v1.0.4) | ||
conda create --name zoobot39_torch python==3.9 | ||
conda activate zoobot39_torch | ||
conda install -c conda-forge cudatoolkit=12.1 | ||
|
||
- Now supports multi-class finetuning. See `pytorch/examples/finetuning/finetune_multiclass_classification.py` | ||
- Removed `simplejpeg` dependency due to M1 install issue. | ||
- Pinned `timm` version to ensure MaX-ViT models load correctly. Models supporting the latest `timm` will follow. | ||
- (internal until published) GZ Evo v2 now includes Cosmic Dawn (HSC). Significant performance improvement on HSC finetuning. | ||
### Recent release features (v2.0.0) | ||
|
||
### Latest major features (v1.0.0) | ||
|
||
v1.0.0 recognises that most of the complexity in this repo is training Zoobot from scratch, but most non-GZ users will probably simply want to load the pretrained Zoobot and finetune it on their data. | ||
|
||
- Adds new finetuning interface (`finetune.run_finetuning()`), examples. | ||
- Refocuses docs on finetuning rather than training from scratch. | ||
- Rework installation process to separate CUDA from Zoobot (simpler, easier) | ||
- Better wandb logging throughout, to monitor training | ||
- Remove need to make TFRecords. Now TF directly uses images. | ||
- Refactor out augmentations and datasets to `galaxy-datasets` repo. TF and Torch now use identical augmentations (via albumentations). | ||
- Many small quality-of-life improvements | ||
- New pretrained architectures: ConvNeXT, EfficientNetV2, MaxViT, and more. Each in several sizes. | ||
- Reworked finetuning procedure. All these architectures are finetuneable through a common method. | ||
- Reworked finetuning options. Batch norm finetuning removed. Cosine schedule option added. | ||
- Reworked finetuning saving/loading. Auto-downloads encoder from HuggingFace. | ||
- Now supports regression finetuning (as well as multi-class and binary). See `pytorch/examples/finetuning` | ||
- Updated `timm` to 0.9.10, allowing latest model architectures. Previously downloaded checkpoints may not load correctly! | ||
- (internal until published) GZ Evo v2 now includes Cosmic Dawn (HSC H2O). Significant performance improvement on HSC finetuning. Also now includes GZ UKIDSS (dragged from our archives). | ||
- Updated `pytorch` to `2.1.0` | ||
- Added support for webdatasets (only recommended for large-scale distributed training) | ||
- Improved per-question logging when training from scratch | ||
- Added option to compile encoder for max speed (not recommended for finetuning, only for pretraining). | ||
- Deprecates TensorFlow. The CS research community focuses on PyTorch and new frameworks like JAX. | ||
|
||
Contributions are very welcome and will be credited in any future work. Please get in touch! See [CONTRIBUTING.md](https://github.com/mwalmsley/zoobot/blob/main/benchmarks) for more. | ||
|
||
|
@@ -176,6 +156,8 @@ The [benchmarks](https://github.com/mwalmsley/zoobot/blob/main/benchmarks) folde | |
|
||
Training Zoobot using the GZ DECaLS dataset option will create models very similar to those used for the GZ DECaLS catalogue and shared with the early versions of this repo. The GZ DESI Zoobot model is trained on additional data (GZD-1, GZD-2), as the GZ Evo Zoobot model (GZD-1/2/5, Hubble, Candels, GZ2). | ||
|
||
**Pretraining is becoming increasingly complex and is now partially refactored out to a separate repository. We are gradually migrating this `zoobot` repository to focus on finetuning.** | ||
|
||
### Citing | ||
|
||
If you use this software, or otherwise wish to cite Zoobot as a software package, please use the [JOSS paper](https://doi.org/10.21105/joss.05312): | ||
|
@@ -184,10 +166,14 @@ If you use this software, or otherwise wish to cite Zoobot as a software package | |
|
||
You might be interested in reading papers using Zoobot: | ||
|
||
- [Galaxy Zoo DECaLS](https://arxiv.org/abs/2102.08414) (first use at Galaxy Zoo) | ||
- [A Comparison of Deep Learning Architectures for Optical Galaxy Morphology Classification](https://arxiv.org/abs/2111.04353) | ||
- [Practical Galaxy Morphology Tools from Deep Supervised Representation Learning](https://arxiv.org/abs/2110.12735) | ||
- [Towards Foundation Models for Galaxy Morphology](https://arxiv.org/abs/2206.11927) (adding contrastive learning) | ||
- [Harnessing the Hubble Space Telescope Archives: A Catalogue of 21,926 Interacting Galaxies](https://arxiv.org/abs/2303.00366) | ||
|
||
Many other works use Zoobot indirectly via the [Galaxy Zoo DECaLS](https://arxiv.org/abs/2102.08414) catalog. | ||
- [Galaxy Zoo DECaLS: Detailed visual morphology measurements from volunteers and deep learning for 314,000 galaxies](https://arxiv.org/abs/2102.08414) (2022) | ||
- [A Comparison of Deep Learning Architectures for Optical Galaxy Morphology Classification](https://arxiv.org/abs/2111.04353) (2022) | ||
- [Practical Galaxy Morphology Tools from Deep Supervised Representation Learning](https://arxiv.org/abs/2110.12735) (2022) | ||
- [Towards Foundation Models for Galaxy Morphology](https://arxiv.org/abs/2206.11927) (2022) | ||
- [Harnessing the Hubble Space Telescope Archives: A Catalogue of 21,926 Interacting Galaxies](https://arxiv.org/abs/2303.00366) (2023) | ||
- [Galaxy Zoo DESI: Detailed morphology measurements for 8.7M galaxies in the DESI Legacy Imaging Surveys](https://academic.oup.com/mnras/advance-article/doi/10.1093/mnras/stad2919/7283169?login=false) (2023) | ||
- [Galaxy mergers in Subaru HSC-SSP: A deep representation learning approach for identification, and the role of environment on merger incidence](https://doi.org/10.1051/0004-6361/202346743) (2023) | ||
- [Astronomaly at Scale: Searching for Anomalies Amongst 4 Million Galaxies](https://arxiv.org/abs/2309.08660) (2023, submitted) | ||
- [Transfer learning for galaxy feature detection: Finding Giant Star-forming Clumps in low redshift galaxies using Faster R-CNN](https://arxiv.org/abs/2312.03503) (2023, submitted) | ||
|
||
Many other works use Zoobot indirectly via the [Galaxy Zoo DECaLS](https://arxiv.org/abs/2102.08414) catalog (and now via the new [Galaxy Zoo DESI](https://academic.oup.com/mnras/advance-article/doi/10.1093/mnras/stad2919/7283169?login=false) catalog). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,7 +5,7 @@ | |
|
||
setuptools.setup( | ||
name="zoobot", | ||
version="1.0.5", | ||
version="2.0.0", | ||
author="Mike Walmsley", | ||
author_email="[email protected]", | ||
description="Galaxy morphology classifiers", | ||
|
@@ -22,51 +22,61 @@ | |
packages=setuptools.find_packages(), | ||
python_requires=">=3.8", # recommend 3.9 for new users. TF needs >=3.7.2, torchvision>=3.8 | ||
extras_require={ | ||
'pytorch_cpu': [ | ||
'pytorch-cpu': [ | ||
# A100 GPU currently only seems to support cuda 11.3 on manchester cluster, let's stick with this version for now | ||
# very latest version wants cuda 11.6 | ||
'torch == 1.12.1+cpu', | ||
'torchvision == 0.13.1+cpu', | ||
'torchaudio == 0.12.1', | ||
'torch == 2.1.0+cpu', | ||
'torchvision == 0.16.0+cpu', | ||
'torchaudio >= 2.1.0', | ||
'pytorch-lightning >= 2.0.0', | ||
# 'simplejpeg', | ||
'albumentations', | ||
'pyro-ppl == 1.8.0', | ||
'pyro-ppl >= 1.8.6', | ||
'torchmetrics == 0.11.0', | ||
'timm == 0.6.12' | ||
'timm == 0.9.10' | ||
], | ||
'pytorch_m1': [ | ||
'pytorch-m1': [ | ||
# as above but without the +cpu (and the extra-index-url in readme has no effect) | ||
# all matching pytorch versions for an m1 system will be cpu | ||
'torch == 1.12.1', | ||
'torchvision == 0.13.1', | ||
'torchaudio == 0.12.1', | ||
'torch == 2.1.0', | ||
'torchvision == 0.16.0', | ||
'torchaudio >= 2.1.0', | ||
'pytorch-lightning >= 2.0.0', | ||
'albumentations', | ||
'pyro-ppl == 1.8.0', | ||
'pyro-ppl >= 1.8.6', | ||
'torchmetrics == 0.11.0', | ||
'timm == 0.6.12' | ||
'timm >= 0.9.10' | ||
], | ||
# as above but without pytorch itself | ||
# for GPU, you will also need e.g. cudatoolkit=11.3, 11.6 | ||
# https://pytorch.org/get-started/previous-versions/#v1121 | ||
'pytorch_cu113': [ | ||
'torch == 1.12.1+cu113', | ||
'torchvision == 0.13.1+cu113', | ||
'torchaudio == 0.12.1', | ||
'pytorch-cu118': [ | ||
'torch == 2.1.0+cu118', | ||
'torchvision == 0.16.0+cu118', | ||
'torchaudio >= 2.1.0', | ||
'pytorch-lightning >= 2.0.0', | ||
'albumentations', | ||
'pyro-ppl == 1.8.0', | ||
'pyro-ppl >= 1.8.6', | ||
'torchmetrics == 0.11.0', | ||
'timm == 0.6.12' | ||
], | ||
'pytorch_colab': [ | ||
'timm >= 0.9.10' | ||
], # exactly as above, but _cu121 for cuda 12.1 (the current default) | ||
'pytorch-cu121': [ | ||
'torch == 2.1.0+cu121', | ||
'torchvision == 0.16.0+cu121', | ||
'torchaudio >= 2.1.0', | ||
'pytorch-lightning >= 2.0.0', | ||
'albumentations', | ||
'pyro-ppl >= 1.8.6', | ||
'torchmetrics == 0.11.0', | ||
'timm >= 0.9.10' | ||
], | ||
'pytorch-colab': [ | ||
# colab includes pytorch already | ||
'pytorch-lightning >= 2.0.0', | ||
'albumentations', | ||
'pyro-ppl>=1.8.0', | ||
'torchmetrics==0.11.0', | ||
'timm == 0.6.12' | ||
'timm == 0.9.10' | ||
], | ||
# TODO may add narval/Digital Research Canada config | ||
'tensorflow': [ | ||
|
Empty file.
Oops, something went wrong.