Skip to content

Commit

Permalink
Merge pull request #1 from uhh-pd-ml/import-fix
Browse files Browse the repository at this point in the history
Update README and fix import error
  • Loading branch information
joschkabirk authored May 22, 2024
2 parents b7f2fc4 + 5108669 commit c73ef04
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 8 deletions.
18 changes: 17 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,22 @@ repository [jet-universe/particle_transformer](https://github.com/jet-universe/p
The recommended (and by us tested) way of running the code is to use the
provided docker image at
[`jobirk/omnijet` on DockerHub](https://hub.docker.com/repository/docker/jobirk/omnijet/general).
The requirements listed in `docker/requirements.txt` are installed in the `conda` environment
`base` of the base image (official pytorch image).
Thus, you have to make sure that the `conda` environment is activated when running the code,
which can be done with `source /opt/conda/bin/activate`.

An interactive session inside a container can be started by running the following command:

```shell
# on a machine with Singularity
singularity shell docker://jobirk/omnijet:latest # start a shell in the container
source /opt/conda/bin/activate # activate the conda environment in the container
#
# on a machine with Docker
docker run -it --rm jobirk/omnijet:latest bash # start a shell in the container
source /opt/conda/bin/activate # activate the conda environment in the container
```

Alternatively, you can install the requirements from the `docker/requirements.txt` file, but
you'll have to add `pytorch` to the list of requirements, since this is not
Expand Down Expand Up @@ -96,7 +112,7 @@ python scripts/create_tokenized_dataset.py --ckpt_path=<path to the checkpoint>
Make sure to adjust the `--n_files_*` arguments to your needs, and set the env variable
`JETCLASS_DIR` and `JETCLASS_DIR_TOKENIZED` in the `.env` file.

Afterwards, the tokenized dataset will be saved in if a subdirectory of the
Afterwards, the tokenized dataset will be saved in a subdirectory of the
`JETCLASS_DIR_TOKENIZED` directory and can be used to train the backbone model.

### Generative training
Expand Down
1 change: 0 additions & 1 deletion configs/experiment/example_experiment_generative.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,6 @@ model:
n_GPT_blocks: 3
n_heads: 8
verbosity: false
use_parallel_heads: true
# --- optimizer configuration ---
optimizer:
_target_: torch.optim.AdamW
Expand Down
1 change: 0 additions & 1 deletion configs/experiment/experiment_backbone_generative.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,6 @@ model:
n_GPT_blocks: 3
n_heads: 8
verbosity: false
use_parallel_heads: true
# --- optimizer configuration ---
optimizer:
_target_: torch.optim.AdamW
Expand Down
6 changes: 1 addition & 5 deletions gabbro/models/backbone.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@

from gabbro.metrics.utils import calc_accuracy
from gabbro.models.gpt_model import BackboneModel
from gabbro.models.gpt_model_sequential import FullModel # noqa: E402

vector.register_awkward()

Expand Down Expand Up @@ -54,10 +53,7 @@ def __init__(
self.save_hyperparameters(logger=False)

# initialize the backbone
if model_kwargs.get("use_parallel_heads", False):
self.module = BackboneModel(**model_kwargs)
else:
self.module = FullModel(**model_kwargs)
self.module = BackboneModel(**model_kwargs)

# initialize the model head
self.head = NextTokenPredictionHead(
Expand Down

0 comments on commit c73ef04

Please sign in to comment.