Skip to content

Commit

Permalink
Add nanopet architecture (#388)
Browse files Browse the repository at this point in the history
  • Loading branch information
frostedoyster authored Dec 4, 2024
1 parent 99ae60c commit 09829ba
Show file tree
Hide file tree
Showing 28 changed files with 2,628 additions and 28 deletions.
1 change: 1 addition & 0 deletions .github/workflows/architecture-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ jobs:
- architecture-name: gap
- architecture-name: soap-bpnn
- architecture-name: pet
- architecture-name: nanopet

runs-on: ubuntu-22.04

Expand Down
1 change: 1 addition & 0 deletions CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
**/alchemical_model @abmazitov
**/pet @abmazitov
**/gap @DavideTisi
**/nanopet @frostedoyster
5 changes: 5 additions & 0 deletions docs/src/advanced-concepts/fitting-generic-targets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,11 @@ capabilities of the architectures in metatrain.
- No
- No
- No
* - NanoPET
- Energy, forces, stress/virial
- Yes
- No
- Only with ``rank=1`` (vectors)


Preparing generic targets for reading by metatrain
Expand Down
110 changes: 110 additions & 0 deletions docs/src/architectures/nanopet.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
.. _architecture-nanopet:

NanoPET
=======

.. warning::

This is an **experimental model**. You should not use it for anything important.

This is a more user-friendly re-implementation of the original PET (which lives in
https://github.com/spozdn/pet), with slightly improved training and evaluation speed.

Installation
------------
To install the package, you can run the following command in the root
directory of the repository:

.. code-block:: bash
pip install .[nanopet]
This will install the package with the nanoPET dependencies.


Default Hyperparameters
-----------------------
The default hyperparameters for the nanoPET model are:

.. literalinclude:: ../../../src/metatrain/experimental/nanopet/default-hypers.yaml
:language: yaml


Tuning Hyperparameters
----------------------
The default hyperparameters above will work well in most cases, but they
may not be optimal for your specific dataset. In general, the most important
hyperparameters to tune are (in decreasing order of importance):

- ``cutoff``: This should be set to a value after which most of the interactions between
atoms is expected to be negligible. A lower cutoff will lead to faster models.
- ``learning_rate``: The learning rate for the neural network. This hyperparameter
controls how much the weights of the network are updated at each step of the
optimization. A larger learning rate will lead to faster training, but might cause
instability and/or divergence.
- ``batch_size``: The number of samples to use in each batch of training. This
hyperparameter controls the tradeoff between training speed and memory usage. In
general, larger batch sizes will lead to faster training, but might require more
memory.
- ``d_pet``: This hyperparameters controls width of the neural network. In general,
increasing it might lead to better accuracy, especially on larger datasets, at the
cost of increased training and evaluation time.
- ``num_gnn_layers``: The number of graph neural network layers. In general, decreasing
this hyperparameter to 1 will lead to much faster models, at the expense of accuracy.
Increasing it may or may not lead to better accuracy, depending on the dataset, at the
cost of increased training and evaluation time.
- ``num_attention_layers``: The number of attention layers in each layer of the graph
neural network. Depending on the dataset, increasing this hyperparameter might lead to
better accuracy, at the cost of increased training and evaluation time.
- ``loss``: This section describes the loss function to be used, and it has three
subsections. 1. ``weights``. This controls the weighting of different contributions
to the loss (e.g., energy, forces, virial, etc.). The default values of 1.0 for all
targets work well for most datasets, but they might need to be adjusted. For example,
to set a weight of 1.0 for the energy and 0.1 for the forces, you can set the
following in the ``options.yaml`` file under ``loss``:
``weights: {"energy": 1.0, "forces": 0.1}``. 2. ``type``. This controls the type of
loss to be used. The default value is ``mse``, and other options are ``mae`` and
``huber``. ``huber`` is a subsection of its own, and it requires the user to specify
the ``deltas`` parameters in a similar way to how the ``weights`` are specified (e.g.,
``deltas: {"energy": 0.1, "forces": 0.01}``). 3. ``reduction``. This controls how the
loss is reduced over batches. The default value is ``sum``, and the other allowed
option is ``mean``.


All Hyperparameters
-------------------
:param name: ``experimental.nanopet``

model
#####

The model-related hyperparameters are

:param cutoff: Spherical cutoff to use for atomic environments
:param cutoff_width: Width of the shifted cosine cutoff function
:param d_pet: Width of the neural network
:param num_heads: Number of attention heads
:param num_attention_layers: Number of attention layers in each GNN layer
:param num_gnn_layers: Number of GNN layers
:param zbl: Whether to use the ZBL short-range repulsion as the baseline for the model

training
########
The hyperparameters for training are

:param distributed: Whether to use distributed training
:param distributed_port: Port to use for distributed training
:param batch_size: Batch size for training
:param num_epochs: Number of epochs to train for
:param learning_rate: Learning rate for the optimizer
:param scheduler_patience: Patience for the learning rate scheduler
:param scheduler_factor: Factor to reduce the learning rate by
:param log_interval: Interval at which to log training metrics
:param checkpoint_interval: Interval at which to save model checkpoints
:param fixed_composition_weights: Weights for fixed atomic contributions to scalar
targets
:param per_structure_targets: Targets to calculate per-structure losses for
:param log_mae: Whether to log the MAE (mean absolute error) of the model in addition
to the RMSE
:param loss: The loss function to use, with the subfields described in the previous
section
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ alchemical-model = [
pet = [
"pet @ git+https://github.com/lab-cosmo/pet@5d40710",
]
nanopet = []
gap = [
"rascaline-torch @ git+https://github.com/luthaf/rascaline@5326b6e#subdirectory=python/rascaline-torch",
"skmatter",
Expand Down
13 changes: 13 additions & 0 deletions src/metatrain/experimental/nanopet/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from .model import NanoPET
from .trainer import Trainer

__model__ = NanoPET
__trainer__ = Trainer

__authors__ = [
("Filippo Bigi <[email protected]>", "@frostedoyster"),
]

__maintainers__ = [
("Filippo Bigi <[email protected]>", "@frostedoyster"),
]
30 changes: 30 additions & 0 deletions src/metatrain/experimental/nanopet/default-hypers.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
architecture:

name: experimental.nanopet

model:
cutoff: 5.0
cutoff_width: 0.5
d_pet: 128
num_heads: 4
num_attention_layers: 2
num_gnn_layers: 2
zbl: False

training:
distributed: False
distributed_port: 39591
batch_size: 16
num_epochs: 10000
learning_rate: 3e-4
scheduler_patience: 100
scheduler_factor: 0.8
log_interval: 10
checkpoint_interval: 100
fixed_composition_weights: {}
per_structure_targets: []
log_mae: False
loss:
type: mse
weights: {}
reduction: sum
Loading

0 comments on commit 09829ba

Please sign in to comment.