PyTorch wrapper #5

raimis · 2020-09-25T16:42:51Z

More discussions in #3

peastman · 2020-09-28T01:00:59Z

Looks great! How about the backward pass? That will be essential for MD.

I'm not familiar with how custom PyTorch ops handle types. You've written it so all the arguments are either double or int64. Much more typically they'll be float and int32. Can you make it accept those and process them directly without needing type conversions?

raimis · 2020-09-28T08:30:14Z

I'm working on the backward pass.

Regarding the types: https://pytorch.org/tutorials/advanced/torch_script_custom_ops.html

The TorchScript compiler understands a fixed number of types. Only these types can be used as arguments to your custom operator. Currently these types are: torch::Tensor, torch::Scalar, double, int64_t and std::vector s of these types. Note that only double and not float, and only int64_t and not other integral types such as int, short or long are supported.

peastman · 2020-09-28T16:55:31Z

If we make the arguments have type torch::Tensor I believe that should do what we want. You can then cast the tensor if necessary and pull out a pointer to the data. Something like this.

tensor = tensor.type(torch::kFloat32);
float* data = tensor.data_ptr<float>();

This will be important for the CUDA version, since it will give us a pointer to the data on the GPU, which saves having to copy it to the host and then back again.

raimis · 2020-09-29T08:43:52Z

The atomic positions and gradients are as torch::Tensor.

raimis · 2020-09-30T13:42:44Z

I have a working ANISymmetryFunction operation in PyTorch.

Overall all the wrapper has three components:

ANISymmetryFunction function, which is exposed in PyTorch.
GradANISymmetryFunction class, which implements the autograd interface.
CustomANISymmetryFunctions class, which wraps ANISymmetryFunctions class and pass it between the forward and backward passes.

PyTorch API favours the functional programming, which is at odds with the object-oriented ANISymmetryFunctions. At the moment, the ANISymmetryFunctions constructor is called each time PyTorch executes the operation, which isn't optimal.

raimis · 2020-09-30T13:55:52Z

Quick performance benchmark

Molecule: 46 atoms
GPU: GTX 1080 Ti
Execution time is averaged over 10000 consecutive calls

TorchANI 2.2 featurizer (pure PyTorch implementation): ~6.5 ms
ANISymmetryFunction operation (called via PyTorch): 2 ms

raimis · 2020-09-30T14:12:42Z

I hacked the wrapper to reuse CudaANISymmetryFunctions object. It increases the performance, but breaks the serialization of PyTorch models.

ANISymmetryFunction operation (reusing CudaANISymmetryFunctions object): 1.5 ms

raimis · 2020-09-30T14:30:49Z

In addition, I have made a mock up of the wrapper, which doesn't do any calculations.

ANISymmetryFunction operation (mock up): 1.3 ms

So out of 2 ms, 1.3 ms is PyTorch overhead, 0.5 ms takes the CudaANISymmetryFunctions constructor, and 0.2 ms is actual calculations.

peastman · 2020-10-06T21:25:38Z

The way this is implemented has very high overhead. You construct a new ANISymmetryFunctions object when forward() is called, delete it when backward() is called, and then need to create a new one from scratch on the next time step. We really want a Python ANISymmetryFunctions class that extends torch.nn.Module and does all memory allocation and initialization in its constructor. You should then be able to use it as many times as you want with no further overhead.

That class should also be independent of TorchANI. Having a separate class that provides easy integration with TorchANI is useful, but that necessarily adds overhead and long term I think our goal is to completely replace TorchANI. So we first want a minimum overhead wrapper that directly exposes this code to PyTorch in the simplest way, and TorchANI integration can be built on top of that.

raimis · 2020-10-07T10:13:23Z

The way this is implemented has very high overhead. You construct a new ANISymmetryFunctions object when forward() is called, delete it when backward() is called, and then need to create a new one from scratch on the next time step. We really want a Python ANISymmetryFunctions class that extends torch.nn.Module and does all memory allocation and initialization in its constructor. You should then be able to use it as many times as you want with no further overhead.

torch.nn.Module is just an high-level abstraction. The computational graph is implemented and executed in term of PyTorch operations.
A PyTorch operation to work with the autograd, the implementation is limited to two functions (for forward and backward pass) with a limited mechanism to pass data (forward -> backward).
There is no way to initialised a PyTorch operation in advance or reuse objects. At least, I haven't found. As mentioned PyTorch wrapper #5 (comment), PyTorch favours functional programming. The computational graph is constructed and executed dynamically on-the-fly and after discarded.

That class should also be independent of TorchANI. Having a separate class that provides easy integration with TorchANI is useful, but that necessarily adds overhead and long term I think our goal is to completely replace TorchANI. So we first want a minimum overhead wrapper that directly exposes this code to PyTorch in the simplest way, and TorchANI integration can be built on top of that.

The functionality is already directly exposed in PyTorch via torch.ops.NNPOps.ANISymmetryFunctions (with minimum overhead as far as PyTorch allows) and the TorchANI integration is built on top of that.

raimis · 2020-10-07T10:21:08Z

I have rename the PyTorch module to TorchANISymmetryFunctions to make it clear that is is TorchANI specific. So the rest components are general.

peastman · 2020-10-08T21:17:34Z

Could you elaborate? Certainly you can implement new calculations using torch.autograd.Function objects, but that doesn't rule out implementing them with torch.nn.Module objects. Both classes define forward() and backward() methods. The documentation provides examples of writing both in C++. One is a functional API and the other is an object oriented API, depending on which suits your needs better.

raimis · 2020-10-09T12:02:45Z

Both classes define forward() and backward() methods.

Where have you seen torch.nn.Module.backward? It isn't mentioned neither in Python API not C++ API:

proteneer · 2020-10-09T12:44:26Z

Modules in pytorch are really just functors that allow you to do perform RAII.

The links you provided are for the base class. It is up to you to implement .backward() and .forward() calls. The backward() call is really just a vector jacobian product. It's identical to the backward() signature of what you'd use to implement Function.backward().

raimis · 2020-10-09T14:03:01Z

@proteneer In the autograd documentation, there is only an example with torch::autograd::Function (https://pytorch.org/tutorials/advanced/cpp_autograd.html#using-custom-autograd-function-in-c). Do you know an equivalent example with torch::nn::Module?

raimis · 2020-10-09T14:36:26Z

End-to-end performance benchmarks of ANI-2x

Molecule: 46 atoms (pytorch/molecules/2iuz_ligand.mol2)
GPU: GTX 1080 Ti

Forward & backward passes with complete ANI-2x:

TorchANI with original featurizer: 90 ms
TorchANI with our featurizer: 81 ms

Just forward pass with complete ANI-2x:

TorchANI with original featurizer: 25 ms
TorchANI with our featurizer: 23 ms

Forward & backward passes with ANI-2x using just one set of the atomic NNs, not 8:

TorchANI with original featurizer: 11 ms
TorchANI with our featurizer: 6.8 ms

Just forward pass with ANI-2x using just one set of the atomic NNs, not 8:

TorchANI with original featurizer: 6.3 ms
TorchANI with our featurizer: 3.7 ms

peastman · 2020-10-09T23:22:59Z

Looks like the neural net part is now the bottleneck. From the benchmarks in #6, doing both forward and backward passes through the features for a system of 60 atoms is only 0.115 ms, and for a system of 2269 atoms is 1.04 ms.

Do you have a sense of what makes the neural net part slow? Can we make it faster from within PyTorch, or do we need a custom kernel for that part too?

Also, in the above numbers, how much of the time is spent constructing and destructing CudaANISymmetryFunction objects, and how much is spent in the kernels?

raimis · 2020-10-13T09:55:18Z

Let's move the discussion about the NN part to #11.

raimis · 2020-10-22T10:34:01Z

@peastman the first iteration of the PyTorch wrapper of NNPOps is done!

At the moment, it exposes just NNPOps.SymmetryFunctions.TorchANISymmetryFunctions, but it demonstrates how to make a custom PyTorch operations which work the automatic differentiation and model serialisation.

Remaining problems:

The wrapper isn't as efficient as it could be.
Only one molecule can be computed, i.e. the batched computation isn't supported.
Only the 0D or 3D periodic boundary condition are supported.

peastman · 2020-10-22T19:01:13Z

Nice!

@proteneer do you have any ideas about how we could make the wrapping more efficient? Is there a way we could create the C++ object just once and use it repeatedly, instead of having to create a new one on every evaluation? This will be even more important for SchNet, since there we'll want to build a neighbor list structure once and use it repeatedly for all the layers within a single evaluation.

raimis · 2020-10-28T17:43:51Z

I have tried a few ideas to make the wrapping more efficient, but nothing better came out, except I found and fixed a bug regarding the device management.

This is ready for merging. Or do you have more comments, @peastman and @proteneer?

peastman · 2020-10-28T21:02:17Z

Great! I'll go ahead and merge it. It would still be good to try to restructure it following the pattern in #10 (comment), since I think that will substantially improve performance.

Raimondas Galvelis added 3 commits September 28, 2020 11:19

PyTorch wrapper for the forward pass on CPU

b74a2b8

CMake file for the PyTorch wrapper

a8eb8e0

Pytorch wrapper for the backward pass (not yet working)

bae1ecb

raimis force-pushed the pytorch branch from df6f3bc to bae1ecb Compare September 28, 2020 09:19

Raimondas Galvelis added 2 commits September 28, 2020 15:50

Wrap CpuANISymmetryFunctions as a custom Pytorch class

21b88b8

Pytorch wrapper of the backward pass

e7cf48c

Raimondas Galvelis added 4 commits September 29, 2020 11:04

Simplify Pytorch wrapper

3fdc9f0

Pytorch wrapper for the CUDA implementation

7f65603

Fix a typo

a9a1fbd

Simplfy the Pytorch wrapper

45a6031

Raimondas Galvelis added 6 commits October 1, 2020 11:09

Fix the memory leak in the PyTorch wrapper

46ef01e

Pass the box vector to the PyTorch wrapper

62206eb

Merge branch 'master' into pytorch

0ea2863

Unify the names of PyTorch wrapper

d94936a

Implement integration with TorchANI via the PyTorch wrapper

524982a

Simplify and add check to the TorchANI integration

b2b2a9e

Merge branch 'master' into pytorch

4e6f226

Rename the PyTorch wrapper component

b3c6ca4

Add a test for TorchANISymmetryFunctions

fec2500

raimis mentioned this pull request Oct 13, 2020

Performance of the atomic neural networks in TorchANI #11

Open

Merge remote-tracking branch 'origin/master' into pytorch

f8b4584

raimis mentioned this pull request Oct 15, 2020

Batched NNs for TorchANI #13

Merged

4 tasks

Raimondas Galvelis added 9 commits October 21, 2020 15:27

Add more molecules for TorchANISymmetryFunctions tests

551591b

Update TorchANISymmetryFunctions tests to use all the molecules

bdae88f

Improve CMake file for NNPOpsPyTorch

ee82560

Add installation instructions for NNPOpsPyTorch

40bb2fa

Fix the import of NNPOps in Python

df3b1ee

Add an usage example for NNPOpsPyTorch

525153f

Fix the import in the example

4d9fdc9

Add docstrings for TorchANISymmetryFunctions

ae73034

Add more general text about the wrapper

c5cf004

raimis marked this pull request as ready for review October 22, 2020 10:15

Fix typo

cd86134

This was referenced Oct 23, 2020

Conda package of NNPOps-PyTorch #15

Closed

Support SchNet-like models as well #10

Closed

Raimondas Galvelis added 2 commits October 28, 2020 12:15

Add a benchmark script for TorchANISymmetryFunctions

ddd222c

Make PyTorch and NNPOps to run on the same GPU

ed3cc15

peastman merged commit 667a282 into openmm:master Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch wrapper #5

PyTorch wrapper #5

raimis commented Sep 25, 2020 •

edited

Loading

peastman commented Sep 28, 2020

raimis commented Sep 28, 2020

peastman commented Sep 28, 2020

raimis commented Sep 29, 2020

raimis commented Sep 30, 2020

raimis commented Sep 30, 2020

raimis commented Sep 30, 2020 •

edited

Loading

raimis commented Sep 30, 2020

peastman commented Oct 6, 2020

raimis commented Oct 7, 2020 •

edited

Loading

raimis commented Oct 7, 2020

peastman commented Oct 8, 2020

raimis commented Oct 9, 2020

proteneer commented Oct 9, 2020

raimis commented Oct 9, 2020

raimis commented Oct 9, 2020 •

edited

Loading

peastman commented Oct 9, 2020

raimis commented Oct 13, 2020

raimis commented Oct 22, 2020

peastman commented Oct 22, 2020

raimis commented Oct 28, 2020

peastman commented Oct 28, 2020

PyTorch wrapper #5

PyTorch wrapper #5

Conversation

raimis commented Sep 25, 2020 • edited Loading

peastman commented Sep 28, 2020

raimis commented Sep 28, 2020

peastman commented Sep 28, 2020

raimis commented Sep 29, 2020

raimis commented Sep 30, 2020

raimis commented Sep 30, 2020

raimis commented Sep 30, 2020 • edited Loading

raimis commented Sep 30, 2020

peastman commented Oct 6, 2020

raimis commented Oct 7, 2020 • edited Loading

raimis commented Oct 7, 2020

peastman commented Oct 8, 2020

raimis commented Oct 9, 2020

proteneer commented Oct 9, 2020

raimis commented Oct 9, 2020

raimis commented Oct 9, 2020 • edited Loading

peastman commented Oct 9, 2020

raimis commented Oct 13, 2020

raimis commented Oct 22, 2020

peastman commented Oct 22, 2020

raimis commented Oct 28, 2020

peastman commented Oct 28, 2020

raimis commented Sep 25, 2020 •

edited

Loading

raimis commented Sep 30, 2020 •

edited

Loading

raimis commented Oct 7, 2020 •

edited

Loading

raimis commented Oct 9, 2020 •

edited

Loading