Non-Equivariant BatchNorm #16

oriondollar · 2023-02-10T03:03:22Z

Hi there,

I've been playing with your codebase to see how equivariant features propagate through different layer types and I think there might be an error in your code. The AttentionInteractionBlockVN normalizes the vector representation with a standard nn.LayerNorm layer which breaks the equivariance of the vector representations inside the encoder. Was this intended? I'm not sure how much of an effect it will have on the rest of the model as the ligand and pocket are jointly encoded. Similarly, a standard nn.Linear layer is used to embed the initial atomic vector representation which also breaks the initial maintenance of equivariance between the atomic coordinates and the machine learned embeddings.

pengxingang · 2023-02-14T09:21:03Z

Hi, thanks for your interest in our work.

First, the layer normalization is defined as
$$y=\frac{x - \mathbf{E}[x]}{\sqrt{\mathbf{Var}[x] - \epsilon}} * \gamma + \beta.$$ The layer norm for the vector features were defined as self.layernorm_vec = LayerNorm([hidden_channels[1], 3]), and the normalization operate on the feature dimension [hidden_channels[1], 3]. The operation $$\frac{x - \mathbf{E}[x]}{\sqrt{\mathbf{Var}[x] - \epsilon}}$$ did not violate the equivarance because $\mathbf{Var}[R\circ x]=\mathbf{Var}[x]$ and $R\circ(x-\mathbf{E}[x]) = R\circ x-\mathbf{E}[R\circ x]$ where $R$ is an operation in E3 group. Then if $\gamma$ and $\beta$ are scalar values, this layer normalization will keep equivarance. Unfortunately, the default settings of torch.nn.LayerNorm defines $\gamma$ and $\beta$ as tensors with the same shape as the feature dimensions so the original code did violate the equivarance. But this can be solved by setting the parameter elementwise_affine as False for the torch.nn.LayerNorm.

Actually the layernorm layers were added to the model to stabilize the training process. We did not notice that $\gamma$ and $\beta$ of the layers were not scalars. The impact on the equivarance of the whole model can possible be speculated from the values of $\gamma$ and $\beta$ of the trained model.

Second, for the linear layer for the initial vector embedding, it may not be equivarant to the translation operation. This is easy to solve by translating all the center of mass of pocket to the origin of the coordinate. Besides, we speculated that this non-equivariant layer may have minor effect because the substraction of two vectors remained unaffected and equivariant. The model had a chance to learn from the equivariant parts,such as adapted the weights to focus on the substraction of two vector features.

Overall, these two layers had impact on the equivariance of the model. The solutions are easy as discussed above, and the influence on the model performance might not be large. Thanks again for your careful reading and pointing out this problem.

yangxiufengsia · 2023-05-15T04:33:18Z

Hi, nice paper published. Retarding to your E(3) model, I think your model might be not translational equivariant. I have similar questions with @oriondollar.
(1) In your model, absolute coordinates were directly used as input vector features. So, I am wondering your model can not work for other test data having different absolute coordinates. As you mentioned the solution above, that we can translate test coordinates to the center of mass of training data (absolute coordinates). But In real practice, this is often impossible.
(2) You also mentioned that the embedding layer has minor effect on the translational equivariance and the model still learns the translational equivariance since the substraction of vectors remain unchanged. But what is the purpose of embedding layer? Since from my test, it doesn't work if I used a different protein-ligand data outside training data coordinate space.

Do you have any ideas to solve this issue? Looking forward to your reply. Thank you.

pengxingang mentioned this issue May 16, 2023

Non E(3) equivariant model issue #23

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-Equivariant BatchNorm #16

Non-Equivariant BatchNorm #16

oriondollar commented Feb 10, 2023

pengxingang commented Feb 14, 2023

yangxiufengsia commented May 15, 2023 •

edited

Loading

Non-Equivariant BatchNorm #16

Non-Equivariant BatchNorm #16

Comments

oriondollar commented Feb 10, 2023

pengxingang commented Feb 14, 2023

yangxiufengsia commented May 15, 2023 • edited Loading

yangxiufengsia commented May 15, 2023 •

edited

Loading