Comparison with direct voxel lookups - Chapter. 5.4. #156

JoanCharmant · 2022-02-13T18:23:08Z

JoanCharmant
Feb 13, 2022

Hi, congrats on this awesome project and TCNN.

From Chapter 5.4, "Comparison with direct voxel lookups", ablation study, I quote "we replace the entire neural network with a single linear matrix multiplication".

I'm interested in exploring this tradeoff further. Could you clarify what you did?

In particular what were the grid parameters, since the first MLP is normally predicting direction-independent color information this now needs to be represented directly as trainable features in the grid. (For Plenoxels I guess that would be like having a 1-level dense grid with 28-feature vectors, which is not possible here).

My questions:

What dimensionality F for the feature vectors stored in the hash table? How many trainable weights for color vs density?
How many grid levels and hash table entries?
Were the outputs of the encoding passed through an activation function?
What is this single matrix multiplication calculating? The RGBD at samples or the final ray RGB?
What is the shape of this matrix and where are the weights coming from?

Do you still have this code around in a branch maybe?

Thank you!

Answered by Tom94

Feb 13, 2022

Hi there, what we did amounts to simply setting the number of hidden layers of the neural networks to zero. (All other parameters---input/output dims, output activations, optimizers, F, T, etc.---being equal.)

Note that despite there being two neural networks (one for density and one for color), the concatenation of 2 linear transforms is still a linear transform. Therefore, it's equivalent to having just a single transformation matrix. (Although the computation structure is such that the predicted density can't depend on view direction---so it's fairer to say that there is one linear transform of Hashgrid->Density and another from Hashgrid+SphericalHarmonigs->RGB, I.e. the single concept…

View full answer

Tom94 · 2022-02-13T18:56:23Z

Tom94
Feb 13, 2022
Maintainer

Hi there, what we did amounts to simply setting the number of hidden layers of the neural networks to zero. (All other parameters---input/output dims, output activations, optimizers, F, T, etc.---being equal.)

Note that despite there being two neural networks (one for density and one for color), the concatenation of 2 linear transforms is still a linear transform. Therefore, it's equivalent to having just a single transformation matrix. (Although the computation structure is such that the predicted density can't depend on view direction---so it's fairer to say that there is one linear transform of Hashgrid->Density and another from Hashgrid+SphericalHarmonigs->RGB, I.e. the single conceptual large linear transform has some of its matrix entries forced to zero).

We've bundled the corresponding config as nerf/linear.json. You can load it via the --network linear.json CLI argument. Within the config, you can see that it's based on the default config (base.json) and merely overrides the n_hidden_layers parameter to zero.

Cheers!

0 replies

JoanCharmant · 2022-02-13T20:46:53Z

JoanCharmant
Feb 13, 2022
Author

Thanks, I see. From the phrasing I initially thought it was completely side stepping the training of network weights.
So essentially it is training the features in the hash tables and also the 32×16 and 32×3 weights going from the input to the output layers in the networks (but some of these weights are linearly coupled).
(I am trying to limit the amount of matmuls and put as much stuff as possible in the trainable data structure).

0 replies

Tom94 · 2022-02-14T06:52:51Z

Tom94
Feb 14, 2022
Maintainer

Yep! Although in a specialized implementation that doesn't need padding, the matrices would be 32x1 and 48x3, which amounts to "just" 176 FMA instructions.

48 of these (16x3) are inherent in the SH coefficient representation, and the remaining ones could conceivably be halved by lowering the output dimensionality of the hash encoding (e.g. by choosing F=1 or using L=8). That'd be my attempted approach, for a total of 112 mults, if FMA instructions are well and truly at a premium.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparison with direct voxel lookups - Chapter. 5.4. #156

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Comparison with direct voxel lookups - Chapter. 5.4. #156

JoanCharmant Feb 13, 2022

Replies: 3 comments

Tom94 Feb 13, 2022 Maintainer

JoanCharmant Feb 13, 2022 Author

Tom94 Feb 14, 2022 Maintainer

JoanCharmant
Feb 13, 2022

Tom94
Feb 13, 2022
Maintainer

JoanCharmant
Feb 13, 2022
Author

Tom94
Feb 14, 2022
Maintainer