Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How should I set num_route_nodes? #25

Open
viking-sudo-rm opened this issue Nov 30, 2018 · 2 comments
Open

How should I set num_route_nodes? #25

viking-sudo-rm opened this issue Nov 30, 2018 · 2 comments

Comments

@viking-sudo-rm
Copy link

viking-sudo-rm commented Nov 30, 2018

According to my understanding, num_route_nodes should be the length of the vote vectors that are used as input to dynamic routing.

In the example code, this value is set according to:

        self.primary_capsules = CapsuleLayer(num_capsules=8, num_route_nodes=-1, in_channels=256, out_channels=32,
                                             kernel_size=9, stride=2)
        self.digit_capsules = CapsuleLayer(num_capsules=NUM_CLASSES, num_route_nodes=32 * 6 * 6, in_channels=8,
                                           out_channels=16)

32 appears to come from the previous out_channels, but I can't tell what 6 * 6 are doing. Is this arbitrary, or is it preventing some kind of dimension mismatch? If I am implementing my own capsule network with new dimensions, do I need to be careful about how I pick this value?

Thanks,
Will

@nnWhisperer
Copy link

Hi,
6 * 6 comes from the following:
input(MNIST) size is 28^2(capsule network was tried on MNIST).
First convolution layer has 9 kernel size with stride 1, making its output ceil((28 - kernel_size + 1) / stride_size)^2 = 20^2.
Primary capsule side has another convolution layer inside, but this time it has stride = 2. The same formula, gives: ceil((20 - kernel_size + 1) / stride_size)^2 = 6 * 6. I hope the rest is comprehendible.

@zlh-source
Copy link

According to my understanding, num_route_nodes should be the length of the vote vectors that are used as input to dynamic routing.

In the example code, this value is set according to:

        self.primary_capsules = CapsuleLayer(num_capsules=8, num_route_nodes=-1, in_channels=256, out_channels=32,
                                             kernel_size=9, stride=2)
        self.digit_capsules = CapsuleLayer(num_capsules=NUM_CLASSES, num_route_nodes=32 * 6 * 6, in_channels=8,
                                           out_channels=16)

32 appears to come from the previous out_channels, but I can't tell what 6 * 6 are doing. Is this arbitrary, or is it preventing some kind of dimension mismatch? If I am implementing my own capsule network with new dimensions, do I need to be careful about how I pick this value?

Thanks,
Will

I think it's arbitrary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants