Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Question in Batch Size and torch.Tensor.expand #23

Open
TerenceLiu98 opened this issue May 4, 2023 · 1 comment
Open

A Question in Batch Size and torch.Tensor.expand #23

TerenceLiu98 opened this issue May 4, 2023 · 1 comment

Comments

@TerenceLiu98
Copy link

TerenceLiu98 commented May 4, 2023

Thank you for providing this reproduction!

I have a question on the grouped convolution: in this line you use the grouped convolution to solve the mini-batch training problem.

Could we use the torch.Tensor.expand to replace the grouped convolution, like:

weight_prime = weight.expand(K, weight.shape[0], weight.shape[1], weight.shape[2], weight.shape[3])
weight = torch.mm(softmax_attention, weight_prime).view(-1, x.shape[1], self.kernel_size, self.kernel_size)

In this way, we might aggregate the attention weight and the convolution weight together. However, this may cause another problem. If batch size ($\mathcal{B}$) is larger than 1, the attention weight would be a matrix with $\mathcal{B} \times K$, I think we can use torch.mean(attention_weight, dim=0) or torch.max(attention_weight, dim=0) since they are calculated within the batch, in which the range is very close.

I am not sure whether this calculation is equivalent to the line :)

@kaijieshi7
Copy link
Owner

Good idea. THank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants