A Question in Batch Size and torch.Tensor.expand #23

TerenceLiu98 · 2023-05-04T15:08:15Z

Thank you for providing this reproduction!

I have a question on the grouped convolution: in this line you use the grouped convolution to solve the mini-batch training problem.

Could we use the torch.Tensor.expand to replace the grouped convolution, like:

weight_prime = weight.expand(K, weight.shape[0], weight.shape[1], weight.shape[2], weight.shape[3])
weight = torch.mm(softmax_attention, weight_prime).view(-1, x.shape[1], self.kernel_size, self.kernel_size)

In this way, we might aggregate the attention weight and the convolution weight together. However, this may cause another problem. If batch size ($\mathcal{B}$) is larger than 1, the attention weight would be a matrix with $\mathcal{B} \times K$, I think we can use torch.mean(attention_weight, dim=0) or torch.max(attention_weight, dim=0) since they are calculated within the batch, in which the range is very close.

I am not sure whether this calculation is equivalent to the line :)

The text was updated successfully, but these errors were encountered:

kaijieshi7 · 2023-05-04T17:05:45Z

Good idea. THank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Question in Batch Size and torch.Tensor.expand #23

A Question in Batch Size and torch.Tensor.expand #23

TerenceLiu98 commented May 4, 2023 •

edited

Loading

kaijieshi7 commented May 4, 2023

A Question in Batch Size and torch.Tensor.expand #23

A Question in Batch Size and torch.Tensor.expand #23

Comments

TerenceLiu98 commented May 4, 2023 • edited Loading

kaijieshi7 commented May 4, 2023

TerenceLiu98 commented May 4, 2023 •

edited

Loading