You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a result of how list multiplication works in Python, this will actually mean that self.in_mlps[0] is self.in_mlps[1], i.e., the two elements of the list have the same identity.
In terms of the network structure, I assume that this means that they share their weights. This is a bit unexpected, because self.in_mlps[0] is used on pitches and self.in_mlps[1] is used on loudness. I haven't looked into the official ddsp implementation, but at least from the paper it would not assume that the MLPs are supposed to use shared weights. Or are their semantics similar enough to justify using shared weights?
For comparison, here is the torch info output as it is implemented now (note that "Sequential: 2-2" is labeled as recursive because torch info recognizes it as shared weights):
This is how the model changes when replacing the expression to [mlp(1, config.hidden_size, 3) for _ in range(2)] to give them individual weights (note the increased model size):
I'm just wondering if this is rather an accident that happens to work okay-ish or if the weights were merged deliberately as a performance optimization?
The text was updated successfully, but these errors were encountered:
I noticed that
self.in_mlps
uses this expression:ddsp_pytorch/ddsp/model.py
Line 47 in aaaf17d
As a result of how list multiplication works in Python, this will actually mean that
self.in_mlps[0] is self.in_mlps[1]
, i.e., the two elements of the list have the same identity.In terms of the network structure, I assume that this means that they share their weights. This is a bit unexpected, because
self.in_mlps[0]
is used on pitches andself.in_mlps[1]
is used on loudness. I haven't looked into the official ddsp implementation, but at least from the paper it would not assume that the MLPs are supposed to use shared weights. Or are their semantics similar enough to justify using shared weights?For comparison, here is the torch info output as it is implemented now (note that "Sequential: 2-2" is labeled as recursive because torch info recognizes it as shared weights):
This is how the model changes when replacing the expression to
[mlp(1, config.hidden_size, 3) for _ in range(2)]
to give them individual weights (note the increased model size):I'm just wondering if this is rather an accident that happens to work okay-ish or if the weights were merged deliberately as a performance optimization?
The text was updated successfully, but these errors were encountered: