-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Densenets supermask #2
Comments
This seems like more of a question for
|
I was interested in your specific context 😉 and the comments and FAQ section in https://mitchellnw.github.io/blog/2020/supsup/ was poiting to this repo 😸 |
P.s. I got this vague idea reading the conclusions of https://arxiv.org/abs/2006.12156. If he is wondering about skip connections why not about dense connections? |
Oops! Sorry about that :) We tried skip-connections with resnets here which worked well. I believe dense-connections have not been explored with supermasks and it seems like a really interesting direction! |
Yes I know but I meant in the mentioned work the conclusion was more related to their strong claim that subnetworks "only needs a logarithmic factor (in all variables but depth) number of neurons per weight of the target subnetwork". So the open question was more about the impact of convolutional and batch norm layers, skip-connections, (densenet like connections?) |
I also meant that this claim could has an interesting impact in your continual learning specific setup. |
Thanks, that could definitely help! |
If you are interested in this see also Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient |
Thank you, we have seen this but haven't taken a close look! Hopefully we can soon it seems awesome |
Other then densenets another interesting direction are Transformers. Some early exploring efforts were made in: https://arxiv.org/abs/2005.00561 |
Have you never tried to find supermask over densenets?
The text was updated successfully, but these errors were encountered: