Multiple feature requests #444

bishshoy · 2021-02-20T08:19:34Z

bishshoy
Feb 20, 2021

Separated layers:
During research, we extract features from intermediate layers. We also inject other stuff like random features or gradients in between layers. Among the models that are implemented in this repo, many of them have bunched layers. For example, in the ResNet models, many Bottleneck blocks are bunched together in the make_layer function. This makes research difficult, since we first have to edit the network structure and undo all the 'bunching', make everything serialized and then use the network for research. Doing so not only takes a lot of time, but also takes away the ability to use pretrained weights since there is a dictionary mismatch. Hence, we would also have to write porting code for the weights as well. It would be great if the network structures can be simplified.
DeepSpeed / Fairscale integration:
As it is rightly pointed out in the latest update, the NF-Nets are memory hungry and training even in low batch sizes goes OOM in most modern GPUs. This trend is just the beginning. In the coming future, more and more models are going to go OOM even at very small batch sizes. A transparent implementation (like --zero-optimizer fairscale/deepspeed) would be an awesome addition.
Nvidia DALI loader:
Nvidia DALI is an extremely lightweight and fast loader. It is very useful for research folks who don't own a high grade CPU wherein bottlenecks in the dataloader pipeline is imminent. It also help in scenarios where we run multiple projects in a single node splitting it across different devices.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple feature requests #444

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Multiple feature requests #444

bishshoy Feb 20, 2021

Replies: 0 comments

bishshoy
Feb 20, 2021