-
Notifications
You must be signed in to change notification settings - Fork 652
DenseNetFCN not training to expected performance #63
Comments
Update: One obvious problem is I had the wrong learning rate for Adam, which I changed from
|
@ahundt How was the training with the corrected learning rate? In my case, it seems to perform well enough, but does not beat the benchmark yet. Perhaps deeper and wider DenseNetFCNs will do the trick. |
@titu1994 I've done a few things since the last update. In your case, did you use the same scripts I reference here or did you try something else? Here is what I've done:
|
No I used a private dataset, which normal networks like UNet have difficulty with. DenseNet seems to perform better than UNet but then it plateaus and further improvement is trivial after that point. Although in my case there was no single class problem. |
I'm not sure but it shouldn't label all pixels as background, there must be something wrong in the code or training settings. Could you please give a link to the code you are using? |
@titu1994 sorry I didn't realize I had submitted my most recent post then edited it, there is now a lot of additional information above. @aurora95 Here is My Keras-FCN version where I'm training with DenseNet, with options for Atrous_DenseNet and DenseNet_FCN as generated after cloning keras-contrib with #46 applied. Most modification is in models.py where I import this keras-contrib repository, and train.py where I changed the file paths, switched from SGD to Adam, and changed the learning rate appropriately. A few additional minor changes were also needed for compatibility of image and batch dimensions. |
Hmm I don't think there are weights for DenseNet trained on ImageNet. I'll have to do a more thorough search to be sure. Atrous DenseNets seems nice, could you try implementing it ? I'll give it a look, but i don't think just adding the atrous rate parameter will translate to better performance. |
@titu1994 they are trained on imagenet with DenseNet-caffe and the original DenseNet repository. See my densenet + imagenet links two posts up for those and some caffe to keras conversion scripts that might help, I haven't had a chance to try it all out yet. |
This is great news! I'll be sure to look at it in some time, and if possible port the weights to Keras. However if it's in caffe, I won't be able to convert it, since im on Windows. |
@titu1994 Also, Atrous Densenet is already implemented in #46, next steps would be converting the pretrained imagenet weights or training from scratch. If the pretrained weights don't work, I have access to a distributed GPU cluster... but it will quite some time before I have all of that implemented, integrated, and tested. tf-slim or tensorpack could help there once paired with a script to copy weights between tf and keras models. |
Found another bug which came from combining the two scripts where the loss function softmax_sparse_crossentropy_ignoring_last_label applies softmax a second time. This is due to a difference between the Keras_FCN models which never apply softmax and then the loss function applies softmax. The keras-contrib densenet models apply softmax by default, thus combining keras-contrib models with keras-fcn loss results in 2x softmax. I'm now training a new model with the bugfix on Pascal VOC 2012. |
Other optimizers & hyperparameters tensorflow/tensorflow#9175 may help. It also appears https://github.com/0bserver07/One-Hundred-Layers-Tiramisu has an independent keras implementation but has also run into similar training limitations. |
Update: my fork of Keras-FCN is merged to master with instructions in the README.md |
It seems the original authors explain in SimJeg/FC-DenseNet#10 from their FC-Densenet repository that they have found DenseNetFCN performance isn't very good on Pascal VOC. @0bserver07 @titu1994 you will be interested in this info. |
I have seen similar poor performance on a private dataset. Seems it learns rapidly upto a certain point, then cannot improve at all. UNet seems to perform well in that dataset, far better than DenseNetFCN. Perhaps the implementation is correct but the model is not able to learn properly on all datasets ? |
@titu1994 that seems likely, camvid is a much simpler dataset than Pascal VOC 2012. The real test would likely be to try training on camvid itself. |
Another interesting difference is the use of a ceil mode in pooling. I'm a bit doubtful that this is the key cause of the performance difference between the paper and the keras implementations, however. Especially considering the tiramisu paper didn't use pretrained weights. |
I'm training and testing DenseNetFCN on Pascal VOC2012
Could I get advice on next steps to take to debug and improve the results?
To do so I'm using the train.py training script in my fork of Keras-FCN in conjunction with DenseNetFCN in keras-contrib with #46 applied, which for DenseNetFCN really mostly changes the formatting for pep8, though the regular densenet is more heavily modified.
I use Keras-FCN because we don't have an FCN training script here in keras-contrib yet, though I plan to adapt & submit this here once things work properly. While the papers don't publish results on Pascal VOC, the original DenseNet has fairly state of the art results on ImageNet & cifar10/cifar100, and DenseNetFCN performed well on CamVid and the Gatech dataset. Considering this I expected that perhaps DenseNetFCN might fail to get state of the art results, but I figured it could most likely in a worst case get over 50% mIOU and around 70-80% pixel accuracy since it has many similarities in common with ResNet, and performed quite well on the much smaller CamVid dataset.
DenseNet FCN configuration I'm using
This is very close to the configuration FC-DenseNet56 from the 100 layers tiramisu aka DenseNetFCN paper.
Sparse training accuracy
Here is what I'm seeing as I train with the
Adam
optimizer and loss rate of0.1
:Similar networks and training scripts verified for a baseline
7x%
pixel accuracy on the pascal VOC2012 test set.8x%
accuracy.9x%
accuracy on augmented versions of the training image.Verifying training scripts with AtrousFCNResNet50_16s
For comparison, AtrousFCNResNet50_16s test set training results, which can be brought to 0.661 mIOU with augmented pascal voc,
I also trained AtrousFCNResNet50_16s from scratch with the results below:
Download links
Automated VOC download script
Thanks
@titu1994 and @aurora95, since I'm using your respective DenseNetFCN and Keras-FCN implementations, could I get any comments or advice you might have on this? I'd appreciate your thoughts.
All, thanks for giving this a look, as well as your consideration and advice!
The text was updated successfully, but these errors were encountered: