Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about seeds #4

Open
ljjcoder opened this issue Apr 16, 2022 · 8 comments
Open

about seeds #4

ljjcoder opened this issue Apr 16, 2022 · 8 comments

Comments

@ljjcoder
Copy link

thanks for your excellent works! can you give me your seeds on cifar100? (1,2,3,4,5?)

@KaiWU5
Copy link
Collaborator

KaiWU5 commented Apr 17, 2022

Yes, the seeds are 1,2,3,4,5.
The performance for each seed on CIFAR100 is
label400: 59.79 | 59.54 | 62.83 | 59.55 | 62.65
label2500: 76.33 | 75.07 | 75.7725 | 75.94 | 75.75
label10000: 80.68 | 80.52 | 80.84 | 80.51 | 80.58

To make the code public, we reconstructed the code and tested on one seed that reconstruction does not affect the randomness. If you find something different, please let us know.

@ljjcoder
Copy link
Author

@KaiWU5 ,I train the fixmatch-ccssl with label 4--(seeds=5)。However, I only get the 60.22 acc which is lower than yours.
I noticed that in fixmatch_ccssl.py, the output of self.cfg.get("contrast_left_out", False) is False, which means that the code directly runs Lcontrast = self.loss_contrast(features, max_probs, labels), is this correct?

@KaiWU5
Copy link
Collaborator

KaiWU5 commented Apr 28, 2022

Thanks for helping us validate our method.

For contrast_left_out
The contrast_left_out is False for in-distribution dataset, like CIFAR100, as in our paper. contrast_left_out means whether we use contrastive for noisy out-of-distribution data.

Possible reasons:
1 The code refactorization process for complying with Tencent YouTu Code Publication policy affects the randomization process.
2 In our code, set seed = 5 in the args failed to fix the randomness

Timeline to response to this question:
We will test the possible two reasons by running several experiments for CIFAR100/label400/seed5.
Training for epoch512 will take about 1 week, and 1st May - 5th May is our holiday. We will very possibly solve this question after one-two week/s.
If you have other questions, feel free to let us know during this time period.

@ljjcoder
Copy link
Author

@KaiWU5 Thanks for your prompt reply, I have one more question. Since your code refactors fixmatch's original code, this changes the randomness to some extent.
For a fair comparison, fixmatch should be reproduced on the refactored code. I reproduced Fixmatch by setting self.cfg.lambda_contrast to 0 (in Fixmatch_ccssl.py). But when the training reaches the 80th epoch, it suddenly diverges.

  1. Will CCSS become Fixmatch after setting self.cfg.lambda_contrast to o.
  2. the training hyperparameters of CCSSL are exactly the same as Fixmatch.

@KaiWU5
Copy link
Collaborator

KaiWU5 commented Apr 28, 2022

Sorry about the refactorization, which is used for solving code duplication problem, makes FixMatch and FixMatchccssl code looks different.

  1. If set self.cfg.lambda_contrast =0, the FixMatchccssl is the same as FixMatch. As in the paper, CCSSL is a simple and effective replaceable module instead of a complicated framework.
  2. The training parameters of the overlapping part of FixMatch and FixMatchCCSSL are exactly the same.

The possible problem for setting self.cfg.lambda_contrast = 0:
1 Unused parameters may affect the optimization process in pytorch because the graph is not complete. This is hard to debug.
2 Divergence in general usually comes from 1 large lr 2 too deep network 3 unstable optimization process (gradient explosion)

I will add some experiments about Fixmatch and release configs, but sorry that I cannot be sure about the root cause of the divergence you mentioned.

@ljjcoder
Copy link
Author

@KaiWU5 Thank you for answering my doubts, by the way your method is very performant and concise. Nice work!

@KaiWU5
Copy link
Collaborator

KaiWU5 commented May 10, 2022

【Code reproductivity】
I have trained FixMatchCCSSL-l400, 5-fold models, and i get : 59.49 | 59.72 | 62.67 | 60.11 | 60.83 | 61.08±1.59. The result is similar to the metric in the paper 61.19±1.65. So, it is okay to assume the randomness is because of code refactorization.

【FixMatch accuracy in our code】
I have tried two solution to test Fixmatch divergence problem you mentioned, but no divergence phenomenons found.

  1. Directly use Fixmatch in our code. We get 56.88 precision. The config fm_cifar100_wres_x8_b4x16_l400.py is released.
  2. Delete unused code in trainer/fixmatch_ccssl.py by 1. comment out from 131 to 165 2. change line 166 to loss = Lx + self.cfg.lambda_u * Lu, 3. delete line 186. I get 56.29 which is similar to directly use FixMatch.

If you need more clarifications, please let me know.

@ljjcoder
Copy link
Author

@KaiWU5 Thank you for your reply! I will try it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants