Higher accuarcies : Bert trained on ori1.7k dataset achieves high accuracy on revised test dataset. #12

Siki-cloud · 2023-11-05T10:27:08Z

hi, very sorry for bothering, but I'm quite troubled by this issue.

i was trying to reproduce the results of the paper on imdb dataset., which trained on original train set(ori1.7k) and tested on original validation set is 87.4% while 82.2% on revised validation set.
I used the same setting reported in the paper (if i did not miss something):

bert-base-uncased
adam optimizer lr 5e-5
early stopping on val_loss, 5
no warmup
random seed 42

In my experiments, i got 88.7% and 89.1% (82.2% in the paper ) respectively on original validation and revised validation when trained on the original dataset1.7k.
Got 87.3%(80.4% in the paper) and 96.5% (90.8% in the paper) respectively on ori/dev.tsv and revised/dev.tsv when trained on revised/train.tsv.

Could you please guide me to where I should look? I would greatly appreciate your assistance.

Siki-cloud · 2023-11-05T10:36:08Z

I tried to train the bert-base-uncased without pre-trained weights on ori/train.tsv (unchanged other settings) , and tested on ori/test.tsv and revised/test.tsv, leading to lower accuracies, i.e., 50.5% and 49.8% respectively.

So i think the Bert model in the paper was trained with the pre-trained weights from huggingface. Is my understanding correct？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Higher accuarcies : Bert trained on ori1.7k dataset achieves high accuracy on revised test dataset. #12

Higher accuarcies : Bert trained on ori1.7k dataset achieves high accuracy on revised test dataset. #12

Siki-cloud commented Nov 5, 2023

Siki-cloud commented Nov 5, 2023

Higher accuarcies : Bert trained on ori1.7k dataset achieves high accuracy on revised test dataset. #12

Higher accuarcies : Bert trained on ori1.7k dataset achieves high accuracy on revised test dataset. #12

Comments

Siki-cloud commented Nov 5, 2023

Siki-cloud commented Nov 5, 2023