You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear authors. Thank you for your great work and also your efforts in sharing the code & data.
I was wondering if you can let me know which hyperparameters were used for allsides data.
With your default hyperparameter setting in the README, the performance of allsides-s test data have reached maximum of 70% accuracy like below and I could not find all the required parameters from the paper.
Again, thank you for your work and assistance.
==================================================================================
--gpu_index=0
--batch_size=16
--num_epochs=50
--learning_rate=0.001
--max_sentence=20
--embed_size=256
--dropout=0.3
--num_layer=1
--num_head=4
--d_hid=128
--dataset=ALLSIDES-S
--alpha=0.6
--beta=0.2
Count of using GPUs: 2
Count of using GPUs: 2
====================================TRAIN INFO START====================================
TRAINING MODEL = KHAN
Embedding Size = 256
Maximum Length = 20
Number of Transformer Encoder Layers = 1
Number of Multi-head Attentions = 4
Hidden Layer Dimension = 128
Dropout Probability = 0.3
Alpha = 0.6
Beta = 0.2
DATASET = ALLSIDES-S
BATCH SIZE = 16
NUM EPOCHS = 50
LEARNING RATE = 0.001
==================================== Training Start ====================================
Training data size: 13304
Test data size: 1479
[2.233713901947616, 3.180492469519484, 4.20347551342812]
Also, how can I reproduce the results for SemEval data? The performance of the model on SemEval data is also saturated near 80% given the parameter settings in README. Pytorch version was 1.10.0 with torchtext 0.11.0 which is slightly different form mentioned in README, but I have also evaluated the code in torch > 2.0 which incurs not much difference.
=============================== 10-Folds Training Result ===============================
=============== Total Accuracy: 0.7906, Training time: 925.95 (sec.) ================
=============== Best Accuracy: 0.8438, Accuracy variance: 0.0013 ================
Dear authors. Thank you for your great work and also your efforts in sharing the code & data.
I was wondering if you can let me know which hyperparameters were used for allsides data.
With your default hyperparameter setting in the README, the performance of allsides-s test data have reached maximum of 70% accuracy like below and I could not find all the required parameters from the paper.
Again, thank you for your work and assistance.
==================================================================================
--gpu_index=0
--batch_size=16
--num_epochs=50
--learning_rate=0.001
--max_sentence=20
--embed_size=256
--dropout=0.3
--num_layer=1
--num_head=4
--d_hid=128
--dataset=ALLSIDES-S
--alpha=0.6
--beta=0.2
Count of using GPUs: 2
Count of using GPUs: 2
====================================TRAIN INFO START====================================
==================================== Training Start ====================================
[2.233713901947616, 3.180492469519484, 4.20347551342812]
50
torch.Size([256])
Total params: 122.80M
Fold: 1 | Epoch: 1 | Loss: 1.3780 | TrainAcc: 0.4497 | ValAcc: 0.6105 | Time: 40.98
Fold: 1 | Epoch: 2 | Loss: 0.9669 | TrainAcc: 0.5270 | ValAcc: 0.5916 | Time: 40.51
Fold: 1 | Epoch: 3 | Loss: 0.8768 | TrainAcc: 0.5714 | ValAcc: 0.5625 | Time: 40.43
Fold: 1 | Epoch: 4 | Loss: 0.8125 | TrainAcc: 0.6040 | ValAcc: 0.5991 | Time: 40.43
Fold: 1 | Epoch: 5 | Loss: 0.7928 | TrainAcc: 0.6197 | ValAcc: 0.6045 | Time: 40.58
Fold: 1 | Epoch: 6 | Loss: 0.7621 | TrainAcc: 0.6335 | ValAcc: 0.6241 | Time: 40.37
Fold: 1 | Epoch: 7 | Loss: 0.7471 | TrainAcc: 0.6439 | ValAcc: 0.6302 | Time: 40.31
Fold: 1 | Epoch: 8 | Loss: 0.7436 | TrainAcc: 0.6512 | ValAcc: 0.6504 | Time: 40.32
Fold: 1 | Epoch: 9 | Loss: 0.7038 | TrainAcc: 0.6757 | ValAcc: 0.6234 | Time: 40.48
Fold: 1 | Epoch: 10 | Loss: 0.6852 | TrainAcc: 0.6941 | ValAcc: 0.6376 | Time: 40.37
Fold: 1 | Epoch: 11 | Loss: 0.6664 | TrainAcc: 0.7068 | ValAcc: 0.6092 | Time: 40.42
Fold: 1 | Epoch: 12 | Loss: 0.6353 | TrainAcc: 0.7223 | ValAcc: 0.6795 | Time: 40.41
Fold: 1 | Epoch: 13 | Loss: 0.6127 | TrainAcc: 0.7389 | ValAcc: 0.6160 | Time: 40.42
Fold: 1 | Epoch: 14 | Loss: 0.5859 | TrainAcc: 0.7540 | ValAcc: 0.6599 | Time: 40.39
Fold: 1 | Epoch: 15 | Loss: 0.5598 | TrainAcc: 0.7653 | ValAcc: 0.6633 | Time: 40.53
Fold: 1 | Epoch: 16 | Loss: 0.5337 | TrainAcc: 0.7862 | ValAcc: 0.6849 | Time: 40.34
Fold: 1 | Epoch: 17 | Loss: 0.5187 | TrainAcc: 0.7892 | ValAcc: 0.6545 | Time: 40.52
Fold: 1 | Epoch: 18 | Loss: 0.5091 | TrainAcc: 0.8012 | ValAcc: 0.6910 | Time: 40.27
Fold: 1 | Epoch: 19 | Loss: 0.4569 | TrainAcc: 0.8190 | ValAcc: 0.6728 | Time: 40.29
Fold: 1 | Epoch: 20 | Loss: 0.4471 | TrainAcc: 0.8238 | ValAcc: 0.6842 | Time: 40.32
Fold: 1 | Epoch: 21 | Loss: 0.4305 | TrainAcc: 0.8340 | ValAcc: 0.6795 | Time: 40.32
Fold: 1 | Epoch: 22 | Loss: 0.4135 | TrainAcc: 0.8422 | ValAcc: 0.6802 | Time: 40.24
Fold: 1 | Epoch: 23 | Loss: 0.4013 | TrainAcc: 0.8430 | ValAcc: 0.6883 | Time: 40.26
Fold: 1 | Epoch: 24 | Loss: 0.3912 | TrainAcc: 0.8534 | ValAcc: 0.6890 | Time: 40.31
Fold: 1 | Epoch: 25 | Loss: 0.3343 | TrainAcc: 0.8771 | ValAcc: 0.6910 | Time: 40.36
Fold: 1 | Epoch: 26 | Loss: 0.3321 | TrainAcc: 0.8761 | ValAcc: 0.6687 | Time: 40.28
Fold: 1 | Epoch: 27 | Loss: 0.2972 | TrainAcc: 0.8899 | ValAcc: 0.6795 | Time: 40.23
Fold: 1 | Epoch: 28 | Loss: 0.2997 | TrainAcc: 0.8887 | ValAcc: 0.6728 | Time: 40.29
Fold: 1 | Epoch: 29 | Loss: 0.2795 | TrainAcc: 0.9000 | ValAcc: 0.6782 | Time: 40.30
Fold: 1 | Epoch: 30 | Loss: 0.2816 | TrainAcc: 0.8957 | ValAcc: 0.6660 | Time: 40.23
Fold: 1 | Epoch: 31 | Loss: 0.2515 | TrainAcc: 0.9103 | ValAcc: 0.6795 | Time: 40.37
Fold: 1 | Epoch: 32 | Loss: 0.2407 | TrainAcc: 0.9112 | ValAcc: 0.6863 | Time: 40.35
Fold: 1 | Epoch: 33 | Loss: 0.2225 | TrainAcc: 0.9213 | ValAcc: 0.6802 | Time: 40.33
Fold: 1 | Epoch: 34 | Loss: 0.2215 | TrainAcc: 0.9216 | ValAcc: 0.6957 | Time: 40.29
Fold: 1 | Epoch: 35 | Loss: 0.2202 | TrainAcc: 0.9204 | ValAcc: 0.6788 | Time: 40.44
Fold: 1 | Epoch: 36 | Loss: 0.2118 | TrainAcc: 0.9250 | ValAcc: 0.6775 | Time: 40.34
Fold: 1 | Epoch: 37 | Loss: 0.1921 | TrainAcc: 0.9335 | ValAcc: 0.6768 | Time: 40.32
Fold: 1 | Epoch: 38 | Loss: 0.1914 | TrainAcc: 0.9305 | ValAcc: 0.6707 | Time: 40.18
Fold: 1 | Epoch: 39 | Loss: 0.1904 | TrainAcc: 0.9324 | ValAcc: 0.6694 | Time: 40.15
Fold: 1 | Epoch: 40 | Loss: 0.1721 | TrainAcc: 0.9393 | ValAcc: 0.6755 | Time: 40.25
Fold: 1 | Epoch: 41 | Loss: 0.1718 | TrainAcc: 0.9390 | ValAcc: 0.6897 | Time: 40.23
Fold: 1 | Epoch: 42 | Loss: 0.1819 | TrainAcc: 0.9363 | ValAcc: 0.6829 | Time: 40.25
Fold: 1 | Epoch: 43 | Loss: 0.1778 | TrainAcc: 0.9379 | ValAcc: 0.6700 | Time: 40.27
Fold: 1 | Epoch: 44 | Loss: 0.1685 | TrainAcc: 0.9392 | ValAcc: 0.6897 | Time: 40.25
Fold: 1 | Epoch: 45 | Loss: 0.1669 | TrainAcc: 0.9425 | ValAcc: 0.6714 | Time: 40.21
Fold: 1 | Epoch: 46 | Loss: 0.1521 | TrainAcc: 0.9460 | ValAcc: 0.6667 | Time: 40.26
Fold: 1 | Epoch: 47 | Loss: 0.1477 | TrainAcc: 0.9480 | ValAcc: 0.6755 | Time: 40.08
Fold: 1 | Epoch: 48 | Loss: 0.1444 | TrainAcc: 0.9484 | ValAcc: 0.6761 | Time: 40.00
Fold: 1 | Epoch: 49 | Loss: 0.1621 | TrainAcc: 0.9441 | ValAcc: 0.6856 | Time: 40.00
Fold: 1 | Epoch: 50 | Loss: 0.1584 | TrainAcc: 0.9445 | ValAcc: 0.6640 | Time: 39.97
FOLD - 1
Test Accuracy: 0.6957, Training time: 2096.58 (sec.)
The text was updated successfully, but these errors were encountered: