Skip to content
cha-no edited this page May 3, 2021 · 2 revisions

실험 HyperParameter

  • data_dir, model_dir 등등 공통적인 argument는 제외했습니다
  • baseline과 다른 부분은 bold체와 기울기로 구분했습니다
  • 점수기준은 제출했을때 joint accuracy 기준입니다
  • 공통 HyperParameter와 architecture의 HyperParameter를 나누었습니다

실험 - SUMBT

  • 점수 : 0.6213
  • epoch이 13일때 성능입니다

공통 HyperParameter

'random_seed' 'architecture' 'group_decay' 'weight_decay' 'max_seq_length' 'max_label_length' 'train_batch_size' 'eval_batch_size' 'learning_rate' 'adam_epsilon' 'max_grad_norm' 'num_train_epochs' 'warmup_ratio' 'model_name_or_path'
2020 'SUMBT' True 0.01 128 12 4 8 5e-5 1e-8 1.0 15 0.1 'dsksd/bert-ko-small-minimal'

SUMBT Hyperparameter

'hidden_dim' 'num_rnn_layers' 'zero_init_rnn' 'attn_head' 'fix_utterance_encoder' 'task_name' 'distance_metric'
512 1 False 4 False 'sumbtgru' 'euclidean'

실험 - SUMBT

  • 점수 : 0.6149
  • epoch이 11일때 성능입니다

공통 HyperParameter

'random_seed' 'architecture' 'group_decay' 'weight_decay' 'max_seq_length' 'max_label_length' 'train_batch_size' 'eval_batch_size' 'learning_rate' 'adam_epsilon' 'max_grad_norm' 'num_train_epochs' 'warmup_ratio' 'model_name_or_path'
2020 'SUMBT' True 0.01 128 12 4 8 5e-5 1e-8 1.0 15 0.1 'dsksd/bert-ko-small-minimal'

SUMBT Hyperparameter

'hidden_dim' 'num_rnn_layers' 'zero_init_rnn' 'attn_head' 'fix_utterance_encoder' 'task_name' 'distance_metric'
512 2 False 4 False 'sumbtgru' 'euclidean'

실험 - SUMBT

  • 점수 : 0.6142
  • epoch이 12일때 성능입니다

공통 HyperParameter

'random_seed' 'architecture' 'group_decay' 'weight_decay' 'max_seq_length' 'max_label_length' 'train_batch_size' 'eval_batch_size' 'learning_rate' 'adam_epsilon' 'max_grad_norm' 'num_train_epochs' 'warmup_ratio' 'model_name_or_path'
2020 'SUMBT' True 0.01 96 12 8 8 5e-5 1e-8 1.0 15 0.1 'dsksd/bert-ko-small-minimal'

SUMBT Hyperparameter

'hidden_dim' 'num_rnn_layers' 'zero_init_rnn' 'attn_head' 'fix_utterance_encoder' 'task_name' 'distance_metric'
512 1 False 4 False 'sumbtgru' 'euclidean'

실험 - SUMBT

  • 점수 : 0.6128
  • epoch이 11일때 성능입니다

공통 HyperParameter

'random_seed' 'architecture' 'group_decay' 'weight_decay' 'max_seq_length' 'max_label_length' 'train_batch_size' 'eval_batch_size' 'learning_rate' 'adam_epsilon' 'max_grad_norm' 'num_train_epochs' 'warmup_ratio' 'model_name_or_path'
2020 'SUMBT' True 0.01 128 16 4 8 5e-5 1e-8 1.0 15 0.1 'dsksd/bert-ko-small-minimal'

SUMBT Hyperparameter

'hidden_dim' 'num_rnn_layers' 'zero_init_rnn' 'attn_head' 'fix_utterance_encoder' 'task_name' 'distance_metric'
512 1 False 4 False 'sumbtgru' 'euclidean'

실험 - SUMBT

  • 점수 : 0.6066
  • epoch이 15일때 성능입니다

공통 HyperParameter

'random_seed' 'architecture' 'group_decay' 'weight_decay' 'max_seq_length' 'max_label_length' 'train_batch_size' 'eval_batch_size' 'learning_rate' 'adam_epsilon' 'max_grad_norm' 'num_train_epochs' 'warmup_ratio' 'model_name_or_path'
2020 'SUMBT' True 0.01 96 12 8 8 5e-5 1e-8 1.0 15 0.1 'dsksd/bert-ko-small-minimal'

SUMBT Hyperparameter

'hidden_dim' 'num_rnn_layers' 'zero_init_rnn' 'attn_head' 'fix_utterance_encoder' 'task_name' 'distance_metric'
300 1 False 4 False 'sumbtgru' 'euclidean'

실험 - SUMBT

  • 점수 : 0.6022
  • epoch이 18일때 성능입니다

공통 HyperParameter

'random_seed' 'architecture' 'group_decay' 'weight_decay' 'max_seq_length' 'max_label_length' 'train_batch_size' 'eval_batch_size' 'learning_rate' 'adam_epsilon' 'max_grad_norm' 'num_train_epochs' 'warmup_ratio' 'model_name_or_path'
2020 'SUMBT' True 0.01 96 12 8 8 5e-5 1e-8 1.0 20 0.1 'dsksd/bert-ko-small-minimal'

SUMBT Hyperparameter

'hidden_dim' 'num_rnn_layers' 'zero_init_rnn' 'attn_head' 'fix_utterance_encoder' 'task_name' 'distance_metric'
300 2 False 4 False 'sumbtgru' 'euclidean'