Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

torch.nn.modules.module.ModuleAttributeError: 'AlbertEmbeddings' object has no attribute 'bias' #241

Open
dhs29 opened this issue Mar 17, 2021 · 1 comment

Comments

@dhs29
Copy link

dhs29 commented Mar 17, 2021

transformers-cli convert --model_type albert
--tf_checkpoint $ALBERT_BASE_DIR/model.ckpt-64000
--config $ALBERT_BASE_DIR/albert_config.json
--pytorch_dump_output $ALBERT_BASE_DIR/pytorch_model.bin

i am running this script
AlbertConfig {
"attention_probs_dropout_prob": 0,
"bos_token_id": 2,
"classifier_dropout_prob": 0.1,
"down_scale_factor": 1,
"embedding_size": 128,
"eos_token_id": 3,
"gap_size": 0,
"hidden_act": "gelu",
"hidden_dropout_prob": 0,
"hidden_size": 768,
"initializer_range": 0.02,
"inner_group_num": 1,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "albert",
"net_structure_type": 0,
"num_attention_heads": 12,
"num_hidden_groups": 1,
"num_hidden_layers": 12,
"num_memory_blocks": 0,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 31990
}

Converting TensorFlow checkpoint from /data/NLP/ALBERT_Inspird_Train/albert_base/model.ckpt-64000
Loading TF weight bert/embeddings/layer_normalization/beta with shape [128]
Loading TF weight bert/embeddings/layer_normalization/beta/adam_m with shape [128]
Loading TF weight bert/embeddings/layer_normalization/beta/adam_v with shape [128]
Loading TF weight bert/embeddings/layer_normalization/gamma with shape [128]
Loading TF weight bert/embeddings/layer_normalization/gamma/adam_m with shape [128]
Loading TF weight bert/embeddings/layer_normalization/gamma/adam_v with shape [128]
Loading TF weight bert/embeddings/position_embeddings with shape [512, 128]
Loading TF weight bert/embeddings/position_embeddings/adam_m with shape [512, 128]
Loading TF weight bert/embeddings/position_embeddings/adam_v with shape [512, 128]
Loading TF weight bert/embeddings/token_type_embeddings with shape [2, 128]
Loading TF weight bert/embeddings/token_type_embeddings/adam_m with shape [2, 128]
Loading TF weight bert/embeddings/token_type_embeddings/adam_v with shape [2, 128]
Loading TF weight bert/embeddings/word_embeddings with shape [31990, 128]
Loading TF weight bert/embeddings/word_embeddings/adam_m with shape [31990, 128]
Loading TF weight bert/embeddings/word_embeddings/adam_v with shape [31990, 128]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/bias with shape [768]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/bias/adam_m with shape [768]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/bias/adam_v with shape [768]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/kernel with shape [128, 768]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/kernel/adam_m with shape [128, 768]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/kernel/adam_v with shape [128, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel/adam_m with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel/adam_v with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel/adam_m with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel/adam_v with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel/adam_m with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel/adam_v with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel/adam_m with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel/adam_v with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias with shape [3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias/adam_m with shape [3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias/adam_v with shape [3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel with shape [768, 3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel/adam_m with shape [768, 3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel/adam_v with shape [768, 3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel with shape [3072, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel/adam_m with shape [3072, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel/adam_v with shape [3072, 768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma/adam_v with shape [768]
Loading TF weight bert/pooler/dense/bias with shape [768]
Loading TF weight bert/pooler/dense/bias/adam_m with shape [768]
Loading TF weight bert/pooler/dense/bias/adam_v with shape [768]
Loading TF weight bert/pooler/dense/kernel with shape [768, 768]
Loading TF weight bert/pooler/dense/kernel/adam_m with shape [768, 768]
Loading TF weight bert/pooler/dense/kernel/adam_v with shape [768, 768]
Loading TF weight cls/predictions/output_bias with shape [31990]
Loading TF weight cls/predictions/output_bias/adam_m with shape [31990]
Loading TF weight cls/predictions/output_bias/adam_v with shape [31990]
Loading TF weight cls/predictions/transform/dense/bias with shape [128]
Loading TF weight cls/predictions/transform/dense/bias/adam_m with shape [128]
Loading TF weight cls/predictions/transform/dense/bias/adam_v with shape [128]
Loading TF weight cls/predictions/transform/dense/kernel with shape [768, 128]
Loading TF weight cls/predictions/transform/dense/kernel/adam_m with shape [768, 128]
Loading TF weight cls/predictions/transform/dense/kernel/adam_v with shape [768, 128]
Loading TF weight cls/predictions/transform/layer_normalization_25/beta with shape [128]
Loading TF weight cls/predictions/transform/layer_normalization_25/beta/adam_m with shape [128]
Loading TF weight cls/predictions/transform/layer_normalization_25/beta/adam_v with shape [128]
Loading TF weight cls/predictions/transform/layer_normalization_25/gamma with shape [128]
Loading TF weight cls/predictions/transform/layer_normalization_25/gamma/adam_m with shape [128]
Loading TF weight cls/predictions/transform/layer_normalization_25/gamma/adam_v with shape [128]
Loading TF weight cls/seq_relationship/output_bias with shape [2]
Loading TF weight cls/seq_relationship/output_bias/adam_m with shape [2]
Loading TF weight cls/seq_relationship/output_bias/adam_v with shape [2]
Loading TF weight cls/seq_relationship/output_weights with shape [2, 768]
Loading TF weight cls/seq_relationship/output_weights/adam_m with shape [2, 768]
Loading TF weight cls/seq_relationship/output_weights/adam_v with shape [2, 768]
Loading TF weight global_step with shape []
bert/embeddings/layer_normalization/beta
bert/embeddings/layer_normalization/beta/adam_m
bert/embeddings/layer_normalization/beta/adam_v
bert/embeddings/layer_normalization/gamma
bert/embeddings/layer_normalization/gamma/adam_m
bert/embeddings/layer_normalization/gamma/adam_v
bert/embeddings/position_embeddings
bert/embeddings/position_embeddings/adam_m
bert/embeddings/position_embeddings/adam_v
bert/embeddings/token_type_embeddings
bert/embeddings/token_type_embeddings/adam_m
bert/embeddings/token_type_embeddings/adam_v
bert/embeddings/word_embeddings
bert/embeddings/word_embeddings/adam_m
bert/embeddings/word_embeddings/adam_v
bert/encoder/embedding_hidden_mapping_in/bias
bert/encoder/embedding_hidden_mapping_in/bias/adam_m
bert/encoder/embedding_hidden_mapping_in/bias/adam_v
bert/encoder/embedding_hidden_mapping_in/kernel
bert/encoder/embedding_hidden_mapping_in/kernel/adam_m
bert/encoder/embedding_hidden_mapping_in/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel/adam_v
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta/adam_m
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta/adam_v
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma/adam_m
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma/adam_v
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta/adam_m
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta/adam_v
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma/adam_m
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma/adam_v
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta/adam_m
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta/adam_v
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma/adam_m
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma/adam_v
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta/adam_m
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta/adam_v
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma/adam_m
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma/adam_v
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta/adam_m
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta/adam_v
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma/adam_m
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma/adam_v
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta/adam_m
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta/adam_v
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma/adam_m
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma/adam_v
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta/adam_m
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta/adam_v
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma/adam_m
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma/adam_v
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta/adam_m
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta/adam_v
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma/adam_m
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma/adam_v
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta/adam_m
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta/adam_v
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma/adam_m
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma/adam_v
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta/adam_m
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta/adam_v
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma/adam_m
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma/adam_v
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta/adam_m
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta/adam_v
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma/adam_m
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma/adam_v
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta/adam_m
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta/adam_v
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma/adam_m
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma/adam_v
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta/adam_m
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta/adam_v
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma/adam_m
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma/adam_v
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta/adam_m
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta/adam_v
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma/adam_m
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma/adam_v
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta/adam_m
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta/adam_v
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma/adam_m
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma/adam_v
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta/adam_m
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta/adam_v
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma/adam_m
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma/adam_v
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta/adam_m
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta/adam_v
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma/adam_m
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma/adam_v
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta/adam_m
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta/adam_v
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma/adam_m
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma/adam_v
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta/adam_m
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta/adam_v
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma/adam_m
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma/adam_v
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta/adam_m
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta/adam_v
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma/adam_m
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma/adam_v
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta/adam_m
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta/adam_v
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma/adam_m
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma/adam_v
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta/adam_m
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta/adam_v
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma/adam_m
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma/adam_v
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta/adam_m
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta/adam_v
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma/adam_m
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma/adam_v
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta/adam_m
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta/adam_v
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma/adam_m
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma/adam_v
bert/pooler/dense/bias
bert/pooler/dense/bias/adam_m
bert/pooler/dense/bias/adam_v
bert/pooler/dense/kernel
bert/pooler/dense/kernel/adam_m
bert/pooler/dense/kernel/adam_v
cls/predictions/output_bias
cls/predictions/output_bias/adam_m
cls/predictions/output_bias/adam_v
cls/predictions/transform/dense/bias
cls/predictions/transform/dense/bias/adam_m
cls/predictions/transform/dense/bias/adam_v
cls/predictions/transform/dense/kernel
cls/predictions/transform/dense/kernel/adam_m
cls/predictions/transform/dense/kernel/adam_v
cls/predictions/transform/layer_normalization_25/beta
cls/predictions/transform/layer_normalization_25/beta/adam_m
cls/predictions/transform/layer_normalization_25/beta/adam_v
cls/predictions/transform/layer_normalization_25/gamma
cls/predictions/transform/layer_normalization_25/gamma/adam_m
cls/predictions/transform/layer_normalization_25/gamma/adam_v
cls/seq_relationship/output_bias
cls/seq_relationship/output_bias/adam_m
cls/seq_relationship/output_bias/adam_v
cls/seq_relationship/output_weights
cls/seq_relationship/output_weights/adam_m
cls/seq_relationship/output_weights/adam_v
global_step
Skipping albert/embeddings/layer_normalization/beta
Traceback (most recent call last):
File "/home/dshah/venv/bin/transformers-cli", line 8, in
sys.exit(main())
File "/home/dshah/venv/lib64/python3.8/site-packages/transformers/commands/transformers_cli.py", line 33, in main
service.run()
File "/home/dshah/venv/lib64/python3.8/site-packages/transformers/commands/convert.py", line 80, in run
convert_tf_checkpoint_to_pytorch(self._tf_checkpoint, self._config, self._pytorch_dump_output)
File "/home/dshah/venv/lib64/python3.8/site-packages/transformers/convert_albert_original_tf_checkpoint_to_pytorch.py", line 36, in convert_tf_checkpoint_to_pytorch
load_tf_weights_in_albert(model, config, tf_checkpoint_path)
File "/home/dshah/venv/lib64/python3.8/site-packages/transformers/modeling_albert.py", line 163, in load_tf_weights_in_albert
pointer = getattr(pointer, "bias")
File "/home/dshah/venv/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 771, in getattr
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'AlbertEmbeddings' object has no attribute 'bias'

@Ala-Na
Copy link

Ala-Na commented Mar 17, 2023

Hi there !

I'm curious : Did you find a solution about this issue ?

Thank you

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants