You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.
transformers-cli convert --model_type albert
--tf_checkpoint $ALBERT_BASE_DIR/model.ckpt-64000
--config $ALBERT_BASE_DIR/albert_config.json
--pytorch_dump_output $ALBERT_BASE_DIR/pytorch_model.bin
i am running this script
AlbertConfig {
"attention_probs_dropout_prob": 0,
"bos_token_id": 2,
"classifier_dropout_prob": 0.1,
"down_scale_factor": 1,
"embedding_size": 128,
"eos_token_id": 3,
"gap_size": 0,
"hidden_act": "gelu",
"hidden_dropout_prob": 0,
"hidden_size": 768,
"initializer_range": 0.02,
"inner_group_num": 1,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "albert",
"net_structure_type": 0,
"num_attention_heads": 12,
"num_hidden_groups": 1,
"num_hidden_layers": 12,
"num_memory_blocks": 0,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 31990
}
Converting TensorFlow checkpoint from /data/NLP/ALBERT_Inspird_Train/albert_base/model.ckpt-64000
Loading TF weight bert/embeddings/layer_normalization/beta with shape [128]
Loading TF weight bert/embeddings/layer_normalization/beta/adam_m with shape [128]
Loading TF weight bert/embeddings/layer_normalization/beta/adam_v with shape [128]
Loading TF weight bert/embeddings/layer_normalization/gamma with shape [128]
Loading TF weight bert/embeddings/layer_normalization/gamma/adam_m with shape [128]
Loading TF weight bert/embeddings/layer_normalization/gamma/adam_v with shape [128]
Loading TF weight bert/embeddings/position_embeddings with shape [512, 128]
Loading TF weight bert/embeddings/position_embeddings/adam_m with shape [512, 128]
Loading TF weight bert/embeddings/position_embeddings/adam_v with shape [512, 128]
Loading TF weight bert/embeddings/token_type_embeddings with shape [2, 128]
Loading TF weight bert/embeddings/token_type_embeddings/adam_m with shape [2, 128]
Loading TF weight bert/embeddings/token_type_embeddings/adam_v with shape [2, 128]
Loading TF weight bert/embeddings/word_embeddings with shape [31990, 128]
Loading TF weight bert/embeddings/word_embeddings/adam_m with shape [31990, 128]
Loading TF weight bert/embeddings/word_embeddings/adam_v with shape [31990, 128]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/bias with shape [768]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/bias/adam_m with shape [768]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/bias/adam_v with shape [768]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/kernel with shape [128, 768]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/kernel/adam_m with shape [128, 768]
Loading TF weight bert/encoder/embedding_hidden_mapping_in/kernel/adam_v with shape [128, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel/adam_m with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel/adam_v with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel/adam_m with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel/adam_v with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel/adam_m with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel/adam_v with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel/adam_m with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel/adam_v with shape [768, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias with shape [3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias/adam_m with shape [3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias/adam_v with shape [3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel with shape [768, 3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel/adam_m with shape [768, 3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel/adam_v with shape [768, 3072]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel with shape [3072, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel/adam_m with shape [3072, 768]
Loading TF weight bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel/adam_v with shape [3072, 768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta/adam_v with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma/adam_m with shape [768]
Loading TF weight bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma/adam_v with shape [768]
Loading TF weight bert/pooler/dense/bias with shape [768]
Loading TF weight bert/pooler/dense/bias/adam_m with shape [768]
Loading TF weight bert/pooler/dense/bias/adam_v with shape [768]
Loading TF weight bert/pooler/dense/kernel with shape [768, 768]
Loading TF weight bert/pooler/dense/kernel/adam_m with shape [768, 768]
Loading TF weight bert/pooler/dense/kernel/adam_v with shape [768, 768]
Loading TF weight cls/predictions/output_bias with shape [31990]
Loading TF weight cls/predictions/output_bias/adam_m with shape [31990]
Loading TF weight cls/predictions/output_bias/adam_v with shape [31990]
Loading TF weight cls/predictions/transform/dense/bias with shape [128]
Loading TF weight cls/predictions/transform/dense/bias/adam_m with shape [128]
Loading TF weight cls/predictions/transform/dense/bias/adam_v with shape [128]
Loading TF weight cls/predictions/transform/dense/kernel with shape [768, 128]
Loading TF weight cls/predictions/transform/dense/kernel/adam_m with shape [768, 128]
Loading TF weight cls/predictions/transform/dense/kernel/adam_v with shape [768, 128]
Loading TF weight cls/predictions/transform/layer_normalization_25/beta with shape [128]
Loading TF weight cls/predictions/transform/layer_normalization_25/beta/adam_m with shape [128]
Loading TF weight cls/predictions/transform/layer_normalization_25/beta/adam_v with shape [128]
Loading TF weight cls/predictions/transform/layer_normalization_25/gamma with shape [128]
Loading TF weight cls/predictions/transform/layer_normalization_25/gamma/adam_m with shape [128]
Loading TF weight cls/predictions/transform/layer_normalization_25/gamma/adam_v with shape [128]
Loading TF weight cls/seq_relationship/output_bias with shape [2]
Loading TF weight cls/seq_relationship/output_bias/adam_m with shape [2]
Loading TF weight cls/seq_relationship/output_bias/adam_v with shape [2]
Loading TF weight cls/seq_relationship/output_weights with shape [2, 768]
Loading TF weight cls/seq_relationship/output_weights/adam_m with shape [2, 768]
Loading TF weight cls/seq_relationship/output_weights/adam_v with shape [2, 768]
Loading TF weight global_step with shape []
bert/embeddings/layer_normalization/beta
bert/embeddings/layer_normalization/beta/adam_m
bert/embeddings/layer_normalization/beta/adam_v
bert/embeddings/layer_normalization/gamma
bert/embeddings/layer_normalization/gamma/adam_m
bert/embeddings/layer_normalization/gamma/adam_v
bert/embeddings/position_embeddings
bert/embeddings/position_embeddings/adam_m
bert/embeddings/position_embeddings/adam_v
bert/embeddings/token_type_embeddings
bert/embeddings/token_type_embeddings/adam_m
bert/embeddings/token_type_embeddings/adam_v
bert/embeddings/word_embeddings
bert/embeddings/word_embeddings/adam_m
bert/embeddings/word_embeddings/adam_v
bert/encoder/embedding_hidden_mapping_in/bias
bert/encoder/embedding_hidden_mapping_in/bias/adam_m
bert/encoder/embedding_hidden_mapping_in/bias/adam_v
bert/encoder/embedding_hidden_mapping_in/kernel
bert/encoder/embedding_hidden_mapping_in/kernel/adam_m
bert/encoder/embedding_hidden_mapping_in/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/output/dense/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/key/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/query/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/attention_1/self/value/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/dense/kernel/adam_v
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias/adam_m
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/bias/adam_v
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel/adam_m
bert/encoder/transformer/group_0/inner_group_0/ffn_1/intermediate/output/dense/kernel/adam_v
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta/adam_m
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/beta/adam_v
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma/adam_m
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_1/gamma/adam_v
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta/adam_m
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/beta/adam_v
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma/adam_m
bert/encoder/transformer/group_0/layer_0/inner_group_0/layer_normalization_2/gamma/adam_v
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta/adam_m
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/beta/adam_v
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma/adam_m
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_3/gamma/adam_v
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta/adam_m
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/beta/adam_v
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma/adam_m
bert/encoder/transformer/group_0_1/layer_1/inner_group_0/layer_normalization_4/gamma/adam_v
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta/adam_m
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/beta/adam_v
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma/adam_m
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_21/gamma/adam_v
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta/adam_m
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/beta/adam_v
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma/adam_m
bert/encoder/transformer/group_0_10/layer_10/inner_group_0/layer_normalization_22/gamma/adam_v
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta/adam_m
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/beta/adam_v
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma/adam_m
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_23/gamma/adam_v
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta/adam_m
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/beta/adam_v
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma/adam_m
bert/encoder/transformer/group_0_11/layer_11/inner_group_0/layer_normalization_24/gamma/adam_v
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta/adam_m
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/beta/adam_v
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma/adam_m
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_5/gamma/adam_v
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta/adam_m
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/beta/adam_v
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma/adam_m
bert/encoder/transformer/group_0_2/layer_2/inner_group_0/layer_normalization_6/gamma/adam_v
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta/adam_m
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/beta/adam_v
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma/adam_m
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_7/gamma/adam_v
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta/adam_m
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/beta/adam_v
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma/adam_m
bert/encoder/transformer/group_0_3/layer_3/inner_group_0/layer_normalization_8/gamma/adam_v
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta/adam_m
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/beta/adam_v
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma/adam_m
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_10/gamma/adam_v
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta/adam_m
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/beta/adam_v
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma/adam_m
bert/encoder/transformer/group_0_4/layer_4/inner_group_0/layer_normalization_9/gamma/adam_v
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta/adam_m
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/beta/adam_v
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma/adam_m
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_11/gamma/adam_v
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta/adam_m
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/beta/adam_v
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma/adam_m
bert/encoder/transformer/group_0_5/layer_5/inner_group_0/layer_normalization_12/gamma/adam_v
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta/adam_m
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/beta/adam_v
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma/adam_m
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_13/gamma/adam_v
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta/adam_m
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/beta/adam_v
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma/adam_m
bert/encoder/transformer/group_0_6/layer_6/inner_group_0/layer_normalization_14/gamma/adam_v
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta/adam_m
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/beta/adam_v
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma/adam_m
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_15/gamma/adam_v
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta/adam_m
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/beta/adam_v
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma/adam_m
bert/encoder/transformer/group_0_7/layer_7/inner_group_0/layer_normalization_16/gamma/adam_v
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta/adam_m
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/beta/adam_v
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma/adam_m
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_17/gamma/adam_v
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta/adam_m
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/beta/adam_v
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma/adam_m
bert/encoder/transformer/group_0_8/layer_8/inner_group_0/layer_normalization_18/gamma/adam_v
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta/adam_m
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/beta/adam_v
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma/adam_m
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_19/gamma/adam_v
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta/adam_m
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/beta/adam_v
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma/adam_m
bert/encoder/transformer/group_0_9/layer_9/inner_group_0/layer_normalization_20/gamma/adam_v
bert/pooler/dense/bias
bert/pooler/dense/bias/adam_m
bert/pooler/dense/bias/adam_v
bert/pooler/dense/kernel
bert/pooler/dense/kernel/adam_m
bert/pooler/dense/kernel/adam_v
cls/predictions/output_bias
cls/predictions/output_bias/adam_m
cls/predictions/output_bias/adam_v
cls/predictions/transform/dense/bias
cls/predictions/transform/dense/bias/adam_m
cls/predictions/transform/dense/bias/adam_v
cls/predictions/transform/dense/kernel
cls/predictions/transform/dense/kernel/adam_m
cls/predictions/transform/dense/kernel/adam_v
cls/predictions/transform/layer_normalization_25/beta
cls/predictions/transform/layer_normalization_25/beta/adam_m
cls/predictions/transform/layer_normalization_25/beta/adam_v
cls/predictions/transform/layer_normalization_25/gamma
cls/predictions/transform/layer_normalization_25/gamma/adam_m
cls/predictions/transform/layer_normalization_25/gamma/adam_v
cls/seq_relationship/output_bias
cls/seq_relationship/output_bias/adam_m
cls/seq_relationship/output_bias/adam_v
cls/seq_relationship/output_weights
cls/seq_relationship/output_weights/adam_m
cls/seq_relationship/output_weights/adam_v
global_step
Skipping albert/embeddings/layer_normalization/beta
Traceback (most recent call last):
File "/home/dshah/venv/bin/transformers-cli", line 8, in
sys.exit(main())
File "/home/dshah/venv/lib64/python3.8/site-packages/transformers/commands/transformers_cli.py", line 33, in main
service.run()
File "/home/dshah/venv/lib64/python3.8/site-packages/transformers/commands/convert.py", line 80, in run
convert_tf_checkpoint_to_pytorch(self._tf_checkpoint, self._config, self._pytorch_dump_output)
File "/home/dshah/venv/lib64/python3.8/site-packages/transformers/convert_albert_original_tf_checkpoint_to_pytorch.py", line 36, in convert_tf_checkpoint_to_pytorch
load_tf_weights_in_albert(model, config, tf_checkpoint_path)
File "/home/dshah/venv/lib64/python3.8/site-packages/transformers/modeling_albert.py", line 163, in load_tf_weights_in_albert
pointer = getattr(pointer, "bias")
File "/home/dshah/venv/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 771, in getattr
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'AlbertEmbeddings' object has no attribute 'bias'
The text was updated successfully, but these errors were encountered: