Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Bug] Failed to fast refit Llama2-7b with STRIP_PLAN + REFIT #3255

Open
zewenli98 opened this issue Oct 22, 2024 · 2 comments
Open

🐛 [Bug] Failed to fast refit Llama2-7b with STRIP_PLAN + REFIT #3255

zewenli98 opened this issue Oct 22, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@zewenli98
Copy link
Collaborator

Bug Description

If compile llama2-7b with STRIP_PLAN + REFIT and then fast refit (i.e., refit with weight_name_map), the outputs are all zeros. However, slow refit works well.
If compile llama2-7b with STRIP_PLAN + REFIT_IDENTICAL and then fast refit, the outputs are correct.

@zewenli98 zewenli98 added the bug Something isn't working label Oct 22, 2024
@zewenli98
Copy link
Collaborator Author

For REFIT, I compared the mappings used in fast refit and slow refit. The fast refit lacks many CONSTANT that required by refitter.get_all_weights() for refitting:

['model/arange_step CONSTANT', 'model/full_start CONSTANT', 'model/full_delta CONSTANT', 'model/full_mul_rhs CONSTANT', 'model/full_add_rhs CONSTANT', 'model/arange_1_start_rank_0 CONSTANT', 'model/arange_1_step CONSTANT', 'model/arange_2_start_rank_0 CONSTANT', 'model/arange_2_step CONSTANT', 'model/ge_gt_rhs CONSTANT', 'model/ge_eq_rhs CONSTANT', 'model/where_y CONSTANT', 'model/arange_3_start_rank_0 CONSTANT', 'model/arange_3_step CONSTANT', 'model.layers.0.input_layernorm/pow_1_rhs_val CONSTANT', 'model.layers.0.input_layernorm/add_34_rhs CONSTANT', 'model.layers.0.input_layernorm/div_lhs CONSTANT', 'model.layers.0.self_attn/div_1_rhs CONSTANT', 'model.layers.0.post_attention_layernorm/pow_2_rhs_val CONSTANT', 'model.layers.0.post_attention_layernorm/add_317_rhs CONSTANT', 'model.layers.0.post_attention_layernorm/div_2_lhs CONSTANT', 'model.layers.1.input_layernorm/pow_3_rhs_val CONSTANT', 'model.layers.1.input_layernorm/add_379_rhs CONSTANT', 'model.layers.1.input_layernorm/div_3_lhs CONSTANT', 'model.layers.1.self_attn/div_4_rhs CONSTANT', 'model.layers.1.post_attention_layernorm/pow_4_rhs_val CONSTANT', 'model.layers.1.post_attention_layernorm/add_662_rhs CONSTANT', 'model.layers.1.post_attention_layernorm/div_5_lhs CONSTANT', 'model.layers.2.input_layernorm/pow_5_rhs_val CONSTANT', 'model.layers.2.input_layernorm/add_724_rhs CONSTANT', 'model.layers.2.input_layernorm/div_6_lhs CONSTANT', 'model.layers.2.self_attn/div_7_rhs CONSTANT', 'model.layers.2.post_attention_layernorm/pow_6_rhs_val CONSTANT', 'model.layers.2.post_attention_layernorm/add_1007_rhs CONSTANT', 'model.layers.2.post_attention_layernorm/div_8_lhs CONSTANT', 'model.layers.3.input_layernorm/pow_7_rhs_val CONSTANT', 'model.layers.3.input_layernorm/add_1069_rhs CONSTANT', 'model.layers.3.input_layernorm/div_9_lhs CONSTANT', 'model.layers.3.self_attn/div_10_rhs CONSTANT', 'model.layers.3.post_attention_layernorm/pow_8_rhs_val CONSTANT', 'model.layers.3.post_attention_layernorm/add_1352_rhs CONSTANT', 'model.layers.3.post_attention_layernorm/div_11_lhs CONSTANT', 'model.layers.4.input_layernorm/pow_9_rhs_val CONSTANT', 'model.layers.4.input_layernorm/add_1414_rhs CONSTANT', 'model.layers.4.input_layernorm/div_12_lhs CONSTANT', 'model.layers.4.self_attn/div_13_rhs CONSTANT', 'model.layers.4.post_attention_layernorm/pow_10_rhs_val CONSTANT', 'model.layers.4.post_attention_layernorm/add_1697_rhs CONSTANT', 'model.layers.4.post_attention_layernorm/div_14_lhs CONSTANT', 'model.layers.5.input_layernorm/pow_11_rhs_val CONSTANT', 'model.layers.5.input_layernorm/add_1759_rhs CONSTANT', 'model.layers.5.input_layernorm/div_15_lhs CONSTANT', 'model.layers.5.self_attn/div_16_rhs CONSTANT', 'model.layers.5.post_attention_layernorm/pow_12_rhs_val CONSTANT', 'model.layers.5.post_attention_layernorm/add_2042_rhs CONSTANT', 'model.layers.5.post_attention_layernorm/div_17_lhs CONSTANT', 'model.layers.6.input_layernorm/pow_13_rhs_val CONSTANT', 'model.layers.6.input_layernorm/add_2104_rhs CONSTANT', 'model.layers.6.input_layernorm/div_18_lhs CONSTANT', 'model.layers.6.self_attn/div_19_rhs CONSTANT', 'model.layers.6.post_attention_layernorm/pow_14_rhs_val CONSTANT', 'model.layers.6.post_attention_layernorm/add_2387_rhs CONSTANT', 'model.layers.6.post_attention_layernorm/div_20_lhs CONSTANT', 'model.layers.7.input_layernorm/pow_15_rhs_val CONSTANT', 'model.layers.7.input_layernorm/add_2449_rhs CONSTANT', 'model.layers.7.input_layernorm/div_21_lhs CONSTANT', 'model.layers.7.self_attn/div_22_rhs CONSTANT', 'model.layers.7.post_attention_layernorm/pow_16_rhs_val CONSTANT', 'model.layers.7.post_attention_layernorm/add_2732_rhs CONSTANT', 'model.layers.7.post_attention_layernorm/div_23_lhs CONSTANT', 'model.layers.8.input_layernorm/pow_17_rhs_val CONSTANT', 'model.layers.8.input_layernorm/add_2794_rhs CONSTANT', 'model.layers.8.input_layernorm/div_24_lhs CONSTANT', 'model.layers.8.self_attn/div_25_rhs CONSTANT', 'model.layers.8.post_attention_layernorm/pow_18_rhs_val CONSTANT', 'model.layers.8.post_attention_layernorm/add_3077_rhs CONSTANT', 'model.layers.8.post_attention_layernorm/div_26_lhs CONSTANT', 'model.layers.9.input_layernorm/pow_19_rhs_val CONSTANT', 'model.layers.9.input_layernorm/add_3139_rhs CONSTANT', 'model.layers.9.input_layernorm/div_27_lhs CONSTANT', 'model.layers.9.self_attn/div_28_rhs CONSTANT', 'model.layers.9.post_attention_layernorm/pow_20_rhs_val CONSTANT', 'model.layers.9.post_attention_layernorm/add_3422_rhs CONSTANT', 'model.layers.9.post_attention_layernorm/div_29_lhs CONSTANT', 'model.layers.10.input_layernorm/pow_21_rhs_val CONSTANT', 'model.layers.10.input_layernorm/add_3484_rhs CONSTANT', 'model.layers.10.input_layernorm/div_30_lhs CONSTANT', 'model.layers.10.self_attn/div_31_rhs CONSTANT', 'model.layers.10.post_attention_layernorm/pow_22_rhs_val CONSTANT', 'model.layers.10.post_attention_layernorm/add_3767_rhs CONSTANT', 'model.layers.10.post_attention_layernorm/div_32_lhs CONSTANT', 'model.layers.11.input_layernorm/pow_23_rhs_val CONSTANT', 'model.layers.11.input_layernorm/add_3829_rhs CONSTANT', 'model.layers.11.input_layernorm/div_33_lhs CONSTANT', 'model.layers.11.self_attn/div_34_rhs CONSTANT', 'model.layers.11.post_attention_layernorm/pow_24_rhs_val CONSTANT', 'model.layers.11.post_attention_layernorm/add_4112_rhs CONSTANT', 'model.layers.11.post_attention_layernorm/div_35_lhs CONSTANT', 'model.layers.12.input_layernorm/pow_25_rhs_val CONSTANT', 'model.layers.12.input_layernorm/add_4174_rhs CONSTANT', 'model.layers.12.input_layernorm/div_36_lhs CONSTANT', 'model.layers.12.self_attn/div_37_rhs CONSTANT', 'model.layers.12.post_attention_layernorm/pow_26_rhs_val CONSTANT', 'model.layers.12.post_attention_layernorm/add_4457_rhs CONSTANT', 'model.layers.12.post_attention_layernorm/div_38_lhs CONSTANT', 'model.layers.13.input_layernorm/pow_27_rhs_val CONSTANT', 'model.layers.13.input_layernorm/add_4519_rhs CONSTANT', 'model.layers.13.input_layernorm/div_39_lhs CONSTANT', 'model.layers.13.self_attn/div_40_rhs CONSTANT', 'model.layers.13.post_attention_layernorm/pow_28_rhs_val CONSTANT', 'model.layers.13.post_attention_layernorm/add_4802_rhs CONSTANT', 'model.layers.13.post_attention_layernorm/div_41_lhs CONSTANT', 'model.layers.14.input_layernorm/pow_29_rhs_val CONSTANT', 'model.layers.14.input_layernorm/add_4864_rhs CONSTANT', 'model.layers.14.input_layernorm/div_42_lhs CONSTANT', 'model.layers.14.self_attn/div_43_rhs CONSTANT', 'model.layers.14.post_attention_layernorm/pow_30_rhs_val CONSTANT', 'model.layers.14.post_attention_layernorm/add_5147_rhs CONSTANT', 'model.layers.14.post_attention_layernorm/div_44_lhs CONSTANT', 'model.layers.15.input_layernorm/pow_31_rhs_val CONSTANT', 'model.layers.15.input_layernorm/add_5209_rhs CONSTANT', 'model.layers.15.input_layernorm/div_45_lhs CONSTANT', 'model.layers.15.self_attn/div_46_rhs CONSTANT', 'model.layers.15.post_attention_layernorm/pow_32_rhs_val CONSTANT', 'model.layers.15.post_attention_layernorm/add_5492_rhs CONSTANT', 'model.layers.15.post_attention_layernorm/div_47_lhs CONSTANT', 'model.layers.16.input_layernorm/pow_33_rhs_val CONSTANT', 'model.layers.16.input_layernorm/add_5554_rhs CONSTANT', 'model.layers.16.input_layernorm/div_48_lhs CONSTANT', 'model.layers.16.self_attn/div_49_rhs CONSTANT', 'model.layers.16.post_attention_layernorm/pow_34_rhs_val CONSTANT', 'model.layers.16.post_attention_layernorm/add_5837_rhs CONSTANT', 'model.layers.16.post_attention_layernorm/div_50_lhs CONSTANT', 'model.layers.17.input_layernorm/pow_35_rhs_val CONSTANT', 'model.layers.17.input_layernorm/add_5899_rhs CONSTANT', 'model.layers.17.input_layernorm/div_51_lhs CONSTANT', 'model.layers.17.self_attn/div_52_rhs CONSTANT', 'model.layers.17.post_attention_layernorm/pow_36_rhs_val CONSTANT', 'model.layers.17.post_attention_layernorm/add_6182_rhs CONSTANT', 'model.layers.17.post_attention_layernorm/div_53_lhs CONSTANT', 'model.layers.18.input_layernorm/pow_37_rhs_val CONSTANT', 'model.layers.18.input_layernorm/add_6244_rhs CONSTANT', 'model.layers.18.input_layernorm/div_54_lhs CONSTANT', 'model.layers.18.self_attn/div_55_rhs CONSTANT', 'model.layers.18.post_attention_layernorm/pow_38_rhs_val CONSTANT', 'model.layers.18.post_attention_layernorm/add_6527_rhs CONSTANT', 'model.layers.18.post_attention_layernorm/div_56_lhs CONSTANT', 'model.layers.19.input_layernorm/pow_39_rhs_val CONSTANT', 'model.layers.19.input_layernorm/add_6589_rhs CONSTANT', 'model.layers.19.input_layernorm/div_57_lhs CONSTANT', 'model.layers.19.self_attn/div_58_rhs CONSTANT', 'model.layers.19.post_attention_layernorm/pow_40_rhs_val CONSTANT', 'model.layers.19.post_attention_layernorm/add_6872_rhs CONSTANT', 'model.layers.19.post_attention_layernorm/div_59_lhs CONSTANT', 'model.layers.20.input_layernorm/pow_41_rhs_val CONSTANT', 'model.layers.20.input_layernorm/add_6934_rhs CONSTANT', 'model.layers.20.input_layernorm/div_60_lhs CONSTANT', 'model.layers.20.self_attn/div_61_rhs CONSTANT', 'model.layers.20.post_attention_layernorm/pow_42_rhs_val CONSTANT', 'model.layers.20.post_attention_layernorm/add_7217_rhs CONSTANT', 'model.layers.20.post_attention_layernorm/div_62_lhs CONSTANT', 'model.layers.21.input_layernorm/pow_43_rhs_val CONSTANT', 'model.layers.21.input_layernorm/add_7279_rhs CONSTANT', 'model.layers.21.input_layernorm/div_63_lhs CONSTANT', 'model.layers.21.self_attn/div_64_rhs CONSTANT', 'model.layers.21.post_attention_layernorm/pow_44_rhs_val CONSTANT', 'model.layers.21.post_attention_layernorm/add_7562_rhs CONSTANT', 'model.layers.21.post_attention_layernorm/div_65_lhs CONSTANT', 'model.layers.22.input_layernorm/pow_45_rhs_val CONSTANT', 'model.layers.22.input_layernorm/add_7624_rhs CONSTANT', 'model.layers.22.input_layernorm/div_66_lhs CONSTANT', 'model.layers.22.self_attn/div_67_rhs CONSTANT', 'model.layers.22.post_attention_layernorm/pow_46_rhs_val CONSTANT', 'model.layers.22.post_attention_layernorm/add_7907_rhs CONSTANT', 'model.layers.22.post_attention_layernorm/div_68_lhs CONSTANT', 'model.layers.23.input_layernorm/pow_47_rhs_val CONSTANT', 'model.layers.23.input_layernorm/add_7969_rhs CONSTANT', 'model.layers.23.input_layernorm/div_69_lhs CONSTANT', 'model.layers.23.self_attn/div_70_rhs CONSTANT', 'model.layers.23.post_attention_layernorm/pow_48_rhs_val CONSTANT', 'model.layers.23.post_attention_layernorm/add_8252_rhs CONSTANT', 'model.layers.23.post_attention_layernorm/div_71_lhs CONSTANT', 'model.layers.24.input_layernorm/pow_49_rhs_val CONSTANT', 'model.layers.24.input_layernorm/add_8314_rhs CONSTANT', 'model.layers.24.input_layernorm/div_72_lhs CONSTANT', 'model.layers.24.self_attn/div_73_rhs CONSTANT', 'model.layers.24.post_attention_layernorm/pow_50_rhs_val CONSTANT', 'model.layers.24.post_attention_layernorm/add_8597_rhs CONSTANT', 'model.layers.24.post_attention_layernorm/div_74_lhs CONSTANT', 'model.layers.25.input_layernorm/pow_51_rhs_val CONSTANT', 'model.layers.25.input_layernorm/add_8659_rhs CONSTANT', 'model.layers.25.input_layernorm/div_75_lhs CONSTANT', 'model.layers.25.self_attn/div_76_rhs CONSTANT', 'model.layers.25.post_attention_layernorm/pow_52_rhs_val CONSTANT', 'model.layers.25.post_attention_layernorm/add_8942_rhs CONSTANT', 'model.layers.25.post_attention_layernorm/div_77_lhs CONSTANT', 'model.layers.26.input_layernorm/pow_53_rhs_val CONSTANT', 'model.layers.26.input_layernorm/add_9004_rhs CONSTANT', 'model.layers.26.input_layernorm/div_78_lhs CONSTANT', 'model.layers.26.self_attn/div_79_rhs CONSTANT', 'model.layers.26.post_attention_layernorm/pow_54_rhs_val CONSTANT', 'model.layers.26.post_attention_layernorm/add_9287_rhs CONSTANT', 'model.layers.26.post_attention_layernorm/div_80_lhs CONSTANT', 'model.layers.27.input_layernorm/pow_55_rhs_val CONSTANT', 'model.layers.27.input_layernorm/add_9349_rhs CONSTANT', 'model.layers.27.input_layernorm/div_81_lhs CONSTANT', 'model.layers.27.self_attn/div_82_rhs CONSTANT', 'model.layers.27.post_attention_layernorm/pow_56_rhs_val CONSTANT', 'model.layers.27.post_attention_layernorm/add_9632_rhs CONSTANT', 'model.layers.27.post_attention_layernorm/div_83_lhs CONSTANT', 'model.layers.28.input_layernorm/pow_57_rhs_val CONSTANT', 'model.layers.28.input_layernorm/add_9694_rhs CONSTANT', 'model.layers.28.input_layernorm/div_84_lhs CONSTANT', 'model.layers.28.self_attn/div_85_rhs CONSTANT', 'model.layers.28.post_attention_layernorm/pow_58_rhs_val CONSTANT', 'model.layers.28.post_attention_layernorm/add_9977_rhs CONSTANT', 'model.layers.28.post_attention_layernorm/div_86_lhs CONSTANT', 'model.layers.29.input_layernorm/pow_59_rhs_val CONSTANT', 'model.layers.29.input_layernorm/add_10039_rhs CONSTANT', 'model.layers.29.input_layernorm/div_87_lhs CONSTANT', 'model.layers.29.self_attn/div_88_rhs CONSTANT', 'model.layers.29.post_attention_layernorm/pow_60_rhs_val CONSTANT', 'model.layers.29.post_attention_layernorm/add_10322_rhs CONSTANT', 'model.layers.29.post_attention_layernorm/div_89_lhs CONSTANT', 'model.layers.30.input_layernorm/pow_61_rhs_val CONSTANT', 'model.layers.30.input_layernorm/add_10384_rhs CONSTANT', 'model.layers.30.input_layernorm/div_90_lhs CONSTANT', 'model.layers.30.self_attn/div_91_rhs CONSTANT', 'model.layers.30.post_attention_layernorm/pow_62_rhs_val CONSTANT', 'model.layers.30.post_attention_layernorm/add_10667_rhs CONSTANT', 'model.layers.30.post_attention_layernorm/div_92_lhs CONSTANT', 'model.layers.31.input_layernorm/pow_63_rhs_val CONSTANT', 'model.layers.31.input_layernorm/add_10729_rhs CONSTANT', 'model.layers.31.input_layernorm/div_93_lhs CONSTANT', 'model.layers.31.self_attn/div_94_rhs CONSTANT', 'model.layers.31.post_attention_layernorm/pow_64_rhs_val CONSTANT', 'model.layers.31.post_attention_layernorm/add_11012_rhs CONSTANT', 'model.layers.31.post_attention_layernorm/div_95_lhs CONSTANT', 'model.norm/pow_65_rhs_val CONSTANT', 'model.norm/add_11074_rhs CONSTANT', 'model.norm/div_96_lhs CONSTANT']

For REFIT_IDENTICAL, fast refit doesn't lack anything. I think that's the reason why REFIT + fast refit outputs all zeros.

A guess about REFIT v.s. REFIT_IDENTICAL

Since fast refit works for REFIT_IDENTICAL, I guess this is because if we build an engine with REFIT_IDENTICAL, the engine will not strip CONSTANT.

Limitations in the current fast refit

The current fast refit doesn't consider CONSTANT, so if CONSTANT appears in engines like the list above, fast refit will silently fail.

@zewenli98
Copy link
Collaborator Author

Added a constant mapping for fast refit. This issue can be closed after merging PR #3167

@zewenli98 zewenli98 self-assigned this Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant