-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[moe] merge branch feature/pipeline #4435
Commits on Jun 20, 2023
-
[cluster] add process group mesh (hpcaitech#4039)
* [cluster] add process group mesh * [test] add process group mesh test * force sync
Configuration menu - View commit details
-
Copy full SHA for 1015f04 - Browse repository at this point
Copy the full SHA 1015f04View commit details
Commits on Jun 27, 2023
-
[pipeline] add stage manager (hpcaitech#4093)
* [pipeline] add stage manager * [test] add pipeline stage manager test * [pipeline] add docstring for stage manager
Configuration menu - View commit details
-
Copy full SHA for b10821a - Browse repository at this point
Copy the full SHA b10821aView commit details
Commits on Jun 28, 2023
-
[pipeline] implement p2p communication (hpcaitech#4100)
* [pipeline] add p2p communication * [test] add p2p communication test * [test] add rerun decorator * [test] rename to avoid conflict
Configuration menu - View commit details
-
Copy full SHA for bd6b0a3 - Browse repository at this point
Copy the full SHA bd6b0a3View commit details
Commits on Jun 29, 2023
-
[pipeline] refactor 1f1b schedule (hpcaitech#4115)
* [api] update optimizer wrapper to fit pipeline * [pipeline] add base schedule * [pipeline] add 1f1b schedule * [test] add pipeline schedule utils test * [pipeline] fix import
Configuration menu - View commit details
-
Copy full SHA for faeac9d - Browse repository at this point
Copy the full SHA faeac9dView commit details
Commits on Jul 4, 2023
-
[pipeline]add pipeline policy and bert forward (hpcaitech#4130)
* add pipeline policy and bert forward to be done * add bertmodel pipeline forward and make tests * add Bert_Policy and test for policy * update formatting * update formatting * update the code * fix bugs * fix name confilt
Configuration menu - View commit details
-
Copy full SHA for 9852743 - Browse repository at this point
Copy the full SHA 9852743View commit details -
[cluster] add process group mesh (hpcaitech#4039)
* [cluster] add process group mesh * [test] add process group mesh test * force sync
Configuration menu - View commit details
-
Copy full SHA for 3be0c35 - Browse repository at this point
Copy the full SHA 3be0c35View commit details -
[pipeline] add stage manager (hpcaitech#4093)
* [pipeline] add stage manager * [test] add pipeline stage manager test * [pipeline] add docstring for stage manager
Configuration menu - View commit details
-
Copy full SHA for 18c7539 - Browse repository at this point
Copy the full SHA 18c7539View commit details -
[pipeline] implement p2p communication (hpcaitech#4100)
* [pipeline] add p2p communication * [test] add p2p communication test * [test] add rerun decorator * [test] rename to avoid conflict
Configuration menu - View commit details
-
Copy full SHA for 5a467e9 - Browse repository at this point
Copy the full SHA 5a467e9View commit details -
[pipeline] refactor 1f1b schedule (hpcaitech#4115)
* [api] update optimizer wrapper to fit pipeline * [pipeline] add base schedule * [pipeline] add 1f1b schedule * [test] add pipeline schedule utils test * [pipeline] fix import
Configuration menu - View commit details
-
Copy full SHA for 9526f44 - Browse repository at this point
Copy the full SHA 9526f44View commit details -
[pipeline]add pipeline policy and bert forward (hpcaitech#4130)
* add pipeline policy and bert forward to be done * add bertmodel pipeline forward and make tests * add Bert_Policy and test for policy * update formatting * update formatting * update the code * fix bugs * fix name confilt
Configuration menu - View commit details
-
Copy full SHA for 836a3a2 - Browse repository at this point
Copy the full SHA 836a3a2View commit details -
Merge pull request hpcaitech#4166 from ver217/sync/main
[sync] update from main
Configuration menu - View commit details
-
Copy full SHA for ef1f972 - Browse repository at this point
Copy the full SHA ef1f972View commit details
Commits on Jul 5, 2023
-
[pipeline] build bloom model and policy , revise the base class of po…
…licy (hpcaitech#4161) * add pipeline policy and bert forward to be done * add bertmodel pipeline forward and make tests * add Bert_Policy and test for policy * update formatting * update formatting * update the code * fix bugs * fix name confilt * add bloom model and policy ,revise the base class of policy * revise * revision * add bert_for_pretraining
Configuration menu - View commit details
-
Copy full SHA for 386d34e - Browse repository at this point
Copy the full SHA 386d34eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 15a4e82 - Browse repository at this point
Copy the full SHA 15a4e82View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8b6679d - Browse repository at this point
Copy the full SHA 8b6679dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9143556 - Browse repository at this point
Copy the full SHA 9143556View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0cbe423 - Browse repository at this point
Copy the full SHA 0cbe423View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1a87dd7 - Browse repository at this point
Copy the full SHA 1a87dd7View commit details -
Configuration menu - View commit details
-
Copy full SHA for d4b96ab - Browse repository at this point
Copy the full SHA d4b96abView commit details -
Merge pull request hpcaitech#4176 from ver217/feature/pipeline-policy
[pipeline] fit shardformer policy
Configuration menu - View commit details
-
Copy full SHA for 12e6d5d - Browse repository at this point
Copy the full SHA 12e6d5dView commit details
Commits on Jul 6, 2023
-
[pipeline] add bert_for_pretraining bert_lmhead forward and policy (h…
…pcaitech#4172) * add pipeline policy and bert forward to be done * add bertmodel pipeline forward and make tests * add Bert_Policy and test for policy * update formatting * update formatting * update the code * fix bugs * fix name confilt * add bloom model and policy ,revise the base class of policy * revise * revision * add bert_for_pretraining * add bert_for_pretraining forward and policy * fix typos * cancel warning * change the imediate output to default dict * change the default output of get_shared_params
Configuration menu - View commit details
-
Copy full SHA for 15b34e0 - Browse repository at this point
Copy the full SHA 15b34e0View commit details
Commits on Jul 7, 2023
-
Feature/vit support (hpcaitech#4182)
* [shardformer] added tests * [shardformer] vit test finish and support * fix attention dropout
Configuration menu - View commit details
-
Copy full SHA for ec217de - Browse repository at this point
Copy the full SHA ec217deView commit details -
[pipeline] move bert related pipeline components to shardformer (hpca…
…itech#4187) * move bert related pipeline components to shardformer * fix bugs * revision * fix bert model tests * fix bert_lm_head model tests * fix tests * fix tests * done checks * skip bloom
Configuration menu - View commit details
-
Copy full SHA for c6f9c2c - Browse repository at this point
Copy the full SHA c6f9c2cView commit details
Commits on Jul 10, 2023
-
[shardformer] support lazy init (hpcaitech#4202)
* [shardformer] support lazy init * [shardformer] linear support lazy init * [shardformer] embedding support lazy init * [shardformer] norm support lazy init * [shardformer] fused linear support lazy init * [test] update shardformer test layer * [test] shardformer with lazy init fit ddp * [lazy] hotfix deepcopy of param * [shardformer] fix bert policy and update test * [shardformer] fix bloom policy and update test * [shardformer] fix opt policy and update test * [shardformer] fix t5 policy and update test * [shardformer] fix gpt2 policy and update test * [shardformer] fix llama policy and update test
Configuration menu - View commit details
-
Copy full SHA for 0192011 - Browse repository at this point
Copy the full SHA 0192011View commit details -
[pipeline] Bert pipeline for shardformer and its tests (hpcaitech#4197)
* add pipeline forward * complete pipeline forward check * fix bert forward without pipeline * fix comments * discard useless line * add todo * clean prints * fix distribute layers
Configuration menu - View commit details
-
Copy full SHA for b30d1b9 - Browse repository at this point
Copy the full SHA b30d1b9View commit details
Commits on Jul 11, 2023
-
[pipeline] Llama pipeline (hpcaitech#4205)
* bloom policy * llama pipeline forward and tests * fix the output and attention_mask * fix name * bind argument to policy * Revert "bloom policy" This reverts commit 8dee68a. This policy should be revert and copied to feature/bloom * revert the bloom changes * cancel unneeded inputs * gpt
Configuration menu - View commit details
-
Copy full SHA for a2619c3 - Browse repository at this point
Copy the full SHA a2619c3View commit details -
[pipeline] Llama causal lm and llama for sequence classification pipe…
…line (hpcaitech#4208) * bloom policy * llama pipeline forward and tests * fix the output and attention_mask * fix name * bind argument to policy * Revert "bloom policy" This reverts commit 8dee68a. This policy should be revert and copied to feature/bloom * revert the bloom changes * cancel unneeded inputs * gpt * finish llama * causal lm and sequence classification * revision
Configuration menu - View commit details
-
Copy full SHA for 981764c - Browse repository at this point
Copy the full SHA 981764cView commit details
Commits on Jul 13, 2023
-
[pipeline] add bloom model pipeline (hpcaitech#4210)
* bloom policy * llama pipeline forward and tests * fix the output and attention_mask * fix name * bind argument to policy * finish bloom model * test shard gpt2 * clear cache
Configuration menu - View commit details
-
Copy full SHA for 3595eba - Browse repository at this point
Copy the full SHA 3595ebaView commit details -
[pipeline] Add Pipeline Forward for GPT2Model Shardformer (hpcaitech#…
…4224) * * fix typehint & docstring in sharder.py * * update pipeline forward for GPT2Model * * add test for pipeline forward of GPT2Model * * add cache cleaning in gpt2 test * * change assert to raise command
Configuration menu - View commit details
-
Copy full SHA for 236f294 - Browse repository at this point
Copy the full SHA 236f294View commit details
Commits on Jul 14, 2023
-
Configuration menu - View commit details
-
Copy full SHA for ad2687c - Browse repository at this point
Copy the full SHA ad2687cView commit details -
[shardformer] support SAM (hpcaitech#4231)
* 1.support sam 2.add fused qkv for nn.Linear * update utils support set element in list * overtwrite SamVisionAttention foward to use DropoutForParallelInput * remove unused code
Configuration menu - View commit details
-
Copy full SHA for ddecf73 - Browse repository at this point
Copy the full SHA ddecf73View commit details
Commits on Jul 17, 2023
-
[shardformer] support whisper (hpcaitech#4212)
* support whisper * fix bug in vocabembedding * support downstream model of whisper * update readme
Configuration menu - View commit details
-
Copy full SHA for afcf4a0 - Browse repository at this point
Copy the full SHA afcf4a0View commit details -
[pipeline] add pipeline forward for variants of gpt2 (hpcaitech#4238)
* add forward for GPTLMHeadModel * add test for gpt_lm * arranging get_held_layers method * arrange forward replacement * add forward for GPT2ForTokenClassification * add forward for GPT2ForSequenceClassification * fix test_shard_gpt2.py * add GPT2DoubleHeadsmodel & fix bugs * add id checking in get_shared_params
Configuration menu - View commit details
-
Copy full SHA for 383d2e3 - Browse repository at this point
Copy the full SHA 383d2e3View commit details -
[pipeline] All bert models (hpcaitech#4233)
* bloom policy * llama pipeline forward and tests * fix the output and attention_mask * fix name * bind argument to policy * Revert "bloom policy" This reverts commit 8dee68a. This policy should be revert and copied to feature/bloom * revert the bloom changes * cancel unneeded inputs * gpt * finish llama * causal lm and sequence classification * revision * add pure pipeline test * finish some bert models * finish all bert models * finish bert tests * fix bugs * fix bugs * fix test pipeline * fix data gen for qa * update the set pipeline forward * shared params * fix bugs
Configuration menu - View commit details
-
Copy full SHA for 7b8756f - Browse repository at this point
Copy the full SHA 7b8756fView commit details -
[pipeline] finish bloom models pipeline and tests (hpcaitech#4223)
* bloom policy * llama pipeline forward and tests * fix the output and attention_mask * fix name * bind argument to policy * finish bloom model * test shard gpt2 * clear cache * support all bloom models * add bloom models policies * finish bloom pipeline and tests * add set pipeline * finish bloom
Configuration menu - View commit details
-
Copy full SHA for a895458 - Browse repository at this point
Copy the full SHA a895458View commit details
Commits on Jul 18, 2023
-
[bugs] hot fix some testing bugs for new models (hpcaitech#4268)
* hot fix * hot fx tracer
Configuration menu - View commit details
-
Copy full SHA for 843158b - Browse repository at this point
Copy the full SHA 843158bView commit details
Commits on Jul 19, 2023
-
[pipeline] support shardformer for GPT2ForQuestionAnswering & complet…
…e pipeline support for GPT2 (hpcaitech#4245) * change for transformers loggers * add forward for GPT2ForQuestionAnswering * fix assert * fix torchrec test
Configuration menu - View commit details
-
Copy full SHA for 3918898 - Browse repository at this point
Copy the full SHA 3918898View commit details
Commits on Jul 20, 2023
-
[shardformer] support inplace sharding (hpcaitech#4251)
* [shardformer] embedding support inplace sharding * [shardformer] linear support inplace sharding * [shardformer] layernorm support inplace sharding * [shardformer] qkv support inplace sharding * [test] update shardformer layer test * [shardformer] fix shared param sharding * [shardformer] fix bert policy * [shardformer] fix bloom policy * [shardformer] fix llama policy * [shardformer] fix opt policy * [shardformer] fix t5 policy * [shardformer] fix fused qkv linear * [shardformer] fix bugs * force sync * [test] fix bugs * [test] fix transformer version
Configuration menu - View commit details
-
Copy full SHA for 7b5a155 - Browse repository at this point
Copy the full SHA 7b5a155View commit details -
[pipeline] refactor gpt2 pipeline forwards (hpcaitech#4287)
* move gpt2 pipeline forwards to modeling folder * check pipeline status when adding replacing policy * fix typehint * fix arguments processing in gpt2_model_forward
Configuration menu - View commit details
-
Copy full SHA for cc120c6 - Browse repository at this point
Copy the full SHA cc120c6View commit details -
[pipeline] OPT model pipeline (hpcaitech#4258)
* opt forward and test * pause * finish opt model pipeline * finish opt pipeline * opt forward and test * pause * finish opt model pipeline * finish opt pipeline * fix opt * set transformers version * refactor the test pipeline
Configuration menu - View commit details
-
Copy full SHA for 7b583c0 - Browse repository at this point
Copy the full SHA 7b583c0View commit details -
[hotfix] fix opt pipeline (hpcaitech#4293)
* opt forward and test * pause * finish opt model pipeline * finish opt pipeline * opt forward and test * pause * finish opt model pipeline * finish opt pipeline * fix opt * set transformers version * refactor the test pipeline * fix bug
Configuration menu - View commit details
-
Copy full SHA for 3b92e4a - Browse repository at this point
Copy the full SHA 3b92e4aView commit details -
Feature/chatglm (hpcaitech#4240)
* [shardformer] added tests * [shardformer] vit test finish and support * [shardformer] chatglm ready * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] chatglm shard without mlp sharding * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] fix chatglm configuration with pre-commit
Configuration menu - View commit details
-
Copy full SHA for 77cc087 - Browse repository at this point
Copy the full SHA 77cc087View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6c2acf0 - Browse repository at this point
Copy the full SHA 6c2acf0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7668b24 - Browse repository at this point
Copy the full SHA 7668b24View commit details -
Configuration menu - View commit details
-
Copy full SHA for b135b75 - Browse repository at this point
Copy the full SHA b135b75View commit details -
Configuration menu - View commit details
-
Copy full SHA for e3cd5cb - Browse repository at this point
Copy the full SHA e3cd5cbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 30574a7 - Browse repository at this point
Copy the full SHA 30574a7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 28677d4 - Browse repository at this point
Copy the full SHA 28677d4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 28319c2 - Browse repository at this point
Copy the full SHA 28319c2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3f19de9 - Browse repository at this point
Copy the full SHA 3f19de9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2a4bbcf - Browse repository at this point
Copy the full SHA 2a4bbcfView commit details -
Configuration menu - View commit details
-
Copy full SHA for 32448e3 - Browse repository at this point
Copy the full SHA 32448e3View commit details -
Configuration menu - View commit details
-
Copy full SHA for eb1c71a - Browse repository at this point
Copy the full SHA eb1c71aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 127e385 - Browse repository at this point
Copy the full SHA 127e385View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d5b141 - Browse repository at this point
Copy the full SHA 9d5b141View commit details
Commits on Jul 21, 2023
-
[pipeline] reformat for unified design (hpcaitech#4283)
* bert_reformat * reformat * reformat * fix a typo * format * format * fix bug
Configuration menu - View commit details
-
Copy full SHA for d7e584c - Browse repository at this point
Copy the full SHA d7e584cView commit details -
[pipeline] add pipeline support for T5Stack/T5EncoderModel (hpcaitech…
…#4300) * modify t5 policy & add test * pipeline stage distribution for t5 * complete t5 base policy * t5 stack: halfway * modify gpt2 pipeline test * complete pipeline forward for T5Stack/T5EncoderModel * fix docstring * move t5 util tests to test_pipeline
Configuration menu - View commit details
-
Copy full SHA for 9605805 - Browse repository at this point
Copy the full SHA 9605805View commit details -
Merge pull request hpcaitech#4297 from klhhhhh/feature/support_ChatGL…
…MForConditionalGeneration Feature/support chat glm for conditional generation
Configuration menu - View commit details
-
Copy full SHA for 805f342 - Browse repository at this point
Copy the full SHA 805f342View commit details
Commits on Jul 25, 2023
-
[shardformer] support Blip2 (hpcaitech#4243)
* support base blip2 * add support for downstream blip2 model * update readme * add forward injection * skip not compatible models test * fix test for gemini and low_level_zero_pugin
Configuration menu - View commit details
-
Copy full SHA for f48a8bb - Browse repository at this point
Copy the full SHA f48a8bbView commit details -
[pipeline] test pure pipeline process using llama (hpcaitech#4218)
* bloom policy * llama pipeline forward and tests * fix the output and attention_mask * fix name * bind argument to policy * Revert "bloom policy" This reverts commit 8dee68a. This policy should be revert and copied to feature/bloom * revert the bloom changes * cancel unneeded inputs * gpt * finish llama * causal lm and sequence classification * revision * add pure pipeline test * fixed version * fixed version * pure pipeline
Configuration menu - View commit details
-
Copy full SHA for 965bf20 - Browse repository at this point
Copy the full SHA 965bf20View commit details -
[pipeline] add pipeline support for all T5 models (hpcaitech#4310)
* complete policy for T5Model & T5ForConditionalGeneration * modify function signature in forwards * add forward for T5model * add forward for T5ForConditionalGeneration * fix a bug * fix hidden_states transporting in decoder * fix the passing of encoder_outputs
Configuration menu - View commit details
-
Copy full SHA for 28e6980 - Browse repository at this point
Copy the full SHA 28e6980View commit details -
[shardformer] support pipeline base vit model (hpcaitech#4284)
* Feature/vit support (hpcaitech#4182) * [shardformer] added tests * [shardformer] vit test finish and support * fix attention dropout * support base vit pipeline * support vit downstream model * fix vit shard test * modify hidden states return type --------- Co-authored-by: Kun Lin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2e93d9b - Browse repository at this point
Copy the full SHA 2e93d9bView commit details -
[plugin] add 3d parallel plugin (hpcaitech#4295)
* [amp] add mixed precision optimizer * [plugin] add 3d parallel plugin * [booster] support pipeline * [plugin] 3d parallel plugin support clip grad norm * [shardformer] fix sharder and add plugin test * [plugin] rename 3d parallel plugin * [ci] support testmon core pkg change detection (hpcaitech#4305) * [hotfix] debug testmon * [hotfix] fix llama * [hotfix] fix p2p bugs * [hotfix] fix requirements
Configuration menu - View commit details
-
Copy full SHA for 78dd508 - Browse repository at this point
Copy the full SHA 78dd508View commit details
Commits on Jul 27, 2023
-
[hotfix] fix gemini and zero test (hpcaitech#4333)
* [hotfix] fix gemini and zero test * [hotfix] fix lazy init test * [hotfix] fix lazy init test
Configuration menu - View commit details
-
Copy full SHA for 8ad05d1 - Browse repository at this point
Copy the full SHA 8ad05d1View commit details -
Configuration menu - View commit details
-
Copy full SHA for d547377 - Browse repository at this point
Copy the full SHA d547377View commit details
Commits on Jul 31, 2023
-
[pipeline] add unit test for 1f1b (hpcaitech#4303)
* add unit test for 1f1b * polish code * polish code and update ut version * fix
Configuration menu - View commit details
-
Copy full SHA for b941e65 - Browse repository at this point
Copy the full SHA b941e65View commit details
Commits on Aug 1, 2023
-
[pipeline] refactor test pipeline and remove useless utils in pipeline (
hpcaitech#4324) * refactor tests * refactor bloom model * finish policy tests * refactor tests * fix test pure pipeline * remove test pipeline and cutdown launch process * refactor tests * refactor bloom model * finish policy tests * refactor tests * fix test pure pipeline * remove test pipeline and cutdown launch process
Configuration menu - View commit details
-
Copy full SHA for 7d5b144 - Browse repository at this point
Copy the full SHA 7d5b144View commit details -
Merge remote-tracking branch 'upstream/feature/pipeline' into feature…
…/shardformer-models
Configuration menu - View commit details
-
Copy full SHA for 01ef6c5 - Browse repository at this point
Copy the full SHA 01ef6c5View commit details -
[pipeline] support fp32 for HybridPlugin/merge shardformer test and p…
…ipeline test into one file (hpcaitech#4354) * add naive optimizer for 3DPlugin/refactor gpt2 shardformer test * merge tests of PP/DP/TP combinations into one test file * fix bug when sync grad for dp in HybridPlugin * update supported precisions for 3DPlugin/fix bug when shifting tp_degree * improve the passing of lazy_init * modify lazy_init/use sync_shared_params
Configuration menu - View commit details
-
Copy full SHA for 992cbb7 - Browse repository at this point
Copy the full SHA 992cbb7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 260df9e - Browse repository at this point
Copy the full SHA 260df9eView commit details
Commits on Aug 2, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 5403578 - Browse repository at this point
Copy the full SHA 5403578View commit details -
Merge pull request hpcaitech#4358 from hpcaitech/feature/shardformer-…
…models [Merge] Feature/shardformer models to feature/pipeline
Configuration menu - View commit details
-
Copy full SHA for b849657 - Browse repository at this point
Copy the full SHA b849657View commit details
Commits on Aug 3, 2023
-
[test] Hotfix/fix some model test and refactor check util api (hpcait…
…ech#4369) * fix llama test * fix test bug of bert, blip2, bloom, gpt2 * fix llama test * fix opt test * fix sam test * fix sam test * fix t5 test * fix vit test * fix whisper test * fix whisper test * polish code * adjust allclose parameter * Add mistakenly deleted code * addjust allclose * change loss function for some base model
Configuration menu - View commit details
-
Copy full SHA for 3bfdd53 - Browse repository at this point
Copy the full SHA 3bfdd53View commit details -
[shardformer] add util functions for shardformer tests/fix sync_share…
…d_param (hpcaitech#4366) * add util functions for shardformer tests & rewrite gpt2 test * fix shared_params & embedding/merging * fix precision
Configuration menu - View commit details
-
Copy full SHA for 21c6bb0 - Browse repository at this point
Copy the full SHA 21c6bb0View commit details
Commits on Aug 4, 2023
-
[pipeline] add chatglm (hpcaitech#4363)
* add pipeline policy and bert forward to be done * add bertmodel pipeline forward and make tests * add Bert_Policy and test for policy * update formatting * update formatting * update the code * fix bugs * fix name confilt * add bloom model and policy ,revise the base class of policy * revise * revision * add bert_for_pretraining * add bert_for_pretraining forward and policy * fix typos * cancel warning * change the imediate output to default dict * change the default output of get_shared_params * add chatglm * add * chatglm * chatglm * finish chatglm * deletes * fix rmsnorm * chatglm * fix chatglm shard * init
Configuration menu - View commit details
-
Copy full SHA for c5f4844 - Browse repository at this point
Copy the full SHA c5f4844View commit details
Commits on Aug 7, 2023
-
[Shardformer] Merge flash attention branch to pipeline branch (hpcait…
…ech#4362) * [shardformer] supported flash attention test dependency (hpcaitech#4158) * [shardformer] fix flash attention utils test (hpcaitech#4180) * [shardformer] opt support flash attention (hpcaitech#4163) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] move to modeling * [shardformer] move to modeling * [shardformer] add performance benchmark of shardformer (hpcaitech#4175) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] benchmark fix * [shardformer] benchmark fix * [shardformer] llama support flash attention (hpcaitech#4185) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] move to modeling * [shardformer] move to modeling * [shardformer] llama support flash attention * [shardformer] llama support flash attention * [shardformer] Move the import statement for xformer outside the forward function. * [shardformer] gpt2 support flash attention. (hpcaitech#4191) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] move to modeling * [shardformer] move to modeling * [shardformer] gpt2 support flash attention * [shardformer] gpt2 support flash attention * [shardformer] gpt2 support flash attention * [shardformer] bloom support flash attention (hpcaitech#4188) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] move to modeling * [shardformer] move to modeling * [shardformer] bloom suport flash attention * [shardformer] add assert to sequence length * [shardformer] fix * [shardformer] fix * [shardformer] fix * [shardformer] bert support flash attention. (hpcaitech#4206) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] move to modeling * [shardformer] move to modeling * [shardformer] bert support flash attention * [shardformer] t5 support flash attention. (hpcaitech#4216) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] move to modeling * [shardformer] move to modeling * [shardformer] t5 support flash attention * [shardformer] t5 support flash attention * fix typo * fix typo * fix typo * fix typo * fix typo * fix typo * [shardformer] support 'paddedcausal' type of attention mask in Coloattention. (hpcaitech#4215) * added padded causal attn mask type for ColoAttention * [shardformer]t5 flash attention fix (hpcaitech#4239) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] move to modeling * [shardformer] move to modeling * [shardformer] t5 flash attention fix * [shardformer] update gpt2 to use coloattention. (hpcaitech#4234) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] move to modeling * [shardformer] move to modeling * [shardformer] update gpt2 to use coloattention * [shardformer] update gpt2 to use coloattention * [shardformer] update gpt2 to use coloattention * [shardformer] update gpt2 to use coloattention * [shardformer] update gpt2 * [shardformer] update opt and llama to use coloattention. (hpcaitech#4226) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] move to modeling * [shardformer] move to modeling * update opt to use coloattention * [shardformer]update opt to use coloattention * [shardformer]update opt to use coloattention * [shardformer]update opt to use coloattention * [shardformer]update opt to use coloattention * [shardformer]update opt to use coloattention * [shardformer]update opt to use coloattention * [shardformer]update opt * [shardformer] shardformer support jit fused operator. (hpcaitech#4236) * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] opt support flash attention * [shardformer] move to modeling * [shardformer] move to modeling * [shardformer] bloom support jit fused operator * [shardformer] bloom support jit fused operator * [shardformer] bloom support jit fused operator * [shardformer] t5 support jit fused operator * [shardformer] t5 support jit fused operator * [shardformer] t5 support jit fused operator * [shardformer] add roadmap of flash attention * [shardformer] add roadmap of flash attention * [shardformer] add roadmap of flash attention * [shardformer] add type hint to 'self' param of forward * [shardformer] merge feature/shardformer-models branch to feature/flash-attention-shardformer branch. (hpcaitech#4290) * Feature/vit support (hpcaitech#4182) * [shardformer] added tests * [shardformer] vit test finish and support * fix attention dropout * [shardformer] support SAM (hpcaitech#4231) * 1.support sam 2.add fused qkv for nn.Linear * update utils support set element in list * overtwrite SamVisionAttention foward to use DropoutForParallelInput * remove unused code * [shardformer] support whisper (hpcaitech#4212) * support whisper * fix bug in vocabembedding * support downstream model of whisper * update readme * Feature/chatglm (hpcaitech#4240) * [shardformer] added tests * [shardformer] vit test finish and support * [shardformer] chatglm ready * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] chatglm shard without mlp sharding * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] fix chatglm configuration with pre-commit --------- Co-authored-by: Kun Lin <[email protected]> Co-authored-by: FoolPlayer <[email protected]> * [shardformer] whisper support flash attention (hpcaitech#4301) * Feature/vit support (hpcaitech#4182) * [shardformer] added tests * [shardformer] vit test finish and support * fix attention dropout * [shardformer] support SAM (hpcaitech#4231) * 1.support sam 2.add fused qkv for nn.Linear * update utils support set element in list * overtwrite SamVisionAttention foward to use DropoutForParallelInput * remove unused code * [shardformer] support whisper (hpcaitech#4212) * support whisper * fix bug in vocabembedding * support downstream model of whisper * update readme * Feature/chatglm (hpcaitech#4240) * [shardformer] added tests * [shardformer] vit test finish and support * [shardformer] chatglm ready * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] chatglm shard without mlp sharding * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] fix chatglm configuration with pre-commit * [shardformer] whisper support flash attention * [shardformer] whisper support flash attention * [shardformer]whisper support jit operator --------- Co-authored-by: Kun Lin <[email protected]> Co-authored-by: FoolPlayer <[email protected]> * [shardformer] sam support flash attention (hpcaitech#4316) * Feature/vit support (hpcaitech#4182) * [shardformer] added tests * [shardformer] vit test finish and support * fix attention dropout * [shardformer] support SAM (hpcaitech#4231) * 1.support sam 2.add fused qkv for nn.Linear * update utils support set element in list * overtwrite SamVisionAttention foward to use DropoutForParallelInput * remove unused code * [shardformer] support whisper (hpcaitech#4212) * support whisper * fix bug in vocabembedding * support downstream model of whisper * update readme * Feature/chatglm (hpcaitech#4240) * [shardformer] added tests * [shardformer] vit test finish and support * [shardformer] chatglm ready * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] chatglm shard without mlp sharding * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] fix chatglm configuration with pre-commit * [shardformer] sam support flash attention --------- Co-authored-by: Kun Lin <[email protected]> Co-authored-by: FoolPlayer <[email protected]> * [shardformer] merge blip2/chatglm (hpcaitech#4321) * Feature/vit support (hpcaitech#4182) * [shardformer] added tests * [shardformer] vit test finish and support * fix attention dropout * [shardformer] support SAM (hpcaitech#4231) * 1.support sam 2.add fused qkv for nn.Linear * update utils support set element in list * overtwrite SamVisionAttention foward to use DropoutForParallelInput * remove unused code * [shardformer] support whisper (hpcaitech#4212) * support whisper * fix bug in vocabembedding * support downstream model of whisper * update readme * Feature/chatglm (hpcaitech#4240) * [shardformer] added tests * [shardformer] vit test finish and support * [shardformer] chatglm ready * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] chatglm shard without mlp sharding * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] fix chatglm configuration with pre-commit * [shardformer] added tests * [shardformer] vit test finish and support * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit * [shardformer] support Blip2 (hpcaitech#4243) * support base blip2 * add support for downstream blip2 model * update readme * add forward injection * skip not compatible models test * fix test for gemini and low_level_zero_pugin --------- Co-authored-by: Kun Lin <[email protected]> Co-authored-by: FoolPlayer <[email protected]> Co-authored-by: klhhhhh <[email protected]> * [shardformer] blip2 support flash attention and jit operator (hpcaitech#4325) * Feature/vit support (hpcaitech#4182) * [shardformer] added tests * [shardformer] vit test finish and support * fix attention dropout * [shardformer] support SAM (hpcaitech#4231) * 1.support sam 2.add fused qkv for nn.Linear * update utils support set element in list * overtwrite SamVisionAttention foward to use DropoutForParallelInput * remove unused code * [shardformer] support whisper (hpcaitech#4212) * support whisper * fix bug in vocabembedding * support downstream model of whisper * update readme * Feature/chatglm (hpcaitech#4240) * [shardformer] added tests * [shardformer] vit test finish and support * [shardformer] chatglm ready * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] chatglm shard without mlp sharding * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] fix chatglm configuration with pre-commit * [shardformer] added tests * [shardformer] vit test finish and support * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit * [shardformer] support Blip2 (hpcaitech#4243) * support base blip2 * add support for downstream blip2 model * update readme * add forward injection * skip not compatible models test * fix test for gemini and low_level_zero_pugin * [shardformer] blip2 support flash attention and jit operator * [shardformer] blip2 support flash attention and jit operator * [shardformer] blip2 support flash attention and jit operator --------- Co-authored-by: Kun Lin <[email protected]> Co-authored-by: FoolPlayer <[email protected]> Co-authored-by: klhhhhh <[email protected]> * [shardformer] chatglm support flash attention and jit operator (hpcaitech#4330) * Feature/vit support (hpcaitech#4182) * [shardformer] added tests * [shardformer] vit test finish and support * fix attention dropout * [shardformer] support SAM (hpcaitech#4231) * 1.support sam 2.add fused qkv for nn.Linear * update utils support set element in list * overtwrite SamVisionAttention foward to use DropoutForParallelInput * remove unused code * [shardformer] support whisper (hpcaitech#4212) * support whisper * fix bug in vocabembedding * support downstream model of whisper * update readme * Feature/chatglm (hpcaitech#4240) * [shardformer] added tests * [shardformer] vit test finish and support * [shardformer] chatglm ready * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] chatglm shard without mlp sharding * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] fix chatglm configuration with pre-commit * [shardformer] added tests * [shardformer] vit test finish and support * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit * [shardformer] support Blip2 (hpcaitech#4243) * support base blip2 * add support for downstream blip2 model * update readme * add forward injection * skip not compatible models test * fix test for gemini and low_level_zero_pugin * [shardformer] chatglm support flash attention and jit operator * [shardformer] chatglm support flash attention and jit operator * [shardformer] chatglm support flash attention and jit operator * [shardformer] chatglm support flash attention and jit operator --------- Co-authored-by: Kun Lin <[email protected]> Co-authored-by: FoolPlayer <[email protected]> Co-authored-by: klhhhhh <[email protected]> * [shardformer] vit support flash attention and jit operator (hpcaitech#4334) * Feature/vit support (hpcaitech#4182) * [shardformer] added tests * [shardformer] vit test finish and support * fix attention dropout * [shardformer] support SAM (hpcaitech#4231) * 1.support sam 2.add fused qkv for nn.Linear * update utils support set element in list * overtwrite SamVisionAttention foward to use DropoutForParallelInput * remove unused code * [shardformer] support whisper (hpcaitech#4212) * support whisper * fix bug in vocabembedding * support downstream model of whisper * update readme * Feature/chatglm (hpcaitech#4240) * [shardformer] added tests * [shardformer] vit test finish and support * [shardformer] chatglm ready * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] chatglm shard without mlp sharding * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] fix chatglm configuration with pre-commit * [shardformer] added tests * [shardformer] vit test finish and support * import chatglm * [shardformer] add test kit in model zoo for chatglm * [sharformer] add first version of policy of chatglm * [shardformer] polish chatglm code * [shardformer] polish code * [shardformer] support chatglm without layernorm * [shardformer] delete some file * [shardformer] ChatGLM support layernorm sharding * [shardformer] register without auto policy * [shardformer] pre-commit check files * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit * [shardformer] support Blip2 (hpcaitech#4243) * support base blip2 * add support for downstream blip2 model * update readme * add forward injection * skip not compatible models test * fix test for gemini and low_level_zero_pugin * [shardformer] vit support flash attention and jit operator * [shardformer] vit support flash attention and jit operator --------- Co-authored-by: Kun Lin <[email protected]> Co-authored-by: FoolPlayer <[email protected]> Co-authored-by: klhhhhh <[email protected]> * [pipeline] merge flash attention branch * [pipeline] merge flash attention branch * [pipeline] merge flash attention branch * [pipeline] fix conflict * [pipeline] fix conflict * Merge branch 'feature/pipeline' into feature/pipeline * Merge branch 'feature/pipeline' into feature/pipeline * Merge branch 'feature/pipeline' into feature/pipeline * activate checks * activate checks * activate checks * activate checks * activate checks * activate checks * activate checks * activate checks * fix flash attention tests * gemini ignore whisper * fix vit * fix xformers import handle --------- Co-authored-by: Frank Lee <[email protected]> Co-authored-by: Kun Lin <[email protected]> Co-authored-by: FoolPlayer <[email protected]> Co-authored-by: klhhhhh <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7c84f51 - Browse repository at this point
Copy the full SHA 7c84f51View commit details
Commits on Aug 8, 2023
-
[pipeline] rewrite t5 tests & support multi-tensor transmitting in pi…
…peline (hpcaitech#4388) * fix remaining t5 bugs/rewrite t5 tests * fix multi-tensor communication in pipeline * rearrange test_config * fix keyerror in sync_shared_params * fix get_held_layers & Randomnizer, complete t5 tests * erase printing * fix get_held_layers through modifying _release_unheld_layers * fix _get_recursive_held_layers bug
Configuration menu - View commit details
-
Copy full SHA for 2e77e57 - Browse repository at this point
Copy the full SHA 2e77e57View commit details
Commits on Aug 9, 2023
-
[shardformer] update shardformer to use flash attention 2 (hpcaitech#…
…4392) * cherry-pick flash attention 2 cherry-pick flash attention 2 * [shardformer] update shardformer to use flash attention 2 [shardformer] update shardformer to use flash attention 2, fix [shardformer] update shardformer to use flash attention 2, fix [shardformer] update shardformer to use flash attention 2, fix
Configuration menu - View commit details
-
Copy full SHA for c14920a - Browse repository at this point
Copy the full SHA c14920aView commit details
Commits on Aug 10, 2023
-
[shardformer] test all optimizations (hpcaitech#4399)
[shardformer] test all optimizations [shardformer] test all optimizations [shardformer] test all optimizations
Configuration menu - View commit details
-
Copy full SHA for ed2c229 - Browse repository at this point
Copy the full SHA ed2c229View commit details
Commits on Aug 11, 2023
-
[pipeline] rewrite bert tests and fix some bugs (hpcaitech#4409)
* add pipeline policy and bert forward to be done * add bertmodel pipeline forward and make tests * add Bert_Policy and test for policy * update formatting * update formatting * update the code * fix bugs * fix name confilt * add bloom model and policy ,revise the base class of policy * revise * revision * add bert_for_pretraining * add bert_for_pretraining forward and policy * fix typos * cancel warning * change the imediate output to default dict * change the default output of get_shared_params * rewrite bert test * rewrite bert test * fix some bugs * del pipeline tests * del pipeline tests * del useless print * del useless print * rewrite data repeats
Configuration menu - View commit details
-
Copy full SHA for 9916a19 - Browse repository at this point
Copy the full SHA 9916a19View commit details -
[shardformer]fix, test gpt2 for AMP+TP (hpcaitech#4403)
* [shardformer] gpt2 tests fix [shardformer] test all optimizations (hpcaitech#4399) [shardformer] test all optimizations [shardformer] test all optimizations [shardformer] test all optimizations [shardformer] gpt2 tests fix * [shardformer] gpt2 tests fix
Configuration menu - View commit details
-
Copy full SHA for fcbf80f - Browse repository at this point
Copy the full SHA fcbf80fView commit details -
[shardformer] rewrite tests for opt/bloom/llama/vit/chatglm (hpcaitec…
…h#4395) * rewrite opt tests * rewrite llama tests * rewrite bloom & vit tests * rewrite chatglm tests * fix LinearCol for classfiers * add judge for other tp layers, fix lazy init in util
Configuration menu - View commit details
-
Copy full SHA for 1e518ae - Browse repository at this point
Copy the full SHA 1e518aeView commit details -
[shardformer] update tests for all optimization (hpcaitech#4413)
[shardformer] update tests for all optimization
Configuration menu - View commit details
-
Copy full SHA for d4a3a10 - Browse repository at this point
Copy the full SHA d4a3a10View commit details
Commits on Aug 14, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 6990477 - Browse repository at this point
Copy the full SHA 6990477View commit details -
[shardformer]update t5 tests for using all optimizations. (hpcaitech#…
…4407) * [shardformer] gpt2 tests fix [shardformer] test all optimizations (hpcaitech#4399) [shardformer] test all optimizations [shardformer] test all optimizations [shardformer] test all optimizations [shardformer] gpt2 tests fix * [shardformer]update t5 to use all optimizations
Configuration menu - View commit details
-
Copy full SHA for ac8d4ed - Browse repository at this point
Copy the full SHA ac8d4edView commit details -
[shardformer] update bloom/llama/vit/chatglm tests (hpcaitech#4420)
[shardformer] update bloom/llama/vit/chatglm tests [shardformer] update opt tests [shardformer] update opt tests [shardformer] update bloom/llama/vit/chatglm tests [shardformer] update bloom/llama/vit/chatglm tests [shardformer] update bloom/llama/vit/chatglm tests
Configuration menu - View commit details
-
Copy full SHA for 82ea190 - Browse repository at this point
Copy the full SHA 82ea190View commit details -
Merge pull request hpcaitech#4424 from ver217/sync/pipeline
[sync] update pipeline branch with main
Configuration menu - View commit details
-
Copy full SHA for 60db2cc - Browse repository at this point
Copy the full SHA 60db2ccView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d1a6d2 - Browse repository at this point
Copy the full SHA 9d1a6d2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4f095e6 - Browse repository at this point
Copy the full SHA 4f095e6View commit details