Need a running script for ‘dist_flash_attn’ #22

LzhinFdu · 2024-04-26T09:16:44Z

Can you provide a script to run dist_flash_attn? I tried setting parallel_mode to dist_flash_attn but it didn't work successfully.

When trying to use 'dist_flash_attn' with 2*A100, process 0 is stuck in torch.cuda.synchronize() of _lightseq_forward of a certain decoderlayer, while process 1 runs to this step of the next decoderlayer. Strangely, the model gets stuck on the second sample. What might be causing this bug? Is there any way to solve this problem？

EasyContext/easy_context/dist_flash_attn/lightseq_async_attn.py

Line 291 in 41324ec

torch.cuda.synchronize()

It seems that communication of process 0 in maybe_send_recv_fwd_qkvo is not completed.

LzhinFdu · 2024-05-07T09:07:07Z

Well, After making the input sequence length divisible by world_size * block_size, it can run normally.

fahadh4ilyas · 2024-05-21T15:56:39Z

Well, After making the input sequence length divisible by world_size * block_size, it can run normally.

What is block_size?

LzhinFdu · 2024-05-22T08:57:40Z

Well, After making the input sequence length divisible by world_size * block_size, it can run normally.

What is block_size?

the block_size for flash-attn

fahadh4ilyas · 2024-05-22T09:10:33Z

Well, After making the input sequence length divisible by world_size * block_size, it can run normally.

What is block_size?

the block_size for flash-attn

I'm sorry I don't understand. I didn't find any block_size parameter in this repo. Could you please tell me where is it?

LzhinFdu · 2024-05-22T09:14:56Z

Well, After making the input sequence length divisible by world_size * block_size, it can run normally.

What is block_size?

the block_size for flash-attn

I'm sorry I don't understand. I didn't find any block_size parameter in this repo. Could you please tell me where is it?

EasyContext/easy_context/dist_flash_attn/lightseq_async_attn.py

Line 253 in 3c68bd5

BLOCK_M = 32

seems here.

LzhinFdu changed the title ~~Stuck when training with ‘dist_flash_attn’~~ Need a running script for ‘dist_flash_attn’ Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need a running script for ‘dist_flash_attn’ #22

Need a running script for ‘dist_flash_attn’ #22

LzhinFdu commented Apr 26, 2024 •

edited

Loading

LzhinFdu commented May 7, 2024

fahadh4ilyas commented May 21, 2024

LzhinFdu commented May 22, 2024

fahadh4ilyas commented May 22, 2024

LzhinFdu commented May 22, 2024

Need a running script for ‘dist_flash_attn’ #22

Need a running script for ‘dist_flash_attn’ #22

Comments

LzhinFdu commented Apr 26, 2024 • edited Loading

Can you provide a script to run dist_flash_attn? I tried setting parallel_mode to dist_flash_attn but it didn't work successfully.

LzhinFdu commented May 7, 2024

fahadh4ilyas commented May 21, 2024

LzhinFdu commented May 22, 2024

fahadh4ilyas commented May 22, 2024

LzhinFdu commented May 22, 2024

LzhinFdu commented Apr 26, 2024 •

edited

Loading