Allow MPT models to return attention weights #599

lorabit110 · 2023-09-15T17:33:36Z

Previously, output_attentions is not propagated to the attention layer. Even when it's set to True, no attention weights were returned.

dakinggg

Could you please add a simple unit test for this?

llmfoundry/models/mpt/modeling_mpt.py

vchiley · 2023-09-15T21:37:45Z

This error should be noted if you plan to use it.

Co-authored-by: Daniel King <[email protected]>

lorabit110 · 2023-09-15T22:50:24Z

Could you please add a simple unit test for this?

Updated an existing unit test to check attention.

lorabit110 · 2023-09-15T22:52:44Z

This error should be noted if you plan to use it.

Yeah. I am aware of that flash attention won't return attention weights.

dakinggg

One comment to make sure we're returning the right shape of stuff, but otherwise lgtm! Thanks for the PR!

tests/test_model.py

Co-authored-by: Daniel King <[email protected]>

tests/test_model.py

dakinggg · 2023-09-18T17:54:00Z

@lorabit110 could you run precommit run --all-files locally? Thanks!

tests/test_model.py

Allow MPT models to return attention weights

a4f5679

dakinggg reviewed Sep 15, 2023

View reviewed changes

dakinggg requested a review from vchiley September 15, 2023 18:18

dakinggg reviewed Sep 15, 2023

View reviewed changes

llmfoundry/models/mpt/modeling_mpt.py Outdated Show resolved Hide resolved

lorabit110 and others added 4 commits September 15, 2023 15:01

Update llmfoundry/models/mpt/modeling_mpt.py

89f8e83

Co-authored-by: Daniel King <[email protected]>

Add unit test

655cee4

Merge branch 'main' of github.com:lorabit110/llm-foundry

c655217

Merge branch 'main' into main

950ef42

dakinggg approved these changes Sep 15, 2023

View reviewed changes

tests/test_model.py Outdated Show resolved Hide resolved

Update tests/test_model.py

2fad7a5

Co-authored-by: Daniel King <[email protected]>

dakinggg enabled auto-merge (squash) September 16, 2023 00:20

dakinggg disabled auto-merge September 16, 2023 05:33

Merge branch 'main' into main

3c2a975

dakinggg reviewed Sep 16, 2023

View reviewed changes

tests/test_model.py Outdated Show resolved Hide resolved

Update tests/test_model.py

2846d29

dakinggg enabled auto-merge (squash) September 16, 2023 05:37

lorabit110 and others added 4 commits September 16, 2023 12:15

retrigger checks

f4e4f49

Merge branch 'main' of github.com:lorabit110/llm-foundry

f00f8ea

Merge branch 'main' into main

d5bc46b

Merge branch 'main' into main

7d6639f

Merge branch 'main' into main

32334e0

dakinggg reviewed Sep 21, 2023

View reviewed changes

tests/test_model.py Outdated Show resolved Hide resolved

dakinggg added 2 commits September 20, 2023 17:30

Update tests/test_model.py

507b4df

Merge branch 'main' into main

9853135

dakinggg merged commit 0be2ca8 into mosaicml:main Sep 21, 2023
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow MPT models to return attention weights #599

Allow MPT models to return attention weights #599

lorabit110 commented Sep 15, 2023

dakinggg left a comment

vchiley commented Sep 15, 2023

lorabit110 commented Sep 15, 2023

lorabit110 commented Sep 15, 2023

dakinggg left a comment

dakinggg commented Sep 18, 2023

Allow MPT models to return attention weights #599

Allow MPT models to return attention weights #599

Conversation

lorabit110 commented Sep 15, 2023

dakinggg left a comment

Choose a reason for hiding this comment

vchiley commented Sep 15, 2023

lorabit110 commented Sep 15, 2023

lorabit110 commented Sep 15, 2023

dakinggg left a comment

Choose a reason for hiding this comment

dakinggg commented Sep 18, 2023