Sharing two extension libraries for TorchSharp - cross-platform module loading/saving and flash-attention! #1231
shaltielshmid
started this conversation in
Show and tell
Replies: 2 comments
-
I thought Torch now ships with an (even more improved) implementation of flash attention. |
Beta Was this translation helpful? Give feedback.
0 replies
-
With these two extension libraries, it would be great if there is a summary what we can and still can not do now when dealing with distributed pytorch binary trained models available from HuggingFace. Those model which do not require additional preprocessing e.g. Bert |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm thrilled to introduce two new extension libraries to our community:
TorchSharp.PyBridge - This library enables the loading of PyTorch model weights saved using the standard
torch.save(model.state_dict(), path)
method, as well as weights saved in HuggingFace'ssafetensors
format, including sharded models split across several files. For usage examples and further instructions, please refer to the README on the project page.TorchSharp.FlashAttention - With the rise of attention mechanisms in the latest generation of language models, the demand for computational resources increases with sequence length. Enter Flash Attention, a brilliant solution offering fast and memory-efficient exact attention that is IO-aware. This package serves as a wrapper, bringing flash-attention's benefits to the TorchSharp ecosystem.
Hope you all enjoy, and any comments/bug reports/contributions are more than welcome!
Beta Was this translation helpful? Give feedback.
All reactions