-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Stable Diffusion model #129
Comments
Hello Sirius, thanks for taking interest in wonnx! The It will be required to do an approximation of the |
Thanks for your answer. Again, let me reiterate my ignorance on this field, but this is what I've found. The implementation used in tract seems very simple https://github.com/sonos/tract/blob/21928fb3652d028db5be1348e6017494318d4b86/onnx-opl/src/erf.rs Looking at other WGSL shaders for other operations, it seems translatable. The signum in WGSL is just sign, abs is the same, powi we can just use pow or even unroll it as it's 16 (and it's short and efficient), recip is just 1/x. copysign is trickier, but for the erf function should be just a multiplication with the original sign (as erf(0) == 0). I've looked a little bit to the other missing ops, and they don't seem as straight forward. |
I looked into this a few weeks ago - it is a significant chunk of work for 2 reasons:
|
Thanks for looking at it. I hope one day we can be able to run something like SD in pure Rust. |
As a matter of interest, tch-rs recently implemented Stable Diffusion: https://github.com/LaurentMazare/diffusers-rs It's not directly applicable to this, but it could inform future development efforts. |
I am not too familiar with SD but at least for BERT and other text encoders, parameterized dimensions can be replaced with fixed dimensions just fine (the model will then work with text token strings up to the statically set length). |
The shape inference engine in WONNX now supports this (it allows you to set parametrized dimensions, then infer shapes for other outputs). |
As for Einsum: this may be feasible, a first start is in #154 |
Is your feature request related to a problem? Please describe.
I would like to be able to run Stable Diffusion using wonnx
Describe the solution you'd like
At least, these operators are missing and should be implemented before even trying too run Stable Diffusion on wonnx:
Einsum, Erf, Expand, InstanceNormalization, Shape, Slice
This is the minimum based on this guide that simplifies the onnx model (see the simplification table):
https://www.photoroom.com/tech/stable-diffusion-25-percent-faster-and-save-seconds/
Probably many more things will be needed, but I'm creating this issue because it can be a really interesting use case to be able to run SD in rust on the GPU directly.
I don't have much experience with wonnx or even ML, but I decided to create this issue because it surprised me how few operators are missing to run this model. I would need to get more experience with stable diffusion, diffusers library and onnx in python before attempting to port it here, but maybe there are more experienced users interested too.
The text was updated successfully, but these errors were encountered: