Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLX-swift stable diffusion implementation is ~5x slower than python on M1 Macs #165

Open
louen opened this issue Nov 21, 2024 · 1 comment
Assignees

Comments

@louen
Copy link
Contributor

louen commented Nov 21, 2024

Timings are taken from the example implementations in mlx-examples/stable_diffusion for python and mlx-swift-examples/StableDiffusionExample for swift.

For swift I ran
./mlx-run image-tool sd text --model base --prompt "a dog running on a beach" --seed 3141592654

and for python
python txt2image.py --n_images 1 --model sd --steps 50 --cfg 7.50 --seed 3141592654 "a dog running on a beach"

I took the average iteration speed on the denoising step for each implementation, on two different M1 machines

Model CPU RAM Swift speed Swift time Python speed Python time speedup
MacBook Pro M1 Max 32 GB 0.34 it/s 2.9 s 1.96 it/s 0.5s ~6x
Mac Studio M1 Ultra 64 GB 0.7 it/s 1.4 s 3.41 it/s 0.3s ~5x

The python speedup appears consistent across runs.
It also seems to be much more pronounced on M1 machines than on later models. Since those are two implementations of the same pipeline, the discrepancy should not be as wide.

@davidkoski davidkoski self-assigned this Nov 21, 2024
@davidkoski
Copy link
Collaborator

It looks like this may be mlx v0.18.1 (swift) vs mlx v0.19.3 (python) -- making both of those match (v0.18.1) makes the timing match as well (per offline conversation).

We can revisit with a newer mlx once that is ready: #150

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants