MLX-swift stable diffusion implementation is ~5x slower than python on M1 Macs #165

louen · 2024-11-21T16:54:22Z

Timings are taken from the example implementations in mlx-examples/stable_diffusion for python and mlx-swift-examples/StableDiffusionExample for swift.

For swift I ran
./mlx-run image-tool sd text --model base --prompt "a dog running on a beach" --seed 3141592654

and for python
python txt2image.py --n_images 1 --model sd --steps 50 --cfg 7.50 --seed 3141592654 "a dog running on a beach"

I took the average iteration speed on the denoising step for each implementation, on two different M1 machines

Model	CPU	RAM	Swift speed	Swift time	Python speed	Python time	speedup
MacBook Pro	M1 Max	32 GB	0.34 it/s	2.9 s	1.96 it/s	0.5s	~6x
Mac Studio	M1 Ultra	64 GB	0.7 it/s	1.4 s	3.41 it/s	0.3s	~5x

The python speedup appears consistent across runs.
It also seems to be much more pronounced on M1 machines than on later models. Since those are two implementations of the same pipeline, the discrepancy should not be as wide.

The text was updated successfully, but these errors were encountered:

davidkoski · 2024-11-21T17:45:47Z

It looks like this may be mlx v0.18.1 (swift) vs mlx v0.19.3 (python) -- making both of those match (v0.18.1) makes the timing match as well (per offline conversation).

We can revisit with a newer mlx once that is ready: #150

davidkoski self-assigned this Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MLX-swift stable diffusion implementation is ~5x slower than python on M1 Macs #165

MLX-swift stable diffusion implementation is ~5x slower than python on M1 Macs #165

louen commented Nov 21, 2024 •

edited

Loading

davidkoski commented Nov 21, 2024

MLX-swift stable diffusion implementation is ~5x slower than python on M1 Macs #165

MLX-swift stable diffusion implementation is ~5x slower than python on M1 Macs #165

Comments

louen commented Nov 21, 2024 • edited Loading

davidkoski commented Nov 21, 2024

louen commented Nov 21, 2024 •

edited

Loading