Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support benchmarking vips processor #44

Merged
merged 5 commits into from
Dec 12, 2024
Merged

Conversation

knarewski
Copy link
Contributor

Background

GdkPixbuf-based image processing can take over 1GB of memory for processing a single high-resolution photo. We've decided to introduce another processor, hoping that it'll improve Morandi's memory profile and performance.

Problems

Benchmarking scripts only supported pixbuf processor

Solution

  • pass image processor id to Morandi.process in bin/process-single script - it adds support for vips processor and simplifies the code
  • add vips processor to the benchmarking list in bin/benchmark-all script

Notes

I've noticed that vips outputs noticeably bigger (in terms of storage space) images, so I've also included output_size_mb in benchmarking results

Example benchmark

Observations

  • memory used by vips is typically lower than by pixbuf; the difference gets very significant (eg 150MB vs 2.5GB RAM) for high-resolution images; that's expected, as pixbuf loads entire image to memory, while vips processes in chunks
  • cpu consumption for vips is noticeably higher than pixbuf's; that can be controlled with concurrency level, but impacts rendering time; in order to make the time comparable between vips and pixbuf, vips concurrency has to be at least 2
  • size of the output image for vips is roughly 10-50% higher than pixbuf's; I don't know the exact cause, I'd guess some implementation detail when saving; the size difference is apparent even on no-op read->write

Raw benchmark results

root@3a2291c480a4:/workspaces/morandi-rb# bundle exec bin/benchmark-full
Processing image: tmp/high-res-small-size-16000x11000px.jpg (Huge, pixelised greyscale gradient), 10 runs
Options: {"crop"=>"815,850,14909,10005", "straighten"=>0.5, "gamma"=>0.85}
  vips:
    real_time: avg 7.91; min 6.94; max 8.94
    kernel_time: avg 4.78; min 4.19; max 5.47
    user_time: avg 9.23; min 8.16; max 10.45
    cpu_percentage: avg 176; min 172; max 180
    rss_max_mb: avg 154.67; min 154.45; max 154.89
    output_size_mb: avg 4.12; min 4.12; max 4.12
  pixbuf:
    real_time: avg 7.37; min 7.02; max 8.08
    kernel_time: avg 1.26; min 1.14; max 1.41
    user_time: avg 5.93; min 5.66; max 6.47
    cpu_percentage: avg 97; min 97; max 97
    rss_max_mb: avg 2430.64; min 2430.45; max 2430.77
    output_size_mb: avg 2.47; min 2.47; max 2.47
Processing image: tmp/spider-8288816.jpg (10MB stock photo), 10 runs
Options: {"crop"=>"100,100,6000,4000", "angle"=>180, "straighten"=>-0.5, "gamma"=>1.2}
  vips:
    real_time: avg 2.96; min 2.73; max 3.45
    kernel_time: avg 0.33; min 0.29; max 0.42
    user_time: avg 3.37; min 3.15; max 4.04
    cpu_percentage: avg 124; min 122; max 129
    rss_max_mb: avg 178.26; min 177.95; max 178.59
    output_size_mb: avg 12.64; min 12.64; max 12.64
  pixbuf:
    real_time: avg 4.07; min 3.26; max 5.9
    kernel_time: avg 0.49; min 0.36; max 0.74
    user_time: avg 3.12; min 2.56; max 4.48
    cpu_percentage: avg 88; min 86; max 90
    rss_max_mb: avg 457.16; min 456.82; max 457.31
    output_size_mb: avg 10.03; min 10.03; max 10.03
Processing image: tmp/apple-8027938.jpg (1MB stock photo), 10 runs
Options: {"crop"=>"300,300,5000,3000", "gamma"=>1.0}
  vips:
    real_time: avg 2.31; min 1.92; max 2.9
    kernel_time: avg 0.21; min 0.15; max 0.27
    user_time: avg 2.0; min 1.71; max 2.51
    cpu_percentage: avg 95; min 94; max 97
    rss_max_mb: avg 155.61; min 155.44; max 155.82
    output_size_mb: avg 1.89; min 1.89; max 1.89
  pixbuf:
    real_time: avg 2.11; min 1.86; max 2.45
    kernel_time: avg 0.16; min 0.13; max 0.2
    user_time: avg 1.83; min 1.59; max 2.06
    cpu_percentage: avg 93; min 92; max 95
    rss_max_mb: avg 140.83; min 140.57; max 140.94
    output_size_mb: avg 1.5; min 1.5; max 1.5
Processing image: tmp/IMG_1425.jpg (A typical phone upload), 10 runs
Options: {"crop"=>"100,100,2500,2500", "straighten"=>0.5, "gamma"=>0.85}
  vips:
    real_time: avg 2.36; min 2.12; max 2.71
    kernel_time: avg 0.22; min 0.17; max 0.26
    user_time: avg 2.26; min 2.06; max 2.5
    cpu_percentage: avg 104; min 102; max 107
    rss_max_mb: avg 124.43; min 124.07; max 124.73
    output_size_mb: avg 1.59; min 1.59; max 1.59
  pixbuf:
    real_time: avg 2.28; min 2.07; max 2.69
    kernel_time: avg 0.21; min 0.16; max 0.27
    user_time: avg 1.96; min 1.77; max 2.32
    cpu_percentage: avg 94; min 94; max 96
    rss_max_mb: avg 243.4; min 243.2; max 243.57
    output_size_mb: avg 1.29; min 1.29; max 1.29

That's how most operations of the GdkPixbuf-based image processor work when saving to jpeg.

- match-multiple-operations - MAE (Mean Average Error): 0.000500558
- match-multiple-operations-and-straighten - MAE (Mean Average Error): 0.0422601
MAE 0.04 comes from improving consistency in vips implementation. When saving to jpeg, Pixbuf handled transparency
differently depending on whether straighten option was used (flattening alpha) or not (discarding alpha).
Vips is currently always discarding. Long-term it feels that flattening is more desired, but I'm trying to avoid
changing too much at once.
Thanks to the dedicated argument in Morandi.process, we can delegate the processor selection instead of custom conditions
…ript

It is a preparatory step before including output file size in the benchmark
I have noticed that vips output tends to be bigger despite using the same quality settings. I believe this is caused
by vips preserving more details when performing operations, but decided to expose that difference in benchmark script
to allow making more informed decisions
@knarewski knarewski requested a review from a team December 12, 2024 12:00
Base automatically changed from feat-unify-alpha-handling to master December 12, 2024 13:08
@knarewski knarewski merged commit 0c5f3b2 into master Dec 12, 2024
8 checks passed
@knarewski knarewski deleted the feat-improve-benchmarks branch December 12, 2024 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants