feat: support benchmarking vips processor #44

knarewski · 2024-12-12T12:00:37Z

Background

GdkPixbuf-based image processing can take over 1GB of memory for processing a single high-resolution photo. We've decided to introduce another processor, hoping that it'll improve Morandi's memory profile and performance.

Problems

Benchmarking scripts only supported pixbuf processor

Solution

pass image processor id to Morandi.process in bin/process-single script - it adds support for vips processor and simplifies the code
add vips processor to the benchmarking list in bin/benchmark-all script

Notes

I've noticed that vips outputs noticeably bigger (in terms of storage space) images, so I've also included output_size_mb in benchmarking results

Example benchmark

Observations

memory used by vips is typically lower than by pixbuf; the difference gets very significant (eg 150MB vs 2.5GB RAM) for high-resolution images; that's expected, as pixbuf loads entire image to memory, while vips processes in chunks
cpu consumption for vips is noticeably higher than pixbuf's; that can be controlled with concurrency level, but impacts rendering time; in order to make the time comparable between vips and pixbuf, vips concurrency has to be at least 2
size of the output image for vips is roughly 10-50% higher than pixbuf's; I don't know the exact cause, I'd guess some implementation detail when saving; the size difference is apparent even on no-op read->write

Raw benchmark results

root@3a2291c480a4:/workspaces/morandi-rb# bundle exec bin/benchmark-full
Processing image: tmp/high-res-small-size-16000x11000px.jpg (Huge, pixelised greyscale gradient), 10 runs
Options: {"crop"=>"815,850,14909,10005", "straighten"=>0.5, "gamma"=>0.85}
  vips:
    real_time: avg 7.91; min 6.94; max 8.94
    kernel_time: avg 4.78; min 4.19; max 5.47
    user_time: avg 9.23; min 8.16; max 10.45
    cpu_percentage: avg 176; min 172; max 180
    rss_max_mb: avg 154.67; min 154.45; max 154.89
    output_size_mb: avg 4.12; min 4.12; max 4.12
  pixbuf:
    real_time: avg 7.37; min 7.02; max 8.08
    kernel_time: avg 1.26; min 1.14; max 1.41
    user_time: avg 5.93; min 5.66; max 6.47
    cpu_percentage: avg 97; min 97; max 97
    rss_max_mb: avg 2430.64; min 2430.45; max 2430.77
    output_size_mb: avg 2.47; min 2.47; max 2.47
Processing image: tmp/spider-8288816.jpg (10MB stock photo), 10 runs
Options: {"crop"=>"100,100,6000,4000", "angle"=>180, "straighten"=>-0.5, "gamma"=>1.2}
  vips:
    real_time: avg 2.96; min 2.73; max 3.45
    kernel_time: avg 0.33; min 0.29; max 0.42
    user_time: avg 3.37; min 3.15; max 4.04
    cpu_percentage: avg 124; min 122; max 129
    rss_max_mb: avg 178.26; min 177.95; max 178.59
    output_size_mb: avg 12.64; min 12.64; max 12.64
  pixbuf:
    real_time: avg 4.07; min 3.26; max 5.9
    kernel_time: avg 0.49; min 0.36; max 0.74
    user_time: avg 3.12; min 2.56; max 4.48
    cpu_percentage: avg 88; min 86; max 90
    rss_max_mb: avg 457.16; min 456.82; max 457.31
    output_size_mb: avg 10.03; min 10.03; max 10.03
Processing image: tmp/apple-8027938.jpg (1MB stock photo), 10 runs
Options: {"crop"=>"300,300,5000,3000", "gamma"=>1.0}
  vips:
    real_time: avg 2.31; min 1.92; max 2.9
    kernel_time: avg 0.21; min 0.15; max 0.27
    user_time: avg 2.0; min 1.71; max 2.51
    cpu_percentage: avg 95; min 94; max 97
    rss_max_mb: avg 155.61; min 155.44; max 155.82
    output_size_mb: avg 1.89; min 1.89; max 1.89
  pixbuf:
    real_time: avg 2.11; min 1.86; max 2.45
    kernel_time: avg 0.16; min 0.13; max 0.2
    user_time: avg 1.83; min 1.59; max 2.06
    cpu_percentage: avg 93; min 92; max 95
    rss_max_mb: avg 140.83; min 140.57; max 140.94
    output_size_mb: avg 1.5; min 1.5; max 1.5
Processing image: tmp/IMG_1425.jpg (A typical phone upload), 10 runs
Options: {"crop"=>"100,100,2500,2500", "straighten"=>0.5, "gamma"=>0.85}
  vips:
    real_time: avg 2.36; min 2.12; max 2.71
    kernel_time: avg 0.22; min 0.17; max 0.26
    user_time: avg 2.26; min 2.06; max 2.5
    cpu_percentage: avg 104; min 102; max 107
    rss_max_mb: avg 124.43; min 124.07; max 124.73
    output_size_mb: avg 1.59; min 1.59; max 1.59
  pixbuf:
    real_time: avg 2.28; min 2.07; max 2.69
    kernel_time: avg 0.21; min 0.16; max 0.27
    user_time: avg 1.96; min 1.77; max 2.32
    cpu_percentage: avg 94; min 94; max 96
    rss_max_mb: avg 243.4; min 243.2; max 243.57
    output_size_mb: avg 1.29; min 1.29; max 1.29

That's how most operations of the GdkPixbuf-based image processor work when saving to jpeg. - match-multiple-operations - MAE (Mean Average Error): 0.000500558 - match-multiple-operations-and-straighten - MAE (Mean Average Error): 0.0422601 MAE 0.04 comes from improving consistency in vips implementation. When saving to jpeg, Pixbuf handled transparency differently depending on whether straighten option was used (flattening alpha) or not (discarding alpha). Vips is currently always discarding. Long-term it feels that flattening is more desired, but I'm trying to avoid changing too much at once.

Thanks to the dedicated argument in Morandi.process, we can delegate the processor selection instead of custom conditions

…ript It is a preparatory step before including output file size in the benchmark

I have noticed that vips output tends to be bigger despite using the same quality settings. I believe this is caused by vips preserving more details when performing operations, but decided to expose that difference in benchmark script to allow making more informed decisions

knarewski added 5 commits December 10, 2024 11:28

feat: add support for vips processor in bin/process-single script

c0911f1

Thanks to the dedicated argument in Morandi.process, we can delegate the processor selection instead of custom conditions

feat: add support for vips processor to the benchmarking script

6d82eec

feat: make the output path customisable and available in benchmark sc…

742d289

…ript It is a preparatory step before including output file size in the benchmark

knarewski requested a review from a team December 12, 2024 12:00

R-Bell approved these changes Dec 12, 2024

View reviewed changes

Base automatically changed from feat-unify-alpha-handling to master December 12, 2024 13:08

knarewski merged commit 0c5f3b2 into master Dec 12, 2024
8 checks passed

knarewski deleted the feat-improve-benchmarks branch December 12, 2024 13:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support benchmarking vips processor #44

feat: support benchmarking vips processor #44

knarewski commented Dec 12, 2024

feat: support benchmarking vips processor #44

feat: support benchmarking vips processor #44

Conversation

knarewski commented Dec 12, 2024

Background

Problems

Solution

Notes

Example benchmark

Observations

Raw benchmark results