Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

egui_plots: simplifying (complex) lines with Ramer–Douglas–Peucker might save a lot of triangulation/overdraw time #22

Open
abey79 opened this issue Feb 14, 2024 · 14 comments

Comments

@abey79
Copy link
Contributor

abey79 commented Feb 14, 2024

A HN commenter suggested looking at the Ramer–Douglas–Peucker polyline simplification algorithm. For very big lines that generate lots of triangulation/overdraw load, that might be a very neat optimisation. We'd need to run it with e.g. 0.1px tolerance, and cache the result against the zoom level.

@lokegustafsson
Copy link

I found something like this necessary (chunking the time series in 1px-equivalents and replacing each chunk with its max value) for my hobby project linux task manager (https://github.com/lokegustafsson/pi/blob/master/crates/pi/src/system/time_series.rs). It seems like a very reasonable thing to have built into egui_plot.

@willtemperley
Copy link

I'm the HN commenter - FWIW this is the Swift code I'm using. I've found radial distance gives very similar results and seems faster.

https://gist.github.com/willtemperley/abd73fa097486bc1aac3a88d9d6d5458

@emilk emilk mentioned this issue Feb 15, 2024
@mkalte666
Copy link
Contributor

I think just simplifying to max values in the range might be inaccurate when looking at lines with high frequency contents; you already get significant aliasing with that.

I'm still a fan of using a user stored data format as plot input that "mipmaps" the data using an actual low pass filter

Instead of flat mapping you could do iterative addition and single pole filtering if you split the thing into multiple flat arrays

@abey79
Copy link
Contributor Author

abey79 commented Feb 15, 2024

@mkalte666 I agree than min/max aggregation works only for time series and is not a very good general-purpose approach. The RDP algorithm on the other-hand preserves high frequency content so I think it would work much better. Both approach could be combined with a mipmap that use RDP to produce lower-res version of the data.

@mkalte666
Copy link
Contributor

Depending on what you want to do, High frequency content needs to be filtered out instead of preserved, but now that i think about it, thst really depends on the use case - aliasing might be wanted or needs to be filtered out.

Hmm.

@abey79
Copy link
Contributor Author

abey79 commented Feb 15, 2024

I guess your proposed avenue of having some kind of Series trait to abstract over the actual data that is accepted by egui_plot would allow some user-controlled flexibility in how data is filtered/decimated before display.

@lokegustafsson
Copy link

In my use case the signal was already sufficiently well low-pass filtered that max-aggregration was sufficient. Having a Series trait seems useful, although most callsites probably could have egui_plot do low-pass-filtering internally.

@enomado
Copy link
Contributor

enomado commented Mar 19, 2024

journal.pone.0094694.pdf

@NChechulin
Copy link

Hi! Just tried plotting around 2.5M points on egui 0.27.2, and it is basically unusable.
Is there anyone currently working on this algorithm? If no, I could try, but I have zero understanding of egui internals, and might need some guidance along the way.

@NChechulin
Copy link

Offtopic

Also, I have noticed that performance does not get any better as I zoom into the plot.
A neat trick would be to infer the x_start and x_end --- the leftmost and rightmost points on the X axis. Then we can use binary search to find all the points such that x_start <= x <= x_end and draw only them.

If someone is working on that, please help me find an issue (i havent found one).

P.S. If you think this comment does not belong to the conversation, please, feel free to delete it.

@abey79
Copy link
Contributor Author

abey79 commented Apr 5, 2024

@NChechulin "brute force" render of 2.5M points gets you in "custom renderer" territory (been there, done that).

I'm not really sure egui is the right place for that amount of drawn data. The original issue was meant more as an (immediate-mode) optimisation when data is ~10-100x what can possibly visualised with the available pixels. For 1000x+, you'll probably need to do probably-cached aggregation type of things. I would certainly be great to have some support for that in egui plot, but that sounds complicated and somewhat at odds with the IM paradigm. FWIW, we are in that latter case at Rerun, and we currently use at least the zoom level, probably the min max boundaries as well (can't remember) to drive the data query and aggregation.

As you say, though, there are likely a myriad of low hanging fruits to make things at least a bit better.

@mkalte666
Copy link
Contributor

Hi! Just tried plotting around 2.5M points on egui 0.27.2, and it is basically unusable. Is there anyone currently working on this algorithm? If no, I could try, but I have zero understanding of egui internals, and might need some guidance along the way.

I have done > 10 million points in a static plot and its doable.. but not without work. I mean, if you wrote your own gpu based renderer (and thats what @abey79 basically is suggesting, its probably very much doable. but you want this in egui.

I cannot share the whole code (again, work project), but i have shared the core of how i solve this problem: I mip-map the data beforehand.

The basic idea is to look at the current plot bounds, and provide a chunk of data that is just a bit more (or less, depending on your taste) than those bounds, and then select an appropriate filter level and chunk from your pre-mip-mapped data. This seems weirdly much at first, but in my experience is incredibly fast. (gpus do that for texture filtering, and while on cpu, this is just 1d flat data that is easily addressed into :D).

You will face challanges such as initially filling the plot bounds, actually generating your mipmap using a proper filter function (see this issue for one choice), etc etc. But it works without much custom rendering. I think the whole codebase around that is less than 200 lines, and its not clever code.

I will try to clear with my employer if i can publish this open source, ping me in a month if i forget.

@NChechulin
Copy link

Hi everyone!
I made a crate with 1-dimensional mipmapper inspired by @mkalte666's code.
It is hosted on github and is available on crates.io as well.

The key differences are:

  • Now it is generic
  • It uses Vec<Vec<_>> for internal data storage. It shouldn't introduce much memory or speed overhead.
  • It does not discard data
  • We don't get annoying 0's in the end of each mipmap
  • It currently aggregates 2 points by averaging their value. New ways are to be added soon.

It is still called mipmap, because I did not come up with a better name.

P.S. I might forget to change this comment after I release a new version, so if you're interested, check the crates readme, some things might've changed.

@emilk emilk transferred this issue from emilk/egui Jul 15, 2024
@enomado
Copy link
Contributor

enomado commented Jul 16, 2024

I still think all you need is journal.pone.0094694.pdf

Ramer–Douglas–Peucker is very expensive algorithm. it is good for random cartography geometry.

if you have sorted time series data you can just downsample chunks each frame, even without caching. its ok even with 1-to-100 point.

I even tesselate lines in my own - by adding min-max trapezoids to mesh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants