Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve benchmarks for Data.IntMap #657

Open
jwaldmann opened this issue Jul 4, 2019 · 9 comments
Open

improve benchmarks for Data.IntMap #657

jwaldmann opened this issue Jul 4, 2019 · 9 comments

Comments

@jwaldmann
Copy link
Contributor

Benchmarks should

NB: these bulk ops are the main reason for IntMap? if we only operate by-element, we could use hashmaps?

@int-e
Copy link
Contributor

int-e commented Jul 6, 2019

Here's a potential source for inspiration: https://gist.github.com/int-e/36578cb04d0a187252c366b0b45ddcb6#file-intmapfal-hs-L20-L45

@jwaldmann
Copy link
Contributor Author

@sjakobi
Copy link
Member

sjakobi commented Jul 29, 2019

@jwaldmann That looks like a nice improvement! Why don't you simply make a PR?

@jwaldmann
Copy link
Contributor Author

"Why don't you.." - because it's a drastic change that should be discussed first? Current benchmark:

defaultMain
        [ bench "lookup" $ whnf (lookup keys) m ...

my proposal

  defaultMain $ do
    e <- [ 10, 15 .. 25 ]
    return $ bgroup ("2^" <> show e)
      [ bulk
        [ ("contiguous/overlapping", [1..2^e], [1..2^e]) ...

@sjakobi
Copy link
Member

sjakobi commented Jul 31, 2019

@jwaldmann I guess a PR would be the perfect platform for that discussion! :)

@gereeter
Copy link

test bulk operations (union, intersection) - currently, they don't?

These are tested in the set-operations-intmap benchmark, which also uses a variety of data sets.

@sjakobi
Copy link
Member

sjakobi commented Aug 14, 2020

For reference: In #653, there's some performance work on fromList[WithKey] that needs better benchmarks with more realistic inputs.

@sjakobi
Copy link
Member

sjakobi commented Aug 14, 2020

For reference: In #653, there's some performance work on fromList[WithKey] that needs better benchmarks with more realistic inputs.

Also related: #652

@sjakobi
Copy link
Member

sjakobi commented Aug 14, 2020

Regarding the problem of realistic inputs for fromList and friends: How about using e.g. splitmix to generate them randomly? Certainly that's not very realistic for many applications, but it adds another data point.

prettyprinter has a benchmark that is similarly based on randomly generated data: https://github.com/quchen/prettyprinter/blob/ab2c09419cca51fcc37760e71ef6861d26753e94/prettyprinter/bench/LargeOutput.hs#L173-L181

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants