Skip to content

Commit

Permalink
elaborate text benchmark
Browse files Browse the repository at this point in the history
  • Loading branch information
dmetivie committed Nov 29, 2023
1 parent a865533 commit e4b6a38
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions docs/src/benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,10 @@ You can find the benchmark code in [here](https://github.com/dmetivie/Expectatio

<!-- I guess that to increase performance in this package, it would be nice to be able to do in place `fit_mle` for large multidimensional cases. -->

[^1]: I would have loved that `@btime` with `RCall` and `PyCall` would just [work](https://discourse.julialang.org/t/benchmarking-julia-vs-python-vs-r-with-pycall-and-rcall/37308).
I did compare with `R` `microbenchmark` and Python `timeit` (not a pleasing experience).
[^1]: Note that `@btime` with `RCall` and `PyCall` might produce a small-time overhead compare to the true R/Python time see [here for example](https://discourse.julialang.org/t/benchmarking-julia-vs-python-vs-r-with-pycall-and-rcall/37308).
I did compare with `R` `microbenchmark` and Python `timeit` and it produces very similar timing but in my experience `BenchmarkTools` is smarter and simpler to use, i.e. it will figure out alone the number of repetition to do in function of the run.

[^2]: This is suspect since it triggers a warning regarding K-means which I do not want to use. I asked a question [here](https://github.com/scikit-learn/scikit-learn/discussions/25916). Plus, the step by step likelihood of `Sklearn` is not the same as outputted by `ExpectationMaximization.jl` and [mixtool.R](https://cran.r-project.org/web/packages/mixtools/index.html) (both agree), so I am a bit suspicious.
[^2]: There is a suspect triggers warning regarding K-means which I do not want to use here. I asked a question [here](https://github.com/scikit-learn/scikit-learn/discussions/25916). It lead to [this issue](https://github.com/scikit-learn/scikit-learn/issues/26015) and [that PR](https://github.com/scikit-learn/scikit-learn/pull/26021). Turns out even if intial condition were provided K-mean were still computed. However to this day 23-11-29 with `scikit-learn 1.3.2` it still get the warning. Maybe it will be in the next release? I also noted this recent [PR](https://github.com/scikit-learn/scikit-learn/pull/26416).
Last, the step by step likelihood of `Sklearn` is not the same as outputted by `ExpectationMaximization.jl` and [mixtool.R](https://cran.r-project.org/web/packages/mixtools/index.html) (both agree), so I am a bit suspicious.

[^3]: It overflows very quickly for $n>500$ or so. I think it is because of naive implementation of [`logsumexp`](https://github.com/sseemayer/mixem/blob/2ffd990b22a12d48313340b427feae73bcf6062d/mixem/em.py#L5). So I eventually did not include the result in the benchmark.

0 comments on commit e4b6a38

Please sign in to comment.