The Benchmark

Dataset

59 collections submitted by users, 2936244 reviews in total.

Log Loss, R-squared, Root-mean-square error (RMSE) and Mean absolute error (MAE).

Model	Log loss	R-squared	RMSE	MAE
FSRS v4	0.34	0.80	3.1%	1.7%
LSTM	0.35	0.67	3.7%	2.1%
FSRS v3	0.36	-0.20	5.0%	2.9%
SM-2	0.50	-31.22	18.2%	11.4%
Memrise	0.61	-72.69	16.8%	13.2%

Note that negative values of R-squared are not the result of a bug. R-squared can be negative in some cases.
The best results are highlighted in bold.

My fantastic research experience on spaced repetition algorithm: How did I publish a paper in ACMKDD as an undergraduate?

The largest open-source dataset on spaced repetition with time-series features: open-spaced-repetition/FSRS-Anki-20k