Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/consider short-term params when clipping PLS #150

Merged
merged 4 commits into from
Dec 9, 2024

Conversation

L-M-Sherlock
Copy link
Member

@L-M-Sherlock L-M-Sherlock commented Dec 9, 2024

@L-M-Sherlock L-M-Sherlock added the bug Something isn't working label Dec 9, 2024
@user1823
Copy link
Contributor

user1823 commented Dec 9, 2024

While the change looks good, it doesn't really fix any of the problems mentioned in the linked issue. So, the issue should not be closed after merging this.

On a side note, I think that we need a fourth memory state to model the short-term memory, just like SuperMemo introduced a third memory state (Difficulty) in SM-17.

The two-component model of long-term memory says that a status of memory in a synapse can be described with variables: stability and retrievability. The DSR model, first used in Algorithm SM-17, adds the third variable called memory difficulty, which is an expression of the complexity of the synaptic pattern involved in storing a given memory. The more complex the net of connections involved in a memory, the harder it is to maintain the memory in the long term using spaced repetition.

Source: https://help.supermemo.org/wiki/Glossary:DSR_model

@Expertium
Copy link
Contributor

Expertium commented Dec 9, 2024

I'm planning to add a fourth component called memory Instability, which decays over time and is the highest during the first review and after a lapse.
image
And the overall curve would look like this, X=time in days (of course, the shape depends on how Instability is calculated and its parameters):
image

I'll see whether this helps.

@L-M-Sherlock
Copy link
Member Author

The fix makes FSRS-5 slightly worse than before. But it actually improve the metrics on @brishtibheja's affected deck.

The benchmark results are here:

Model: FSRS-5-dev
Total number of users: 9999
Total number of reviews: 349923850
Weighted average by reviews:
FSRS-5-dev LogLoss (mean±std): 0.3276±0.1526
FSRS-5-dev RMSE(bins) (mean±std): 0.0518±0.0333
FSRS-5-dev AUC (mean±std): 0.7010±0.0786

Weighted average by log(reviews):
FSRS-5-dev LogLoss (mean±std): 0.3534±0.1696
FSRS-5-dev RMSE(bins) (mean±std): 0.0713±0.0462
FSRS-5-dev AUC (mean±std): 0.6995±0.0887

Weighted average by users:
FSRS-5-dev LogLoss (mean±std): 0.3568±0.1721
FSRS-5-dev RMSE(bins) (mean±std): 0.0742±0.0479
FSRS-5-dev AUC (mean±std): 0.6986±0.0908

parameters: [0.4299, 1.162, 3.1897, 15.8179, 7.1441, 0.5397, 1.7835, 0.0104, 1.5175, 0.1351, 1.0064, 1.9183, 0.1007, 0.3016, 2.3446, 0.2315, 3.0117, 0.4463, 0.635]

Model: FSRS-5
Total number of users: 9999
Total number of reviews: 349923850
Weighted average by reviews:
FSRS-5 LogLoss (mean±std): 0.3271±0.1521
FSRS-5 RMSE(bins) (mean±std): 0.0511±0.0331
FSRS-5 AUC (mean±std): 0.7015±0.0787

Weighted average by log(reviews):
FSRS-5 LogLoss (mean±std): 0.3529±0.1692
FSRS-5 RMSE(bins) (mean±std): 0.0705±0.0464
FSRS-5 AUC (mean±std): 0.7000±0.0887

Weighted average by users:
FSRS-5 LogLoss (mean±std): 0.3563±0.1719
FSRS-5 RMSE(bins) (mean±std): 0.0735±0.0484
FSRS-5 AUC (mean±std): 0.6991±0.0907

parameters: [0.4026, 1.1502, 3.1509, 15.8122, 7.1329, 0.5388, 1.7808, 0.0087, 1.5174, 0.1203, 1.0013, 1.9055, 0.11, 0.2961, 2.325, 0.2262, 3.0157, 0.5126, 0.6509]

The metrics of affected deck:

After fix:

Loss before training: 0.4525
Loss after training: 0.4319

Last rating = all
R-squared: 0.9728
MAE: 0.0118
ICI: 0.0075
E50: 0.0076
E90: 0.0148
EMax: 0.0383
RMSE(bins): 0.0459
AUC: 0.6967

Last rating = 1
R-squared: 0.8075
MAE: 0.0324
ICI: 0.0216
E50: 0.0208
E90: 0.0464
EMax: 0.1289
RMSE(bins): 0.0583
AUC: 0.7209

Last rating = 2
R-squared: 0.3371
MAE: 0.2579
ICI: 0.2579
E50: 0.1535
E90: 0.5278
EMax: 0.6626
RMSE(bins): 0.3524
AUC: 1.0000

Last rating = 3
R-squared: 0.9416
MAE: 0.0183
ICI: 0.0111
E50: 0.0087
E90: 0.0241
EMax: 0.0255
RMSE(bins): 0.0528
AUC: 0.6855

Last rating = 4
R-squared: 0.2246
MAE: 0.0872
ICI: 0.0588
E50: 0.0358
E90: 0.0908
EMax: 0.4747
RMSE(bins): 0.2139
AUC: 0.8810

Before fix:

Loss before training: 0.4855
Loss after training: 0.4572

Last rating = all
R-squared: 0.9036
MAE: 0.0262
ICI: 0.0195
E50: 0.0165
E90: 0.0471
EMax: 0.0668
RMSE(bins): 0.0472
AUC: 0.7197

Last rating = 1
R-squared: 0.7542
MAE: 0.0588
ICI: 0.0438
E50: 0.0246
E90: 0.1098
EMax: 0.1297
RMSE(bins): 0.0592
AUC: 0.7740

Last rating = 2
R-squared: 0.1317
MAE: 0.3333
ICI: 0.3333
E50: 0.2068
E90: 0.5749
EMax: 0.7267
RMSE(bins): 0.4035
AUC: 1.0000

Last rating = 3
R-squared: 0.9496
MAE: 0.0171
ICI: 0.0122
E50: 0.0133
E90: 0.0175
EMax: 0.0364
RMSE(bins): 0.0498
AUC: 0.6849

Last rating = 4
R-squared: 0.2181
MAE: 0.1070
ICI: 0.0599
E50: 0.0364
E90: 0.1814
EMax: 0.2393
RMSE(bins): 0.1736
AUC: 0.7518

@L-M-Sherlock L-M-Sherlock marked this pull request as ready for review December 9, 2024 10:41
@L-M-Sherlock L-M-Sherlock added this pull request to the merge queue Dec 9, 2024
Merged via the queue into main with commit 6882c8c Dec 9, 2024
4 checks passed
@L-M-Sherlock L-M-Sherlock deleted the Fix/consider-short-term-params-when-clipping-PLS branch December 9, 2024 10:46
@Expertium
Copy link
Contributor

Expertium commented Dec 9, 2024

Ineteresting. If it's better for some users but not for others, maybe we could turn it into an optimizable parameter.

new_minimum_s = old_s / self.w[19]
return torch.minimum(new_s, new_minimum_s)

Make w[19] range from 1 to 5.
Would you benchmark it, please? My PC is busy

L-M-Sherlock added a commit to open-spaced-repetition/fsrs-rs that referenced this pull request Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Improve the S increase formula for same-day reviews
3 participants