Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/keep data if count >= 6 #99

Merged
merged 2 commits into from
Mar 25, 2024
Merged

Conversation

L-M-Sherlock
Copy link
Member

No description provided.

@L-M-Sherlock L-M-Sherlock added the bug Something isn't working label Mar 25, 2024
@L-M-Sherlock L-M-Sherlock merged commit a02e177 into main Mar 25, 2024
@L-M-Sherlock L-M-Sherlock deleted the Fix/keep-data-if-count-=-6 branch March 25, 2024 09:39
@user1823
Copy link
Contributor

@L-M-Sherlock, the new code doesn't work as it was originally designed.

The original design was:

  • If count < 6, remove all the cards in that delta_t (even if it is more than 5% of the total). This is because when count is less than 6, then the value of recall is not reliable. For example, if I get 4/5 correct, the recall is 80% and if I get 3/5 correct, the recall is 60%. This is a very significant difference that is caused by only a single additional lapse.
  • Even if count ≥ 6, we can remove up to 5% of the total reviews (or up to 20 reviews, whatever is greater). The reason is that the case of 6 reviews or 7 reviews is not very much different than that of 5 reviews. So, we shouldn't stop filtering after we have at least 6 reviews in a given delta_t. The condition of 5% will prevent filtering of too many reviews.

@L-M-Sherlock
Copy link
Member Author

This PR made the implementation consistent with the FSRS-rs. And it could keep more data from the filter.

@user1823
Copy link
Contributor

I would say that it is better to make FSRS-rs consistent with the previous behavior of fsrs-optimizer. I have explained the reasons in the previous comment.

And it could keep more data from the filter.

Yes, but pre-train is sensitive to outliers and needs a robust outlier filter. In fact, the need for a stricter outlier filter in pretrain and a less strict outlier filter in train was the motivation for open-spaced-repetition/fsrs-rs#121

@L-M-Sherlock
Copy link
Member Author

I’m satisfied with the current implementation. So it will not be changed unless someone reports that the optimizer provides bad parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants