Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor code #75

Open
wants to merge 63 commits into
base: main
Choose a base branch
from

Conversation

vlopezferrando
Copy link

@vlopezferrando vlopezferrando commented Dec 31, 2024

Hi, I've followed this repository for a while and thought I'd contribute some refactors of the code. I have tried to keep the logic the same (haven't changed the tests) but simplified the logic wherever possible and reduced the amount of repeated code.

This is a work in progress, but I'd like to know if work like this would be welcome and take any suggestions.

My plan was to simplify the code as much as possible and then port it to my other project simple-spaced-repetition, which has a more minimalistic API (single Card class).


Update: this ended up being an almost complete rewrite of the code, but keeping the same functionality. Let me summarize the changes and the logic behind them.

Code split in 3 files

I have split the code in 3 files:

  • models.py: which has the Rating, State, Card and ReviewLog classes. These classes just serve the purpose of holding the data and its serialization, but there is no business logic.
  • fsrs_v5.py: a new FSRSv5 class that encapsulates all the FSRS v5 logic. To put it in another way, this is the place where all the 19 weights are used, and only the 19 weights are used.
  • scheduler.py: the Scheduler class implements the top level class that users should use. It is a wrapper around FSRS v5 that adds learning/relearning steps, fuzzing and interval clamping.

Test changes

The tests have stayed the same except for two cases:

  1. The scheduler now has a fsrs attribute with a FSRSv5 object. The attributes' comparison when serializing/deserializing failed because objects had a different address.
  2. Instead of raising a ValueError when using a naive datetime, I believe it is more convenient to just add a UTC timezone to the datetime. Also, if the datetime is timezone aware but has a different timezone, we can use it directly.

Important refactors

The logic stays the same for the rest of the code, but I would highlight some of the improvements on the new approach:

  • Centralizing all the FSRS v5 algorithm in a single class: it makes it much easier to understand the algorithm, plus I added docstrings with explanations from the algorithm spec.
  • Refactor of review_card. Before it was a huge function, now it is much smaller and easier to understand.
  • Extraction of update_from_steps function, that is used both for learning and relearning steps.
  • Reduced the lines of code by simplifying a lot of conditionals and other logic, while increasing the amount of comments to make the code more understandable.

I know this is a huge rewrite, but hopefully you find it useful. Even if you don't feel like merging all of it, I think it can provide some ideas on how to improve the code.

@joshdavham
Copy link
Member

Hey Victor, I should be able to look at this PR in about two or three days, since I'm currently travelling at the moment.

But as for initial thoughts: we are likely going to be adding an optimizer to this package in the coming month which may change the codebase a fair bit so I'd be a bit hesitant to do another refactoring right away. I'd feel more comfortable to implement the optimizer first, before any refactoring.

@vlopezferrando
Copy link
Author

Hi Josh, thanks for your reply. I'll take this couple days to clean up the PR and implement some other improvements I still have in mind.

I understand that adding the optimizer is the priority, but if the refactor could cut code size by half and simplify the logic, it may actually be a good idea to do the refactor before adding new code. Your call!

I noticed when tweaking the code that tests are not completely exhaustive: some logic changes resulted in tests passing anyway. I wonder: is there some related project (from another programming language) that we could port tests from? If so, I'd be happy to contribute some tests too to make the refactor safer.

@joshdavham
Copy link
Member

@vlopezferrando

Woah that's a lot of changes haha! Also, I just read your updated message

So I just talked with the other maintainer and we both think that it'd be best to press pause on this PR for now. You've got some good ideas and a couple ideas that I disagree with, but we think it'd be best to halt progress on this for the moment and then maybe in a month or two (or after the next couple of releases), we can think of doing some smaller refactorings rather than one whole rewrite at once.

@vlopezferrando
Copy link
Author

Sure, I'd be happy to split some of the functionality if you think that'd be useful.

@joshdavham
Copy link
Member

@vlopezferrando

Sure, I'd be happy to split some of the functionality if you think that'd be useful.

Yes, splitting this PR into a smaller PR's in the future would be better rather than one big PR for a full rewrite.

That said, I'm gonna suggest that we put on hold any pure refactorings of this repo for maybe about two months, or after a couple more releases. After that, I'd be open to focusing on improving the code quality. To re-iterate, the main reason for this is that this repo is still under active development and will be getting more features in the coming months like the optimizer which is unfortunately non-trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants