Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make descending retrievability the default sort order; drop day sorting? #3460

Open
dae opened this issue Sep 30, 2024 · 50 comments
Open

Make descending retrievability the default sort order; drop day sorting? #3460

dae opened this issue Sep 30, 2024 · 50 comments

Comments

@dae
Copy link
Member

dae commented Sep 30, 2024

Lastly, if you don’t feel like making this big change right now, at least add Reverse Relative Overdueness (aka Retrievability Descending), as it’s objectively the best. Might be worth naming it Least Overdue First, and renaming Relative Overdueness to Most Overdue First.

See the linked thread for wording discussion.

Originally reported on https://forums.ankiweb.net/t/improving-sort-orders/50081

@Expertium
Copy link
Contributor

Expertium commented Sep 30, 2024

I still think that long-term we should only have 3-4 sort orders + "Custom" that opens a new window where there would be lots of metrics (stability/difficulty/overdueness/interval length/due date/etc.) and an Ascending/Descending toggle, so that people with oddly specific preferences don't lose their favorite functionality.

Regarding wording, "Reverse Relative Overdueness" is kind of a mouthful, there's probably a better name. I suggested "Least Overdue First". @brishtibheja any ideas?

@brishtibheja
Copy link
Contributor

I still think that long-term we should only have 3-4 sort orders

I think designing user experience for a wider (non-nerds) audience involves simplification. I'll agree with you. Also, the mandatory UXmyth article.

users may assume it works based on the number of days a card is overdue without regard to the card’s interval. (dae's comment)

In my experience, most users including me, Expertium, and some others still misunderstood relative overdueness. Isn't that fine?

I think we should be a bit more task-oriented. The namings here do not convey that least overdue first should be used with large backlogs and not most overdue first. I think most of the namings should convey what we at least expect the usage to be rather than what the sort is internally doing.

@dae
Copy link
Member Author

dae commented Oct 2, 2024

I'm skeptical we could come up with task-based names for 3-4 sort orders, let alone all the existing ones. Can you provide some examples of what you had in mind?

@brishtibheja
Copy link
Contributor

That would depend on what the users who requested the sort orders were trying to do. For example, depending on user goals ascending intervals can be renamed to recently reviewed first (though it doesn't work that way if you're going through a backlog which then would need addressing).

Some others:

random = increase interleaving (just an e.g.; that's how some see it but might confuse users more)

reverse relative overdueness = optimise for backlog

difficulty ascending = easy cards first

difficulty descending = difficult cards first


I also think it would be useful to remove relative overdueness altogether as it doesn't serve much purpose. IIRC this was mainly thought to be useful for backlogs but it actually harms the user with a backlog (imho).

@dae
Copy link
Member Author

dae commented Oct 4, 2024

difficulty ascending = easy cards first

That's not task-based, that's just a rewording for the better. I find your random example unconvincing. The only task listed there is 'optimise for backlog'. And even that's not clear: you can optimize for retention, you can optimize for total workload, etc.

I think simply changing the default option may be sufficient here.

I also think it would be useful to remove relative overdueness altogether as it doesn't serve much purpose.

Has that been shown to be universally inferior to the other options in all metrics? And what do we say to users who say "I don't care if it's less efficient, I want to study that way"?

@Expertium
Copy link
Contributor

Expertium commented Oct 4, 2024

Has that been shown to be universally inferior to the other options in all metrics?

It's inferior to reverse relative overdueness and cannot maintain retention in the presence of a backlog.

And what do we say to users who say "I don't care if it's less efficient, I want to study that way"?

As I said before: "3-4 sort orders + Custom that opens a new window where there would be lots of metrics (stability/difficulty/overdueness/interval length/due date/etc.) and an Ascending/Descending toggle, so that people with oddly specific preferences don't lose their favorite functionality."
Btw, a lot of the current sort oders are just inverses of each other (ascending vs descending)

@brishtibheja
Copy link
Contributor

brishtibheja commented Oct 4, 2024 via email

@user1823
Copy link
Contributor

user1823 commented Oct 8, 2024

consider making it the default

I support the addition of the 'reverse relative overdueness' option because it seems to be the best sort order for dealing with a backlog. But, I am not convinced that it should be the default.

Multiple studies show that interleaving the practice of different topics leads to better learning outcomes. For this reason, I think that the Random option should be the default.

Answering the question raised by @brishtibheja on the forums

you are assuming that the Random sort order is somehow producing more interleaving between topics than other sort orders.

Other sort orders decrease interleaving in one or the other condition.

  • Orders based on D prevent interleaving of hard and easy material.
  • Orders based on S prevent interleaving of mature and young cards.
  • Orders based on R prevent interleaving of cards from subdecks having different desired retentions.

So, I think that Random should be the default sort because it is the best in the absence of a backlog. However, when a backlog is present, Descending R becomes the best sort order as it helps to minimize the loss of R and, thus, minimize the number of cards that need to be relearnt.

@dae
Copy link
Member Author

dae commented Oct 8, 2024

There are quite a few users who run with a permanent backlog. How much harm do we do to them with a random order, compared to how much gain to give to non-backlogged users by interleaving different desired retentions? I feel like it makes more sense to optimize for the former, given the relative gains/losses.

@Expertium
Copy link
Contributor

I agree. We can't quantify the benefits of interleaving using a simulation, but I'd be very surprised if it outweighs the benefit of having constant retention regardless of whether you have a backlog or not.

@user1823
Copy link
Contributor

user1823 commented Oct 8, 2024

Ok, because an informed user would still be able to choose a Random sort order when they don't have a backlog, I think that making Descending R the default makes sense.

@brishtibheja
Copy link
Contributor

I mostly agree with @user1823 now. I have seen some criticism from SM users about this point. Although we might not be able to do something like SM, it makes sense to optimise for interleaving in the current proposal. Currently, random does interleaving best.

If we're going to rework a new sort order I think it makes sense to do it this way. The new default sort order will be random if you have zero prop:due<0 cards, but otherwise it will switch to using reverse relative overdueness or what we decide to be best. I've called this dynamic or simply default. The latter is probably a better choice for the name.

Now, it's upto whether writing the code will be easy (I assume it is) and whether we tolerate this degree of ambiguity/complexity in the sort orders.

@dae
Copy link
Member Author

dae commented Oct 11, 2024

Review cards are gathered and sorted at the same time. We can't know if we have a backlog until we've sorted the cards, unless we reimplement the gathering process to check for a backlog, or gather cards twice. I'm concerned this will have an impact on maintainability or performance.

I do appreciate the problem you're trying to solve though. Perhaps there is a different way we could accomplish it? For example, what if we merged the sort orders instead, and did something like this: sort(due==today, RRO), so that all the cards due today are shown first? In the backlog=false case, that's the same as random. In the true case, today's cards would appear first before, but presumably they'd tend to anyway?

@brishtibheja
Copy link
Contributor

I'm sorry about this, but we have something new to see: https://forums.ankiweb.net/t/new-sort-oder-psg/50600


sort(due==today, RRO)

That's a great suggestion and I believe we can do this with my new suggestion too without putting people into too much damage.

@mlidbom
Copy link

mlidbom commented Oct 15, 2024

If you have a dev environment, and you're dying for this feature, you might find this diff useful:

diff --git a/rslib/src/storage/card/mod.rs b/rslib/src/storage/card/mod.rs
index f290a7f71..9004d1476 100644
--- a/rslib/src/storage/card/mod.rs
+++ b/rslib/src/storage/card/mod.rs
@@ -762,7 +762,7 @@ impl fmt::Display for ReviewOrderSubclause {
                 let today = timing.days_elapsed;
                 let next_day_at = timing.next_day_at.0;
                 temp_string =
-                    format!("extract_fsrs_relative_overdueness(data, due, {today}, ivl, {next_day_at}) desc");
+                    format!("extract_fsrs_relative_overdueness(data, due, {today}, ivl, {next_day_at}) asc");
                 &temp_string
             }
             ReviewOrderSubclause::Added => "nid asc, ord asc",

It reverses the current relative overdueness sorting without renaming it. Ugly but I've been running it for about a day now and it seems to be working just fine. I'm entirely new to this codebase though, and basically took a stab in the dark with the change, so there are absolutely no guarantees whatsoever!

@dae
Copy link
Member Author

dae commented Oct 15, 2024

@brishtibheja what was the outcome of that forum discussion? Are we simply reversing the current order, or was there some other ordering you guys decided was better? I am ok with the default for new presets being updated to it.

@Expertium
Copy link
Contributor

Expertium commented Oct 15, 2024

@dae there is some stuff I want to test, I asked LMSherlock to add the code to the simulator. I'll probably tell you whether we're keeping reverse relative overdueness or adding a new order tomorrow.

Btw, I really wish you implemented my suggestion with removing most sort orders and adding a "Custom" sort order so that people who want a specific sort order can "cook" it themselves.
EDIT: here's how I envision it

Custom sort order

We can also add "Deck" to the list of metrics, and it would open a second Metric dropdown list, so that the user could configure something like "Deck, then retrievability descending" or "Deck, then random". I was just too lazy to draw it, but basically it would be First Metric -> Second Metric -> Order (of the second metric)

@richard-lacasse
Copy link

  • Orders based on R prevent interleaving of cards from subdecks having different desired retentions.

@user1823 "Reverse Relative Overdueness" wouldn't be straight up based on retrievability, it would be based on R in relation to the deck's desired retention, at least that's how I understand the current Relative Overdueness to work. So interleaving wouldn't be an issue at all.

@mlidbom
Copy link

mlidbom commented Oct 15, 2024

@Expertium

Btw, I really wish you implemented my suggestion with #3460 (comment) so that people who want a specific sort order can "cook" it themselves.

I could hardly agree more. Keep the built-in orders to a few well thought out orders known to be the best for most use cases. Then allow those that want something custom complete flexibility to get what they need in an advanced configuration section that most users will never use or need to understand. This is the best of both worlds to my mind.

The current model is sub-optimal both for flexibility and simplicity to my mind. Those that need something special probably can't get it. Those that need simplicity and clear guidance on what is best for most users can't get that either.

Expertium's suggestion solves both problems and I've actually been grumbling in my head that I want exactly that advanced configuration for my own needs. For one important point, consider the fact that had this been the way Anki worked, right now all users would already be able to get RRO sorting without having to wait for a new version....

Oh, and one more important point. Name the Default something like "Optimal for most users". Make it explicit in the documentation that this is the state of the art of the knowledge of the developers and that it may change. It is not necessarily one sort order, but rather Anki examining the state of your decks and choosing the best way to do things. This is the one to choose to automatically get the benefit of future improved knowledge without having to spend any mental effort and time researching which one to use. Make that impossible to miss with the naming.

@richard-lacasse
Copy link

richard-lacasse commented Oct 15, 2024

If you have a dev environment, and you're dying for this feature, you might find this diff useful:

@mlidbom I did not have a dev environment and it took a few hours to figure it out, but I got it working. Thank you!! I've wanted this from Anki for so long lol. It's working how I've always imagined

@DerIshmaelite
Copy link

How did you get it to work for you, and how does it look like amongst the options

@mlidbom
Copy link

mlidbom commented Oct 16, 2024

Apply the patch above, build and run. Chose Relative overdueness as the sort order. The patch reverses the order in which that option sorts.

@DerIshmaelite
Copy link

I dont kow how to do that...I guess I will just wait for this to be added to Anki.

@brishtibheja
Copy link
Contributor

Please don't use the issue tracker for non relevant discussions.

@Expertium
Copy link
Contributor

@brishtibheja what was the outcome of that forum discussion? Are we simply reversing the current order, or was there some other ordering you guys decided was better? I am ok with the default for new presets being updated to it.

@dae I tried another promising sort order, but it couldn't maintain desired retention at the specified level. Reverse relative overdueness is the best one, so please implement it.
brishtibheja will disagree and will say that more testing of other sort orders could reveal even better ones. I am not going to test them.

@brishtibheja
Copy link
Contributor

brishtibheja commented Oct 17, 2024

I mentioned this in my post. I don't think that's a good metric to look at. If a user is in a backlog, the priority isn't having a constant recall rate (if recall rate was goal everyone would've set DR to .99) but to clear the backlog in the most efficient/effective way. PSG produces better results for that than retrievability_desc does.

I find it ironic that we're using time per remembered card for CMRR but here we're looking at recall rate instead. I invite @dae to read my post and decide on a metric that we'll look at as there seems to be no consensus on what should we look at. FYI, sherlock looks at seconds_per_remembered_card, some don't trust it, some care more about the avg. recall rate, and so on.


As for PSG, it's better at two things: seconds_per_remembered_card and total_remembered.

@user1823
Copy link
Contributor

user1823 commented Oct 17, 2024

but it couldn't maintain desired retention at the specified level.

Using this criteria to judge a sort order is not a good idea when there is a backlog. The overdue cards have a low R and if you review them, your average R will go down. So, in essence, if we want to maintain the average R close to the DR, we have to ignore the overdue cards and review only the freshly due cards. This is not a good solution because the user won't want to give up on some of their cards just because they couldn't complete all their due cards on some days. The backlog might get cleared eventually, but is it a good idea to keep the cards waiting for so long?

You might be concerned that a user will ask "Why is my retention lower than my desired retention? Isn't FSRS working correctly?" I think that the answer is simple, if the user has a backlog, the average R is bound to be lower than the desired R because the user didn't do their reviews on the date FSRS calculated.

IMO, the best metric to look for is time required to clear the backlog (i.e., the number of days after which the number of due cards becomes zero.)

@brishtibheja
Copy link
Contributor

Dae previously mentioned here,

There are quite a few users who run with a permanent backlog.

For such users, not reviewing some of their cards for a long time will be detrimental. Even in subjects like Biology I can attest that knowledge is still very connected and not studying some basic facts can't be acceptable.

FYI, here is graph showing the distribution of R with different sort orders:

79290400f8648efed80c7ad1b7fe310740703c43

The simulations were done for a situation where the user has a permanent backlog of cards.

If we are optimising for such users too (which from dae's comment, I assume we are) what should we be looking at? To me @user1823's suggestion doesn't look much good either, i.e. "the best metric to look for is time required to clear the backlog".


In the forums, Keks showed PSG_desc also prioritises cards with low R values. So it might be something worth exploring but I'm not sure.

@user1823
Copy link
Contributor

If we are optimising for such users too, what should we be looking at? To me @user1823's suggestion doesn't look much good either

Well, why does it not look good?

If we have a sort order that allows the user to clear the backlog in the shortest possible time, then they may not really get a permanent backlog in the first place.

Also, even if they still get a permanent backlog, such a sort order will allow them to get through the backlog as fast as possible until they hit another block of busy days on which they are not able to reviews enough cards, building up a new backlog.

@Expertium
Copy link
Contributor

then they may not really get a permanent backlog in the first place.

You are making the same mistake as brishtibheja. Sort orders cannot remove backlogs. If the user can only do a fixed number of reviews per day, and the real number of due cards is much larger, no sort order will fix it.

@brishtibheja
Copy link
Contributor

People finish their backlogs all the time irl. The default settings make new cards disappear when reviews touch review limit.

@DerIshmaelite
Copy link

My only concern regarding Reverse Relative Overdueness: With an ever-growing collection and backlog, not just a permanent backlog, what would happen to the cards with very low R cards at the end of the queue? Would I even be able to see them at all, or would they remain trapped there?

@user1823
Copy link
Contributor

the real number of due cards is much larger

Do we really need a new sort order for dealing with such cases? I think that a user in such a situation can just permanently suspend the most overdue cards, which will essentially have the same effect as using Descending R as the sort order.

I think that the new sort order should try to help those people who want to eliminate their backlog. This was the basis of my suggestion of using time required to clear the backlog as the metric.


what would happen to the cards with very low R cards at the end of the queue?

@DerIshmaelite, with reverse relative overdueness sort order, you will never see them unless you are somehow able to eliminate all of your backlog.


I think that all of us have different priorities, which is causing a mess here. So, let's not make further comments and wait for @dae to give his opinion on what should the purpose of the new sort order be and, therefore, what metric should be used to determine which sort order is the best.

@brishtibheja
Copy link
Contributor

I agree, let dae's word be final on this. But I'd like to bring everyone's attention to the fact that we can still add new sort orders for people to use. The issue at hand is to select a new default.

@richard-lacasse
Copy link

My only concern regarding Reverse Relative Overdueness: With an ever-growing collection and backlog, not just a permanent backlog, what would happen to the cards with very low R cards at the end of the queue? Would I even be able to see them at all, or would they remain trapped there?

They'd remain trapped if you never got to them, but if you aren't ever getting through your backlog, you are not reaching your goals regardless of what sort order you're using. The goal is to eventually get through the backlog and we want to find the best sort order that accomplishes that in the least amount of work.

Just think of those cards at the very back of the deck as "New" cards. You're never going to get to all possible new cards, life isn't long enough. But Reverse Relative Overdueness will make sure you're staying on top of the cards that you are studying, and allows you to chip away at the backlog when you can. Also, if there are specific important cards at the bottom of the pile you don't want neglected, just study them today. Now they're back in the rotation.

@richard-lacasse
Copy link

The overdue cards have a low R and if you review them, your average R will go down.

This isn't true. When you review a card, its R immediately becomes 1.0, which raises the average.

IMO, the best metric to look for is time required to clear the backlog (i.e., the number of days after which the number of due cards becomes zero.)

Agreed, and I did change the sim to reflect that. Surprisingly, the results were pretty close to the same. The same sort orders performed the best.

@brishtibheja
Copy link
Contributor

This isn't true.

They were actually talking about retention rate and not average R (it's misnomer yeah).

@DerIshmaelite
Copy link

I think that the new sort order should try to help those people who want to eliminate their backlog. This was the basis of my suggestion of using time required to clear the backlog as the metric.

I see your point and therefore agree.

@dae
Copy link
Member Author

dae commented Oct 26, 2024

wait for @dae to give his opinion on what should the purpose of the new sort order be and, therefore, what metric should be used to determine which sort order is the best.

Sorry to keep you guys waiting. Adding RRO seems non-contentious, so let's ensure we do that at the very least before the next stable release.

The current default sort order is not ideal for users with larger backlogs - they are presented with the longest-waiting cards first (which will tend to be harder), and subsequent reviews with short intervals can be delayed while the user works through other parts of their backlog. It's demotivating, and that can drive users to give up on Anki.

I'm not sure how to state it in terms of metrics, but in the case of a backlog, RRO seems like an improvement over what we currently have. The user will start with the easy stuff, making it less of a slog to start working through the backlog, and the subsequent reviews will come up promptly. It may not be the perfect, but it seems like an improvement over what we have. Any objections to making it the default until a compelling argument for something else comes along?

And any objections to 'today's due cards first, then remaining cards in RRO', as described here? #3460 (comment)

@Gardengul
Copy link

What would be an effective alternative to "Deck, then due date"?

@richard-lacasse
Copy link

It's demotivating, and that can drive users to give up on Anki.

RRO seems like an improvement over what we currently have. The user will start with the easy stuff, making it less of a slog to start working through the backlog, and the subsequent reviews will come up promptly.

Just want to highlight that it's not just about user experience (not to imply you didn't know this already), but it's also more efficient objectively. Cards with higher R values are in the steeper part of the forgetting curve, and are thus losing R faster with each passing day.

And any objections to 'today's due cards first, then remaining cards in RRO', as described here?

No objections, but if that will cause any complications at all, I don't think it's necessary. RRO will in the vast majority of cases give you today's cards first, and in the instances where it doesn't, RRO is probably preferable to "today's due cards" anyway imo. If it's an easy add though, I like it fine.

@dae
Copy link
Member Author

dae commented Oct 26, 2024

@Gardengul that is a good point, it would make sense to update that sort order as well

@richard-lacasse I'd seen the discussions above, but your concise explanation puts it nicely. The special-casing of today is motivated by the non-backlog case: I presume we don't want users who are keeping up with their reviews to suddenly find they're appearing easiest to hardest.

@dae dae added this to the 24.10 milestone Oct 26, 2024
@richard-lacasse
Copy link

richard-lacasse commented Oct 26, 2024

@dae makes sense, I just don't think that anyone can perceive the difference between prop:r=0.87 and prop:r=0.89. If you did a blind test and gave people their prop:due=0 cards sorted by RRO and sorted by Random, nobody would be able to tell the difference.

Again, if it's easy to implement, I don't see why not. But seems like added steps in the code and in the explanation to users.

I'd seen the discussions above

my bad lol, I've been discussing this in like 4 different places, so forgot what was said here already.

@DerIshmaelite
Copy link

DerIshmaelite commented Oct 26, 2024

I think it would be nice to consider updating the list of sorting orders for Filtered Decks as well.

@Expertium
Copy link
Contributor

Expertium commented Oct 28, 2024

@dae please consider implementing this: #3460 (comment)

image

And any objections to 'today's due cards first, then remaining cards in RRO', as described here?

That just sounds like extra steps for no extra reason IMO.

@dae
Copy link
Member Author

dae commented Oct 28, 2024

Custom sort orders are a large change, and are unlikely in the next few months.

for no extra reason

I mentioned my concern above - I'm worried that this change would result in the day's cards being shown in roughly easiest to hardest order, instead of purely randomly as is currently done. @richard-lacasse has asserted that users would not be able to tell the difference, but I wonder if that's the case. The difference between 0.87 and 0.89 may indeed not be apparent, but if cards are showing up in R order, I think they might notice that the tail end and head end of the queue are not equally difficult.

@user1823
Copy link
Contributor

user1823 commented Oct 28, 2024

'today's due cards first, then remaining cards in RRO'

Seems reasonable to me.

But seems like added steps in the explanation to users.

If we follow the below suggestion by mlidbom, this problem won't arise.

Name the Default something like "Optimal for most users". Make it explicit in the documentation that this is the state of the art of the knowledge of the developers and that it may change. It is not necessarily one sort order, but rather Anki examining the state of your decks and choosing the best way to do things. This is the one to choose to automatically get the benefit of future improved knowledge without having to spend any mental effort and time researching which one to use. Make that impossible to miss with the naming.

@Expertium
Copy link
Contributor

Custom sort orders are a large change, and are unlikely in the next few months.

Oh well. Alright.

I think they might notice that the tail end and head end of the queue are not equally difficult.

Is it a problem, though?

@user1823
Copy link
Contributor

In addition to what dae pointed out, having a Random order in the no backlog case helps to increase interleaving (at least theoretically), which improves learning.

@brishtibheja
Copy link
Contributor

If we follow the below suggestion by mlidbom, this problem won't arise.

I've previously suggested the wording for default sort order to be "Default". But I believe we are not making RRO the default yet?

@dae dae self-assigned this Nov 7, 2024
@dae dae removed their assignment Nov 7, 2024
@dae dae removed this from the 24.11 milestone Nov 7, 2024
@dae dae changed the title Add 'reverse relative overdueness'; consider making it the default Make descending retrievability the default sort order; drop day sorting? Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants