Counts in background #7127

Arthur-Milchior · 2020-09-17T03:55:22Z

Pull Request template

Purpose / Description

Running the profiler, I realized that one of the reason undo was still slow was that computing counts is done in foreground and not in background. So I moved it to background.

This is not a problem for normal review since counts are required to find the next card, so the only case where counts should actually be computed and can't be read from cache is when a card can be found without using sched.getCard, i.e. after undo.

How Has This Been Tested?

Running it on my merge branch, reviewing, undoing, and seeing that the card's question appear almost immediately while the counts takes a few seconds to actually update.

Checklist

Please, go through these checks before submitting the PR.

You have not changed whitespace unnecessarily (it makes diffs hard to read)
You have a descriptive commit message with a short title (first line, max 50 chars).
Your code follows the style of the project (e.g. never omit braces in if statements)
You have commented your code, particularly in hard-to-understand areas
You have performed a self-review of your own code

mikehardy

Ah dangit - I think this is going to run up against @david-allison-1 having a strong preference for not pushing task-specific items into TaskData. I am sympathetic to that, I'd prefer TaskData stay not task-specific. @Arthur-Milchior is there a way to implement this passing the data through TaskData but without putting it specifically in to TaskData in the form of concrete types and task-specific methods?

Arthur-Milchior · 2020-09-30T16:17:58Z

It can be simply saved as an object, but I really don't see the point of loosing typing information.
Of course, my strong preference is to have CollectionTask well typed, instead of using <TaskData, TaskData, TaskData>, and I even have a PR dealing with this.

david-allison · 2020-09-30T16:19:56Z

My objection is that each time we add an object, it gets harder and harder to refactor it into something reasonable, and we take the hit from allocating more fields on all tasks, including our hot paths.

If we're using the internal object field, that makes it easier to turn into a generic, or Result later on down the line.

But, if there's a good reason to use another field, then let's take the hit

david-allison

Counts can now throw if it hits a path with database access.

Could we ensure that this is properly handled in the reviewer?

Arthur-Milchior · 2020-09-30T16:29:25Z

I am surprised that there is a hint. In java, this member is simply a pointer. There is no more reason to be worried about pointer than to be worried about ints. And I've not really seen any reason to state that we should avoid introducing more ints when needed. Casting (which require to check whether the cast is valid at all) is far more expensive

CollectionTask extends a class which already deal well with Generic, so I don't see why we would want to use generic in TaskData at all in the first place. What is Result ?

I don't understand your point about database access. Counts (the class) only save three ints and gives them back. It does not interacts with database at all. So I don't know what you believe it can throw. And the method counts is exactly the same in the background than the method that was used in the reviewer previously

david-allison · 2020-09-30T16:57:19Z

I don't understand your point about database access

d694006 introduced an exception into counts(), which wasn't there previously we should deal with it without crashing.

CollectionTask extends a class which already deal well with Generic, so I don't see why we would want to use generic in TaskData at all in the first place. What is Result ?

BaseAsyncTask is a monster and the API surface is terrible. CollectionTask massively improves this, we can do better, but we don't want to go backwards.

TaskData is useful because it provides a sane API to check if the result of a task was a success or not, we'll want to keep this concept going forwards.

Result would be the ideal we move to in the future: a wrapper/union which contains the status & strongly typed (optional) result of the execution of the task. Async libraries will have put much more thought into that than we will.

I am surprised that there is a hint. In java, this member is simply a pointer. There is no more reason to be worried about pointer than to be worried about ints. And I've not really seen any reason to state that we should avoid introducing more ints when needed. Casting (which require to check whether the cast is valid at all) is far more expensive

Anything which is unnecessarily increasing the memory allocations that ANSWER_CARD makes is something that we want to avoid, one field isn't the end of the world, but it'll be hard to go back if we take this route, and I don't think it's a good decision.

Arthur-Milchior · 2020-09-30T17:18:15Z

I don't understand your point about database access

d694006 introduced an exception into counts(), which wasn't there previously we should deal with it without crashing.

Current code in the reviewer already call counts,
https://github.com/ankidroid/Anki-Android/pull/7127/files#diff-15d704f4d834521dda5e28231ddbd727L922

Why would calling it in background introduce more risk than calling it directly from the reviewer ?

Anything which is unnecessarily increasing the memory allocations that ANSWER_CARD makes is something that we want to avoid, one field isn't the end of the world, but it'll be hard to go back if we take this route, and I don't think it's a good decision.

Updating the counts is also a part of answering a card. And casting seems a far more expensive operation. So avoiding a cast seems to save more time than avoiding a field.

The remaining of the answer does not seems relevant for the current PR, so I'll leave it for when we discuss background tasks

david-allison

How does this work WRT the JavaScript API? Can a JS call to counts return a stale value?

Arthur-Milchior · 2020-10-01T08:17:54Z

The JS API send what is currently in the the FlashCardViewer. So you are right, while computation occurs in background, JS only have access to the stale number.
I guess the ideal thing to do would be to send a message to JS that those number have changed. Ideally without reloading the card entirely. But it means that current card who don't handle receiving the message won't know it. I don't know whether any such card even exists.

Do you know whether calls to ReviewerJavaScriptFunction is done in background or in the UI interface ? If it's in background, then I can simply call counts directly, which ensure that the correct value is computed, even if it takes time.
If it's the UI interface, I may use a semaphore to ensure that we are not waiting for value to be updated

mikehardy · 2020-10-01T13:04:54Z

I do not want any sort of message passing or signaling infrastructure between the model (a card, and it's contents) and the viewers, that seems really overdone to me, for performance. Can we have at least some variant of the counts API just be reliable, such that if you call it....you get counts. Real counts ?

Arthur-Milchior · 2020-10-01T13:24:30Z

I am lost once again.
The very fact that we tell to the Reviewer "here is the card's question and answer" is already a message passing from the model to the viewer. It seems impossible to remove that.

Currently, there is a problem, the card is computed in backgrounds, but the counts are computer in the UI thread. This means that when the counts are not trivially known (i.e. after undo, suspend, bury... or when you open a new deck), you have computation done in the UI thread. Worse, you are loosing time because you could display the card content sooner instead of waiting for counts to be computed. For complex deck hierarchy, it's a noticeable difference.

Up until now, the answer has always been to start a collection task, and when the data is available, return it to the UI so it can display it. That's exactly what I'm using right now, the reviewer ask for the counts, a task is started to compute the counts, and when the task is done, it returns the counts to the UI.

I can do exactly the same thing for javascript of course, when it asks for the number of new card, I create a task in background to compute it. So that it gets the real value. The only downside is that it means the function will takes some time to return if the user press "undo", because the counts of new card takes some time to compute. The creator of the javascript code may not expect it to take time.

If we follow this idea, the impact is as follow:

For normal review (i.e. after answering a card of a deck), everything is quick
If you press undo, and the card's javascript does not request access to the count, then the card's question is displayed as soon as possible, and the count is computed once the question is displayed, so that no time is lost
If you press undo and the card's javascript require counts, then the card will takes some time to display - exactly as it does now - because it needs to wait until counts are computed to ends its script. That won't be slower than it is now, but it won't benefits from the increase in speed that this PR should offer

Arthur-Milchior · 2020-10-01T16:56:18Z

I've added two new commits. Once ensure that if counts are asked to be recomputed twice at the same time, it still behaves properly (this is a risk since javascript will need the computation to occur in the UI thread)
The second commit ensure that javascript uses actual number. Usually, this method returns immediatly, unless a reset just occurred (i.e. after undo, or when selecting a new deck). I added comment explaining this

david-allison · 2020-10-01T20:10:19Z

Do you know whether calls to ReviewerJavaScriptFunction is done in background or in the UI interface ?

JavaScript interacts with Java object on a private, background thread of this WebView. Care is therefore required to maintain thread safety.

https://developer.android.com/reference/android/webkit/WebView#addJavascriptInterface(java.lang.Object,%20java.lang.String)

The most sensible (and simple) thing would be to block until the data is available, and make this explicit in the documentation. Realistically, we should only be talking milliseconds.

Arthur-Milchior · 2020-10-01T20:37:36Z

Realistically, we should only be talking milliseconds.

If the numbers are not already computed and that the request trigger the computation of numbers, it may takes far more than miliseconds. If it never took more than miliseconds to compute the counts, I would not be doing this PR.
The only way to ensure it always takes milisecond would be to wait until the numbers are computed before displaying the card content. Which is what I want to avoid in the first place.

mikehardy · 2020-12-08T17:04:38Z

@david-allison-1 what's the prescriptive advice here to get this one in? It looks like it might be close?

david-allison · 2020-12-08T17:10:57Z

I don't know, I still stand by my suggestion for a blocking function which is documented, but that's weak. The main aim is to avoid introducing bugs.

Maybe a JS async wrapper over a blocking API (I don't do JavaScript, so don't know if this is possible), we haven't verified what blocking will do to the JavaScript.

I'm out of my comfort zone for best practices here.

mikehardy · 2020-12-08T18:25:37Z

I've got some JS knowledge and (from react-native-webview plus here) some experience with the actual WebView / JS interfaces. I'll take this and think on it then.

Arthur-Milchior · 2020-12-12T13:17:24Z

Rebased

This makes a difference only during the fraction of second between the time where the card is displayed and the time where reviewer update its counts. This fraction of second only exists when couts needs to be recomputed

github-actions · 2021-02-10T14:42:16Z

Hello 👋, this PR has been opened for more than 2 months with no activity on it. If you think this is a mistake please comment and ping a maintainer to get this merged ASAP! Thanks for contributing! You have 7 days until this gets closed automatically

Arthur-Milchior · 2021-02-13T13:52:35Z

This is still extremely relevant. We still do a lot of database work in foreground. So Clearly not to close

david-allison · 2021-03-22T07:13:06Z

This is going to take a long time for me to review again. Apologies, but I'll be looking at it over the next few days.

BUG (not directly relevant to this change):

Anki-Android/AnkiDroid/src/main/java/com/ichi2/libanki/sched/SchedV2.java

Lines 225 to 228 in 67f0938

    
           @Override 
        
           public void resetCounts(@NonNull CancelListener cancelListener) { 
        
               resetCounts(true); 
        
           }

CancelListener is not propagated. This should be resetCounts(cancelListener, true);

I'll fix this

david-allison · 2021-03-22T07:25:05Z

AnkiDroid/src/main/java/com/ichi2/libanki/sched/Sched.java

    protected void _resetLrnCount(@Nullable CancelListener cancelListener) {
        // sub-day
-        mLrnCount = mCol.getDb().queryScalar(
+        int lrnCount = mCol.getDb().queryScalar(


This implies a race condition. To my knowledge, CollectionTask ensured that access to sched was single-threaded.

Did this occur due to other UI-based interactions, or was this done as a precaution?

If this wasn't a precaution, it means there's the potential for race conditions inside the calling method chain (mHaveCounts at the very least)

You're entirely right, CollectionTask ensures that a single instance is called at each time.

However, the JS interfaces ankiGetLrnCardCount akiGetNewCardCount and ankiGetRevCardCount all require instant access to the number of cards. This currently works because showing the card content is blocked until the numbers are actually computed. My goal is too show card content before the number are computed, to win up to a second on big collection. This creates usually no problem because the numbers are shown in java on top of screen and JS usually do not use them.
This lead to the problem when js require access to those numbers when a card is created. There is two ways to act:

return the current number that may be false

block until the number are computed (I want to emphasize that it is what we are currently doing, except that we block even when those numbers are not used.)

The second case seems to make more sens to me, if we have a function, let it returns the right number.
So my last commit ensure that this method call counts from scheduler. In general, this will return instantaneously, however, in the case where a reset was just done and the computation not yet redone, it may actually trigger the computation to be done immediately. Since the computation was already planned in background, there is a small risk of actual race condition. However, this method is idempotent (with this change), so it presents no danger if the computation is done twice simultaneously, it just will get slower than necessary when it happens.

An easy solution would be of course to ensure that counts is synchronized.

In this case, I can't use CollectionTask in the java method, since I need to return immediately. If the @JavascriptInterface can be made async, we could go back to CollectionTask.
Clearly, it would be ideal as it would move a computation that require database access back to background.
@infinyte7 is it possible to use async/promise in the interface? Do you know if it would break anybody's add-on yet?

The downside is that it suddenly makes add-on code more complex, since users suddenly have to use async/await or promises.

I admit that the current PR is not entirely optimal, but I still believe it to be better than the current state of the code, since it now blocks only if the numbers are actually required and not available, instead of blocking even when the numbers are not useful and don't block anything without introduction the synchronized cost

mikehardy · 2022-07-22T21:38:46Z

#11808 will supercede this with regards to speed, we're pretty sure

mikehardy added this to the 2.14 release milestone Sep 17, 2020

Arthur-Milchior requested review from david-allison and mikehardy and removed request for david-allison September 17, 2020 19:50

Arthur-Milchior added the Needs Review label Sep 17, 2020

Arthur-Milchior force-pushed the counts_in_background branch 2 times, most recently from bdd8ea6 to 5118f28 Compare September 29, 2020 06:21

Arthur-Milchior mentioned this pull request Sep 30, 2020

Get first card quicker during reviews #6035

Merged

mikehardy added the Review High Priority Request for high priority review label Sep 30, 2020

mikehardy reviewed Sep 30, 2020

View reviewed changes

mikehardy added the Needs Author Reply Waiting for a reply from the original author label Sep 30, 2020

david-allison requested changes Sep 30, 2020

View reviewed changes

david-allison reviewed Sep 30, 2020

View reviewed changes

Arthur-Milchior force-pushed the counts_in_background branch from 5118f28 to ead4f4e Compare October 1, 2020 17:40

mikehardy modified the milestones: 2.14 release, 2.15 release Nov 10, 2020

david-allison self-assigned this Dec 8, 2020

mikehardy assigned mikehardy and unassigned david-allison Dec 8, 2020

NF: reviewer.setCounts

44fbcab

Arthur-Milchior force-pushed the counts_in_background branch from ead4f4e to a8fc078 Compare December 12, 2020 13:16

Arthur-Milchior removed the Needs Author Reply Waiting for a reply from the original author label Dec 12, 2020

Arthur-Milchior added 3 commits December 12, 2020 14:54

NF: update reviewer's count in background

ab977d6

NF: introduce temporary variable to avoid synchronization conflict

4b1e92b

NF: javascript return actual number and not reviewer's value

17a3459

This makes a difference only during the fraction of second between the time where the card is displayed and the time where reviewer update its counts. This fraction of second only exists when couts needs to be recomputed

Arthur-Milchior force-pushed the counts_in_background branch from a8fc078 to 17a3459 Compare December 12, 2020 13:57

Arthur-Milchior mentioned this pull request Dec 31, 2020

[Bug] SchedV1: Learn cards are not fully completed #7984

Closed

4 tasks

github-actions bot added the Stale label Feb 10, 2021

Arthur-Milchior added Keep Open avoids the stale bot and removed Stale labels Feb 13, 2021

david-allison mentioned this pull request Mar 22, 2021

fix[SchedV2]: use cancelListener param #8309

Merged

7 tasks

david-allison reviewed Mar 22, 2021

View reviewed changes

david-allison added the Needs Author Reply Waiting for a reply from the original author label Mar 24, 2021

mikehardy modified the milestones: 2.15 release, 2.16 release Apr 4, 2021

dae mentioned this pull request Jul 16, 2022

Add support for the V3 scheduler; drop support for V1 #11808

Merged

mikehardy closed this Jul 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Counts in background #7127

Counts in background #7127

Arthur-Milchior commented Sep 17, 2020

mikehardy left a comment

Arthur-Milchior commented Sep 30, 2020

david-allison commented Sep 30, 2020

david-allison left a comment

Arthur-Milchior commented Sep 30, 2020

david-allison commented Sep 30, 2020

Arthur-Milchior commented Sep 30, 2020

david-allison left a comment

Arthur-Milchior commented Oct 1, 2020

mikehardy commented Oct 1, 2020

Arthur-Milchior commented Oct 1, 2020

Arthur-Milchior commented Oct 1, 2020

david-allison commented Oct 1, 2020

Arthur-Milchior commented Oct 1, 2020

mikehardy commented Dec 8, 2020

david-allison commented Dec 8, 2020

mikehardy commented Dec 8, 2020

Arthur-Milchior commented Dec 12, 2020

github-actions bot commented Feb 10, 2021

Arthur-Milchior commented Feb 13, 2021

david-allison commented Mar 22, 2021 •

edited

Loading

david-allison Mar 22, 2021

Arthur-Milchior Jun 13, 2021

mikehardy commented Jul 22, 2022

Counts in background #7127

Counts in background #7127

Conversation

Arthur-Milchior commented Sep 17, 2020

Pull Request template

Purpose / Description

How Has This Been Tested?

Checklist

mikehardy left a comment

Choose a reason for hiding this comment

Arthur-Milchior commented Sep 30, 2020

david-allison commented Sep 30, 2020

david-allison left a comment

Choose a reason for hiding this comment

Arthur-Milchior commented Sep 30, 2020

david-allison commented Sep 30, 2020

Arthur-Milchior commented Sep 30, 2020

david-allison left a comment

Choose a reason for hiding this comment

Arthur-Milchior commented Oct 1, 2020

mikehardy commented Oct 1, 2020

Arthur-Milchior commented Oct 1, 2020

Arthur-Milchior commented Oct 1, 2020

david-allison commented Oct 1, 2020

Arthur-Milchior commented Oct 1, 2020

mikehardy commented Dec 8, 2020

david-allison commented Dec 8, 2020

mikehardy commented Dec 8, 2020

Arthur-Milchior commented Dec 12, 2020

github-actions bot commented Feb 10, 2021

Arthur-Milchior commented Feb 13, 2021

david-allison commented Mar 22, 2021 • edited Loading

david-allison Mar 22, 2021

Choose a reason for hiding this comment

Arthur-Milchior Jun 13, 2021

Choose a reason for hiding this comment

mikehardy commented Jul 22, 2022

david-allison commented Mar 22, 2021 •

edited

Loading